CoCalc Public Filestmp / Week_1_practical.ipynb
Author: William A. Stein

© Copyright 2016 Dr Marta Milo and Dr Mike Croucher, University of Sheffield.

# Week 1 Practical

This Notebook contains practical assignments for Week 1.

It contains guidance on how to perform some comands in R along with practical tasks that you will have to implement yourself. You are free to base your work on the examples given here but you are also welcome to use different methods if you prefer. You will need to add descriptions of what you have done in the assigned tasks and a comment on the results obtained using Markdown cells.

You will need to create a new notebook in the Week 1 folder of your SageMathCloud account that you will call your username_week1.ipynb. The notebooks will be self-marked following a set of guidelines that you will receive with a notebook that containes the solutions to the exercises. THIS IS FORMATIVE FEEDBACK that you can use to improve your coding skills.

The last version of your notebook saved by the deadlines indicated on the module website will be the one that will be considered for self-marking. It will be moved in your assigment folder where you will find the guidelines and the solved notebook.

All the notebooks are meant to be used interactively. All the code needs to be written into code cells -- these are cells that can be executed by an R kernel. The outputs are not present in this notebook, but the code cells are executable.

You can access each code cell for editing by clicking into it and pressing SHIFT and ENTER simultaneously to execute the code. You can run all code cells at once by clicking on Cell in the above menu bar and choose Run All.

## Basic operation in R

R is based on packages that become active when you call them into your workspace using the function library(my_package). In this practical we will not need to use packages that are not already loaded into your workspace, but it is useful to explore what is available and how to get help from R.

There are many ways to get help from R. Find out what the function library() does by using the commands help(library) or ?library.

Exercise 0: Create a notebook called your_username_week1.ipynb in the Week 1 folder.

Exercise 1: In your notebook called your_username_week1.ipynb, open a code cell execute the command library() results in. What happens if you type library without parentheses? Write a description of what you've discovered in a Markdown cell

### Changing path and verifying location of workspace

You can verify where your current working directory is by using

In [1]:
getwd()

'/projects/4a5f0542-5873-4eed-a85c-a18c706e8bcd/tmp'

The result from the above command will include a very long path such as /projects/81b488df-6f86-4914-a3a2-03e1fb248f11/ which is a code that details exactly where you are in the SageMathCloud. This is your home directory. Your home directory will be in different locations depending on the system you are using -- SageMathCloud, your laptop or perhaps Sheffield's supercomputer.

You never need to remember the long code. Instead, you use the form ~/ to refer to your home directory. That is the string ~/Autum2016 refers to the Autumn2016 directory inside your home directory - wherever that home directory may be on the system you are using.

You can set your working directory using the command setwd(). For example if you want to move to the directory ~/Autumn2016 you can type:

In [2]:
setwd("~/Autumn2016")

Error in setwd("~/Autumn2016"): cannot change working directory Traceback: 1. setwd("~/Autumn2016") 

Exercise 2: Run the setwd command above in your your_username_week1.ipynb notebook. Verify that you are indeed in the Autumn2016 directory and then set your working directory to be Autumn2016/Week1

### Variables

Variables are basic objects that contain values. In R, objects can be of varying complexity. During this module we will explore some different types of objects.

Variable names can contain numbers but can't start with numbers and they are case sensitive.

With ls() or objects() you get on overview of the objects in your workspace. Single objects can be removed by rm(). To clear your whole workspace use rm(list=ls())

In [3]:
x<-1


In the above code cell x is variable, x is its name and 1 its value. The symbol <- assigns a value to an object. Basic arithmetic operations can also be done with vectors. The 'raise to the power' operation is performed with the symbol ^. For example three raised to the power 2 is given by 3^2

** Exercise 3:** In your notebook called your_username_week1.ipynb:

Assign the values "3", "10" and "15" to three different objects and perform the following operations storing the results into different objects:

• sum all three variables
• take the difference of the first two and divide by the third
• multiply all three together
• take the square value of the sum
• calculate the sum of the vector raised to the power of 4
• take the square root (sqrt) of the difference of thrird and the first

Variables can contain collections of letters called strings. We manipulate strings differently from how we manipulate numbers. We use commands such as paste. For example:

In [4]:
myname <- "Marta"
Greeting <- "Ciao"
#Let's join these together using R's paste function.
message <- paste(Greeting,myname,sep=" ") #Sep determines the seperator.
print(message) # Print out the message

[1] "Ciao Marta"

Exercise 4: In your notebook called your_username_week1.ipynb:

Write code to print your name, your email and the module code separated by a comma.

### Vectors and Matrices

You can assign a set of values to an object and this can be in the form of a row of values (vector) or a table (matrix). For example to build a vector with numbers from 1 to 10 we can use any of the following methods:

In [5]:
x <- 1:10
assign("x",1:10)
x <- seq(1,10,by=1)
x <- seq(length=10,from=1,by=1)
x <- c(1,2,3,4,5,6,7,8,9,10)    # c = concatenate


You can generate random sequences of number using commands like sample(),runif() and rnorm(). Explore those with R help For example to create a sequence of 10 random integers from 1:100 you can use

In [6]:
sample(1:100, 12, replace=TRUE)

1. 80
2. 76
3. 72
4. 59
5. 92
6. 62
7. 21
8. 11
9. 33
10. 56
11. 83
12. 24

We can maipulate these vectors in all sorts of ways and use basic arithmetic operations on them. We can find the sum of all the element with the command sum() or explore the length with the command length().

In [7]:
sum(x)
length(x)

55
10

To access elements of the vectors we use [] and a number corresponding to the element position in the vector. Counting starts at 1 (rather than 0 which is the case for other languages). For example, the 5th element of x is 5. We can also access a subset of elements using the : operator. The command x[1:5] gives the first 5 elements of x.

Negative indices exclude certain elements from the vector, e.g. x[-3] is the same as x with the third element missing. Reproduce this in the code cell.

In [8]:
x[5]
x[1:5]

5
1. 1
2. 2
3. 3
4. 4
5. 5

Exercise 5:

In your notebook called your_username_week1.ipynb:

Create a sequence of even numbers ranging from 2 to 30. Create a sequence of odd numbers ranging from 1 to 30. Verify the lengths and calculate the sum of the elements of the even sequence and the sum of the elements of the odd sequence. What are the two values? Calculate the sum of a sequence of numbers ranging from 1 to 30. What do you conclude?

** Numbers as characters**

We can transform numbers to characters using the command as.character(). For example

In [9]:
dept<- "BMS"
code<- 353
module<- c(dept,as.character(code))
print(module)

[1] "BMS" "353"

This enable us to concatenate vectors with letters and numbers that are coerced into characters.

Exercise 6: In your notebook called your_username_week1.ipynb:

Create a vector with the following strings: "BMS", "APS", "MBB". Create a vector with the following numbers: 353, 227, 253.

• concatenate the two vectors
• create a vector of three elements merging the elements of the two vectors. For example the first element will be BMS353

Matrices are multi-dimensional vectors. They can be indexed by two or more indices. We can create matrices from vectors by rearranging the dimensions using the command dim(). We also use dim() to check the matrix dimension

In [10]:
M<-1:20
dim(M)<-c(4,5)
M

 1 5 9 13 17 2 6 10 14 18 3 7 11 15 19 4 8 12 16 20

We more often use the command matrix(), to create a matrix or rearrange a set of data. matrix() needs the data, nrow, ncol. For example matrix(data, nrow = 4, ncol = 5)

In [11]:
M<- matrix(1:20,nrow=4,ncol=5)
M

 1 5 9 13 17 2 6 10 14 18 3 7 11 15 19 4 8 12 16 20

We can index the elements of M in the same way we used for vectors. The only difference: now we need two indices in the square brackets, because M is two-dimensional. The first index corresponds to the rows, the second to the columns.

Exercise 7: Create new matrices with:

• The 2x2 matrix forming the upper left corner of M
• The first two rows with the third column missing
• Second row with all columns

Comment on dimensions of all these new matrices.

We can add column names and row names to matrices using the commands colnames() and rownames() respectively. For example

In [12]:
rownames(M)<-c("A","B","C","D")
colnames(M)<-c("1st", "2nd", "3rd","4th","5th")
M

1st2nd3rd4th5th
A1 5 91317
B2 6 101418
C3 7 111519
D4 8 121620

We can now access the data in the matrix using names as index.

In [13]:
M["A",]
M[,c("2nd","3rd")]

1st
1
2nd
5
3rd
9
4th
13
5th
17
2nd3rd
A5 9
B6 10
C7 11
D8 12

Exercise 8: In your notebook called your_username_week1.ipynb:

Create a [3x3] matrix containing the number of students for the year "2013-14", "2014-15", "2015-16" in the three module which names are the ones you created in Exercise 6. Use invented numbers. Assign names to columns and rows and print the number of student for BMS353 for all years.

It is possible to execute all arithmetic operations with matrices. Some are more computationally demanding than others. We can perform sum, differences and multiplication with matrices and with matrices and single scalar ( one value). For example

In [14]:
M+4
M/2
M+M

1st2nd3rd4th5th
A5 9131721
B6 10141822
C7 11151923
D8 12162024
1st2nd3rd4th5th
A0.5 2.5 4.5 6.5 8.5
B1.0 3.0 5.0 7.0 9.0
C1.5 3.5 5.5 7.5 9.5
D2.0 4.0 6.0 8.0 10.0
1st2nd3rd4th5th
A2 10182634
B4 12202836
C6 14223038
D8 16243240

Exercise 9: In your notebook called your_username_week1.ipynb:

Create two matrices [3x4] of 12 random integers from 1:100.

• sum the matrices
• subtract the matrices
• multiply the matrices (to perform matrix multiplication you need to use %*% operator. You also want to make sure the dimensions are compatible)
• calculate the square root of the elements of the matrices

## Create user-defined functions in R

In R it is possible to create small routines ( or functions) that can be called using a single word. For example we can create a function called myFunction which takes a value, raises it to the power of 3 and subtract 1. It will be something like

In [15]:
myFunction <- function(x) {
ux <- x^3-1
return(ux)
}

test<-myFunction(2) # this implements the function with the value 2


What would be the value of test?

Functions can be more or less complex and have many instructions. We can return one object as output of our user defined-function. A general syntax for user-defined function is:

myfunction <- function(arg1, arg2, ... ){
statements
return(object)
}


Exercise 10: In your notebook called your_username_week1.ipynb:

Calculate the sample variance of $x=(5,4,3,2,1)$ using the formula below. The formula of the sample variance is:

$\frac{1}{N-1}\sum_{n=1}^{N}{(x_n-\frac{1}{N}\sum_{n=1}^{N}x_n)^2}$.

what happens if you use the command var(x)?

Tip: Remember to use the command sum().

Exercise 11 In your notebook called your_username_week1.ipynb:

Create a user-defined function that accepts more than one input argument. This function can do anything you like. Be creative!

Exercise 12 In your notebook called your_username_week1.ipynb:

Write a brief summary in bullet points of the content of this week practical

In [ ]:


In [ ]:


In [ ]:


In [ ]: