Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download

R

Views: 4031
Kernel: R (R-Project)
options(jupyter.plot_mimetypes ='image/png')

Exercise 1

?read.table()

The read.table() command reads a file in table format and creates a data frame from it, with cases corresponding to lines and variables to fields in the file. For .cvs files, use read.csv(file, header = TRUE, sep = ",", quote = """, dec = ".", fill = TRUE, comment.char = "", ...)

Exercise 2

data <- read.csv("data_wk3/smokers.csv", header = TRUE) data
X Smokers..mmHg. Non.smokers..mmHg. 1 1 1.03000e+02 107 2 2 1.14000e+02 101 3 3 1.25000e+02 103 4 4 1.18000e+02 95 5 5 1.19000e+02 101 6 6 1.28000e+02 98 7 7 1.10000e+02 112 8 8 9.80000e+01 98 9 9 1.25000e+02 106 10 10 1.34000e+02 113 11 11 1.11000e+02 102 12 12 1.14000e+02 114 13 13 1.15000e+02 101 14 14 1.12000e+02 94 15 15 NA 16 16 NA 17 17 NA 18 18 3.05975e-04 equal 19 19 3.89071e-04 unequal
str(data)
'data.frame': 19 obs. of 3 variables: $ X : int 1 2 3 4 5 6 7 8 9 10 ... $ Smokers..mmHg. : num 103 114 125 118 119 128 110 98 125 134 ... $ Non.smokers..mmHg.: Factor w/ 14 levels "","101","102",..: 6 2 4 11 2 12 7 12 5 8 ...

The str(data) command shows that the data is formated in a data frame, containing 19 obvervations of 2 variables, hence 19 rows with 2 columns. The first column is Smokers..mmHg, with numerical values, creating a vector. The second column is Non.smokers..mmHg, changing the numerical values into strings due to the presence of the "equal" and "unequal", all values in a column must be in the same format.

Exercise 3

data1 <- read.csv("data_wk3/smokers.csv", header = TRUE, nrows =14)
data1
X Smokers..mmHg. Non.smokers..mmHg. 1 1 103 107 2 2 114 101 3 3 125 103 4 4 118 95 5 5 119 101 6 6 128 98 7 7 110 112 8 8 98 98 9 9 125 106 10 10 134 113 11 11 111 102 12 12 114 114 13 13 115 101 14 14 112 94

Exercise 4

smokers_clean <- read.csv("data_wk3/smokers.csv", header = TRUE, nrows =17) smokers_clean
X Smokers..mmHg. Non.smokers..mmHg. 1 1 103 107 2 2 114 101 3 3 125 103 4 4 118 95 5 5 119 101 6 6 128 98 7 7 110 112 8 8 98 98 9 9 125 106 10 10 134 113 11 11 111 102 12 12 114 114 13 13 115 101 14 14 112 94 15 15 NA NA 16 16 NA NA 17 17 NA NA
write.csv(smokers_clean, file = "data_wk3/somkers_clean.cvs")

Exercise 5

y<-c(27,40,72,NA,89) my<-mean(y,na.rm=TRUE) my
[1] 57
y[is.na(y)] <- 0 y
[1] 27 40 72 0 89
smokers_clean$Smokers..mmHg
[1] 103 114 125 118 119 128 110 98 125 134 111 114 115 112 NA NA NA
smokers_clean$Non.smokers..mmHg
[1] 107 101 103 95 101 98 112 98 106 113 102 114 101 94 NA NA NA
s<-c(smokers_clean$Smokers..mmHg) ms1<-mean(s,na.rm=TRUE) ms1 s[is.na(s)]<-0 s ms2<-mean(s) ms2
[1] 116.1429
[1] 103 114 125 118 119 128 110 98 125 134 111 114 115 112 0 0 0
[1] 95.64706
s<-c(smokers_clean$Smokers..mmHg) mes1<-median(s,na.rm=TRUE) mes1 s[is.na(s)]<-0 s mes2<-median(s) mes2
[1] 114.5
[1] 103 114 125 118 119 128 110 98 125 134 111 114 115 112 0 0 0
[1] 114
s<-c(smokers_clean$Smokers..mmHg) sds1<-sd(s,na.rm=TRUE) sds1 s[is.na(s)]<-0 s sds2<-sd(s) sds2
[1] 9.694226
[1] 103 114 125 118 119 128 110 98 125 134 111 114 115 112 0 0 0
[1] 46.46765
s<-c(smokers_clean$Smokers..mmHg) IQRs1<-sd(s,na.rm=TRUE) IQRs1 s[is.na(s)]<-0 s IQRs2<-sd(s) IQRs2
[1] 9.694226
[1] 103 114 125 118 119 128 110 98 125 134 111 114 115 112 0 0 0
[1] 46.46765
ns<-c(smokers_clean$Non.smokers..mmHg) mns1<- mean(ns,na.rm=TRUE) mns1 ns[is.na(ns)]<-0 ns mns2<-mean(ns) mns2
[1] 103.2143
[1] 107 101 103 95 101 98 112 98 106 113 102 114 101 94 0 0 0
[1] 85
ns<-c(smokers_clean$Non.smokers..mmHg) mens1<- median(ns,na.rm=TRUE) mens1 ns[is.na(ns)]<-0 ns mens2<-median(ns) mens2
[1] 101.5
[1] 107 101 103 95 101 98 112 98 106 113 102 114 101 94 0 0 0
[1] 101
ns<-c(smokers_clean$Non.smokers..mmHg) sdns1<- sd(ns,na.rm=TRUE) sdns1 ns[is.na(ns)]<-0 ns sdns2<-sd(ns) sdns2
[1] 6.411271
[1] 107 101 103 95 101 98 112 98 106 113 102 114 101 94 0 0 0
[1] 40.96798
ns<-c(smokers_clean$Non.smokers..mmHg) IQRns1<- mean(ns,na.rm=TRUE) IQRns1 ns[is.na(ns)]<-0 ns IQRns2<-mean(ns) IQRns2
[1] 103.2143
[1] 107 101 103 95 101 98 112 98 106 113 102 114 101 94 0 0 0
[1] 85
t.test(smokers_clean$Smokers..mmHg,smokers_clean$Non.smokers..mmHg)
Welch Two Sample t-test data: smokers_clean$Smokers..mmHg and smokers_clean$Non.smokers..mmHg t = 4.1621, df = 22.546, p-value = 0.0003891 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 6.495648 19.361495 sample estimates: mean of x mean of y 116.1429 103.2143
hist(smokers_clean$Non.smokers..mmHg, prob=F, main='Non-Smokers Clean Data ', ylab="Non-Smokers", xlab='Blood Pressure') par(fg='black') lines(density(smokers_clean$Non.smokers..mmHg,na.rm=TRUE)) abline(v=mean(smokers_clean$Non.smokers..mmHg,na.rm=TRUE), col=rgb(0.5,0.5,0.5)) abline(v=median(smokers_clean$Non.smokers..mmHg,na.rm=TRUE), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(smokers_clean$Non.smokers..mmHg,na.rm=TRUE)+sd(smokers_clean$Non.smokers..mmHg,na.rm=TRUE), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(smokers_clean$Non.smokers..mmHg,na.rm=TRUE)-sd(smokers_clean$Non.smokers..mmHg,na.rm=TRUE), lty=2, col=rgb(0.7,0.7,0.7)) rug(smokers_clean$Non.smokers..mmHg,na.rm=TRUE) hist(smokers_clean$Smokers..mmHg, prob=F, main='Smokers Clean Data ', ylab="Smokers", xlab='Blood Pressure') par(fg='black') lines(density(smokers_clean$Smokers..mmHg,na.rm=TRUE)) abline(v=mean(smokers_clean$Smokers..mmHg,na.rm=TRUE), col=rgb(0.5,0.5,0.5)) abline(v=median(smokers_clean$Smokers..mmHg,na.rm=TRUE), lty=3, col=rgb(0.5,0.5,0.5)) abline(v=mean(smokers_clean$Smokers..mmHg,na.rm=TRUE)+sd(smokers_clean$Smokers..mmHg,na.rm=TRUE), lty=2, col=rgb(0.7,0.7,0.7)) abline(v=mean(smokers_clean$Smokers..mmHg,na.rm=TRUE)-sd(smokers_clean$Smokers..mmHg,na.rm=TRUE), lty=2, col=rgb(0.7,0.7,0.7)) rug(smokers_clean$Smokers..mmHg,na.rm=TRUE)
Warning message in axis(side = side, at = at, labels = labels, ...): “"na.rm" is not a graphical parameter”
Image in a Jupyter notebook
Warning message in axis(side = side, at = at, labels = labels, ...): “"na.rm" is not a graphical parameter”
Image in a Jupyter notebook

Exercise 6

p1<-c(smokers_clean$Smokers..mmHg,na.rm=TRUE) p2<-c(smokers_clean$Non.smokers..mmHg,na.rm=TRUE) plotMA(p1,p2)
Error in plotMA(p1, p2): Error from the generic function 'plotMA' defined in package 'BiocGenerics': no S4 method definition for argument 'p1' of class 'integer' was found. Did you perhaps mean calling the function 'plotMA' from another package, e.g. 'limma'? In that case, please use the syntax 'limma::plotMA'. Traceback: 1. plotMA(p1, p2) 2. plotMA(p1, p2) 3. stop(msg)

Exercise 7

Log2data<-log2(smokers_clean) Log2data
X Smokers..mmHg. Non.smokers..mmHg. 1 0.000000 6.686501 6.741467 2 1.000000 6.832890 6.658211 3 1.584963 6.965784 6.686501 4 2.000000 6.882643 6.569856 5 2.321928 6.894818 6.658211 6 2.584963 7.000000 6.614710 7 2.807355 6.781360 6.807355 8 3.000000 6.614710 6.614710 9 3.169925 6.965784 6.727920 10 3.321928 7.066089 6.820179 11 3.459432 6.794416 6.672425 12 3.584963 6.832890 6.832890 13 3.700440 6.845490 6.658211 14 3.807355 6.807355 6.554589 15 3.906891 NA NA 16 4.000000 NA NA 17 4.087463 NA NA
d1<-density(smokers_clean$Smokers..mmHg,na.rm=TRUE) plot(d1) d2<-density(smokers_clean$Non.smokers..mmHg,na.rm=TRUE) plot(d2)
Image in a Jupyter notebookImage in a Jupyter notebook
d3<-density(Log2data$Smokers..mmHg,na.rm=TRUE) d4<-density(Log2data$Non.smokers..mmHg,na.rm=TRUE) plot(d3) plot(d4)
Image in a Jupyter notebookImage in a Jupyter notebook
Ct_data <- read.csv("data_wk3/Ct_data.csv", header = TRUE) Ct_data
V1 V2 V3 1 40.000 18.010 23.684 2 10.689 14.455 21.211 3 26.276 13.791 21.652 4 14.877 20.407 26.446 5 23.972 22.347 22.231 6 13.871 19.857 23.965 7 32.987 18.636 20.707 8 24.659 40.000 32.147 9 18.903 15.492 23.371 10 40.000 40.000 18.621 11 23.597 18.811 21.903 12 22.536 21.954 29.702 13 23.239 27.539 27.370 14 11.345 18.919 28.319 15 28.168 22.384 40.000 16 22.507 14.979 40.000 17 18.622 13.154 25.924 18 27.852 40.000 20.260 19 21.913 19.899 19.860 20 21.913 26.150 26.397 21 25.456 19.578 25.362 22 25.412 25.792 32.671 23 40.000 24.537 22.742 24 17.877 15.436 11.149 25 21.938 17.142 25.800 26 40.000 16.928 19.172 27 28.105 15.298 19.676 28 22.079 19.844 25.000 29 23.322 24.323 23.793 30 40.000 18.642 24.078 ⋮ ⋮ ⋮ ⋮ 355 22.742 13.624 21.716 356 20.692 23.067 25.817 357 24.750 19.765 19.754 358 11.169 25.604 27.911 359 22.176 18.117 21.548 360 27.176 12.619 26.140 361 22.629 15.042 25.895 362 14.410 15.413 32.835 363 22.785 18.530 6.086 364 21.275 24.323 31.666 365 23.894 24.175 27.954 366 40.000 17.674 27.258 367 28.258 20.582 26.690 368 19.763 40.000 29.894 369 18.021 15.436 23.036 370 16.521 22.898 26.654 371 20.875 23.815 26.535 372 20.720 40.000 24.257 373 25.790 40.000 24.218 374 40.000 23.845 25.978 375 28.545 22.450 22.175 376 24.849 31.626 40.000 377 35.597 40.000 23.158 378 31.825 17.283 21.848 379 19.859 16.554 20.880 380 40.000 23.385 24.445 381 24.254 17.535 25.583 382 24.685 10.607 25.988 383 26.940 23.651 22.153 384 23.920 31.635 40.000
targetNames <- read.csv("data_wk3/targetNames.csv", header = TRUE) targetNames
RPLPO NANOG MEIS1 1 RNA SPIKE OCT3-4 SOX2 2 B-ACTIN PAX6 ATOH1 3 PAX2 BRN3A BRN3C 4 PAX8 NESTIN MYOSIN7A 5 FOXG1 SIX1 SOX9 6 DLX5 GATA3 SYP 7 GFP RNA SPIKE H10_1 RNA SPIKE H10 8 RPLPO NANOG MEIS1 9 RNA SPIKE OCT3-4 SOX2 10 B-ACTIN PAX6 ATOH1 11 PAX2 BRN3A BRN3C 12 PAX8 NESTIN MYOSIN7A 13 FOXG1 SIX1 SOX9 14 DLX5 GATA3 SYP 15 GFP RNA SPIKE H10_1 RNA SPIKE H10 16 RPLPO NANOG MEIS1 17 RNA SPIKE OCT3-4 SOX2 18 B-ACTIN PAX6 ATOH1 19 PAX2 BRN3A BRN3C 20 PAX8 NESTIN MYOSIN7A 21 FOXG1 SIX1 SOX9 22 DLX5 GATA3 SYP 23 GFP RNA SPIKE H10_1 RNA SPIKE H10 24 RPLPO NANOG MEIS1 25 RNA SPIKE OCT3-4 SOX2 26 B-ACTIN PAX6 ATOH1 27 PAX2 BRN3A BRN3C 28 PAX8 NESTIN MYOSIN7A 29 FOXG1 SIX1 SOX9 30 DLX5 GATA3 SYP ⋮ ⋮ ⋮ ⋮ 354 B-ACTIN PAX6 ATOH1 355 PAX2 BRN3A BRN3C 356 PAX8 NESTIN MYOSIN7A 357 FOXG1 SIX1 SOX9 358 DLX5 GATA3 SYP 359 GFP RNA SPIKE H10_1 RNA SPIKE H10 360 RPLPO NANOG MEIS1 361 RNA SPIKE OCT3-4 SOX2 362 B-ACTIN PAX6 ATOH1 363 PAX2 BRN3A BRN3C 364 PAX8 NESTIN MYOSIN7A 365 FOXG1 SIX1 SOX9 366 DLX5 GATA3 SYP 367 GFP RNA SPIKE H10_1 RNA SPIKE H10 368 RPLPO NANOG MEIS1 369 RNA SPIKE OCT3-4 SOX2 370 B-ACTIN PAX6 ATOH1 371 PAX2 BRN3A BRN3C 372 PAX8 NESTIN MYOSIN7A 373 FOXG1 SIX1 SOX9 374 DLX5 GATA3 SYP 375 GFP RNA SPIKE H10_1 RNA SPIKE H10 376 RPLPO NANOG MEIS1 377 RNA SPIKE OCT3-4 SOX2 378 B-ACTIN PAX6 ATOH1 379 PAX2 BRN3A BRN3C 380 PAX8 NESTIN MYOSIN7A 381 FOXG1 SIX1 SOX9 382 DLX5 GATA3 SYP 383 GFP RNA SPIKE H10_1 RNA SPIKE H10