Project: Jingyi Xie - Autumn2016/BMS353

Path: Autumn2016 / solutions / week4 / Week4_practical_solutions.ipynb

Views: ⁴⁰³¹

Kernel: R (R-Project)

Week 4 Practical notebook - Solutions

This notebook contains guidance on how

to load packages from Bioconductor and access the vignettes
how to load data from preprocessed data objects within bioconductor.
Analysis of gene expression data with different methods and normalisation techniques
Basic use of limma for Differential Expression Anaysis

These are just suggested solutions, you can find different ones and still correct, so please refer to them as a guidance and as suggestions.

In [1]:

options(jupyter.plot_mimetypes ='image/png')

Loading Packages

In Bioconductor thare are many packages that are normally installed on local R versions with the function biocLite(). On Sage Math Cloud this function has not been fully tested for use with our accounts, therefore we have many of them preinstalled. They can be activated using library() as you would normally do with any other R package.

Bioconductor packages may depend on other packages, which means that they can only work if the others are activated. As default behaviour Bioconductor automatically activates all dependencies, but it might happen that you need to do it manually with the command library(). To start our practical we need to load the basic Bioconductor package affy.

Exercise 1: Familiarise with the Bioconductor web portal. Load affy and check that the you have mas5() and rma() using the help function. In a markdown cell explain what are the objects that they require and what is the output ( Tip look out for Affybatch, ExpressionSet...)

In [2]:

library(affy)

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, xtabs

The following objects are masked from ‘package:base’:

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

mas5() is the MAS 5.0 Expression measure function, which converts an Affybatch onto an Expressionset using Affymetrix's MAS 5.0 gene expression estimation method. This method is a single point statistic method.

syntax: mas5(object, normalize = TRUE, sc=500, analysis="absolute")

where:

object = an instance of AffyBatch
normalize = TRUE/FALSE - if TRUE scale normalization is used after the ExpressionSet is obtained.
sc = Scaling value for the arrays
analysis = absolute or comparison analsis

In [3]:

help(mas5)
help(rma)

rma() is the function that implements the Robust Multiarray Average method. It converts an AffyBatch object into an ExpressionSet data using the RMA expression algorithm.

syntax: rma(object, subset=NULL, verbose=TRUE, destructive=TRUE, normalize=TRUE, background=TRUE, bgversion=2, ...)

where:

object an AffyBatch object.
subset a character vector with the the names of the probesets to be used in expression calculation.
verbose logical value. If TRUE, it writes out some messages indicating progress. If FALSE nothing should be printed.
destructive logical value. If TRUE, works on the PM matrix in place as much as possible, good for large datasets.
normalize logical value. If TRUE, normalize data using quantile normalization.
background logical value. If TRUE, background correct using RMA background correction.
bgversion integer value indicating which RMA background to use 1: use background similar to pure R rma background given in affy version 1.0 - 1.0.2 2: use background similar to pure R rma background given in affy version 1.1 and above

Processing array data with Bioconductor

There is a vast amount of data in Bioconductor, raw or preprocessed, that we can be used for some of the exercises below. this notebook is the backbone for the final project notebook, in which we will be processing data from the raw data files called .CEL files. This can take some processing time for this reason here we will use preprocessed data.

To load preprocessed data we can use:

the command data() if the objects are already presnt in the package;
the command library() with the name of data package we want to use;
the command load("fileName.RDA") to upload preprocessed data from a previously saved workspace.

Exercise 2: Set your working directory to data_w4, or define a path variable that contains the path to that data folder. Ensure that you know where you are using getwd() and setwd() commands.

In [4]:

getwd()

[1] "/projects/ddda6a8e-2bca-47f5-b1d6-79b2c48d0e30/Autumn2016/solutions/week4"

In [5]:

setwd("~/Autumn2016/Week4/data_wk4") # Setting the working directory to be data_wk4

In [6]:

getwd() # Confirming that the working directory is data_wk4

[1] "/projects/ddda6a8e-2bca-47f5-b1d6-79b2c48d0e30/Autumn2016/Week4/data_wk4"

Gene Expression Analysis with Bioconductor Workflow

Step 1: build the data set of your raw values into an object called AffyBatch
Step 2: process the affybatch with selected methods
Step 3: Normalise the data if needed (some methods have this step included)
Step 4: Visualise data
Step 5: High level analysis : DE analysis, clustering, PCA ..
Step 6: Selection of final targets

This workbook covers Step 1-4

In data_wk4 there is a file called Dilution.RDA. It conatins an affybatch of a benchmark data set that was used to test methods for affymetrix gene epression estimation. The data consists of two sources of mRNA, human liver tissue and a central nervous system cell line (CNS), were hybridized to human geneChip arrays (HG-U95A) in a range of different dilutions and proportions.

Exercise 3: Load the data from Dilution.RDA and check if it is loaded with the command ls(). The dilution data is already in the form of an affybatch. Use the command show(Dilution) to check the features. Use a markdown cell to describe it.

In [7]:

load("Dilution.RDA") # Loading the Dilution.RDA file into the workspace
ls() # listing the data that is loaded to confirm that Dilution.RDA file has loaded
show(Dilution) # Checking the features of the AffyBatch object, returns the size of arrays (640x640) etc.

[1] "Dilution"   "mapCdfName"

Warning message:
“replacing previous import ‘stats::mad’ by ‘BiocGenerics::mad’ when loading ‘IRanges’”Warning message:
“replacing previous import ‘stats::IQR’ by ‘BiocGenerics::IQR’ when loading ‘IRanges’”Warning message:
“multiple methods tables found for ‘active’”Warning message:
“multiple methods tables found for ‘active<-’”Warning message:
“multiple methods tables found for ‘evalSeparately’”Warning message:
“multiple methods tables found for ‘subsetByFilter’”Warning message:
“multiple methods tables found for ‘params’”Warning message:
“multiple methods tables found for ‘filterRules’”Warning message:
“multiple methods tables found for ‘mad’”Warning message:
“multiple methods tables found for ‘IQR’”Warning message:
“multiple methods tables found for ‘unlist’”Warning message:
“multiple methods tables found for ‘rep.int’”Warning message:
“replacing previous import ‘BiocGenerics::mad’ by ‘IRanges::mad’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::filterRules’ by ‘IRanges::filterRules’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::.__C__FilterRules’ by ‘IRanges::.__C__FilterRules’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::params’ by ‘IRanges::params’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::evalSeparately’ by ‘IRanges::evalSeparately’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::FilterRules’ by ‘IRanges::FilterRules’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::active<-’ by ‘IRanges::active<-’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::.__C__expressionORfunction’ by ‘IRanges::.__C__expressionORfunction’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::.__C__FilterMatrix’ by ‘IRanges::.__C__FilterMatrix’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘BiocGenerics::IQR’ by ‘IRanges::IQR’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::FilterMatrix’ by ‘IRanges::FilterMatrix’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::subsetByFilter’ by ‘IRanges::subsetByFilter’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘S4Vectors::active’ by ‘IRanges::active’ when loading ‘AnnotationDbi’”Warning message:
“replacing previous import ‘AnnotationDbi::tail’ by ‘utils::tail’ when loading ‘hgu95av2cdf’”Warning message:
“replacing previous import ‘AnnotationDbi::head’ by ‘utils::head’ when loading ‘hgu95av2cdf’”

AffyBatch object
size of arrays=640x640 features (35221 kb)
cdf=HG_U95Av2 (12625 affyids)
number of samples=4
number of genes=12625
annotation=hgu95av2
notes=

The AffyBatch object contains 640x640 arrays 35221kb in size

The number of samples is 4

Number of genes is 12,625

Gene expression estimation with RMA and MAS5

When the data is processed using one of the methods we discussed (RMA, puma, MAS5, etc.), Bioconductor creates a new object called ExpressionSet. This object contains the expression values but also the phenotypic data, descriptions, annotations, sample names etc...

Exercise 4: Activate the package affyand create two different types of ExpressionSet, using rma() and mas5(). For example, you can do so with the assignment eset_rma<-rma(Dilution). Use the command show() to describe what you have obtained.

To extract gene expression values from the eset, you need the command exprs() that must be used with an ExpressionSet. For example: e_rma<-exprs(eset_rma).

In [8]:

library(affy) # loading the package affy into the workspace
eset_rma<-rma(Dilution) # Assigning the expression set from the rma function to a variable

Background correcting
Normalizing
Calculating Expression

In [9]:

eset_mas5<-mas5(Dilution) # Assigning the expression set from the mas5 function to a variable

background correction: mas 
PM/MM correction : mas 
expression values: mas 
background correcting...done.
12625 ids to be processed
|                    |
|####################|

In [10]:

eset_rma # checking the contents of the variable containing the expression set calculated using the rma function. This returns the same as show(eset_rma) command.

ExpressionSet (storageMode: lockedEnvironment)
assayData: 12625 features, 4 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: 20A 20B 10A 10B
  varLabels: liver sn19 scanner
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu95av2 

In [11]:

eset_mas5 # checking the contents of the variable containing the expression set calculated using the mas function. This returns the same as show(eset_mas) command.

ExpressionSet (storageMode: lockedEnvironment)
assayData: 12625 features, 4 samples 
  element names: exprs, se.exprs 
protocolData: none
phenoData
  sampleNames: 20A 20B 10A 10B
  varLabels: liver sn19 scanner
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation: hgu95av2 

In [12]:

e_rma<-exprs(eset_rma)
e_mas5<-exprs(eset_mas5)

In [13]:

identical(eset_mas5,eset_rma) # showing that the two functions have returned different expression sets, this is because the functions are using two different methods for analysing the same data

[1] FALSE

In [14]:

identical(e_rma,e_mas5) # showing that the two functions have returned different gene expression values

[1] FALSE

Exercise 5: Extract the expression values for mas5 and rma and do a first diagnostic on the data using density() and boxplot(). Use a markdown cell to explain what you have found. (Tip: Do you need to transform the mas5 data? Remember log2() is the transformation we use). Visualise any manipulation that you do to the data.

In [15]:

density(e_rma) # performing diagnostic on the gene expression data obtained using rma
density(e_mas5) # performing diagnostic on the gene expression data obtained using mas5

Call:
	density.default(x = e_rma)

Data: e_rma (50500 obs.);	Bandwidth 'bw' = 0.2

       x                y            
 Min.   : 1.620   Min.   :1.210e-06  
 1st Qu.: 4.737   1st Qu.:7.066e-03  
 Median : 7.855   Median :6.411e-02  
 Mean   : 7.855   Mean   :8.012e-02  
 3rd Qu.:10.972   3rd Qu.:1.483e-01  
 Max.   :14.089   Max.   :2.005e-01  

Call:
	density.default(x = e_mas5)

Data: e_mas5 (50500 obs.);	Bandwidth 'bw' = 40.5

       x                 y            
 Min.   : -121.1   Min.   :0.000e+00  
 1st Qu.:10747.9   1st Qu.:8.090e-08  
 Median :21616.9   Median :4.438e-07  
 Mean   :21616.9   Mean   :2.353e-05  
 3rd Qu.:32486.0   3rd Qu.:1.346e-06  
 Max.   :43355.0   Max.   :2.692e-03  

In [16]:

boxplot(e_mas5)

MIME type unknown not supported

Boxplot before the data is transformed - this box plot is difficult to red since the outliers have very large values.The transformation of the data helps to change the scale and make plots readable.

In [17]:

e_mas5<-log2(e_mas5) # Transforming the e_mas5

In [18]:

density(e_mas5)

Call:
	density.default(x = e_mas5)

Data: e_mas5 (50500 obs.);	Bandwidth 'bw' = 0.2614

       x                y            
 Min.   :-2.205   Min.   :3.500e-07  
 1st Qu.: 2.393   1st Qu.:4.883e-03  
 Median : 6.990   Median :3.176e-02  
 Mean   : 6.990   Mean   :5.433e-02  
 3rd Qu.:11.587   3rd Qu.:1.061e-01  
 Max.   :16.184   Max.   :1.531e-01  

In [19]:

boxplot(e_mas5, ylab="Gene Expression levels", xlab="Sample Names")

MIME type unknown not supported

boxplot for gene expression values calculated using the mas5 function: Medians are alligned and the datat looks normalised.

In [20]:

boxplot(e_rma, ylab="Gene Expression levels", xlab="Sample Names")

MIME type unknown not supported

Boxplot for gene expression values calculated via the rma() function: Bulk of gene expression values lies a lot lower on the y axis - Medians are alligned and there are no negative outliers.

** Gene Expression estimation with probabilistic models** You can evaluate the gene expression within puma package using mmgmos().

Exercise 6: Activate the puma package and create an eset from the Dilution data. For example using: e_puma<-exprs(eset_puma), it will take a little longer than the others so be patient. Apply the same diagnostics of Exercise 5. Explain in a markdown cell what you have found

In [21]:

library(puma) # Activating the puma package

Error in library(puma): there is no package called ‘puma’
Traceback:
1. library(puma)
2. stop(txt, domain = NA)

In [ ]:

eset_puma<-mmgmos(Dilution) # evaluating the gene expression of Dilution data within puma
e_puma<-exprs(eset_puma) # creating an expression set from the dilution data
e_puma

In [ ]:

plot(density(e_puma), main="Dilution data from puma", xlab="bins", ylab="Density") # plotting the density of the gene expression

The density plots helps o identify any irregularity in thr data, especially if there are more modes, which in this particular case we don't expect to be there. Data looks fine and it is already log2 transformed.

Exercise 7: Are there negative values in the experssions? Why is that? Set all the negative expressions to zero. First extract the expression from the eset and check the dim() of your data frame. Compare the three methods and explain what you found. Is the data Normalised?

Tip: You can use a for loop on the columns and set to zero the negative elements one column at the time. to do so there are many way one can be

temp<-test[,i]
temp[temp < 0] <- 0
test[,i]<-temp

Be inventive!

The negatives values are due to the fact that there are a set of gene expression estimates that have values between 0 and 1. This means that when log2 tranformed they become negative. The genes that have these gene expression values are not expressed in the sample.

In [ ]:

for (i in 1:4) {
temp<-e_puma[,i]
temp[temp<0] <- 0
e_puma[,i] <- temp
}

Estrogen Data

We now use data loaded in pumadata to look at the effect of a different type of experimental design called time series. In this data we want to see the effect of presence or absence of estrogene at 10hr or 48hr in a cell culture.

Activate the pumadata package and use data(affybatch.estrogen) to upload the data into the workspace. We can add phenotypic data to describe the affybatch as follows. You can check the results using the command pData()

In [ ]:

library(pumadata) # activating the pumadata package
data(affybatch.estrogen) # uploading the data into the workspace

In [ ]:

pData(affybatch.estrogen) <- data.frame(
"estrogen"=c("absent","absent","present","present" #adding phenotypic data
,"absent","absent","present","present")
, "time.h"=c("10","10","10","10","48","48","48","48")
, row.names=rownames(pData(affybatch.estrogen)))
pData(affybatch.estrogen) # checking the results of above

You can estimate gene expression levels with

eset_estrogen_mmgmos <- mmgmos(affybatch.estrogen, gsnorm="none")
eset_estrogen_rma <- rma(affybatch.estrogen)

but if you have lack of time in the practical you can load the data already processed

data(eset_estrogen_mmgmos) 
data(eset_estrogen_rma)

Exercise 8: Use boxplot to visualise the two esets. What do you find? (Tip: If you need to normalise the data use eset_estrogen_mmgmos_normd <- pumaNormalize(eset_estrogen_mmgmos)) Why is it important to normalise? Answer: allows the data to be corrected and remove systematic bias

In [ ]:

data(eset_estrogen_mmgmos) #loading the expression set for puma into the workspace
data(eset_estrogen_rma) #loading the expression set for rma into the workspace
eset_estrogen_mmgmos_normd <- pumaNormalize(eset_estrogen_mmgmos)

In [ ]:

eset_rma<-exprs(eset_estrogen_rma)
eset_mmgmos<-exprs(eset_estrogen_mmgmos)

In [ ]:

names(pData(affybatch.estrogen))

In [ ]:

boxplot(eset_rma, main="Estrogen data from RMA", xlab="Samples", ylab="expression")

In [ ]:

par(mfrow=c(1,2))
boxplot(eset_mmgmos,main="Estrogen data from puma", xlab="Samples", ylab="expression")
boxplot(exprs(eset_estrogen_mmgmos_normd),main="Normalised data from puma", xlab="Samples", ylab="expression")

Write results

To write results from puma eset we use write.reslts(eset_estrogen_mmgmos, file="eset_estrogen")

To write results from RMA we use write.csv(exprs(eset_estrogen_rma),"rma_data.csv")

We will not need to write the full table of results for the final project, this is just for your information.

DE Analysis

To perform Differential expression we need to summarise the replicated data, so that we have one values set for each condition. In pumawe can do this using

eset_estrogen_comb <- pumaComb(eset_estrogen_mmgmos_normd)

this might take a long time (hrs) so for this practical we use preprocessed data

In [ ]:

data(eset_estrogen_comb)

show(eset_estrogen_comb)

Exercise 9: Check the dimension and the pData() for the eset of the combined values. Calculate the FC and plot the data with a MA plot using the command ma.plot()

In [ ]:

pData(eset_estrogen_comb)

In [ ]:

dim(eset_estrogen_comb)

Before calculating the FC, we set the negative gene expression values to zero to avoid problems.

In [ ]:

estrog_comb_puma<-exprs(eset_estrogen_comb)


for (i in 1:4) {
temp<-estrog_comb_puma[,i]
temp[temp<0] <- 0
estrog_comb_puma[,i] <- temp
}

In [ ]:

fc<- estrog_comb_puma[,1:2] - estrog_comb_puma[,3:4] # calculating the fold change

colnames(fc)<-c("A-P 10H", "A-P 48H")

** Exercise 10**: Do the same for eset_estrogen_RMA, using the mean to combine the values.

In [ ]:

eset_comb_rma<-matrix(0,nrow=dim(eset_rma)[1], ncol=dim(eset_rma)[2])
eset_comb_rma[,1]<-mean(eset_rma[,1:2])
eset_comb_rma[,2]<-mean(eset_rma[,3:4])
eset_comb_rma[,3]<-mean(eset_rma[,5:6])
eset_comb_rma[,4]<-mean(eset_rma[,7:8])
rownames(eset_comb_rma)<-rownames(eset_estrogen_rma)



# calculating the FC
fc_rma<- eset_comb_rma[,1:2] - eset_comb_rma[,3:4] # calculating the fold change
colnames(fc_rma)<-c("A-P 10H", "A-P 48H")

Visualise the FC with MA plot

In [ ]:

groups<-c("A10","P10","A48","P48")
estrogen_table<-data.frame(sampleNames(eset_estrogen_comb), groups)
group10hr<-factor(groups[1:2])
group48hr<-factor(groups[3:4])

group10hr

In [ ]:

estrogen_table

In [ ]:

par(mfrow=c(2,2))
MAplot(eset_estrogen_comb) # MA plot of the full eset

First Steps with limma

To start using limma activate the package as normal using library(limma). The limma user guide is very useful, that you can access in the additional reading.

Three steps are required for limma:

Step 1: build the design contrast matrix
Step 2: fit the linear model
Step 3: calculate the p-values and FDRs with a empirical Bayes test

To simplify Step 1 we can make use of two functions that are in puma package.

design_rma<-createDesignMatrix(eset_estrogen_rma) # create design matrix
contrast_rma<-createContrastMatrix(eset_estrogen_rma) # create constrast

In [ ]:

library(limma) # loading the limma package

In [ ]:

design_rma<-createDesignMatrix(eset_estrogen_rma) # creating the design matrix
contrast_rma<-createContrastMatrix(eset_estrogen_rma) # creating contrast matrix

design_rma
contrast_rma

The contrast matrix describes all the possible comaprisons, that you can have in the data, where is 0 the sample is ignored. Where there is 1 that sample gooes to the numerator of the FC ratio. Clearly the sample with -1 goes to the denominator. So for example the first column the comparison will be present.10/absent.10 or in log2 scale present.10 - absent.10.

Step 2: fit the linear model to the design that we have created.

In [ ]:

fit_rma<-lmFit(eset_estrogen_rma,design_rma)

Step 3: calculate the p-values from the linear model we fitted.

In [ ]:

fit_rma2 <- contrasts.fit(fit_rma, contrast_rma)
fit_rma2 <- eBayes(fit_rma2)

In [ ]:

contrasts<-colnames(contrast_rma) # check which is the comparison of interest by storing them in an array
contrasts

colnames(contrast_rma)<-c("P10vsA10","A48vsA10","P48vsP10","P48vsA48","E_AvsE_P","10hva48h","E_vsTime")

You can rename the columns of the constrat matrix to have shorter names.

** Identify DE genes**

You can access the field of that objects using $. For example if you want the p.values you can use fit_rma2$p.value. You can plot histgrams of them for the constrast of interest. For example hist(fit_rma2$p.value[,1]) for constrast 1 (present.10_vs_absent.10). Finally you can access the most different genes in that constrast using topTable()

In [ ]:

pval<-fit_rma2$p.value

In [ ]:

DEgenes<-topTable(fit_rma2,coef=1, adjust.method="BH") # identify genes that are DE

# Builds a vennDiagram to see which genes constrasts have in common
results_rma_1<-decideTests(fit_rma2, method="global") 
colnames(results_rma_1)<-colnames(contrast_rma)

vennDiagram(results_rma_1[,c(1,4)]) # vennDiagram for constrast 1 and 4

Exercise 11: Repeat the limma analysis for eset_estrogen_mmgmos. How do the results compare? mmgmos has a lot more DE expressed genes detected (more sensitive?) and a lot more overlapping as well.

In [ ]:

design_mmgmos<-createDesignMatrix(eset_estrogen_mmgmos) # creating the design matrix
contrast_mmgmos<-createContrastMatrix(eset_estrogen_mmgmos) # creating contrast matrix

In [ ]:

fit_mmgmos<-lmFit(eset_estrogen_mmgmos,design_mmgmos)

In [ ]:

fit_mmgmos2 <- contrasts.fit(fit_mmgmos, contrast_mmgmos)
fit_mmgmos2 <- eBayes(fit_mmgmos2)

In [ ]:

DEgenes<-topTable(fit_mmgmos2,coef=1, adjust.method="BH") # identify genes that are DE

# Builds a vennDiagram to see which genes constrasts have in common
results_mmgmos_1<-decideTests(fit_mmgmos2, method="global") 
colnames(results_mmgmos_1)<-colnames(contrast_rma)
vennDiagram(results_mmgmos_1[,c(1,4)]) # vennDiagram for constrast 1 and 4

You can see from the Venn diagram that puma is able to identify many more DE genes and therefore it is right to say that this method is more sensitive than RMA.

In [ ]: