Gene Set Enrichment

Outline Hypergeometric Testing Simple GSEA using Z-score and Permutation GSEA using Linear Models

Gene Set Enrichment Analysis

Chao-Jen Wong

Fred Hutchinson Cancer Research Center

January 28, 2010

1 / 29

1 Hypergeometric Testing

2 Simple GSEA using Z-score and Permutation

3 GSEA using Linear Models

2 / 29

Gene set enrichment analysis

Unlike per-gene analysis ...

Search for categories where the constituent genes show changes inexpression level over the experimental conditions.

Use predefined gene set such as KEGG pathways, GO classifications,chromosome bands, and protein complexes.

No need to make a cutoff between genes that are differentiallyexpressed and those that are not.

Provided in the GESABase, Category , GOstats and topGO.

3 / 29

Outline

4 / 29

Hypergeometric testing

Basic concept: Suppose there are N balls in an urn, n are white andm are black. Drawing k balls out of the urn without replacement,how many black balls do we expect to get? What is the probabilityof getting x black balls?

Hypergeometric testing for under- and over-representation of GOterms.

Inputs

1 Gene universe, N.

2 GO categories (categorize genes by GO terms).

3 A list of interesting genes, I , (differentially expressed genes

identified by limma or just simply t-test by rowttests).

5 / 29

Interesting (Black) Not (White)In GO term n11 n12 K

Not in GO term n21 n22 N − KI N − I N

Suppose there are j interesting genes in the GO term (n11 = j),compute

1 Probability of seeing j or more black balls in K draws.

2 Expected number of black balls seeing in K draws.

6 / 29

Data preparation

Define gene universe (a vector of Entrez Gene IDs).

Select a list of interesting genes (a vector of Entrez Gene ID).

Code: gene selection via t-test

> library(genefilter)> library(day2)> library(hgu95av2.db)> data(ALLfilt_bcrneg)> ttests <- rowttests(ALLfilt_bcrneg, "mol.biol")> ## select interesting genes> smPV <- ttests[ttests$p.value < 0.005, ]> selectedEntrezIds <- unlist(mget(rownames(smPV),+ hgu95av2ENTREZID))> entrezUniverse=unlist(mget(featureNames(ALLfilt_bcrneg),+ hgu95av2ENTREZID))

7 / 29

Data preparation

Define gene universe (a vector of Entrez Gene IDs).

Select a list of interesting genes (a vector of Entrez Gene ID).

Code: gene selection via t-test

> library(genefilter)> library(day2)> library(hgu95av2.db)> data(ALLfilt_bcrneg)> ttests <- rowttests(ALLfilt_bcrneg, "mol.biol")> ## select interesting genes> smPV <- ttests[ttests$p.value < 0.005, ]> selectedEntrezIds <- unlist(mget(rownames(smPV),+ hgu95av2ENTREZID))> entrezUniverse=unlist(mget(featureNames(ALLfilt_bcrneg),+ hgu95av2ENTREZID))

7 / 29

Create a GOHyperGParams object.

Code: GOHyperGParams

> library(GOstats)> hgCutoff <- 0.001> GOparams <- new("GOHyperGParams",+ geneIds=selectedEntrezIds,+ universeGeneIds=entrezUniverse,+ annotation="hgu95av2.db",+ ontology="BP",+ pvalueCutoff=0.001,+ conditional=TRUE,+ testDirection="over")

8 / 29

Outputs and summary.

Code: hyperGTest

> hgOver <- hyperGTest(GOparams)> class(hgOver)> summary(hgOver)

Exercise: generate report using htmlReort.

> showMethods("htmlReport")> htmlReport(hgOver, file="hgResult.html")> browseURL("hgResult.hrml")

9 / 29

Outputs and summary.

Code: hyperGTest

> hgOver <- hyperGTest(GOparams)> class(hgOver)> summary(hgOver)

Exercise: generate report using htmlReort.

> showMethods("htmlReport")> htmlReport(hgOver, file="hgResult.html")> browseURL("hgResult.hrml")

9 / 29

Lab activity

1 Chapter 14: read and do the exercises in Section 14.3 and 14.4.

2 Use the topGenes dataset (load the data using data(topGenes))and find a subset of genes whose adj.P.Val are less than 0.01.

3 Repeat the conditional Hypergeometric testing to find under- andover-represented biological processes.

4 Generate html reports.

10 / 29

Outline

11 / 29

Simple GSEA

Consider two group comparison

Start with data quality assessment.

Compute per-gene t-statistics: tk for each gene k.

Null hypothesis: no difference in mean expression

Ho : ZK = 0

ZK =1�|K |

tk ∼ N (0, 1),

where K denotes the gene sets, and |K | the number of genes in thegene set.

Alternative approach: use permutation test to assess which genesets have an unusually large absolute value of zK .

12 / 29

Data preparation

ALLfill bcrneg> library(ALL)> library(hgu95av2.db)> data(ALL)> bcell <- grep("^B", as.character(ALL$BT))> types <- c("NEG", "BCR/ABL")> moltyp <- which(as.character(ALL$mol.biol) %in% types)> # subsetting> ALL_bcrneg <- ALL[, intersect(bcell, moltyp)]> ALL_bcrneg$BT <- factor(ALL_bcrneg$BT)> ALL_bcrneg$mol.biol <- factor(ALL_bcrneg$mol.biol)> # nonspecific filter: remove genes that does not> ## show much variation across samples> library(genefilter)> filt_bcrneg <- nsFilter(ALL_bcrneg,+ var.cutoff=0.5)> ALLfilt_bcrneg <- filt_bcrneg$eset

13 / 29

Using KEGG

Data representation: create an incidence matrix Am where aij = 1 ifgene j is in gene set i and aij = 0 otherwise.

> library(KEGG.db)> library(GSEABase)> gsc <- GeneSetCollection(ALLfilt_bcrneg,+ setType=KEGGCollection())> Am <- incidence(gsc)

ExpressionSet object retains only those features that are in theincidence matrix Am.

> nsF <- ALLfilt_bcrneg[colnames(Am), ]

14 / 29

Using KEGG

Exercise

1 How many gene sets and how many genes are represented by theincidence matrix Am?

2 How many gene sets have fewer than ten genes in them?

3 What is the largest number of gene sets in which a gene can befound?

4 What is the name of this gene set? (use KEGGPATHID2NAME)

Code> dim(nsF)> dim(Am)> nGene <- rowSums(Am)> rownames(Am)[nGene < 10]> sort(nGene, decreasing=TRUE)[1]> KEGGPATHID2NAME[["05200"]]

15 / 29

Using KEGG

Exercise

1 How many gene sets and how many genes are represented by theincidence matrix Am?

2 How many gene sets have fewer than ten genes in them?

3 What is the largest number of gene sets in which a gene can befound?

4 What is the name of this gene set? (use KEGGPATHID2NAME)

Code> dim(nsF)> dim(Am)> nGene <- rowSums(Am)> rownames(Am)[nGene < 10]> sort(nGene, decreasing=TRUE)[1]> KEGGPATHID2NAME[["05200"]]

15 / 29

Using KEGG

Compute the per-gene test statistics using the rowttests function.

> rtt <- rowttests(nsF, "mol.biol")> names(rtt)

[1] "statistic" "dm" "p.value"

> rttStats <- rtt$statistic

Reduce the incidence matrix by removing all gene sets that havefewer than ten genes in them.

> selectedRows <- (rowSums(Am) > 10)> Am2 <- Am[selectedRows, ]

Compute zk for each pathway: zK = 1√|K |

�k∈K tk .

> tA <- as.vector(Am2 %*% rttStats)> tAadj <- tA /sqrt(rowSums(Am2))> names(tAadj) <- rownames(Am2)

16 / 29

Using KEGG

�k∈K tk .

16 / 29

Using KEGG

�k∈K tk .

16 / 29

Using KEGG

Exercise

1 Which pathways have remarkably low (< 5) and high aggregatestatistics (> 5)?

2 What is the name the pathway that has the lowest zk score?

3 Use KEGG2heatmap to plot a heatmap for the genes in thispathway.

Code> smPW <- tAadj[tAadj < -5]> mget(names(smPW),KEGGPATHID2NAME)> lgPW <- tAadj[tAadj > 5]> mget(names(lgPW), KEGGPATHID2NAME)

17 / 29

Using KEGG

Exercise

1 Which pathways have remarkably low (< 5) and high aggregatestatistics (> 5)?

2 What is the name the pathway that has the lowest zk score?

3 Use KEGG2heatmap to plot a heatmap for the genes in thispathway.

Code> smPW <- tAadj[tAadj < -5]> mget(names(smPW),KEGGPATHID2NAME)> lgPW <- tAadj[tAadj > 5]> mget(names(lgPW), KEGGPATHID2NAME)

17 / 29

KEGG2heatmap

> KEGG2heatmap("03010", nsF, "hgu95av2")

31948_at33661_at41214_at34316_at32394_s_at1151_at39856_at39916_r_at41765_at31511_at34570_at31907_at33485_at36786_at33668_at31546_at31955_at32330_at34085_at31956_f_at33619_at32440_at36333_at35125_at31509_at32315_at33674_at31505_at34592_at32437_at36358_at33117_r_at1653_at

18 / 29

Permutation testing

Assess the significant gene sets with respect to a referencedistribution build by a number of permutations.

gseattperm: permute the sample labels.

Return p-value w.r.t. to a reference distribution:Lower: proportion of permutation t-statistics that were smaller than

the observed t-statisticsUpper: proportion of permutation t-statistics that were larger than

the observed t-statistics

Code: using gseattperm> library(Category)> set.seed(123)> pvals <- gseattperm(nsF, nsF$mol.biol, Am2, 1000)> pvalCut <- 0.05> lowC <- rownames(pvals)[pvals[, 1] <= pvalCut]> unlist(getPathNames(lowC), use.names=FALSE)

[1] "Glycerophospholipid metabolism"

[2] "Ribosome"

[3] "RNA polymerase"

[4] "DNA replication"

[5] "Mismatch repair"

[6] "Homologous recombination"

19 / 29

Outline

20 / 29

Chromosome bands

Use the mapping of genes to chromosome bands.

To answer whether there are anomalies in the pattern of geneexpression that related to chromosome bands.

Use GSEA linear models.

Human chromosome 12

p13.33

p13.32

p13.31

p11.23

p11.22

p11.21

q13.11

q13.12

q13.13

q21.31

q21.32

q21.33 q22

q24.11

q24.12

q24.13

q24.21

q24.22

q24.23

q24.31

q24.32

q24.33

Figure: Ideogram for human chromosome 12. The shaded bands togetherrepresent 12q21. Notice that the chromosome bands are hierarchicallynested, and they almost form a partition. (D. Sarker et. al. 2007)

Reference

”Using Categories defined by Chromosome Bands”by D. Sarker et. al.

21 / 29

Data preparation

Consider the comparison of BCR/ABL and NEG groups.

Use ALL_bcrneg object.

Use nsFilter to remove probes with no Entrez Gene ID and nomapping to a chromosome band. Ensure that each Entrez Gene IDmaps to exactly one probeset which has the highest IQR. Alsoremove probes with lack of variation (var < 0.5).

Code: nonspecific filtering

> ALLfilt <- nsFilter(ALL_bcrneg, require.entez=TRUE,+ remove.dupEntrez=TRUE,+ require.CytoBand=TRUE,+ var.func=IQR,+ var.cutoff=0.5)$eset

22 / 29

Data preparation

Consider the comparison of BCR/ABL and NEG groups.

Use ALL_bcrneg object.

Use nsFilter to remove probes with no Entrez Gene ID and nomapping to a chromosome band. Ensure that each Entrez Gene IDmaps to exactly one probeset which has the highest IQR. Alsoremove probes with lack of variation (var < 0.5).

Code: nonspecific filtering

> ALLfilt <- nsFilter(ALL_bcrneg, require.entez=TRUE,+ remove.dupEntrez=TRUE,+ require.CytoBand=TRUE,+ var.func=IQR,+ var.cutoff=0.5)$eset

22 / 29

Data preparation

Compute per-gene t-statistics using limma.

Code: moderate t-statistics> library(limma)> design <- model.matrix(~0 + ALLfilt$mol.biol)> colnames(design) <- c("BCR/ABL", "NEG")> contr <- c(1, -1)> fit1 <- lmFit(ALLfilt, design)> fit2 <- contrasts.fit(fit1, contr)> fit3 <- eBayes(fit2)> tlimma <- topTable(fit3, number=nrow(fit3),+ adjust.method="none")> ## annotation> entrezUniverse <- unlist(mget(tlimma$ID,+ hgu95av2ENTREZID))> tstats <- tlimma$t> names(tstats) <- entrezUniverse

23 / 29

Data preparation

Compute per-gene t-statistics using limma.

Code: moderate t-statistics> library(limma)> design <- model.matrix(~0 + ALLfilt$mol.biol)> colnames(design) <- c("BCR/ABL", "NEG")> contr <- c(1, -1)> fit1 <- lmFit(ALLfilt, design)> fit2 <- contrasts.fit(fit1, contr)> fit3 <- eBayes(fit2)> tlimma <- topTable(fit3, number=nrow(fit3),+ adjust.method="none")> ## annotation> entrezUniverse <- unlist(mget(tlimma$ID,+ hgu95av2ENTREZID))> tstats <- tlimma$t> names(tstats) <- entrezUniverse

23 / 29

Linear models

Per−gene t−statistics

12q21.1

−5 0 5

!!! !!

!!! ! !! !

! !! !!! !

! !! !! !!!!! !! ! !

! ! ! !!!

!!!!! !

!!! !!! !

!!!! !

!! !! !! !!!

!! !!!

!! !!!!! !

!!! ! ! !!

!!! ! !! !! ! !!! !!

! !!! !

!!!! !

! !! !!!! ! !

! !!!! !! !

!!!!!! !! ! !!

!!! ! !!

! !!! !

!! !!! !!

! !!!! ! !! !

! !! !!!!! !

!!! !!!! ! !! ! !

!! ! !!!

!! !!! ! !!! !!! !

! !! !!!!

!!! !!

! ! !!!! ! !! !! !! !

! !!! !!

!!!!!!

! !! !

!! ! !

! ! !!!

!!!!!!! !

!!! !! !!! ! !

!! !! !! !!! ! !!!

! !!!! !!! !

! !!!!! ! !!! !

!! ! !! !! !!

!! ! ! !! !

!!! !!! !

! !!! !! !! !!! ! !!! !!!!

!!!! ! !! !!

! !!! !!

! !! !!

!!!! ! !!! !

!! !!! !! !

! ! !! !! !!!!! !

!!! !!

!!!! !! !

! !!!!

!! !! ! !! !

!! !! !!!

! ! !! !!! !

!! !!! !! !! !!!! !!!! !!

! ! ! !!!

!! !!!!! !! ! !! ! !!

!! !!! !!!!

!! ! !! !!

!! ! ! !!!

!!!!! !!!! !!!

! ! !!! !!!! !!! !! !

! !! !

!!! !! ! !! !

!! !!! !

!! !!!!!!

!!!!!! !!! !

!! ! ! ! !

!!!!! !!

!!!! !!

! !! !!! ! !! !!

! !! ! !

! !! !!!! !!! !! !

! !!!! !! ! !! !!! !! !

! !!! ! !!! !!

! !! !!!

!!! ! !!!!

!!!! !!

! !!!!!!

! !! !! !! ! !!!! ! !

! !!! !! !! ! !!!!!! !!

! !! ! !!!!

!! !!!!! !!

!! ! !!

!! !!!! !

! !! !!

! ! !! !!

!!! !! ! ! !!! !! !! !

!! ! !!!

! !!!! !! !!!

!! !! ! !!! !!!

!! !! ! !!! !!

!!! ! !! !

!!! !!

!! !!!!

!! !!!! ! !!!!

! ! !! ! !!! !!! !! !! !!! !!!!!

!!! !! ! ! !

!! ! ! !!!!!!

!!! ! !! ! !!! !! !

! !! !! ! !!

!! !! !! !!!! !

!! !!! !

!!! ! !

!!!! ! !

!!!!! !!! ! !

!! !!!! ! ! !!!! ! !!!

! !! !! !! !! ! ! !!! !

!!! !! !!!! !! !!

!! !! !!!! !! !

!! !! !

! ! !! !!

! !!! ! !

!!! ! !!!!! !

!! !!!

!! ! !!

!!! ! !! !

!!!! !

! !!!!

! ! !!!

!! ! !!! ! ! !

!!! !!! ! !! !!

!!! !!!

!! !!!!

!! ! !! !! !!!

!! !! ! !

!! !!! !

! !!!! !

!!! ! !! !

! !!! ! !

!!!!! ! !!!

!! !!!!! !!!! !

!!!!!! !!!!

! !!!! !! !

! !!!!!

!!! !! !

!!! ! ! !! !

!!!!! !! !! ! !

! !!! !! !!

!! !!!!!

!!!! !

! !!!!!

!! ! !!!

! !! !!! !! !!

!! !! ! ! !

! !! ! !!!!! !

!! !!!!!!!

! !!!!!

! !!!!

! !! !!! !!

!!!! ! !! !

!!! !!! !

!! !! !

!!!!!! !

!!!! ! !

! ! ! !! ! !! !!! !! ! !! !! !! !!

!! !!!

! !!!!!!

! ! !! !! ! !!!

!!! !!! !!!! !!

!!! !! ! !!! !

!! !!! ! !! !!! ! !! !

!!!! !!

!!!! !! !!!!

! !!!!

!!! !! !

!!! !!! !

!!! ! !!!! !

!!! !! !!! !!

! !! !!!

!! !!! !!!

! ! !!!!!! !!!

!!!! !! ! !! ! !!

!!!!!!

!!!! ! ! !

! !!! ! ! !

! !! !!! !! ! !

! !! !!

!!!!!! ! !!

!!! !!

! ! !!!!!! !! !

! !! !! !! !!!

!!! !! !!!

!!!! ! !!!

!! !!! !!! !

!!!! !! !

! !!!!

!!! !!! !! !! !!

!! !!!! !!! !!

!!! ! !! !

! ! !!

!!!! !!! !

!! ! !

!! ! !!! !!

! !!! ! !! !!!!

! !!! !!! !! !!!

!! !!!!! !!!!! ! !!

!!!! !

!!! !!!! !!! ! !

!!!! !!!!

!! !! !! !!!

!!! !!!!

!!! !!

! !!! !!! !!

!! ! !

!! ! !! !!!!

! !! !! !!! ! !

!! !! !!!

! ! !!!

!! !!!

! !!!! !

!! !!!

!!!! ! !! !

!! ! !!!

!!! !! ! !!

! !!!!

!! !! !

!!!! !!

!!! !!!!

! ! ! !!!! !! !

!!! !!! !! ! !

! !! !! !!! !! !!! !

!!!! !! !!

!!!!!!! !

!!! !!

! !! !!

!!! !!!!

! !!! !

! !! !! !!!! !! !! !

!! !!!! ! ! ! !!! !! !

! ! !!! !!

! !! ! !! !! !

!!! !!! !! ! !!!! !! !! !!!!! !!

!! !!!! !

!! !! !!!

!! !!!! !!!! ! !!! !!

!!! !!!!

!!! !!! !! !!! !

! !!!! ! !

! ! ! ! !!

!! !! !

!! !!!! ! !

! !!!! !!

! !!! ! !! !!!!

! !! !! ! !!! !!

!! !! !

!!! !! ! !! !! !!!!!! !! ! !

!! !! !!

!! ! !! !!

!! !! !

! !!! !

!! ! !!!

!! !! !! !

! ! !! !! !

!!!! !! !!! !

!!! !!!!

!!! !!! !!! !! !!! !!!!! !! !

!! !! !! !

! !! !! !! !! !

!!! !! !!!! !! !!

!! !!!! !!

!! ! !!! !! !

! !!!!!!

! !! !!

! !!! ! !

! ! !!! ! !

!!!! !!!!! !!!

! ! !! !! !

!!! !!! !

!! !! !! !! !!

!! !! !

!!! !! !!

! !! !

!!! !!

! !!!! !! !! ! !

!! ! !

!! !!! ! !

!! !!! !

!!! !!!

! !! !

!!!!!!!! !! !! !!

!! !! !! !!!

!!! ! !!!! !

!! ! !! ! ! !!! !! !! !!

! !! !!!!!

!!!! !!

!!!!!!

!!! !! !! !

!! !! !

! ! !! !!

!! ! !!!

! ! ! !! ! !!!!

!! !! ! !! ! !! !

!!!! !! ! !!

! ! !! !

! !! ! !!! !

!!!! !!

!! !! !

! !!! !

!! ! !!

!!! !!! !!

!!!! ! !

!!!! ! !!

!!!! !!! ! !! !

! !! !!! !! !!!! ! !!

!!! ! !! !

!!! !!! ! !

!! ! !! !!!!! !! ! !!! ! !

!! !! !!

!!! !! !!

! ! !!! !! !!!

!!!! ! !!!

!!!! !

! !!!!!

! !!!!! ! !

!! !! !

! ! ! !!! !!

!! !! ! !!! !!

!! !!!

! !! ! !! !!!

! !!! !!

! !! !!

! !!!! !

! !! !! !!

! !! !!!

! !! ! !! !! !! !!

!!!! !! ! !! !

!!! ! !! !!

! !! ! !!

!!! ! !

! !! !!! ! !!! ! !!!! ! !

! !! !! !

!!!! !!!! ! ! !!! ! !!! ! !! !!!!!

!!! !! ! !!! ! !!! !! !!! !! !! !!! !

! ! !! !!!! !! !!

!!!! !

!! !!! !!!! !

!! ! !! !! !!!

! !! !!!! !!

! ! !! !!!

! !!!! !!

!! !!!! !! !

!! !! !!!!! !!

! !!! !! !! !!

!! !!! !

! !!!!

! !! !! !

! !! !!! !!!!

!! !!! !

! !! !!!!

! !!! !!!! !!

! !! !!! !!

!! ! !!! !!

!!! ! !!!

!! !!!!! !!

!! !!!!

! !! !!!! ! !!

!!! !!!!

!!!! !

!!!!! !!

!!! !!

!!!! !!! !! ! !

! !! !!! !!! !!

!!! !! !

!!! ! !! !!! !

! ! !! !! !!! !!

! !!!! ! ! ! !!! ! !!!

!!!!!!!!!!!

! !! !!!

!!!! !

!!!!!! !! !

!!! !!!

!! !! !!! ! !!! ! !!

!!!! ! !! !! !

!! !! ! ! !!! !!!! !

! !!!!

! !!! !

!! !!!

!!!! !!

! ! !!!!!

! !!!!! ! !!! !!!!

! !!! !! !! !! !!! ! ! !

! ! !! ! !!!

!! !! !

! !!! !!! !!! ! !

! !! ! !

!! !!!! !! !

! !! !!! !

! ! !!!!! !

! !!! !!!! !

! ! !!! !!

! !! ! !! !!!

!! !! !!! !!

! !!!!

!! !! !! !! !!

!! !!!! !!

! !! !!! !!

!!! !!!!! !

!!!!!!! !!! ! !!

! !!! ! !! !

! !!!! !

!! !! !! !! !! !! !

! !!! ! !!!

!! !!! !!!

! !!! !

! !!! ! !!

!! ! !! !!!!! !!! ! !

! !! !!! !!!

!! !!!! !! !! ! ! !

! ! !!!! !! ! !! !

!! !!!

! ! !!!! !!

!! !! !!!!! !

!! !! !

!!! ! !!

!!!! !!

!! !! ! !!! !!

!!! !!

!! !!!!

!! ! !!!

! ! !!! !!! ! !!!! !!

!! !!! !! !

! !! !! !! !! !

! !!! !

!!! ! ! !!

! !!! !! !!!!!

!!! !!

! ! !!! !!! ! ! !! !

!! !!! !

! !!!! ! !! !!!

!! ! !!

!!! ! !!

! !!! ! !!!!! !!

! !!! !! !! !!! !

!! !! !!!

!!! !!

!!! !! !

!! ! !!

!! !! !! !! !! !!!

! ! !!

!! !! !!!!

! ! !!

! !!!!!

!! ! ! !!!

!!!!! !!! !

! !!!!

! ! !! ! !!!!!

!!!! !! !!!! !! !

Fitting linear model with per-gene t-statistics: for each category j ,

yi = β0 + β1aij + εi ,

where aij = 1 if gene i is associated with category j , and 0otherwise. The index i may range over from universal genes to asubset of genes.

β1 ∼ N (0, 1)

Null hypothesisH0 : β1 = 0

24 / 29

Linear models

Create a ChrMapLinearMParams object.

Code: instance of class ChrMapLinearMParams

> library(Category)> params <- new("ChrMapLinearMParams",+ conditional=FALSE,+ testDirection="up",+ universeGeneIds=entrezUniverse,+ geneStats=tstats,+ annotation="hgu95av2",+ pvalueCutoff=0.01,+ minSize=4L)

25 / 29

Calling the linearMTest function

linearMTest: compute the p-values for detecting up- ordown-regulation of predefined gene sets.

Code: linearMTest

> lman <- linearMTest(params)

> lman

> summary(lman)

26 / 29

Exercise

1 Get familiar with the structure of ChrMapLinearMParams class?ChrMapLinearMParams or help(“ChrMapLinearMParams-class”)

2 Perform conditional GSEA linear models to find interestingchromosome bands that are up-regulated.

3 Summarize the result of the conditional test using summary.

Code: conditional test> slotNames(params)> paramsCond <- params> paramsCond@conditional <- TRUE> lmanCond <- linearMTest(paramsCond)> summary(lmanCond)

27 / 29

Exercise

1 Get familiar with the structure of ChrMapLinearMParams class?ChrMapLinearMParams or help(“ChrMapLinearMParams-class”)

2 Perform conditional GSEA linear models to find interestingchromosome bands that are up-regulated.

3 Summarize the result of the conditional test using summary.

Code: conditional test> slotNames(params)> paramsCond <- params> paramsCond@conditional <- TRUE> lmanCond <- linearMTest(paramsCond)> summary(lmanCond)

27 / 29

Summary

1 Basic idea behind GSEA.

2 Simple GSEA: t-tests and permutation.

3 Using KEGG categories.

4 Linear models and chromosome band categories.

5 Hypergeometric testings on GO BP terms.

28 / 29

Reference

Assaf P. Oron et. al., Gene set enrichment analysis using linearmodels and diagnostics, Bioinformatics, vol. 24 no. 22, pp.2566-2591, 2008.

Florian Hahne et. al., Bioconductor Case Studies, chapter 13-14,Springer, 2008.

Deepayan Sarker et. al., Using Categories defined chromosomebands, Bioconductor Category package vignette.

D. Sarker et.al., Modeling gene expression data via chromosomebands, Bioinformatics, 2007.

29 / 29

Gene Set Enrichment

Documents

Transcript of Gene Set Enrichment

Gene Set Enrichment Analysis Petri Törönen petri(DOT)toronen(AT)helsinki.fi.

SAM Significance Analysis of Microarrays-User guide …statweb.stanford.edu/~tibs/SAM/sam.pdf · SAM now has facilities for Gene Set Analysis [2], a variation on the Gene Set Enrichment

Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.

Enrichment Map: A Network-Based Method for Gene-Set ... · ranked to identify the top-most list of expressed genes, based on an arbitrary expression threshold, or a set of gene expression

Gene Set Enrichment Analysis Microarray Classification STAT115 Jun S. Liu and Xiole Shirley Liu.

The shinySISPA Manual - bbisr.shinyapps.winship.emory.edubbisr.shinyapps.winship.emory.edu/shinySISPA/shinySISPA_Manual.pdf · samples with similar gene set enrichment profiles. The

Gene Set Enrichment Analysis: A Knowledge-Based Approach ...

Gene Set Enrichment Analysis (GSEA)

Gene set enrichment analysis with topGO - … set enrichment analysis with topGO Adrian Alexa, J org Rahnenf uhrer April 30, 2018 ˜alexa Contents 1 Introduction 2 2 Instalation 2

Gene Set Enrichment Analysis

Bayesian variant-based gene set enrichment …xiangzhu/NHS_20160304.pdf2016/03/04 · GSEABMApathwayRSSHeight Bayesian variant-based gene set enrichment analysis using GWAS summary

Gene Ontology Network Enrichment Analysis

Pathway/ Gene Set Analysis in Genome-Wide Association Studies · (GE) – GE enrichment typically test which gene sets/pathways show enrichment ... • Pathway and gene set analysis

A Factor Graph Model for Minimal Gene Set Enrichment Analysis Diana Uskat Computational Biology - Gene Center Munich.

SeqGSEA: Gene Set Enrichment Analysis (GSEA) of RNA-Seq ...

GSVA: The Gene Set Variation Analysis package for ... · enrichment pro les for each gene set and sample. GSVA starts by evaluating whether a ... expression-level statistics into

Ensemble of Gene Set Enrichment Analyses · 1m.hamdoosh@gmail.com 2mritchie@wehi.edu.au Ensemble of Gene Set Enrichment Analyses Monther Alhamdoosh1, Luyi Tian, Milica Ng and Matthew

Gene set enrichment analysis of RNA-Seq data with the ... · Gene set enrichment analysis of RNA-Seq data with the SeqGSEA package Xi Wang 1 ;2and Murray Cairns 3 April 27, 2020 1School

EnrichNet: network-based gene set enrichment analysis Presenter: Lu Liu.

EnrichNet : network-based gene set enrichment analysis