Naveen K. Bansal and Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

33
Naveen K. Bansal and Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University Milwaukee, WI (USA) Email: [email protected] and Hongmei Jiang Dept. of Statistics Northwestern University Evanston, IL (USA) Testing Multiple Hypotheses for Detecting Targeted Genes in an Experiment Involving MicroRNA 1 Seminar on Interdisciplinary Data Analysis

description

Testing Multiple Hypotheses for Detecting Targeted Genes in an Experiment Involving MicroRNA. Naveen K. Bansal and Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University Milwaukee, WI (USA) Email: [email protected] a nd Hongmei Jiang Dept. of Statistics - PowerPoint PPT Presentation

Transcript of Naveen K. Bansal and Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Page 1: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Naveen K. Bansal and Prachi PradeepDept. of Math., Stat., and Comp. Sci.

Marquette UniversityMilwaukee, WI (USA)

Email: [email protected]

and

Hongmei JiangDept. of Statistics

Northwestern UniversityEvanston, IL (USA)

Testing Multiple Hypotheses for Detecting Targeted Genes in an Experiment Involving

MicroRNA

1Seminar on Interdisciplinary Data Analysis

Page 2: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Outline:

Biology behind microRNA

Statistical Formulation

Bayesian Methodology

Real Data: Some Preliminary Results

Seminar on Interdisciplinary Data Analysis 2

Page 3: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Transcription, Translation, and Protein Synthesis

Source: http://statwww.epfl.ch/davison/teaching/Microarrays3

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 4: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Microarray Technology

Idea: measure the amount of mRNA to see which genes are being expressed. Measuring protein would be more direct, but is currently harder. Other problem is that some RNAs are not translated.

Source: http://statwww.epfl.ch/davison/teaching/Microarrays 4

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 5: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Yeast genome on a chip

5

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 6: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Past Discoveries: • Many segments of DNA are inactive.• Some can move around the genome of a cell. • For a long time, they were termed as “Junk DNA.”

They do not transcribe, i.e., no RNA molecule is created. However, They can insert into genes, and can trigger chromosome rearrangements. (McClintock, 1940)

Back to microRNA:

6

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 7: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

• Recent Discoveries:

Many transcribed non-coding RNAs have been identified, some containing short sequence of nucleotides, and some containing large. They do not translate.

Transcribed RNAs containing short sequence of nucleotides are called microRNA or miRNA.

7

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 8: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

• It is believed that some miRNAs play important roles in regulating mRNA (protein coding genes). Many research works focus on the regulatory function of these genes in cancer causing genes.

• These miRNA typically binds to mRNAs via base pairing at target sites of the coding sequence of mRNA and thus prevent the translation of the mRNAs.

8

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 9: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

miRNA genes are transcribed by RNA polymerase II to form primary miRNA (pri-miRNA) molecules. The ribonuclease, Drosha, then cleaves the pri-miRNA to release the pre-miRNA for cytoplasmic export and processing by Dicer. The mature miRNA product associates with the RNA-induced silencing complex for loading onto the 3′ UTR of target mRNAs to mediate translational repression.

Source: PNAS, Sept. 2007 9

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 10: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

• Theory: Cells carry cancer genes, but miRNAs prevent their translation?

• Hypothesis: Identified miRNAs affect the gene expressions of protein coding mRNAs.

This can be tested in a lab.

• Silence the miRNA , and look for the overexpression of the targeted genes in a microarray.

• Overexpress miRNA, and look for the supression of the targeted genes in a microarray.

10

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 11: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Experimental identification of microRNA-140 targets by silencing and overexopressing miR-140, By Nicolas, Pais, and Schwach . RNA, 2008

• Experiment-1: miR-140 was silenced. Gene expressions 45,000 mRNAs were recorded.

• Experiment-2: miR-140 was overexpressed. Gene expressions of 45,000 mRNAs were recorded.

Three Replicates

11

Biology behind microRNA

Seminar on Interdisciplinary Data Analysis

Page 12: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Results of Nicolas et al.(2008)

1. T-test to determine differentially expressed genes.

2. Two-different cut-off points for experiment-1 and

experiment-2

3. 1236 differentially expressed genes in Experimet-1

and 466 differentially

expressed genes in Experiment-2 with

49 common genes

12

Seminar on Interdisciplinary Data Analysis

Page 13: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Statistical Modeling:

𝑡𝑖𝑗 → 𝑡− 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐𝑠 𝑓𝑜𝑟 𝑖𝑡ℎ 𝑔𝑒𝑛𝑒,𝑎𝑛𝑑 𝑗𝑡ℎ 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝑖 = 1,2,…,𝑚, 𝑗= 1,2, ( 𝑚 𝑔𝑒𝑛𝑒𝑠,𝑡𝑤𝑜 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑠) 𝑋𝑖𝑗 = Φ−1ቀ𝐹൫𝑡𝑖𝑗൯ቁ 𝑋𝑖𝑗 ~ 𝑁ሺ0,1ሻ under the null 𝑋𝑖𝑗 ~ 𝑁(𝜃𝑖𝑗,1) under the non-null This is justifiable under independence assumption, see Efron (2008),

Statistical Sciences.

13

Statistical Formulation

Seminar on Interdisciplinary Data Analysis

Page 14: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Experiment-1: 𝐻𝑖10 :𝜃𝑖1 = 0 𝑣𝑠. 𝐻𝑖1−:𝜃𝑖1 < 0 𝑜𝑟 𝐻𝑖1+:𝜃𝑖1 > 0 Experiment-2: 𝐻𝑖20 :𝜃𝑖2 = 0 𝑣𝑠. 𝐻𝑖2−:𝜃𝑖1 < 0 𝑜𝑟 𝐻𝑖2+:𝜃𝑖2 > 0

Genes that are under-expressed in Experiment-1 should be

over-expressed under Experiment-2

Genes that are over-expressed genes in Experiment-1

should be under-expressed under Experimet-2

14

Statistical Formulation

Seminar on Interdisciplinary Data Analysis

Page 15: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝐻𝑖𝑗0 :𝜃𝑖 = 0 𝑣𝑠. 𝐻𝑖−:𝜃𝑖 < 0, 𝐻𝑖+:𝜃𝑖 > 0 𝜃𝑖 ~ 𝑝−𝜋−ሺ∙ሻ+ 𝑝0 𝐼ሺ0ሻ+ 𝑝+𝜋+(∙) 𝑖 = 1,2,…,𝑚 Loss Function : “0-1” Loss Expected Loss = Expected number of False

discoveries Bayes Rule under controlled False Discovery rate

Bayesian Decision Theoretic Methodology

Bansal and Miescke (2013): Journal of Multivariate Analysis

15

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 16: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Accept Accept Accept Total

true

true

true

Total

0U 0V 0W 0m

HU V W m

HU V W m

U V W

0H H H

0H

Table Possible outcomes from hypothesis tests

m

m

])0()0(

[

])0(

[])0(

[

00

00

WIVIWVWVWVEDFDR

WIWWWERFDR

VIVVVELFDR

Directional False Discovery rates:

16

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 17: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

(𝑑𝑖1𝑋,𝑑𝑖2𝑋), 𝑑𝑖𝑗𝑋 ∈ሼ−1,0,1ሽ, 𝑗= 1,2, 𝑖 = 1,2,…,𝑚 𝑑𝑖𝑗𝑋 = −1 𝑚𝑒𝑎𝑛𝑠 𝑖𝑡ℎ 𝑔𝑒𝑛𝑒 𝑖𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑎𝑠 𝑢𝑛𝑑𝑒𝑟− 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑑𝑖𝑗𝑋 = +1 𝑚𝑒𝑎𝑛𝑠 𝑖𝑡ℎ 𝑔𝑒𝑛𝑒 𝑖𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑎𝑠 𝑜𝑣𝑒𝑟− 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑑𝑖𝑗𝑋 = 0 𝑚𝑒𝑎𝑛𝑠 𝑖𝑡ℎ 𝑔𝑒𝑛𝑒 𝑖𝑠 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑎𝑠 𝑛𝑜𝑛− 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 Utility Function:

𝑈ሺ𝑑𝑋,𝜈ሻ= 𝐼൫𝑑𝑖𝑗𝑋 = 𝑘൯𝐼(𝜈𝑖𝑗 = 𝑘)1𝑘=−1

2𝑗=1

𝑚𝑖=1

where, 𝜈𝑖𝑗 ∈{−1,0,1} represents true state of nature

17

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 18: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝐸ሾ𝑈ሺ𝑑𝑋,𝜈ሻሿ = 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 # 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑙𝑦 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑔𝑒𝑛𝑒𝑠 𝑖𝑛 𝑏𝑜𝑡ℎ 𝑒𝑥𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑠 Maximizing the posterior expected utility yields the following Bayes Rule:

𝑑𝑖𝑗𝐵 =GGە۔����

𝑖𝑓 𝑃൫𝜃𝑖𝑗 1−ۓ������������������ ∈Ω−1|𝑥൯= max𝑘=−1,0,1𝑃൫𝜃𝑖𝑗 ∈Ω𝑘|𝑥൯ 0 𝑖𝑓 𝑃൫𝜃𝑖𝑗 ∈Ω0|𝑥൯= max𝑘=−1,0,1𝑃൫𝜃𝑖𝑗 ∈Ω𝑘|𝑥൯1 𝑖𝑓 𝑃൫𝜃𝑖𝑗 ∈Ω1|𝑥൯= max𝑘=−1,0,1𝑃൫𝜃𝑖𝑗 ∈Ω𝑘|𝑥൯

Ω−1 = ሺ−∞,0ሻ, Ω0 = ሼ0ሽ, Ω1 = (0,∞)

18

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 19: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

A New False Discovery Rate

Note the objective is to select genes that are overexpressed under experiment-1 and underexpressed under experiment-2. 𝐵𝐹𝐷𝑅12+− = 𝐸ቈ

σ 𝐼𝑚𝑖=1 ൫𝑑𝑖1𝑋 = 1,𝑑𝑖2𝑋 = −1൯𝐼ሺ𝜈𝑖1 ≤ 0,𝜈𝑖2 ≥ 0ሻ#𝐷1+ ∩𝐷2− ∨1 where 𝐷1+ is the set of selected overexpressed genes and 𝐷2− is the set of selected underexpressed genes. It is desirable to have the property that 𝐵𝐹𝐷𝑅12+− ≤ 𝛾 (0 < 𝛾 < 1)

19

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 20: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

The posterior version is given by 𝑃𝐹𝐷𝑅12+− = σ 𝑃ሺ𝜈𝑖1 ≤ 0,𝜈𝑖2 ≥ 0|𝑥ሻ𝐼ሺ𝑑𝑖1𝑋 = 1,𝑑𝑖2𝑋 = −1 ሻ𝑚𝑖=1 #𝐷1+ ∩𝐷2− ∨1

Note 𝑃𝐹𝐷𝑅12−+ ≤ 𝛾 ⇒ 𝐵𝐹𝐷𝑅12−+ ≤ 𝛾

20

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 21: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝐷𝐵𝑗− = 𝑠𝑒𝑡 𝑜𝑓 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑢𝑛𝑑𝑒𝑟𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑔𝑒𝑛𝑒𝑠 𝑢𝑛𝑑𝑒𝑟 𝑗𝑡ℎ 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝐷𝐵𝑗+ = 𝑠𝑒𝑡 𝑜𝑓 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑜𝑣𝑒𝑟𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑔𝑒𝑛𝑒𝑠 𝑢𝑛𝑑𝑒𝑟 𝑗𝑡ℎ 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝐷𝐵𝑗0 = 𝑠𝑒𝑡 𝑜𝑓 𝑠𝑒𝑙𝑒𝑐𝑡𝑒𝑑 𝑛𝑜𝑛 − 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑔𝑒𝑛𝑒𝑠 𝑢𝑛𝑑𝑒𝑟 𝑗𝑡ℎ 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡 𝑞𝑖 = 𝑃ሺ𝜈𝑖1 ≤ 0,𝜈𝑖2 ≥ 0aM𝑥ሻ Rank {𝑞𝑖:𝑖 ∈𝐷𝐵1+ ∩𝐷𝐵2− } from the Bayes rule 𝑞ሾ1ሿ≤ 𝑞ሾ2ሿ≤ ⋯ ≤ 𝑞 �ห𝐷𝐵1− ∩𝐷𝐵2+ ห൧

Constrained Bayes Rule

21

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 22: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝑖Ƹ0 = max൝𝑘 ≤ aM𝐷𝐵1− ∩𝐷𝐵2+ aM: 1𝑘 𝑞ሾ𝑖ሿ𝑘𝑖=1 ≤ 𝛼ൡ

Select the genes corresponding to

𝑞ሾ1ሿ≤ 𝑞ሾ2ሿ≤ ⋯ ≤ 𝑞ሾ𝑖Ƹ0ሿ

22

Properties:  1. Selected genes have Bayes optimality under both experiments 2. They are controlled by a false discovery rate in the sense that only a few of them are falsely selected as overexpressed under experiment-1 and falsely selected as underexpressed under experiment-2.  

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 23: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Remark: This approach can be applied to a different loss (Utility) function.

𝑈ሺ𝑑𝑋,𝜈ሻ= 𝐼൫𝑑𝑖𝑗𝑋 = 𝑘൯𝐼(𝜃𝑖𝑗 ∈Ω𝑘)1𝑘=−1

2𝑗=1

𝑚𝑖=1

Ω0 = ሺ−𝜖,𝜖ሻ, Ω−1 = ሺ−∞,𝜖ሻ, Ω1 = (𝜖,∞)

23

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 24: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝑋𝑖𝑗 ~ 𝑁൫𝜃𝑖𝑗,1൯, 𝑖 = 1,2,…,𝑚, 𝑗= 1,2 𝑃1ሺ𝜃𝑖1ሻ= 𝑝1𝐼ሺ𝜃𝑖1 = 0ሻ+ሺ1− 𝑝1ሻ𝜋1ሺ𝜃𝑖1ሻ

𝑃2ሺ𝜃𝑖2ሻ= 𝑝2𝐼ሺ𝜃𝑖2 = 0ሻ+ሺ1− 𝑝2ሻ𝜋2(𝜃𝑖2) Under the non-null ሺ𝐻𝑖1𝑎 ,𝐻𝑖2𝑎ሻ, jointly, (𝜃𝑖1,𝜃𝑖2)′ | 𝐻𝑖1𝑎 ,𝐻𝑖2𝑎 ∼ 𝑁ቆቂ

𝜇1𝜇2ቃ, ቈ 𝜎12 𝜌𝜎1𝜎2𝜌𝜎1𝜎2 𝜎22 ቇ, 𝑖 = 1,2,…,𝑚. 𝜋1 − 𝑝𝑑𝑓 𝑜𝑓 𝑁ሺ𝜇1,𝜎12ሻ 𝑎𝑛𝑑 𝜋2 − 𝑝𝑑𝑓 𝑜𝑓 𝑁(𝜇2,𝜎22)

Prior:

24

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 25: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

𝑃൫𝜃𝑖1,𝜃𝑖2ห𝐻𝑖1𝑎 ,𝐻𝑖20൯= 𝑁ሺ𝜇1,𝜎12ሻ𝐼ሺ𝜃𝑖2 = 0ሻ 𝑃൫𝜃𝑖1,𝜃𝑖2ห𝐻𝑖10 ,𝐻𝑖2𝑎൯= 𝐼ሺ𝜃𝑖1 = 0ሻ𝑁ሺ𝜇2,𝜎22ሻ 𝑃ሺ𝜃𝑖1,𝜃𝑖2ሻ= 𝑝1𝑝2𝐼ሺ𝜃𝑖1 = 0ሻ𝐼ሺ𝜃𝑖2 = 0ሻ +𝑝1(1− 𝑝2)𝐼ሺ𝜃𝑖1 = 0ሻ𝜋2𝑎ሺ𝜃𝑖2ሻ +ሺ1− 𝑝1ሻ𝑝2𝜋1𝑎ሺ𝜃𝑖1ሻ𝐼ሺ𝜃𝑖2 = 0ሻ +ሺ1− 𝑝1ሻሺ1− 𝑝2ሻ𝜋𝑎ሺ𝜃𝑖1,𝜃𝑖2ሻ

Prior (Cont.)

25

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 26: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

26

From The Posterior Distribution, calculate 𝑃൫𝜃𝑖𝑗 < 0ห𝑥൯, 𝑃൫𝜃𝑖𝑗 = 0ห𝑥൯, 𝑃(𝜃𝑖𝑗 > 0|𝑥)

This yields the Bayes selection sets, 𝐷𝐵𝑗− , 𝐷𝐵𝑗0 , 𝑎𝑛𝑑 𝐷𝐵𝑗+ .

Also calculate 𝑃(𝜃𝑖1 ≤ 0,𝜃𝑖2 ≥ 0|𝑥) for 𝑖 ∈𝐷𝐵1+ ∩𝐷𝐵2− from the posterior. Obtain the Constrain Bayes Rule

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 27: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Estimation of Hyper-parameters: Marginally, 𝐸൫𝑋𝑖𝑗൯= ൫1− 𝑝𝑗൯𝜇𝑗 𝑉𝑎𝑟൫𝑋𝑖𝑗൯= ൫1− 𝑝𝑗൯𝜎𝑗𝑗 + 𝑝𝑗൫1− 𝑝𝑗൯𝜇𝑗2 + 1 𝐶𝑜𝑣ሺ𝑋𝑖1,𝑋𝑖2ሻ= ሺ1− 𝑝1ሻሺ1− 𝑝2ሻ𝜎12 𝜇Ƹ𝑗 = 𝑥ҧ.𝑗1− 𝑝Ƹ𝑗 , 𝑗= 1,2

𝜎ො��𝑗𝑗 = 𝑠𝑗𝑗 − 11− 𝑝Ƹ𝑗 − 𝑝Ƹ𝑗𝜇Ƹ𝑗2, 𝑗= 1,2

𝜎ො��12 = 𝑠12

ሺ1− 𝑝Ƹ1ሻሺ1− 𝑝Ƹ2ሻ, 27

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 28: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Estimation of 𝒑𝟏 and 𝒑𝟐 Storey (2002) JRSS

𝑝Ƹ1 = ቄ#𝑝𝑖(1) > 𝜆1ቅሺ1− 𝜆1ሻ𝑚 , 𝑝Ƹ2 = ቄ#𝑝𝑖(2) > 𝜆2ቅ

ሺ1− 𝜆2ሻ𝑚

where 𝜆1 and 𝜆2 are appropriately chosen values.

28

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 29: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

EM Algorithm Approach

𝐿𝑗൫𝑝𝑗൯= �� �𝑝𝑗𝑓𝑗൫𝑥𝑖𝑗ห0൯+൫1− 𝑝𝑗൯𝑚𝑗𝑎(𝑥𝑖𝑗)൧,𝑚𝑖=1

where, 𝑓𝑗(𝑥𝑖𝑗|0) and 𝑚𝑗𝑎(𝑥𝑖𝑗) are the marginal pdfs under 𝐻𝑖𝑗0 :𝜃𝑖𝑗 = 0 and 𝐻𝑖𝑗𝑎 :𝜃𝑖𝑗 ≠ 0 respectively.

𝑝Ƹ𝑗𝑘+1 = 1𝑚 𝑝Ƹ𝑗𝑘𝑓൫𝑥𝑖𝑗ห0൯𝑝Ƹ𝑗𝑘𝑓൫𝑥𝑖𝑗ห0൯+൫1− 𝑝Ƹ𝑗𝑘൯𝑚𝑗𝑎൫𝑥𝑖𝑗൯𝑚

𝑖=1

29

Bayesian Methodology

Seminar on Interdisciplinary Data Analysis

Page 30: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

Some Preliminary Results of Nicolas et al. (2008) data

30

Seminar on Interdisciplinary Data Analysis

Page 31: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

31

Seminar on Interdisciplinary Data Analysis

Page 32: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

32

Page 33: Naveen K.  Bansal  and  Prachi Pradeep Dept. of Math., Stat., and Comp. Sci. Marquette University

33

Smallest p-value for experiment-1: 0.0002387974

and

Smallest p-value for experiment-2: 0.0001600851

BH FDR approach fails due to large number of genes (m= 45,000)

However, we have 50 genes with

P-values < 0.01 in experiment-1

P-values < 0.05 in experiment-2

Seminar on Interdisciplinary Data Analysis