A Quantitative Overview to Gene Expression Profiling in Animal Genetics
-
Upload
diana-schultz -
Category
Documents
-
view
30 -
download
2
description
Transcript of A Quantitative Overview to Gene Expression Profiling in Animal Genetics
![Page 1: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/1.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Analysis of(cDNA) Microarray Data:
Part V. Mixtures of Distributions
Model-Based Clusteringvia
Mixtures of Distribution
![Page 2: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/2.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Definition
• The mixture model assumes that each cluster (or component) of the data is generated by an underlying normal distribution.
• Each of the data in y are assumed to be independent observations from a mixture density with k (possibly unknown but finite) components and with probability density function:
k
iiiik Vyyf
1
,;;
Mixing proportions (add to 1)
Normal density function
![Page 3: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/3.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Introduction
k
jjjjk Vyyf
1
,;;
![Page 4: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/4.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of DistributionsThe Guru http://www.maths.uq.edu.au/~gjm
![Page 5: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/5.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Software and Resources
![Page 6: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/6.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
EM Algorithm
k
iiiik Vyyf
1
,;;
The EM algorithm obtains the maximum likelihood estimate of by iteration. In the (m+1)th iteration, the estimates of the parameters of interest are updated by:
n
j
mij
mi n
1
)()1( /
n
j
n
j
miji
mij
mi y
1 1
)()()1( /
n
j
mij
Tmii
mii
n
j
mij
mi yyV
1
)()1()1(
1
)()1( /))((
);(/,; )()()()()( mj
mi
mij
mi
mij yfVy Where
Is the Posterior Probability that yj belongs to the i-th component of the mixture (…with a very elegant link to False Discovery Rate).
![Page 7: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/7.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
EM Algorithm
• We proceed for k = 1, 2, 3, …, and so on components.• Criteria for model selection includes the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC):
kkLAIC 2)ˆ(log2
)log()ˆ(log2 nLBIC kk
Where 13 kk Is the number of independent parameters in the mixture.
• Alternatively, the distribution of the likelihood ratio test (LRT) can be estimated by bootstrapping and P-values obtained to contrast a model with k components against a model with k + 1 components.
k
iiiik Vyyf
1
,;;
![Page 8: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/8.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Simulation 1Consider theseDistribution
N(1,5)N(5,10)
Records
10,000 5,000…and simulate
)10,5(31)5,1(3
2)ˆ;( NNyf The Mixture becomes:
);(,;
j
iijiij yf
Vy Posterior Prob:
LikelihoodN(1,5) N(5,10)
-1 0.120 0.021 0 0.161 4 0.036 1 0.178 0.056 5 0.036 3 0.126 7 0.005 0.103
Weighted average (by mixing proportions)
62
![Page 9: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/9.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Simulation 2Consider theseDistribution
N(0,1)N(0,10)
Records
9,000 1,000…and simulate
)10,0(1.0)1,0(9.0)ˆ;( NNyf The Mixture becomes:
Microarray
Non-DE Genes DE Genes
![Page 10: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/10.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of DistributionsSimulation 2 )10,0(1.0)1,0(9.0)ˆ;( NNyf
2. Ask EMMIX to fit mixtures with up to 5 components and…
)805.10,010.0(097.0)993.0,006.0(903.0)ˆ;( NNyf
3. EMMIX model of best fit:
1. Simulate:
![Page 11: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/11.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of DistributionsSimulation 2 )10,0(1.0)1,0(9.0)ˆ;( NNyf
)805.10,010.0(097.0)993.0,006.0(903.0)ˆ;( NNyf3. EMMIX best fit:
1. Simulate:
Frequency Post Prob
Posterior Probabilities are “Decision Function” changing at 2.75
![Page 12: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/12.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of DistributionsLinking Posterior Probabilities with False Discovery Rate
![Page 13: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/13.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of DistributionsLinking Posterior Probabilities with False Discovery Rate
Not-DE DESelect the N most extreme genes, and FDR is the average posterior probability of not being in the cluster of extremes.
![Page 14: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/14.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Simulation 2 )10,0(1.0)1,0(9.0)ˆ;( NNyf
)805.10,010.0(097.0)993.0,006.0(903.0)ˆ;( NNyf3. EMMIX best fit:
1. Simulate:
Post Prob
Select the N most extreme genes, and FDR is the average Post Prob of not being in the cluster of extremes.
FDR by N Genes
![Page 15: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/15.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Example“Diets”
(only REFERENCE components of the design)
88ii rg
iiHvLi
rgy
![Page 16: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/16.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Example“Diets”
(only REFERENCE components of the design)
88ii rg
iiHvLi
rgy
![Page 17: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/17.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Example“Diets”
(only REFERENCE components of the design)
88ii rg
iiHvLi
rgy
k
iiiik Vyyf
1
,;;
)32.2,41.2(366.0)42.10,30.2(590.0)46.67,87.0(044.0)ˆ;(
NNNyf
![Page 18: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/18.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Example
,)32.2,41.2(366.0)42.10,30.2(590.0)46.67,87.0(044.0)ˆ;( NNNyf
“Diets”(only REFERENCE components of the design)
![Page 19: A Quantitative Overview to Gene Expression Profiling in Animal Genetics](https://reader035.fdocuments.net/reader035/viewer/2022070501/56812a93550346895d8e445d/html5/thumbnails/19.jpg)
A Quantitative Overview to Gene Expression Profiling in Animal Genetics
Armidale Animal Breeding Summer Course, UNE, Feb. 2006
Mixtures of Distributions
Example
,)32.2,41.2(366.0)42.10,30.2(590.0)46.67,87.0(044.0)ˆ;( NNNyf
“Diets”(only REFERENCE components of the design)
FDR by N Genes
In Reverter et al. ‘03 (JAS 81:1900), 27 genes were reported as having a PP > 0.95 of being in the extreme cluster.
Now, we can assess that these 27 genes include a FDR < 10%.