The trivial case of the missing heritability
-
Upload
max-moldovan -
Category
Health & Medicine
-
view
271 -
download
1
Transcript of The trivial case of the missing heritability
![Page 1: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/1.jpg)
Max Moldovan Bioinformatics Division, WEHI
[email protected] Bioinformatics Seminar
December 08, 2009
The danger of following traditions: The trivial case of the missing
heritability
![Page 2: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/2.jpg)
Motivation ² It is well known that a number of human traits
are highly heritable ü Human height is 80-90% heritable (Visscher,
2008, Nature Genetics 40:489-490) ü Autism is more than 90% heritable (Sullivan,
2005, PLoS Med. 2:e212) ü Schizophrenia is more than 80% heritable
(Freitag, 2007, Mol. Psychiatr. 12:2-22) ü Heroin addiction is up to 60% heritable
(Tsuang et al., 1996, Am. J. Med. Gen. 67:473-477)
![Page 3: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/3.jpg)
Looking at genes – ~95% of heritability is missing
![Page 4: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/4.jpg)
Searching for genetic “dark matter”
q G x E and G x G interactions § How deep to go?
q Rare variants § With larger effect sizes?
q Structural variants § Deletions, duplications, inversions
q Epigenetics § Heritable?
q Overestimated heritability q Poorly characterized phenotypes
![Page 5: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/5.jpg)
The trivial case of the missing heritability
Talk outline:
q GWAS and inheritance models q Traditional inference q Efficiency robust inference q Empirical illustration q Discussion (implication to heritability)
![Page 6: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/6.jpg)
Genome-wide association study
(GWAS) q Genetic information (e.g. SNPs) is collected from
two groups of individuals – cases and controls –who are discordant with respect to a specified trait
q Genomes are analysed in order to define regions/markers where causative genetic variants are likely to reside
q One of the main analytical objectives is to detect associations between genotypes and the trait (phenotype)
![Page 7: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/7.jpg)
Genotype Group Model AA Aa aa
A is Dominant
A is Recessive
A is Co-Dominant
Inheritance models at a single bi-allelic locus
![Page 8: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/8.jpg)
Statistical tests for detection genotype-phenotype associations
q Cochran-Armitage trend test (CATT) is shown to be optimal if an inheritance model is known (Lettre et al., 2007, Genetic Epidemiology 31:358-362)
q In practice, the inheritance model is not know
q Co-dominant CATT is the traditional choice (see recently reported GWAS)
![Page 9: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/9.jpg)
Alternatives: Efficiency robust significance tests
q Statistical tests that remain sensitive to detection of genotype-phenotype associations even though the genetic model is either unknown or misspecified (Podgor et al., 1996, Stat. Med. 15:2095-2105)
q MAX test (Freidlin et al., 2002, Human Heredity 53:146-152) is one of efficiency robust testing strategies
![Page 10: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/10.jpg)
MAX efficiency robust testing
q MAX3 – additive (co-dominant), dominant and recessive CATTs:
TMAX3 = max(TA,TD,TR), then use TMAX3 to compute p-values
q MAX4 – additive (co-dominant), dominant,
recessive CATTs, plus Pearson’s Chi-sq: pp-min = min(pT-max,pChi-sq), then use pp-min as
test statistics to compute p-values
![Page 11: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/11.jpg)
Problems with MAX
q The distribution of MAX test statistics is either unknown or difficult to obtain (e.g. asymptotic approximations)
q Permutations procedures can be applied but they are extremely computationally intensive
q The p-values based on permutations or asymptotic approximations are not statistically valid p-values
![Page 12: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/12.jpg)
Statistical validity
Pr(P(Y) ≤ α|H0) ≤ α)
q Corresponds to a test of correct size, i.e. type I error does not exceed the nominal level α
![Page 13: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/13.jpg)
Additive approximate vs. Fisher CATT p-vals
0.0000 0.0002 0.0004 0.0006 0.0008 0.0010 0.0012 0.0014
0.0000
0.0002
0.0004
0.0006
0.0008
0.0010
0.0012
0.0014
approximate additive CATT p-values
exact additiv
e C
AT
T p
-valu
es
![Page 14: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/14.jpg)
Statistical validity
Liberal test with inflated size
Exact and conservative test
![Page 15: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/15.jpg)
Statistical validity
Liberal test with inflated size
Exact and efficient test
![Page 16: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/16.jpg)
Fisher-MAX p-values
The probability of each possible table (outcome):
Fisher-MAX p-value is the probability of tables
equally or more extreme than the observed i.e. with T(X1,X2) ≤ t(x1,x2):
![Page 17: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/17.jpg)
Fisher-MAX p-values
q Valid: (can be slightly conservative, but conservatism can be eliminated leading to exact p-values)
q Computationally feasible: for (n1,n2)=(162,131) take between 0.01 and 1.10 seconds per SNP to compute (~3-4 h. for 300K+ SNPs on a single CPU)
q Efficient: the test is sensitive to association signals even though the model can be either unknown or misspecified
![Page 18: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/18.jpg)
HCV genotype 1 progression
HCV infection
Clearance (~20%)
Chronic HCV (~80%)
No treatment response (~50%)
Treatment response (~50%)
Source: Based on NIH information
![Page 19: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/19.jpg)
Genome-wide analysis in Suppiah et al. 2009
q 162 cases (non-responders) vs 131 controls (responders), 311,159 SNPs
q Additive CATT was used with p-value cut-off 0.001
q One SNP was genome-wide significant q 306 SNPs passed the p-val < 0.001
threshold
![Page 20: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/20.jpg)
Reanalysis: additive CATT, MAX3 and MAX4 p-values for the same cut-off 0.001
![Page 21: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/21.jpg)
max(pMAX4) = 0.0028
Reanalysis: additive CATT, MAX3 and MAX4 p-values for the same cut-off 0.001
![Page 22: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/22.jpg)
max(pMAX4) = 0.0028 > 0.001
Reanalysis: additive CATT, MAX3 and MAX4 p-values for the same cut-off 0.001
![Page 23: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/23.jpg)
Summary q Looking at genomes - ~95% of heritability
is missing q There are several alternative inheritance
models q Traditional statistical inference
procedures miss association signals by not accounting for alternative inheritance models
q Can some heritability be hidden in overlooked association signals?
![Page 24: The trivial case of the missing heritability](https://reader034.fdocuments.net/reader034/viewer/2022042607/557d14ecd8b42a4f498b47f9/html5/thumbnails/24.jpg)
Bioinformatics, WEHI Melanie BahloTerry Speed
NTNU, Norway
Mette Langaas
AGRF Rust Turakulov
MBS, UniMelb Chris Lloyd
Acknowledgments
Millenium Institute & Westmead Children’s Hospital, Sydney
Vijay SuppiahDavid BoothJacob George
Funding ARC Linkage Grant