Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.
-
Upload
audrey-holmes -
Category
Documents
-
view
221 -
download
0
Transcript of Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.
![Page 1: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/1.jpg)
Statistical methods forgenetic association studies
http://www.stats.gla.ac.uk/~paulj/assoc_study_stats.ppt
![Page 2: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/2.jpg)
A tutorial on statistical methods for population association studies
David Balding
Nature Reviews Genetics (2006) 7:781-791
![Page 3: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/3.jpg)
Environment
G×E interaction
Genetics
Health outcome
or
?
![Page 4: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/4.jpg)
Recombination
A X
a x
Gametophytes(gamete-producing cells)
Gametes
a X
A x
Recombination
B
B
b
b
X/x: unobserved causative mutation
A/a: distant marker
B/b: linked marker
![Page 5: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/5.jpg)
Approaches to finding disease genes
• Population-based association study– “unrelated” subjects
• Family-based association study– nuclear families
• Admixture mapping– recently admixed population
• Linkage mapping– large pedigrees
Darvasi & Shifman (2005) Nature Genetics
![Page 6: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/6.jpg)
Types of population association study
• Candidate causative polymorphism– SNP (single nucleotide polymorphism), deletion, duplication
• Candidate causative gene (5-50 marker SNPs)– evidence from linkage study or function
• Candidate causative region (100s of marker SNPs)– evidence from linkage study
• Genome-wide (>300,000 marker SNPs)– no prior evidence required
![Page 7: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/7.jpg)
Common disease common variant (CDCV) hypothesis
![Page 8: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/8.jpg)
• Assuming mating is random and the population is large, HWE genotype frequencies will apply
• Allele frequencies:P(X) = pP(x) = q
• HWE genotype frequencies:P(XX) = p2
P(Xx) = 2pqP(xx) = q2
• Useful data quality check:– chi-squared or exact test– log QQ plot
• But can discard causative mutations
p q
p p2 pq
q pq q2
Preliminary analysis: data quality
![Page 9: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/9.jpg)
Log QQ plot
![Page 10: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/10.jpg)
Preliminary analysis: dealing with missing data
• Imputation– various methods: maximum likelihood; probalistic;
‘hot-deck’; regression modelling– test for independence of ‘missingness’ and case-
control status
![Page 11: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/11.jpg)
Choice of inheritance model
Dominant vs additive inheritance
0%
50%
100%
0 1 2
Number trait alleles inherited
Tra
it v
alu
e
Dominant
Additive
![Page 12: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/12.jpg)
Dominant vs additive inheritance
0%
50%
100%
0 1 2
Number trait alleles inherited
Tra
it v
alu
e
Dominant
Additive
Choice of inheritance model
![Page 13: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/13.jpg)
Dominant vs additive inheritance
0%
50%
100%
0 1 2
Number trait alleles inherited
Tra
it v
alu
e
Dominant
Additive
Choice of inheritance model
![Page 14: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/14.jpg)
Tests of association: single SNP
• Case-control– Treat genotype as factor with 3 levels, perform 2x3 goodness-of-
fit test. Loses power if effect is additive– Count alleles rather than individuals, perform 2x2 goodness-of-fit
test. Out of favour because• sensitive to deviation from HWE• risk estimates not interpretable
Major allele homozygote (0)
Heterozygote (1) Minor allele homozygote (2)
Case
Control
![Page 15: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/15.jpg)
Tests of association: single SNP
• Case-control– Cochran-Armitage test
• loses power if additivity assumption wrong
Cochran-Armitage test
![Page 16: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/16.jpg)
Tests of association: single SNP
• Case-control– Armitage or goodness-of-fit? Depends on:
• Prior knowledge of inheritance (additive, dominant, etc)
• Genotype frequencies, e.g. use Armitage test when minor allele is rare, goodness-of-fit test otherwise
![Page 17: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/17.jpg)
Tests of association: single SNP
• Case-control– Logistic regression
• Easily incorporates inheritance model (additive, dominant, etc)
• But assumes phenotype is outcome variable not genotype, so easier to justify for prospective studies
![Page 18: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/18.jpg)
Tests of association: single SNP
• Continuous outcome– Linear regression
• Ordered categorical outcomes– Multinomial regression
![Page 19: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/19.jpg)
Problems: population stratification
Cases
![Page 20: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/20.jpg)
Correcting for population stratification
• Genomic control– Genotype null SNPs and use to calculate background
inflation in test statistic due to population stratification– Limited to simple single-SNP analyses– Can over- or under-correct
• Other approaches using null SNPs– Regression, principal components analysis, model
underlying demography
![Page 21: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/21.jpg)
Problems: multiple testing
• Bonferroni correction– conservative when SNPs are linked
• Permutation– computationally demanding
• False discovery rate• Bayesian approaches
![Page 22: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/22.jpg)
• Advantages– Many SNPs may be linked to a gene, but individually may not
have a significant effect– Interactions between SNPs can be modelled– ‘Tag’ SNPs can reduce testing of redundant linked SNPs
• Methods– Linear regression, logistic regression– Armitage test
• Haplotype-based methods– Natural interpretation– But power reduced due to multiple alleles
Tests of association: multiple SNPs
![Page 23: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/23.jpg)
Haplotypes
Nature Genetics 37, 915 - 916 (2005)
![Page 24: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/24.jpg)
![Page 25: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/25.jpg)
Inferring haplotype phase
![Page 26: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/26.jpg)
Inferring haplotype phase
?
![Page 27: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/27.jpg)
Inferring haplotype phase
![Page 28: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/28.jpg)
Inferring haplotype phase
![Page 29: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/29.jpg)
Methods & software• PHASE, FASTPHASE• EH+• FBAT• HAPLOTYPER• EM-DECODER• PLEM• HAP• HAPLORE• Haplo.stat • SNPEM• PEDPHASE• SNPHAP• TDTHAP
Inferring haplotype phase
![Page 30: Statistical methods for genetic association studies paulj/assoc_study_stats.ppt.](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515c90b55034689058b4923/html5/thumbnails/30.jpg)
• Phase cases and controls separately or pooled?– Separating can give inflated type I error– Pooling can reduce power
Inferring haplotype phase