Ben Domingue Institute of Behavioral Science...
Transcript of Ben Domingue Institute of Behavioral Science...
![Page 1: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/1.jpg)
Genome-wide estimates of heritability
Ben DomingueInstitute of Behavioral ScienceUniversity of Colorado [email protected]
1/16
![Page 2: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/2.jpg)
I Genes → behaviors & outcomes of interest.
I Genome-wide data: FHS, HRS, AddHealth, etc....I Hard to get a handle on genotype/phenotype
connection.I GWAS results help, but have limited availability.I Even when available, polygenic scores have limited
predictive value.
What else can we do?
2/16
![Page 3: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/3.jpg)
GCTA
Genome-wide Complex Trait Analysis (GCTA) tells usabout heritability.
I GCTA estimates heritability without knowledge ofcausal variants.
I Instead uses “genetic similarity” (similar to logic oftwin studies).
3/16
![Page 4: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/4.jpg)
Method1. Estimate genome-wide similarity:
Ajk =1
N
∑i
(xij − 2pi)(xik − 2pi)
2pi(1− pi)
2. Then estimate mixed model:
y = Xβ + g + ε
where g ∼ MVN[0, σ2gA].
3. Heritability:σ̂2g
σ̂2g+σ̂2
ε
.
Complicated model & not the DGP.
4/16
![Page 5: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/5.jpg)
Method1. Estimate genome-wide similarity:
Ajk =1
N
∑i
(xij − 2pi)(xik − 2pi)
2pi(1− pi)
2. Then estimate mixed model:
y = Xβ + g + ε
where g ∼ MVN[0, σ2gA].
3. Heritability:σ̂2g
σ̂2g+σ̂2
ε
.
Complicated model & not the DGP.
4/16
![Page 6: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/6.jpg)
Method1. Estimate genome-wide similarity:
Ajk =1
N
∑i
(xij − 2pi)(xik − 2pi)
2pi(1− pi)
2. Then estimate mixed model:
y = Xβ + g + ε
where g ∼ MVN[0, σ2gA].
3. Heritability:σ̂2g
σ̂2g+σ̂2
ε
.
Complicated model & not the DGP.
4/16
![Page 7: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/7.jpg)
Method1. Estimate genome-wide similarity:
Ajk =1
N
∑i
(xij − 2pi)(xik − 2pi)
2pi(1− pi)
2. Then estimate mixed model:
y = Xβ + g + ε
where g ∼ MVN[0, σ2gA].
3. Heritability:σ̂2g
σ̂2g+σ̂2
ε
.
Complicated model & not the DGP.4/16
![Page 8: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/8.jpg)
Sensitivity to genetic architecture?
I Robust to# of causalvariants.
I Sensitive toLD.
5/16
![Page 9: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/9.jpg)
Sensitivity to genetic architecture?
I Robust to# of causalvariants.
I Sensitive toLD.
5/16
![Page 10: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/10.jpg)
Sensitivity to environment?
Could genetic similarity just be a proxy for environmentalsimilarity?
6/16
![Page 11: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/11.jpg)
My goal: Offer intuition and basic guidance on whenGCTA estimates may be reliable.
7/16
![Page 12: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/12.jpg)
Data
HRS: 4950 non-Hispanic whites, ≈ 1.5M autosomalSNPs.
I Height: 0.40
8/16
![Page 13: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/13.jpg)
Q1: Gen sim as function of SNPs
Correlation
50% Sample 0.9830% Sample 0.9510% Sample 0.83
r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88
9/16
![Page 14: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/14.jpg)
Q1: Gen sim as function of SNPs
Correlation
50% Sample 0.9830% Sample 0.9510% Sample 0.83
r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88
9/16
![Page 15: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/15.jpg)
Q1: Gen sim as function of SNPs
Correlation
50% Sample 0.9830% Sample 0.9510% Sample 0.83
r 2 = 0.01 0.57r 2 = 0.2 0.75r 2 = 0.5 0.88
9/16
![Page 16: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/16.jpg)
Q2: GWAS (height) variants
10/16
![Page 17: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/17.jpg)
Q2: GWAS (height) variants
10/16
![Page 18: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/18.jpg)
Q2: GWAS (height) variants
10/16
![Page 19: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/19.jpg)
Q2: GWAS (height) variants
10/16
![Page 20: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/20.jpg)
Q3: HeteroskedasticityHeteroskedasticiy is common problem.
I weight on height.I own education on paternal education.
Of concern here since we’re estimating variancecomponents.
I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls
level of heteroskedasticity and σ2ε controlsheritability.
Examine recovery of heritability, but def’n no longersimple.
11/16
![Page 21: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/21.jpg)
Q3: HeteroskedasticityHeteroskedasticiy is common problem.
I weight on height.I own education on paternal education.
Of concern here since we’re estimating variancecomponents.
I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls
level of heteroskedasticity and σ2ε controlsheritability.
Examine recovery of heritability, but def’n no longersimple.
11/16
![Page 22: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/22.jpg)
Q3: HeteroskedasticityHeteroskedasticiy is common problem.
I weight on height.I own education on paternal education.
Of concern here since we’re estimating variancecomponents.
I Simulate outcome based on GCTA model.I y = 0.5 · height + g + ε.I εi has variance exp(α · height · σ2ε ), where α controls
level of heteroskedasticity and σ2ε controlsheritability.
Examine recovery of heritability, but def’n no longersimple.
11/16
![Page 23: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/23.jpg)
Q3: Heteroskedasticity
12/16
![Page 24: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/24.jpg)
Q3: Heteroskedasticity
12/16
![Page 25: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/25.jpg)
Q4: Environmental Moderation
Heritability not constant: What are implications forGCTA?
I Standard GCTA: g ∼ MVN[0, σ2gA].
I We simulate data using g ∼ MVN[0,A′] where(i , j)-th entry of A′ is hihjAij .
13/16
![Page 26: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/26.jpg)
Q4: Environmental Moderation
Heritability not constant: What are implications forGCTA?
I Standard GCTA: g ∼ MVN[0, σ2gA].
I We simulate data using g ∼ MVN[0,A′] where(i , j)-th entry of A′ is hihjAij .
13/16
![Page 27: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/27.jpg)
Q4: Environmental Moderation
What if weignoreenvironment?
14/16
![Page 28: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/28.jpg)
Q4: Environmental Moderation
What if weallow forenvironmentalvariation?
15/16
![Page 29: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/29.jpg)
I LD is important consideration (aside: I’m skepticalabout using KING or REAP estimates).
I Heteroskedasticiy leads to inflation of h2 estimates.
I Environmental differences are likely to beproblematic (and yet may be rampant?).
In closing: GCTA is like a table saw.
16/16
![Page 30: Ben Domingue Institute of Behavioral Science …gero.usc.edu/CBPH/files/4_29_15_PAA/Genome-wide...Ben Domingue Institute of Behavioral Science University of Colorado Boulder ben.domingue@gmail.com](https://reader033.fdocuments.net/reader033/viewer/2022051802/5afcd9397f8b9a444f8ca7bb/html5/thumbnails/30.jpg)
I LD is important consideration (aside: I’m skepticalabout using KING or REAP estimates).
I Heteroskedasticiy leads to inflation of h2 estimates.
I Environmental differences are likely to beproblematic (and yet may be rampant?).
In closing: GCTA is like a table saw.
16/16