Whole genome association studies Introduction and practical Boulder, March 2009.

10
Whole genome association studies Introduction and practical Boulder, March 2009

Transcript of Whole genome association studies Introduction and practical Boulder, March 2009.

Page 1: Whole genome association studies Introduction and practical Boulder, March 2009.

Whole genome association studies

Introduction and practical

Boulder, March 2009

Page 2: Whole genome association studies Introduction and practical Boulder, March 2009.

100K+ SNPs, CNVs →100K+ SNPs, CNVs →

← ~

1000

s in

divi

dual

s←

~10

00s

indi

vidu

als

WholeGenome

AssociationStudy

WholeGenome

AssociationStudy

Page 3: Whole genome association studies Introduction and practical Boulder, March 2009.

…?

Control

Control

Scz

Scz

Associating phenotypic and genotypic variationAssociating phenotypic and genotypic variation

Page 4: Whole genome association studies Introduction and practical Boulder, March 2009.

Indirect association (linkage disequilibrium)

2 2 1 2 1 2 2 1 2 2 2 2 1 2

1 0 1 1 1 1 1 0 1 2 1 2 1 1

MDS PCA

Reference haplotypes(HapMap)

Observed genotypes

Imputed genotypes

Direct test of single SNP effect

Imputation of ungenotyped SNPs1) Increase coverage2) Facilitate meta-analysis across platforms3) Quality control (drop SNP/re-impute)

Empirical assessmentof ancestry1) Detect outliers, substructure2) Clusters or continuous indices3) Batch effects, relatedness, sample swaps, contamination, etc

Analytic tools to perform, validate and enhance basic single SNP WGAS, e.g.:

Page 5: Whole genome association studies Introduction and practical Boulder, March 2009.

Age-related macular degenerationAge-related macular degeneration

Page 6: Whole genome association studies Introduction and practical Boulder, March 2009.

Progress in type 2 diabetes and Crohn’s diseaseProgress in type 2 diabetes and Crohn’s disease

KCNJ11

PPARG

2000 2001 2002 2003 2004 2005 2006 2007 2008

TCF2WSF1CDKN2B/AIGF2BP2CDKAL1HHEXSLC30A8TCF7L2

NOD2

5q31

5p1310q213p21PTPN2IRGMIL12BNKX2-3

T2D - confirmed associated loci

Crohn’s - confirmed associated loci

TNFSF15 IL23RATG16L1

JAZF1CDC123ADAMTS9THADANOTCH2TSPAN8

PTPN22ITLN11q241q32CDKAL1MHC6q21CCR67p128q24

JAK210p1111q1312q1213q14ORMDL3STAT319p1321q21ICOSLG

Slide courtesy of Mark Daly

Page 7: Whole genome association studies Introduction and practical Boulder, March 2009.

5 x 10-8

CACNA1CAnkryin-G (ANK3)

Bipolar WGAS of 10,648 samplesBipolar WGAS of 10,648 samples

Sample Cases Controls P-valueSTEP 7.4% 5.8% 0.0013WTCCC 7.6% 5.9% 0.0008EXT 7.3% 4.7% 0.0002Total 7.5% 5.6% 9.1×10-9

Sample Case Controls P-valueSTEP 35.7% 32.4% 0.0015WTCCC 35.7% 31.5% 0.0003EXT 35.3% 33.7% 0.0108Total 35.6% 32.4% 7×10-8

X

>1.7 million genotyped and (high confidence) imputed SNPs

Ferreira et al (Nature Genetics, 2008)

Page 8: Whole genome association studies Introduction and practical Boulder, March 2009.

Main focus of many association studies:

additive effects of single common SNPs on disease

Main focus of many association studies:

additive effects of single common SNPs on disease

Interactionsdominant,recessive

Joint tests of aggregate effect, genes, pathways

Variants <1% frequency

Structural variants

Subtypes,endophenotypes

Page 9: Whole genome association studies Introduction and practical Boulder, March 2009.

Altshuler, Daly & Lander (2008) Science

Manolio, Brooks & Collins (2008) JCI

Further reading on association mapping and interpretation of GWAS findings

Maher (2008) Nature

Page 10: Whole genome association studies Introduction and practical Boulder, March 2009.

Practical session• Data are in ~pshaun/prac2/

• Software required: PLINK and Haploview

• PDF with instructions is ~pshaun/instruct.pdf

• Work through until section “Empirical assessment of population stratification”

• Use PLINK website for help (http://pngu.mgh.harvard.edu/purcell/plink/)