National Cancer Advisory Board Genome-Wide Association … · 2011-02-17 · National Cancer...
Transcript of National Cancer Advisory Board Genome-Wide Association … · 2011-02-17 · National Cancer...
National Cancer Advisory Board
Genome-Wide Association Studies and the Road Ahead
Stephen Chanock, M.D.
Chief, Laboratory of Translational GenomicsDirector, Core Genotyping Facility
Division of Cancer Epidemiology and Genetics
December 7, 2010
Basic principle of genetic association studies in unrelated individuals
Genome wide SNP chips
Selection of surrogate SNPs to capture common genetic variation
Millions of common SNPs across the genomeUse linkage disequilibrium across the genome
Association with disease risk in discovery followed by replication studiesMapping of susceptibility loci
“What’s past is prologue”William Shakespeare
• In the mid 60’s, DCEG began collection of biospecimens from high risk families.
• This approach was extended in the 80’s to high-quality population studies.
• Scientific progress results from application of new technology.
• Remarkable opportunity for discovery of germ-line contributions to cancer.
PSCA
BNC2
ADH1B
C20orf54
ALDH2
FOXE1
NKX2-1
6p21.33
CTBP2
11p15THADA
EHBP1
ITGA6
3p12.1
EEFSECPDLIM5
TET2
SLC22A3
JAZF1
LMTK2
8p21
8q24.21(x5)
MSMB/NCOA4
11q13 (x3)
HNF1B (x2)
17q24.3
19q13.2
KLK2/KLK3
BIK
NUDT10/NUDT11
38 Prostate
C2orf43
5p15
FOXP4GPRC6A
13q22
10p14
11q23.1
BMP4
GREM1CDH1
SMAD7
RHPN2
BMP2
EIF3H
8q24.21
13 Colorectal
PRKD215q21.315q23 16q24.1
11q24.1
2q13
FARP2
IRF4
8q24.21
CDKN2A/CDKN2B
RTEL1
PHLDB1
TERT
CDKN2A/CDKN2B
CCDC26
5 Glioma
1p36
1q42
KIF1B
TERT/CLPTM1L
7q32
7 Basal Cell Carcinoma
TYR
KRT5
ASIP
MC1R
CDKN2A/CDKN2B
TYRPLCE1
ASIP
4 Melanoma
1q21.1
BARD1
3 Neuroblastoma
6p22
KLF5/KLF12
ABO1q32.1
4 Pancreas
CLPTM1L
ITGA9
7 Nasopharyngeal
GABBR1HLA-FHLA-A
IKZF1
ARIDB5
CEBPE
CHRNA3/CHRNA5
6p21.32
TERT/CLPTM1L
3 Lung
SPRY4
BAK1 KITLG
6 Testicular
GSTM1deletion
6** Bladder
TACC3
NAT2
PSCA
MYCFGFR2
LSP1
TOX3 COX11/STXBP4
RAD51L1
1p11.2
2q35
SLC4A7/NEK10
5p125q11.2
ECHDC1/RNF146C6orf97/
ESR1
19* Breast
4 Pediatric Acute Lymphoblastic Leukemia 4 Esophageal Squamous
2 Thyroid 2 Non-Hodgkin4 Ovary 3 Gastric
CASP8
10 CLL
Published GWAS Etiology Hits: 12.1.10~140 Loci marked by SNPs1 Locus marked by a CNV
Chung & Chanock 2010
ZNF365
10p15.1
10q22.3
DMRT1
ATF7IP
8 Multiple
1q22
1 Liver
2q31
9p13
12q13.13
3q26
1q41TP63
GATA3REL
3 Hodgkins
Profile of Cancer GWAS Regions
• Nearly all map to non-coding regions of the genome
• ~20% map to ‘gene’ deserts• Estimated Effect Sizes (Odds Ratios) are
small (< 1.5)Exception is KITLG in Testicular Cancer
• Multi-Cancer Susceptibility RegionsExamples: 8q24, 5p33.15 (TERT), 11q13
Gene-Environment Interaction for Bladder Cancer Risk: Large International Study Shows NAT2 Slow Acetylation
Increases Risk Only for Smokers
P-interaction = 2.8x10-4 Rothman et al., Nat Genet, 2010
Prostate Cancer Risk Factors2006
• Age• Ethnic Background• Family History
CTBP2
11p15THADA
EHBP1
ITGA6
3p12.1
EEFSECPDLIM5
TET2
SLC22A3
JAZF1
LMTK2
8p21
8q24.21
MSMB/NCOA4
11q13
HNF1B
17q24.3
19q13.2
KLK2/KLK3
BIK
NUDT10/NUDT11
Prostate Cancer (38 as of Nov 2010)
PSA or Prostate Cancer or both??Type 2 Diabetes
No distinction between advanced and early disease
2
3
5
10q11.23
5p15.33
13q22.1
FOXP4
2p24.1
RFX6
3p11
2q27.3
12q13
Prostate Cancer Risk Factors2010
• Age• Ethnic Background• Family History• Multiple common alleles
• Each common variant provides a small contribution
• Inverse Relationship of Type 2 Diabetes & Prostate Cancer
1. Association Finding
2. Mapping Region
3. Assessment of Variants
4. Secondary Association
ATGGAAAGGGAGCATATTACGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTATGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGAGGAACGAATGGAAAGGGAGCATATTATGCACTATGAAGCGGAGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGAGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTATGCACTATGAAGCGGAGGAACGAATGGAAAGGGAGCATATTATGCACTATGAAGCGGGGGAACGAATGGAAAGGGAGCATATTACGCACTATGAAGCGGGGGAACGA
GWAS or Linkage
Regional Re-Sequence HapMap1000 Genomes Data
Choose Variants for Testing
Follow-up Genotyping in Large Studies
Nominating Optimal Variants for Functional Studies
Follow-up of GWAS Hits
Non-ProteinCoding
BioinformaticAnalysis
ExperimentalStrategy
UnannotatedTranscript
RegulatoryElement
EpigeneticAlteration ofGene Levels Effect on Genes
ElsewhereNovel
Transcripts
Test FunctionalElements
In vitro/vivo
miRNARNASeq
ExpressionQuantitative
Trait LociAnalysis
Analyze HistoneMethylation
Elements
The Majority of GWAS Map to Non-Coding Regions
Functional Elements
10q11.2 & Prostate CancerTop Hit: rs10993994Promoter of MSMB
MSMB= β-microseminoproteinProstate specificSerum marker in studies
20,000 subjects
Risk Allele “T”Lower expression levels
Reporter assaysElectromobility Shift AssaysLevels in Prostate TissueTumor Tissue
TT Decrease Expression in NCI-60 Cell lines
Functional Analysis
10q11.2 Could Be More Complex……MSMB and NCOA4
RNA Expression MSMB and NCOA4 Normal Tumor Tissue
Anchorage Independent Growth is Specific to ProstateMSMB- SuppressionNCOA4- Over-expression
Cancer Susceptibility Loci Across 8q24
MYC209 Kb126 Kb231 Kb 58 Kb
Pros
tate
Reg
ion
2
Pros
tate
Reg
ion
5
Pros
tate
Reg
ion
4
Pros
tate
Reg
ion
3
Pros
tate
Reg
ion
1
Bre
ast R
egio
n
Col
orec
tal R
egio
n*
CLL
Reg
ion
Bla
dder
Reg
ion
Ova
rian
Reg
ion
* Also Colorectal Adenoma
Chung and Chanock
Genome-wide association studies
Association testing
Biometrics Nutrient levelsBehavioral traits
Tobacco Height, Weight, BMI, Vitamins D,B12Caffeine Menarche/Menopause CaroteneAlcohol
International Consortia (e.g., GIANT, SUNLIGHT)
Genome-wide association studies
Relationship testingSex verificationPopulation genetics
and ancestry
Quality Control
Pursuit of X-Chromosome Aneuploidy and Cancer Risk
Genome-wide association studies
Privacy & Confidentiality GWAS membership
Large chromosomal abnormalities, structural variation, aneuploidy in
Germ-line DNA
Rodriguez-Santiago AJHG 2010 Jacobs Nature Genetics 2009
Deeper Understanding
Genetic Predisposition to Breast CancerEuropean Population
1
1.1
10 0.1 0.9
3
10 BRCA1BRCA2
TP53PTENATM
CHEK2
PALB2BRIP1RAD51C
1.2
1.3
1.4
1.5
0.2 0.3 0.4 0.5 0.6 0.7 0.8
Popu
latio
n ge
noty
pe r
elat
ive
risk
Population risk-allele frequency
BCACCGEMS/BCACWTCCCOther
??
E
Risk prediction utility Crohn’s Disease
Sibling relative-risk=20-35Common (BPC) cancersSibling relative risk=2-3
RandomUsing known lociUsing all estimated lociIdeal (if we could explain all heritability) Park et al Nature Genetics 2010
Predicting Prostate Cancer Risk
35 common SNPs from GWASMen between 60 and 65 yearsNCI Breast and Prostate Cohort
Consortium Area Under the CurveFH Alone= 0.52FH + 35 SNPs= 0.66
ROC=Receiver Operator Curve
Status of Genome-Wide Association Studies
Discovery of New Regions in the Genome Associated with Diseases/Traits
• New “Candidate Genes” Clues for Mechanistic Insights Using Common Variants
• Etiology• Gene-Environment/Lifestyle Interactions• Outcomes & Pharmacogenomics
Challenge of Genetic Markers for Risk Prediction for Individual or Public Health Decisions Common Variants Represent a Fraction of the Genetic
Contribution to Risk Integration of Lifestyle/Environment
Acknowledgements
NCI-DCEGJoseph FraumeniGilles ThomasRobert HooverMeredith YeagerKevin JacobsNilanjan ChatterjeeJuHyun ParkSholom WacholderMila Prokunina-OlssonLaufey AmundadottirPeggy TuckerCharles Chung
NCI- OCGDaniela GerhardNCI-CCRMike DeanHong Lou
HSPHDavid HunterPete Kraft
CGEMS StudiesACS (M Thun)ATBC (D Albanes)CAPS (H Gronberg/J Xu)CeRePP (O Cussenot)CONOR (L Vatten)EPIC (E Riboli)JHU (W Issacs/J Xu)MEC (B Henderson)PLCO (R Hayes)WHI (R Prentiss)
DFCIMatt FreedmanMark Pomerantz
Bladder Cancer GWASNat RothmanDebra SilvermanMontse Garcia-Closas
Renal Cancer GWASM PurdueP BrennanM LathropM Johansson
Cohort Consortium Strategic PlanB
iom
edic
al R
esea
rch
Com
mun
ity Cohort Consortium
Large, High-Quality Cohorts
Biospecimen Collection
Open (Participation and Access)
International
Interdisciplinary
Collaborative
Integrated Databases and Biomarkers
New PopulationCohortsSpecific Cancer ProposalsOther DiseaseProposals
Genetic SusceptibilityMarkers (Replication)
Genetic Pathways and Gene-Environment Interactions (Pooling)
Early Detection Markers
Mechanistic Insights(Integrative Biology)
Mapping Complex Diseases:Identifying Genetic Markers
Comprehensive Catalog of Genetic Variation Enables ‘Agnostic’ Search
Frequency of Study Design forSNP Markers DiscoveryCommon >10% GWASUncommon 1-10% GWAS or SequencingRare < 1% Linkage or Sequencing
SNP= Single Nucleotide Polymorphism
Next Generation Sequencing of
Germline
Regional GWAS and linkage follow-up
Candidate exon
Whole genome
Whole exome Functional Work to Define:Direct Association
11q13: Multi-Cancer Susceptibility Region
660 kB across the region
Recombination hotspots and LD structure in 11q13:GWAS Region with no candidate genes
Sampled 900 controls5 times drawn from 10 studies
2.5 M SNP Chip• Beta tested at CGF• High performance at CGF (~750 samples)
– Completion rates of 99% with concordance > 99%• 2.5 M Array
– Average MAF=10.4% (range 3-15%)– Based on 3 algorithms (UofM, Broad and WTSI)– New common regions not covered on prior chips
• 2.5 M S (2nd half of 5 M)– Design begins in November 2010– Projected beta test in Spring 2011
IMPUTATION STUDY @ CGF750 Caucasians (250 of ACS, ATBC & PLCO)
2.5 M SNP array1.0M Omni array1.0M Duo or 660w arrayAffymetrix 6.0
100 AA & 75 Chinese with 2.5 M and 1 M/660wDeposit in dbGaP in Winter of 2010Approximately 3.4 M genotyped SNPs in CEU for
analysis with 1000 CEU data (new download)
Validation for 42 events: 100% Validation Allelic imbalance &- normal dosage UPD- abnormal dosage CNV
5 / 6 events: also found in bladder tissue; early developmental eventsRodríguez-Santiago et al. Am J Hum Genet. 2010;87:129-38
Large Structural Events (>5 Mb) by Chromosome
50,520 Elderly Subjects>550,000 SNPsIllumina Infinium
MDS Region
Germline Mosaic EventsCopy Neutral LOHDuplication/trisomyDeletionUPD