Post on 20-Dec-2015
SNP Discovery and Analysis: Application to Association Studies
Mark J. Rieder, PhDMark J. Rieder, PhDDana Crawford, PhDDana Crawford, PhD
Deborah Nickerson, PhDDeborah Nickerson, PhD
SeattleSNPs PGASeattleSNPs PGAJuly 19-20, 2005July 19-20, 2005
Practical Aspects of SNP Association Studies
1. SNP Discovery: Where do I find SNPs to use in my association studies? (e.g. databases, direct resequencing)
2. SNP Selection:How do I choose SNPs that are informative?(i.e. assessing SNP correlation - linkage disequilibrium)
3. SNP Associations:What analyses can I perform after genotyping these SNPs?(e.g. single SNP data, haplotype data)
4. SNP Replication/Function:How is function predicted or assessed. (e.g. nonsynonymous SNPs, conserved non-coding regions (CNS)
transcription factor binding sites, gene expression)
SeattleSNPs Program for Genomic Applications: Overview
Aim 1: To establish a variation discovery resource capable of comprehensive resequencing of candidate genes related to HLBS.
Biological Focus: InflammationBiological Focus: InflammationGenes and Pathways: Coagulation, Complement, CytokinesGenes and Pathways: Coagulation, Complement, Cytokines Interacting PartnersInteracting Partners
SNPs in Candidate Genes
Average Gene Size - 26.5 kb ~ Compare 2 haploid - 1 in 1,200 bp
~130 SNPs (200 bp) - 15,000,000 SNPs
~ 44 SNPs > 0.05 MAF (600 bp) - 6,000,000 SNPs
SeattleSNPs
SeattleSNPs PGA: Candidate Gene SNP Resource
• 4.9 Mb in 47 individuals = 230 Mb total sequence • Define sequence diversity - catalogue all SNPs• Select “optimal” tagSNPs sets • Determine haplotype structure • Provide necessary baseline data for association studies
Warfarin Pharmacogenetics1. Background
• Warfarin characteristics• Pharmacokinetics/Pharmacodynamics• Discovery of VKORC1
2. VKORC1 - SNP Discovery
3. VKORC1 - SNP Selection (tagSNPs)
4. VKORC1 - SNP Testing• SNP/Haplotype Inference• Haplotype Inference, Testing
5. VKORC1 - SNP Replication/Function
Pharmacogenomics as a Model for Association Studies
Reduce variability and identify outliers. Prospective testing
Personalized Medicine
Clear genotype-phenotype link intervention variable responsePharmacokinetics - 5x variation
Quantitative intervention and responsedrug dose, response time, metabolism rate, etc.
Target/metabolism of drug generally knowngene target that can be tested directly with response
Warfarin Background• Commonly prescribed oral anti-coagulant
• In 2003, 21.2 million prescriptions were written for warfarin (Coumadin)
• Prescribed following MI, atrial fibrillation, stroke,venous thrombosis, prosthetic heart valve replacement,and following major surgery
• Difficult to determine effective dosage- Narrow therapeutic range - Monitoring of prothrombin time (INR) - 2.0 - 3.0
- Large inter-individual variation
Add warfarin dose distribution
Patient/Clinical/Environmental Factors
0
10
20
30
40
50
0 2 4 6 8 10 12 14 16
Warfarin Dose (mg/d)
No. of patients
Ave: 5.2 mg/dn = 186European-American30x dose variability
Pharmacokinetic/Pharmacodynamic - Genetic
Vitamin K-dependent clotting factorsVitamin K-dependent clotting factors(FII, FVII, FIX, FX, Protein C/S/Z)(FII, FVII, FIX, FX, Protein C/S/Z)
EpoxideReductase
-Carboxylase(GGCX)
Warfarin inhibits the vitamin K cycle
Warfarin
Inactivation
CYP2C9
Pharmacokinetic
Warfarin Metabolism (Pharmacokinetics)Warfarin Metabolism (Pharmacokinetics)
Major pathway for termination of pharmacologic effect Major pathway for termination of pharmacologic effect is through metabolism of S-warfarin in the liver by CYP2C9is through metabolism of S-warfarin in the liver by CYP2C9
• CYP2C9CYP2C9 SNPs alter warfarin metabolism: SNPs alter warfarin metabolism:
CYP2C9*1 (WT) - normalCYP2C9*1 (WT) - normalCYP2C9*2 (Arg144Cys) - low/intermediateCYP2C9*2 (Arg144Cys) - low/intermediateCYP2C9*3 (Ile359Leu) - low CYP2C9*3 (Ile359Leu) - low
• CYP2C9CYP2C9 alleles occur at a significant minor allele frequency alleles occur at a significant minor allele frequencyEuropean: *2 - 10.7% *3 - 8.5 % European: *2 - 10.7% *3 - 8.5 % Asian: *2 - 0% *3 - 1-2% Asian: *2 - 0% *3 - 1-2% African-American: *2 - 2.9% *3 - 0.8% African-American: *2 - 2.9% *3 - 0.8%
Effect of CYP2C9 Genotype on Anticoagulation-Related Outcomes(Higashi et al., JAMA 2002)
WARFARIN MAINTENANCE DOSE
0
1
2
3
4
5
6
7
8
9
*1/*1 *1/*2 *2/*2 *1/*3 *2/*3 *3/*3
mg Warfarin/day
N 127 28 4 18 3 5
mg
war
fari
n/d
ay
- Variant alleles have significant clinical impact- Variant alleles have significant clinical impact- Still large variability in warfarin dose (15-fold) in *1/*1 “controls”?- Still large variability in warfarin dose (15-fold) in *1/*1 “controls”?
TIME TO STABLE ANTICOAGULATION
CYP2C9-WT ~90 days
*2 or *3 carriers take longer to reach stable anticoagulation
CYP2C9-Variant ~180 days
Analysis of Analysis of IndependentIndependent Predictors of Warfarin Dose Predictors of Warfarin Dose
Variable Change in Warfarin Dose P value
Target INR, per 0.5 increase 21% <0.0005
BMI, per SD 14% <0.0001
Ethnicity (African-American, [Asian]) 13%, [ 10-15%] 0.003
Age, per decade 13% <0.0001Gender, Female 12% <0.0001
Drugs (Amiodarone) 24% 0.007
CYP2C9*2, per allele 19% <0.0001
CYP2C9*3, per allele 30% <0.0001
Adapted from Gage et al., Thromb Haemost, 2004
~ 30% of the variability in warfarin dose is explained by these factors
What other candidate genes are influencing warfarin dosing?What other candidate genes are influencing warfarin dosing?
Vitamin K-dependent clotting factorsVitamin K-dependent clotting factors(FII, FVII, FIX, FX, Protein C/S/Z)(FII, FVII, FIX, FX, Protein C/S/Z)
EpoxideReductase
-Carboxylase(GGCX)
Warfarin acts as a vitamin K antagonist
Warfarin
Inactivation
CYP2C9
Pharmacodynamic
New Target Protein for Warfarin
EpoxideReductase
-Carboxylase(GGCX)
Clotting Factors(FII, FVII, FIX, FX, Protein C/S/Z)
Rost et al. & Li, et al., Nature (2004)
(VKORC1)
5 kb - chr 165 kb - chr 16
Warfarin Resistance VKORC1 Polymorphisms
• Rare non-synonymous mutations in VKORC1 causative for warfarin resistance (15-35 mg/d)• NONO non-synonymous mutations found in ‘control’ chromosomes (n = ~400)
Rost, et. al. Nature (2004)
Warfarin maintenance dose (mg/day)
Inter-Individual Variability in Warfarin Dose: Genetic Liabilities
SENSITIVITYSENSITIVITY
CYP2C9 coding
SNPs - *3/*3
RESISTANCERESISTANCEVKORC1
nonsynonymous coding SNPs
0.5 5 15
Fre
qu
ency
Common Common VKORC1VKORC1
non-coding non-coding SNPs?SNPs?
SNP Discovery: Resequencing SNP Discovery: Resequencing VKORC1VKORC1
• PCR amplicons --> Resequencing of the complete genomic region
• 5 Kb upstream and each of the 3 exons and intronic segments; ~11 Kb
• SeattleSNPs PGA - pga.gs.washington.edu (24 African-Am./23 Europeans)
• Warfarin treated clinical patients (UWMC): 186 European
• Other populations: 96 European, 96 African-Am., 120 Asian
Summary of PGA samples (European, n = 23)Total: 13 SNPs identified 10 common/3 rare (<5% MAF)
Clinical Samples (European patients n = 186)Total: 28 SNPs identified 10 common/18 rare (<5% MAF)
15 - intronic/regulatory7 - promoter SNPs2 - 3’ UTR SNPs3 - synonymous SNPs1 - nonsynonymous
- single heterozygous indiv. - highest warfarin dose = 15.5 mg/d
How does the comprehensive SNP discovery compare to How does the comprehensive SNP discovery compare to what was known for this gene?what was known for this gene?
SNP Discovery: Resequencing Results
dbSNP -NCBI SNP database
SNP Discovery: dbSNP database
SeattleSNPs Resequencing 28 SNPs --> 15 SNPs gene region
10 dbSNPs • 8/10 confirmations
• 3 frequency/genotype data
• 7 new dbSNP entries generated by SeattleSNPs resequencing
• 8 dbSNPs/15 SNPs (~50%)
SNP Discovery: dbSNP database (VKORC1)
SNP Discovery: dbSNP database
Nickerson and Kruglyak, Nature Genetics, 2001
Mar 2005Mar 2005 - 5.0 million (validated - 1/600 bp) - 5.0 million (validated - 1/600 bp)
5.0/10.0 = 50% of all common SNPs (validated)!5.0/10.0 = 50% of all common SNPs (validated)!
SNP discovery is dependent on your sample population sizeSNP discovery is dependent on your sample population size
0.0 0.2 0.3 0.4 0.50.10.0
0.5
1.0
Minor Allele Frequency (MAF)
Fra
ctio
n o
f S
NP
s D
isco
vere
d
2
4824
16
8
96
GTTACGCCAATACAGGTTACGCCAATACAGGGATCCAGGAGATTACCATCCAGGAGATTACCGTTACGCCAATACAGGTTACGCCAATACAGCCATCCAGGAGATTACCATCCAGGAGATTACC{{2 chromosomes2 chromosomes
Rarer and population specific SNPs are found by resequencing
SNP Discovery: dbSNP database
Minor Allele Freq. (MAF)
dbSNP (Perlegen/HapMap) SeattleSNPs
Minor Allele Freq. (MAF)
{ 75%75%50%50%
25%25%
0.00
1.00
2.00
3.00
4.00
5.00
6.00
Jan-03 Mar-03 Jun-03 Aug-03 Oct-03 Jan-04 Mar-04 Jun-04 Sep-04 Jan-05 Mar-05
dbSNP Release
SNPs(millions)
Validated SNPs
SNPs with Genotypes
PerlegenPerlegenDataData
dbSNP: Increasing numbers of SNPs now have genotype datadbSNP: Increasing numbers of SNPs now have genotype data
HapMapHapMapPhase IIPhase IIPerlegenPerlegen
0.00
2.00
4.00
6.00
8.00
10.00
12.00
Jan-03 Mar-03 Jun-03 Aug-03 Oct-03 Jan-04 Mar-04 Jun-04 Sep-04 Jan-05 Mar-05
dbSNP Release
SNPs(millions)
Total Reference SNPsValidated SNPsGenotyped SNPs
Current State of dbSNPCurrent State of dbSNP
Many SNPs left to validate and characterize.Many SNPs left to validate and characterize.
Development of a genome-wide SNP map: How many SNPs?Development of a genome-wide SNP map: How many SNPs?
Nickerson and Kruglyak, Nature Genetics, 2001
~ 10 million common SNPs (>1- 5% MAF) - 1/300 bp~ 10 million common SNPs (>1- 5% MAF) - 1/300 bp
Mar 2005Mar 2005 - 5.0 million (validated - 1/600 bp) - 5.0 million (validated - 1/600 bp)
5.0/10.0 = 50% of all common SNPs validated!5.0/10.0 = 50% of all common SNPs validated!Coming Soon!Coming Soon! 5.0 million validated SNPs with genotypes! 5.0 million validated SNPs with genotypes!
dbSNP Issues:
Not comprehensive catalog (50% of SNPs)
Is the data confirmed? (50% are validated)
Information about allele frequency/population (50%)
No information about SNP correlations (linkage disequilibrium)genotyping efficiency
SNP Discovery: dbSNP database
• Common SNPs• VKORC1 - 28 total - 10 SNPs > 10% MAF
• Evaluate linkage disequilibrium (non-random association ofgenotype data)
Does common variation in VKORC1 have a role in determiningDoes common variation in VKORC1 have a role in determiningwarfarin dose?warfarin dose?
Warfarin Dose (mg/d)
Fre
quen
cy
SNP Selection: Using Linkage Disequilibrium
T G 0.5 X 0.5 = 0.25 0.48 *
C : 50%
T : 50%
A : 50%
G : 50%
Site 1 Site 2
C A 0.5 X 0.5 = 0.25 0.50 * C G 0.5 X 0.5 = 0.25 0.01 T A 0.5 X 0.5 = 0.25 0.01
C
T
A
G
Site 1 Site 2
Maternal
Paternal
* Sites Correlated
Possible2-site comb. Expected Freq. Observed Freq.
SNP Selection: Using Linkage Disequilibrium
SNP Selection: Using Linkage Disequilibrium• SNP discovery data (i.e. population of samples with genotypes)• Find all correlated SNPs to minimize the total number of SNPs• Maintains genetic information (correlations) for that locus
LD_Select - SNP tagging/binning algorithm - based on LD (r2), not haplotypes
Carlson, et al. AJHG (2004)
SNP Selection: VG/LD_Select on the Web
pga.gs.washington.ed/VG2
SNP Selection: tagSNP Data
SNP Selection: VKORC1 tagSNPs
Five Bins to TestFive Bins to Test1.1. 381, 3673, 6484, 6853, 7566381, 3673, 6484, 6853, 75662.2. 2653, 60092653, 60093.3. 8618614.4. 580858085.5. 90419041
Bin 1 - p < 0.001Bin 1 - p < 0.001Bin 2 - p < 0.02 Bin 2 - p < 0.02 Bin 3 - p < 0.01 Bin 3 - p < 0.01 Bin 4 - p < 0.001 Bin 4 - p < 0.001 Bin 5 - p < 0.001Bin 5 - p < 0.001
C/C C/T T/T
e.g. Bin 1 - SNP 381
SNP x SNP interactions - haplotype analysis?SNP x SNP interactions - haplotype analysis?
SNP Testing: VKORC1 tagSNPs
VKORC1 Summary: SNP Discovery/SNP Selection
1. VKORC1 candidate gene for warfarin dose response
2. SNP discovery performed using PCR/resequencing to catalog common SNPs• 28 SNPs found • 10 common SNPs
3. SNP discovery using dbSNP• 8/10 dbSNPs confirmed • 7 new SNPs added
4. SNP Selection using linkage disequilibrium• 10 common SNPs (> 10% MAF)• 5 informative SNPs for genotyping
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes Pick tagSNPs Genotype samples
Pick tagSNPs Infer haplotypes Test for association
Haplotypes in Genetic Association Studies
1. How can you get haplotypes?
2. What information do you get from haplotypes?
3. How do you use haplotypes to find tagSNPs?
4. How do you use haplotypes to test for associations?
Haplotypes – The Definition
“…a unique combination of genetic markers present in a chromosome.” pg 57 in Hartl & Clark, 1997
Constructing Haplotypes
C TA G
T TG G
C CA G
C/T, A/G
C/C, A/GT/T, G/G
C/T, A/AC/C, A/G
Collect pedigrees Somatic cell hybrids
Human Rodent
Hybrid
SNP 1 SNP 2
C/T A/G
Allele-specific PCR
Constructing Haplotypes
Examples of Haplotype Inference Software:
EM AlgorithmHaploview http://www.broad.mit.edu/mpg/haploview/index.php Arlequinhttp://lgb.unige.ch/arlequin/
PHASE v2.1http://www.stat.washington.edu/stephens/software.html
HAPLOTYPERhttp://www.people.fas.harvard.edu/~junliu/Haplo/docMain.htm
Haplotypes in SeattleSNPs
• >200 genes re-sequenced in inflammation response
• 2 populations: European- and African-Americans
• PHASEv2.0 results posted on website
• Interactive tool (VH1) to visualize and sort haplotypes
http://pga.gs.washington.edu
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in SeattleSNPs
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes Pick tagSNPs Genotype samples
Pick tagSNPs Infer haplotypes Test for association
RecombinationNatural selectionPopulation historyPopulation demography
Haplotype block definition
Measuring Pair-wise SNP Correlations
• SNP correlation described by linkage disequilibrium (LD)
• Pair-wise measures of LD: D´ and r2
D = pAB - pApB; D´ = D/Dmax Recombination
r2 = D2
f(A1)f(A2)f(B1)f(B2) Power
• r2 is inversely related to power
1/r2
1,000 cases 1,250 cases1,000 controls r2=1.0 1,250 controls r2 = 0.80
• D´ is related to recombination history
D´ = 1 no recombinationD´ < 1 historical recombination
Example: LDSelect
Example: Haplotype “blocks”
Using LD and Haplotypes to Pick tagSNPs
Haplotype “Blocks”
Strong LD Few Haplotypes Represent most chromosomes
Daly et al 2001Daly et al Nat. Genet. (2001)
Block Definitions
Daly et al 2001
D´ [Gabriel et al Science (2002)]
Daly et al Nat. Genet. (2001)
Block Definitions
A B
a bA b
a B
Four-gamete test:
A B
a b
<4 haplotypes, D´=1 block
4 haplotypes, D´<1 boundary
Haplotype Blocks and tagSNPs
Identifying blocks and tagSNPs:
• Manually
• Algorithms– Haploview
Haplotype Blocks and tagSNPs
IL1B:19 SNPs (MAF >5%)
4 “common” haplotypes
tagSNPs
Haplotype Blocks and tagSNPs
Identifying blocks and tagSNPs:
• Manually
• Algorithms– HaploView
LD and tagSNPs using Haploview
VKORC1European-Americans
PHASEv2.1 data
Minimal set of tagSNPs based on r2
Where to Find Tagging Software
HaploBlockFinder http://cgi.uc.edu/cgi-bin/kzhang/haploBlockFinder.cgi
LDSelect http://droog.gs.washington.edu/ldSelect.html
SNPtagger http://www.well.ox.ac.uk/~xiayi/haplotype/index.html
TagIT http://popgen.biol.ucl.ac.uk/software.html
tagSNPs http://www-rcf.usc.edu/~stram/tagSNPs.html
Haploview http://www.broad.mit.edu/personal/jcbarret/haplo/
Haplotypes, TagSNPs, and Caveats
• Haplotypes are inferred
• Block-like structure assumed for some software
• Different block definitions
• Block boundaries sensitive to marker density
• Genotype savings may not be great (recombination)
Haplotypes in Genetic Association Studies
Two main approaches with haplotypes:
Haplotypes Pick tagSNPs Genotype samples
Pick tagSNPs Infer haplotypes Test for associationGenetic diversity of sampleMulti-SNP analysis
Five tagSNPs (10 total SNPs)Five tagSNPs (10 total SNPs)
186 warfarin patients (European)PHASE v2.1
9 haplotypes/5 common (>5%)
Multi-SNP testing: Haplotypes
Multi-SNP testing: Haplotypes
Test for association between haplotype and warfarin dose using multiple linear regression
Adjusted for all significant covariates: age, sex, amiodarone, CYP2C9 genotype
CCGATCTCTG-H1 CCGAGCTCTG-H2
TAGGTCCGCA-H8 TACGTTCGCG-H9
(381, 3673, 6484, 6853, 7566) 5808
9041
861
B
A
VKORC1 haplotypes cluster into divergent clades
Patients can be assigned a clade diplotype:e.g. Patient 1 - H1/H2 = A/A
Patient 2 - H1/H7 = A/BPatient 3 - H7/H9 = B/B
Explore the evolutionary relationship across haplotypes
TCGGTCCGCA-H7
Multi-SNP testing: Haplotypes
VKORC1 clade diplotypes show a strong association with warfarin dose
Low
High
A/AA/BB/B
*
††
**
All patients 2C9 WT patients 2C9 VAR patientsAA AB BBAA AB BB AA AB BB
(n = 181) (n = 124) (n = 57)
Independent of INR levels across all groups
• European - mean ~ 5 mg/d
• African-American - higher ~ 6.0-7.0 mg/d
• Asian - lower ~ 3.0-3.5 mg/d
Hypothesis:Hypothesis: VKORC1VKORC1 haplotypes contribute to racial haplotypes contribute to racial variability in warfarin dosing.variability in warfarin dosing.
• “Control” populations: 120 Europeans 96 African-Americans
120 Asian
Multi-SNP testing: Haplotypes
Asian (Han) Clade Distribution
Low dose phenotype
A(89%)
B(11%)
African-American Clade Distribution
High dose phenotype
A(14%)
B(47%)
Other(39%)
European (CEPH)Clade Distribution
B(58%)
A(37%)
Clade A = LowClade B = High
Explore the evolutionary relationship across populations
Multi-SNP testing: Haplotypes
• Small sample size
• Subgroup analysis and multiple testing
• Random error
• Poorly matched control group
• Failure to attempt study replication
• Failure to detect LD with adjacent loci
• Overinterpreting results and positive publication bias
• Unwarranted ‘candidate gene’ declaration after identifying association in arbitrary genetic region
Common Errors in Association StudiesBell and Cardon (2001)
e.g., Second case/control studyGene expression studies
*
††
* *
All patients 2C9 WT patients 2C9 VAR patientsAA AB BBAA AB BB AA AB BB
Univ. of Washingtonn = 185
All patients 2C9 WT patients 2C9 VAR patientsAA AB BBAA AB BB AA AB BB
†
†
*
†
*
21% variance in dose explained
Washington Universityn = 386
Brian GageHoward McCleodCharles Eby
SNP Replication: VKORC1
SNP Function: VKORC1 Expression
mechanism
No nonsynonymous SNPs
Several SNPs are present in evolutionarily conserved non-coding regions
- mRNA expression in human liver cell lines
SNP Function: VKORC1 Expression
Expression in human liver tissue (n = 53) shows a graded change in expression.
VKORC1 SNP alters liver-specific binding siteVKORC1 SNP alters liver-specific binding site
• Databases and resources available for SNP discovery
• Software for tagSNP selection available
• Both single and multi-SNP analysis are useful
• Replication required by several journals
SNP Discovery and Analysis Application to Association Studies
Summary
SeattleSNPs Genotyping Service
• Free genotyping (BeadArray or SNPlex)
• Emphasis on young investigators
• Research related to heart, lung, blood, or sleep disorders
• Moderate to large population samples
• Apply at pga.gs.washington.edu
• Due: October 15th, 2005
SNP Typing Formats
Microtiter Plates - Fluorescence
Size Analysis by Electrophoresis
Arrays - Custom or Universal
eg. Taqman - Good for a few markers - lots of samples - PCR prior to genotyping
eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed- 1,500 SNPs and beyond (500K+)
Low
Medium
High
Scale
Taqman
Genotyping with fluorescence-based homogenous assays (single-tube assay) = 1 SNP/ tube
SNP Typing Formats
Microtiter Plates - Fluorescence
Size Analysis by Electrophoresis
Arrays - Custom or Universal
eg. Taqman - Good for a few markers - lots of samples - PCR prior to genotyping
eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed- 1,500 SNPs and beyond (500K+)
Low
Medium
High
Scale
Technological Leap - No advance PCR
Universal PCR after preparing multiple regions for analysis -
Several based on primer specific on genomic DNA followed by PCR of the ligated products - different strategiesand different readouts.
SNPlex, Illumina, Parallele
Also, reduced representation - Affymetrix - cut with restriction enzyme, then ligate linkers and amplify from linkers and follow by chiphybridization to read out.
9. Characterize on Capillary Sequencer
Detection
SNP 1
SNP 2
SNP Typing Formats
Microtiter Plates - Fluorescence
Size Analysis by Electrophoresis
Arrays - Custom or Universal
eg. Taqman - Good for a few markers - lots of samples - PCR prior to genotyping
eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on
genomic DNA - new paradigm for high throughput
eg. Illumina, ParAllele, Affymetrics - Highly multiplexed- 1,500 SNPs and beyond (500K+)
Low
Medium
High
Scale
Locus 1 Specific Sequence
cTag1 sequenceTag1 sequence
SubstrateBead or Chip
Tag 1
Tag 2
Tag 3
Tag 4
Chip ArrayBead Array
Multiplexed Genotyping - Universal Tag Readouts
Locus 2 Specific Sequence
cTag2 sequenceTag2 sequence
SubstrateBead or Chip
C T A G
Multiplex ~1,000 SNPs
Not dependent on primary PCRIllumina
ParAllele
Affymetrics
Illumina Platform
96 Multi-array Matrix matches standard microtiter plates~ 1,500 SNPs typed per matrix for 96 samples
Affymetrix’s 100K Chip
http://www.affymetrix.com/products/arrays/specific/100k.affx
Optimized for 250-2000bp
High Throughput Chip Formats
Defining the scale of the genotyping project is key to selecting an approach:
5 to 10 SNPs in a candidate gene - Many approaches (expensive ~ 0.60 per SNP/genotype)
48 ( to 96) SNPs in a handful of candidate genes (~ 0.25 per SNP/genotype)
384 0 1,536 SNPs (~0.15 - 0.08 per SNP/genotype)
10,000 cSNPs - defined format(~0.05 per SNP/genotype)
100,000 Genic SNPs - defined format(~0.005 per SNP/genotype
500,000 SNPs defined format(~0.004 per SNP/ genotype)
1000 individuals
$6,000
$12,000
$57,600-122,880
$500,000
$500,000
$2,000,000
AcknowledgementsAcknowledgements
Allan Rettie, Medicinal ChemistryAllan Rettie, Medicinal ChemistryAlex ReinerAlex ReinerDave VeenstraDave VeenstraDave BloughDave BloughKen ThummelKen Thummel
Noel HastingsNoel HastingsMaggie AhearnMaggie Ahearn
Josh SmithJosh SmithChris BaierChris BaierPeggy Dyer-RobertsonPeggy Dyer-Robertson
Washington UniversityWashington UniversityBrian GageBrian GageHoward McLeodHoward McLeodCharles EbyCharles Eby
Joyce You - Hong KongJoyce You - Hong Kong