Association mapping: finding genetic variants for common traits & diseases

Association mapping:

finding genetic variants for common

traits & diseases

Manuel Ferreira

Queensland Institute of Medical Research

Brisbane

Genetic Epidemiology

1WEHI Postgraduate seminar, 31 May 2010

Predict disease risk / drug response Personalized Medicine

Lancet 2010; 375: 1525–35

Understand disease aetiology

Rare, monogenic traits

Ng et al. Nature Genetics 2010; 42: 30-35.

DISEASERISK

Common, complex traits

Phenotypic modelling

Linkage analysis

Association analysis

GENETICS OF COMMON DISEASES

Recent advances assays/analysis genetic

variationHapMap, 1000 Genomes

High-throughput genotyping & sequencing

Analytic Methods

Genome-wide association, imputation, stratification, CNVs, risk prediction

genes env

DISEASERISK

HapMap project

“The HapMap was designed to determine the frequencies and patterns of association among roughly 3 million common Single Nucleotide Polymorphisms (SNPs) in four populations, for use in

genetic association studies.” [4]

1. GOALS

[1] The International HapMap Consortium. Nature 2003; 426: 789. [2] International HapMap Consortium. Nature 2005; 437: 1299.[3] International HapMap Consortium. Nature 2007; 449: 851.[4] Manolio et al. J Clin Invest 2008; 118: 1590.

Individuals

HapMap project

2. STRATEGY

30 trios Yoruba in Ibadan, Nigeria (YRI)30 trios European descent in Utah (CEU)45 unrelated Han Chinese from Beijing (CHB)45 unrelated Japanese from Tokyo (JPT)

Genome-wide SNP discovery1,7 million dbSNP 9,2 million

2002 200514,7 million (6,5 million validated)

Genotyping

Phase 1: MAF>0.05, validated, non-synonymous SNPs prioritised (1,27 million total)

Phases 2 and 3 expanded SNP (4 million) and population (11) coverage

http://www.hapmap.org/

SNP selection

7 genotyping platforms used/developed by 12 centres

HapMap project

3. OUTCOMES

“Systematic” catalogue of common human variation

Linkage disequilibrium (LD) or correlation between SNPs(tagging, fine-mapping, imputation)

Designing and refining high-throughput genotyping platforms

Population genetics (selection, sub-structure, recombination & mutation)

Gene A

Haplotypes

HapMap SNPs

D’ and r2

Correlation (LD) between SNPs

Haploview, TaggerSNP tags

Genetic CoverageProportion of known SNPs taggedHaploview

Fine-mappingInteresting SNPs to follow-upCross-study comparisons

eg. SNP 1 ‘tags’ 4/10 variants

1000 Genomes project

http://www.1000genomes.org/

“The 1000 Genomes Project aims to achieve a nearly complete catalog of common human genetic variants (defined as frequency 1% or higher) by generating high-quality sequence

data for >85% of the genome for three sets of 400-500 individuals (...)”

2,500 samples at 4x by 2011

Whole-genome genotyping (from $300 USD/sample)

Whole-genome sequencing (from $10,000 USD/sample)

Illumina:

HiSeq 200030x coverage

100 bp read length

Complete Genomics

40x coverage35 bp read length

Affymetrix:

6.0 chip>900,000 SNPs

CNV probes82% coverage CEU HapMap

Accuracy 99.90%

Illumina:

Human1M BeadChip>1 million SNPs

CNV probes95% coverage CEU HapMap

Accuracy 99.94%

Recent advances assays/analysis genetic

variationHapMap, 1000 Genomes

Analytic Methods

Genome-wide Association, stratification, imputation, CNV, risk prediction

Examples: recent GWAS.

Analytic methods

1. GENOME-WIDE ASSOCIATION

controls

cases controls

No association

Association

Analytic methods

Association testsStudy designs

Unrelated individuals

Families

Software

Between individual effects

Between + Within family effects

Many (eg. PLINK)

Merlin, etc

Unrelated individuals

Families

More power / $ spent, easier to collect, analyse

Assess inheritance (CNVs), robust population stratification

Analytic methods

2. POPULATION STRATIFICATION

Ind1 Ind2 % shared

A1 A2 100

A1 A3 50

A1 A4 25

A1 A5 10

A1 A6 8

A1 B1 5

Genetic matching

cases controls

Analytic methods

3. IMPUTATION OF UNMEASURED GENOTYPES

Reference panel (eg. HapMap)

Genotyped Dataset

Individuals

SNPs MAF N SNPs Imputation Info score

Proportion of SNPs

Average Imputation

Average Concordance

0.01-0.05 27,078 Not imputed 0.000 - -

0-0.5 0.325 0.841 0.966

0.5-0.8 0.149 0.917 0.979

≥0.8 0.526 0.992 0.992

0.05-0.15 71,984 Not imputed 0.002 - -

0-0.5 0.164 0.525 0.934

0.5-0.8 0.175 0.750 0.961

≥0.8 0.659 0.967 0.989

0.15-0.25 65,918 Not imputed 0.004 - -

0-0.5 0.082 0.248 0.874

0.5-0.8 0.164 0.554 0.939

≥0.8 0.750 0.939 0.986

0.25-0.50 146,253 Not imputed 0.004 - -

0-0.5 0.053 0.094 0.777

0.5-0.8 0.145 0.389 0.907

≥0.8 0.798 0.917 0.981

MACH, IMPUTE, BEAGLE17

Shaun Purcell, Doug Ruderfer (PLINK)

Genotyped + Imputed Dataset

Illumina

Perlegen

HapMap

Combine data from studies genotyped using different platforms

Example 1: Bipolar Disorder GWAS

WTCCC STEP-UCL ED-DUB-STEP2 Overall

Sample Size

N (% males) 4,764 (45) 3,467 (47) 2,365 (40) 10,596 (44)

Cases (% males) 1,829 (38) 1,460 (43) 1,098 (44) 4,387 (41)

Controls (% males) 2,935 (49) 2,007 (50) 1,267 (36) 6,209 (47)

Genotype missing rate 0.0027 0.0057 0.0031 0.0038

MAF GRR Power (α = 5 × 10-8)

0.05 1.40 0.05 0.02 <0.01 0.61

0.20 1.20 0.03 <0.01 <0.01 0.48

0.40 1.15 0.02 <0.01 <0.01 0.31

Ferreira et al (2008) Nature Genetics 40: 105619

325,690 SNPs

>1,7 million SNPs

ANK3: Ankyrin G

Cases: 7.0% Controls: 5.3%Odds ratio = 1.45

Not related to sex, psychosis or age-of-onset

Smith et al (2009) Mol Psychiatry 14: 755-63.

Scott et al (2009) Proc Natl Acad Sci USA 106: 7501-6.

[Lee et al (2010) Mol Psychiatry Apr 13 – Han Chinese population]

Replicated recently

Example 2: analysis of lymphocyte subsets

Ferreira et al. (2010) Am J Hum Genet 86: 88-92 21

2,538 individuals | CD4+ T cell levels, CD8+ T cell levels, CD4:CD8 ratio

MHC class I• rs2524054, C• Increased CD8+ T levels• Improved host control of HIV (OR=0.32, P=10-9)

MHC class II• rs9270986, A• Increased CD4+ T levels• Protective effect for type-1 diabetes (OR = 0.04, P=10-125)• Protective effect Rheum. Arthritis (OR=0.60, P=10-15)

Structural Variants

Genomic alterations involving segment of DNA >1kb

Quantitative

(Copy Number Variants)

Positional (Translocations)

Orientational (Inversions)

Deletions

Duplications

Insertions

Analytic methods

4. Structural Variants

Detection of CNVs

Non-polymorphic probesMcCarroll et al 2008 Nat Genet 40: 1166

Detection of CNVs

Use polymorphic probes from genotyping arrays to Identify and genotype new, potentially rarer CNVs

Example: rs1006737 A/G ... AGCCCGAAATGTTTTCAGA...

... AGCCCGAAGTGTTTTCAGA...

probe 1

probe 2AAAGGG

Intensity of probe 2

Detection of CNVs

1 A/G 1 1 2

2 A/- 1 0 1

3 AA/- 2 0 2

4 -/G 0 1 1

5 -/- 0 0 0

6 AAA/G 3 1 4

...CG ATG...

ATG......CGATG......CG

ATG......CG

ATG......CGATG......CG

A A AG

ATG......CG

Mat/PatIndGenotype Copy number for:

A G TotalPattern

Detection of CNVs ...CG ATG...

Normalized intensity of allele A

Polymorphic probe in CNV region

Individuals with

deletion(s)

Individuals with

duplication(s)ie. total CN > 2

ie. total CN < 2

Detection of CNVs

Combine information across probes to identify new CNVs

For example... Cases Controls

100kb deletion chr. 2 10/5,000 1/5,000

Korn et al 2008 Nat Genet 40: 1253

BirdseyeAffy 5.0, 6.0

Wang et al 2007 Genome Res 17: 1665

PennCNVAffymetrix and Illumina

Example 3: Autism whole-genome CNV

analysisSample 16p11 Cases Controls P

Discovery Del (600kb) 5/1,441 3/4,2341.1 x 10-4

[Affy 500K] Dup 7/1,441 2/4,234

Replication 1 (CHB) Del 5/512 0/4340.007

[array-CGH] Dup 4/512 0/434

Replication 2 (deCODE) Del 3/299 2/18,8344.2 x 10-4

[Illumina] Dup 0/299 5/18,834

Deletion frequency Iceland

Autism 1%Psychiatric disorder 0.1%General population 0.01%

Weiss et al. N Engl J Med 2008; 358: 667

COPPERBirdseye

del dup

inherited 2 6de novo 10 1unknown 1 4

Example 4: SCZ whole-genome CNV

analysis

Shaun Purcell

CasesCases

ControlsControlsChromosome Chromosome →→

Genome-wide burden

Specific loci

3,391 patients with SCZ, 3,181 controlsFilter for <1% MAF, >100kb

6,753 CNVs

Cases have greater rate of CNVs than controls1.15-fold increase

P = 3×10-5

Cases have greater rate of CNVs than controls1.15-fold increase

P = 3×10-5

Rate of genic CNVs in cases versus controls1.18-fold increase

P = 5×10-6

Rate of genic CNVs in cases versus controls1.18-fold increase

P = 5×10-6

Rate of non-genic CNVs in cases versus controls1.09-fold increase

P = 0.16

Rate of non-genic CNVs in cases versus controls1.09-fold increase

P = 0.16

Results invariant to obvious statistical controlsArray type, genotyping plate, sample collection site, mean probe intensity

Genome-wide burden of rare CNVs in SCZ

Shaun Purcell

Similar successes for Similar successes for other common diseasesother common diseases

Jan 2006 to

Jan 2008

before Jan 2006

Crohn’s Disease (31 loci, ~10% variance)

http://www.genome.gov/gwastudies

Altshuler, Daly & Lander. Science 2008; 322: 881Manolio, Brooks & Collins. J Clin Invest 2008 118: 1590

Summary

Tremendous recent technological advances

Large-scale genetic association studies feasible

>150 disease loci unequivocally identified since 2006

Provide a solid base to build our knowledge about disease mechanisms

Hundreds of loci yet to be identified for most diseases

Association mapping: finding genetic variants for common traits & diseases

Documents

Transcript of Association mapping: finding genetic variants for common traits & diseases

Finding variants for construction-based dialectometry: A ...

Quantitative trait loci, genome wide association mapping ...€¦ · GWAS (Genome wide association mapping) Coupling of molecular variants to (quantitative) traits, like weight of

Finding more and (more) genes anthropometric traitsibg · Thinking big: Finding more and (more) genes influencing glycaemic and anthropometric traits. Mark McCarthy, Oxford. ISGMW

Gip variants

Traits and Special Traits Ingles Final

Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Queensland Institute of Medical Research Brisbane Genetic Epidemiology.

RSA Variants

PPRESS VARIANTS IN Q2 RESS VARIANTS IN Q2 … · 2016-03-08 · PPRESS VARIANTS IN Q2 RESS VARIANTS IN Q2 HHAMLETAMLET 111515 S N 115 PPRESS VARIANTS IN Q2 RESS VARIANTS IN Q2 HHAMLETAMLET::

Inherited Traits Learned Traits Heredity

Revealing plant cryptotypes: defining meaningful phenotypes among infinite traits · 2018. 4. 23. · from each other. Comprehensively measuring phenotype and finding those traits

· Web viewIntroduction With the advancement of high-throughput genotyping technologies, hundreds of common genetic variants have been identified for human complex traits, such as

Finding Modifiers of Known Disease-related Variants · Finding Modifiers of Known Disease – Related Variants Cystic Fibrosis: Model of “Monogenic” Recessive Disorder NIH Workshop:

Generalized Linear Mixed Model (GLMM) & Weighted Sum Test (WST) Detecting Association between Rare Variants and Complex Traits Qunyuan Zhang, Ingrid Borecki,

Assessing Lawyer Traits & Finding a Fit for Successtherightprofile.com/wp-content/uploads/Attorney-Trait-Assessment... · Assessing Lawyer Traits & Finding a Fit for Success ... Help

pure::variants User's Guide€¦ · pure::variants User's Guide ... 1

and genes for polygenic human traits. · 105 traits or disorders to more readily identify causal variants, the cells in which they exert their 106 effects, their target genes, and

Rare Genetic Variation Underlying Human Diseases and Traits: … · 2020. 11. 29. · variants of clinical relevance. We therefore sought to determine the contribution of rare genetic

FINDING COLOR WORLD OF GRAY RESULTS...“Athena Insight provides a reliable, high quality, validated process for analyzing genetic variants of unknown significance. The standardized

Dietary factors impact on the association between CTSS ... fileDietary Factors Impact on the Association between CTSS Variants and Obesity Related Traits Henri Hooton1*., Lars A¨ngquist2.,

Abstract “Racism resembles bacteria. It has an uncanny ability to resist cures. Like bacteria, racism includes variants with unusual traits which have.