The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston...

38
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College [email protected]

Transcript of The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston...

Page 1: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

The medical relevance of

genome variability

Gabor T. Marth, D.Sc.

Department of Biology, Boston [email protected]

Page 2: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Lecture overview

1. Phenotypic effects caused by known genetic variants

2. Genetic mapping to find genetic variants that cause diseases – linkage analysis and association studies

3. Genome-wide association mapping resources – the HapMap

4. Structural and epigenetic variations in disease

Page 3: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

1. Phenotypic effects caused by known genetic variants

Page 4: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Many SNPs do have phenotypic effects

Badano and Katsanis, NRG 2002

some notable genetic diseases:

cystic fibrosis cycle-cell anemia

Page 5: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Genetic variants in Pharmacogenetics

Evans and Rellig, Science 1999

Page 6: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Genetic variants in Pharmacogenetics

Evans and Rellig, Science 1999

Page 7: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Using genotype information in the drug development pipeline

Roses. NRG 2004

Page 8: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Are all genetic variants functional?

~ 10 million known SNPs

0.005.00

10.0015.00

20.0025.00

30.0035.00

40.00

4 kb4 kb

8 kb8kb

12 kb12 kb

16 kb16kb0

0.1

0.2

0.3

0.4

SNPs, on the scale of the genome, can be described well with the “neutral theory” of sequence variations the vast majority of SNPs likely to have no functional effects

How do we find the few functional variants in the background of millions of non-functional SNPs?

Page 9: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

2. Genetic mapping to find genetic variants that cause diseases – linkage analysis and association studies

Page 10: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Genetic mapping

Page 11: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Allelic association (linkage)

• allelic association is the non-random assortment between alleles i.e. it measures how well knowledge of the allele state at one site permits prediction at another marker site functional site

• significant allelic association between a marker and a functional site permits localization (mapping) even without having the functional site in our collection

• allelic association, and the use of genetic markers is the basis for mapping functional alleles

Page 12: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Mendelian diseases have simple inheritance

genotype inheritance

genotype + phenotype inheritance

Page 13: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Linkage analysis compares the transmission of marker genotype and phenotype in families

Page 14: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Complex disease – complex inheritance

Badano and Katsanis, NRG 2002

Page 15: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Allele frequency and relative risk

Brinkman et al. Nature Reviews Genetics advance online publication;published online 14 March 2006 | doi:10.1038/nrg1828

Page 16: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Association study strategies

• region(s) interrogated: single gene, list of candidate genes (“candidate gene study”), or entire genome (“genome scan”)

• direct or indirect:

causative variant causative variantmarker that is co-inherited with causative variant

• single-SNP marker or multi-SNP haplotype marker

• single-stage or multi-stage

Page 17: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Association study strategies

2. LD-driven – based entirely on the reduction of redundancy presented by the linkage disequilibrium (LD) between SNPs; tags represent other SNPs they are correlated with

1. hypothesis driven (i.e. based on gene function)

causative variant

for economy, one cannot genotype every SNP in thousands of clinical samples: marker selection is the process where a subset of all available SNPs is chosen

Page 18: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Marker selection depends on genome LD

Daly et al. NG 2001

Page 19: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Case-control association testing

• searching for markers with “significant” marker allele frequency differences between cases and controls; these marker signify regions of possible causative alleles

AF(cases)

AF(

contr

ol

s)

clinical cases

clinical controls

• genotyping cases and controls at various polymorphisms

Page 20: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

3. Genome-wide association mapping resources – the HapMap

Page 21: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

The HapMap resource

• goal: to map out human allele and association structure of at the kilobase scale

• deliverables: a set of physical and informational reagents

Page 22: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

LD structure in four human populations

International HapMap Consortium, Nature 2005

Page 23: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

LD varies across samples

African reference (YRI)

there are large differences in LD between different human populations…

European reference (CEU)

… and even between samples from the same population.

Other European samples

Page 24: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Sample-to-sample LD differences make tagSNP selection problematic

groups of SNPs that are in LD in the HapMap reference samples may not be in a future set of clinical samples…

… and tags that were selected based on LD in the HapMap may no longer work (i.e. represent the SNPs they were supposed to) in the clinical samples…

… possibly resulting in missed disease associations.

Page 25: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Marker selection with additional samples

test if markers selected from the HapMap continue to “tag” other SNPs in their original LD group

Page 26: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Representative computational samples

Page 27: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Two methods of computational sample generation

“HapMap” “cases”

“controls”HapMap

Method 1. “Data-relevant Coalescent”. This algorithm uses a population genetic model to connect mutations in the HapMap reference to mutations in future clinical samples. Full model but computationally slow.

Method 2. The PAC method (product of approximate conditionals, Li & Stephens). This method constructs “new” samples as mosaics of existing haplotypes, mimicking the effects of recombination. An approximation but fast.

Page 28: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

LD difference -- comparison to extra experimental genotypes

0.949 +/- 0.013

0.978 +/- 0.0100.963 +/- 0.014

• we have analyzed two extra genotype sets collected at the HapMap SNPs in three genome regions, from our clinical collaborators (Prof. Thomas Hudson, McGill; Prof. Stanley Nelson, UCLA)

Page 29: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Genome-wide scans for human diseases

Klein et al, Science 2005

SNPs in Complement Factor H (CFH) gene are associated with Age-related Macular Degeneration (AMD)

Page 30: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

4. Somatic, structural and epigenetic variants in disease

Page 31: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Somatic mutations

© Brian Stavely, Memorial University of Newfoundland

the detection of somatic mutations, and their distinction from inherited polymorphism, is important to separate pre-disposing variants from mutations that occur during disease progression e.g. in cancer

1. detect the mutations

2. classify whether somatic or inherited

Page 32: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Detecting somatic mutations with comparative data

• based on comparison of cancer and normal tissue from the same individual

• often cancer tissue is highly heterogeneous and the somatic mutant allele may represent at low allele frequency

Page 33: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Detecting somatic mutations with subtraction

• if normal tissue samples are not available, we detect SNPs in cancer tissue against e.g. the human genome reference sequence

• subtract apparent mutations that are present in sequence variation databases

• search for evidence that these mutations are genetic

Page 34: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Detecting somatic mutations in murine mtDNA

• we have applied our methods for somatic mutation detection in murine mitochondrial sequences

heteroplasmy homoplasmy

• we will be applying our methods for human nuclear DNA from our collaborators

Page 35: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Structural variants in disease

Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767

Page 36: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Structural variations and phenotype

Feuk et al. Nature Reviews Genetics 7, 85–97 (February 2006) | doi:10.1038/nrg1767

Page 37: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Epigenetics and cancer

Baylin at al. NRC 2006.

Page 38: The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College marth@bc.edu.

Informatics of detection / integration of varied genetic and epigenetic data

chromatin structure

gene expression profiles

copy number changes

methylation profiles

chromosome rearrangement

s

repeat expansions

somatic mutations