Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM...

24
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA [email protected] Epidemiology 244: Cancer Epidemiology Methods

Transcript of Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM...

Page 1: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Selecting TagSNPs in Candidate

Genes for Genetic Association Studies

Shehnaz K. Hussain, PhD, ScMAssistant ProfessorDepartment of Epidemiology, [email protected]

Epidemiology 244: Cancer Epidemiology Methods

Page 2: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Objectives

Molecular genetics primer

Databases and tools to conduct in silico analyses for tagSNP selection/prioritization

Page 3: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Central dogma

DNA

Protein

mRNA

A T C G

Page 4: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

What are SNPs?

More than 99% of all nucleotides are the same in all humans

1% of nucleotides are polymorphic SNPs>> insertions-deletions

Bi-nucleotide – T (80%) A (20%)

Where do SNPs occur? Exons

Introns

Flanking regions

Page 5: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

? T ? G ? A

? T ? G ? A

A T G G A A

T T C G T A

What are haplotypes?

A haplotype is the pattern of nucleotides on a single chromosome

Two “copies” of each chromosome

The haplotype inference problem

TA TT CG GG TA AA

Page 6: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

What is linkage disequilibrium?

Linkage disequilibrium (LD) describes the non-random association of nucleotides on the same chromosome in a population One nucleotide at one position (locus) predicts the

occurrence of another nucleotide at another locus

No LD LD

Page 7: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Disease Susceptibility

Locus

Disease Phenotype

Test for genetic association between the phenotype and the DSL

Marker loci (SNPs)

LD

Test for association between phenotype and

marker loci

What are markers?

Candidate gene

Page 8: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

What are tagSNPs?

TagSNPs are a subset of all SNPs in a gene that mark groups of SNPs in LD

Avoids redundant genotyping

Disease Susceptibility

Locus

LD LD

Marker loci (SNPs)

Page 9: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

The joint effect of tagSNPs in cytokine genes and cigarette

smoking in cervical cancer risk

Page 10: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

T-cell proliferation

IL-2 gene

IFNγ gene

Activated T-cell

Proliferation of TH1-cells

IL-2

IFNγ

IL-2 receptor

IL-2 gene

Activated T-cell

IL-2

IL-2 gene

Activated T-cell

Proliferation of TH1-cellsProliferation of TH1-cells

IL-2

IL-2 receptorIL-2 receptor

IFNγ gene

IFNγ

IFNγ gene

IFNγ

Page 11: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Background

Cigarette smoking ↑ 1.5- to 3-fold cancer risk

Cigarette smoking ↓ levels of IL-2 and IFNγ (cervical and circulating)

↓ levels of IL-2 and IFNγ HPV persistence in the cervix

Cervical neoplasia

Decreased survival from invasive cervical cancer

Page 12: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Cigarette smoking

HPV-associated squamous cell cervical cancer

Model

SNPs in IL-2, IL-2R, and IFNG

Page 13: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Study design Population-based case-only study

Subjects 308 Caucasian squamous cell cervical cancer cases

diagnosed 1986-2004 Residing in 3 western Washington counties

Data collection Structured in–person interviews DNA isolated from buffy coats

Methods

Page 14: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Multi-stage tagSNP design

Re-sequence panel, identify SNPs (many markers, few subjects)

Choose tagSNPs

Genotype tagSNPs in main study(few markers, many subjects)

Select reference panel

Page 15: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

A sample of your study population Most representative

Samples from the Coriell Repository Ability to integrate your data with other

resources

1. Select reference panel

= Candidate gene SNPs = HapMap SNPs

Page 16: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

2. Re-sequence reference panel

PolyPhred

Phred Phrap

(Nickerson, 1997)

(Ewing, 1998) (Ewing, 1998)

Amplify and Sequence DNA

Gene

Page 17: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Alternatives to re-sequencing

Program for Genomic Applications (PGA) SeattleSNPs – inflammation

NIEHS SNPs – environmental response

Innate Immunity

International HapMap Project 5 million SNPs in four ethnically distinct

populations

Page 18: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

3. Choose tagSNPs

Option LDSelect(Carlson, 2002)

Tagger(de Bakker, 2005)

r2 threshold Yes Yes

SNP exclusions/inclusions No Yes

SNP design score No Yes

Page 19: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

LDSelect output for IL-2

SeattleSNPs, r2≥0.80, MAF ≥0.05, Caucasian

BinTotal Number

of SitesTagSNPs

1 2rs2069763 rs2069772

2 2rs2069776 rs2069778

3 2rs2069777 rs2069779

4 1 rs2069762

Page 20: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Exons (cSNPs) SIFT (Ng, 2002) PolyPhen (Ramensky, 2002)

Upstream flanking region Intron-exon junctions

Genomic context

Page 21: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Sequence conservation

Repeat region Unique region

UCSC Genome Browser, PhasCons (Siepel, 2005)

Sco

re

Page 22: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Page 23: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

TagSNP summary

Efficient yet comprehensive coverage of the genetic variation in our candidate genes Reduce costs

Preference should be given to putatively functional variants: Literature, gene context, sequence conservation

Page 24: Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.

Thanks for your attention!

Questions?