Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

53
Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory

Transcript of Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Page 1: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Resources at HapMap.Org

HapMap3 Tutorial

Marcela K. Tello-RuizCold Spring Harbor Laboratory

Page 2: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Basic Concepts

A B

a b

A B

a b

High LD -> No Recombination(r2 = 1) SNP1 “tags” SNP2

A B

A B

A B

a b

a b

a b

Low LD -> RecombinationMany possibilities

A b

A ba Ba b

A BA B

a B

A b

etc…

A B

A B

X

OR

Parent 1 Parent 2

Page 3: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Basic ConceptsSNP1 SNP2

alleles: A/a B/bC1C2

POP allele freqs: A (80%) B (60%)a (20%) b (40%)

genotypes:Person 1 Person 2 Person 3

AA AA Aa BB Bb Bb

phased haplotypes (C1/C2):A B A B A B A BA b a b

ORA ba B

Page 4: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

HapMap Glossary• LD (linkage disequilibrium): For a pair of SNP

alleles, it’s a measure of deviation from random association (i.e., no recombination). Measured by D’, r2, LOD

• Phased haplotypes: Estimated distribution of SNP alleles. Alleles transmitted from Mom are in same chromosome haplotype, while Dad’s form the paternal haplotype.

• Tag SNPs: Minimum SNP set to identify a haplotype. r2= 1 indicates two SNPs are redundant, so each one perfectly “tags” the other.

• Questions? [email protected]

Page 5: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

HapMap Project

Phase 1 Phase 2 Phase 3

Samples & POP panels

269 samples(4 panels)

270 samples(4 panels)

1,115 samples (11 panels)

Genotyping centers

HapMap International Consortium

Perlegen Broad & Sanger

Unique QC+ SNPs

1.1 M 3.8 M(phase I+II)

1.6 M (Affy 6.0 & Illumina 1M)

Reference Nature (2005) 437:p1299

Nature (2007) 449:p851

Draft Rel. 1 (May 2008)

Page 6: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Release Notes• Phase 1+2: Latest Release #24, October 2008

(NCBI build 36):

3.9 M unique QC+ SNPs -- > 1 SNP/700 bp

http://ftp.hapmap.org/00README.releasenotes_rel24

– Added back chrX SNPs dropped in previous releases– Corrected allele flips from rel#23a

• Phase 3: Draft release #1 (NCBI build 36)

http://ftp.hapmap.org/genotypes/2008-07_phaseIII/00README.txt

– HapMap3 sites @ Broad Institute, Sanger Center and Baylor College

Page 7: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Phase 3 Sampleslabel population sample # samples QC+ Draft 1ASW* African ancestry in Southwest USA 90 71

CEU*Utah residents with Northern and Western

European ancestry from the CEPH collection180 162

CHB Han Chinese in Beijing, China 90 82CHD Chinese in Metropolitan Denver, Colorado 100 70GIH Gujarati Indians in Houston, Texas 100 83JPT Japanese in Tokyo, Japan 91 82LWK Luhya in Webuye, Kenya 100 83MEX* Mexican ancestry in Los Angeles, California 90 71MKK* Maasai in Kinyawa, Kenya 180 171TSI Toscans in Italy 100 77

YRI* Yoruba in Ibadan, Nigeria 180 1631,301 1,115

* Population is made of family trios

Page 8: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Phase 3• 11 panels & 1,115 samples

– 558/557 males/females– 924/191 founders/non-founders

• Platforms:– Illumina Human 1M (Sanger)– Affymetrix SNP 6.0 (Broad)

• EXCLUDED from QC+ data set: – Samples with low completeness, and SNPs with low call rate in each

pop (< 80%) and not in HWE (p < 0.001)– Overall false positive rate: ~3.2%

• Data merged with PLINK (concordance over 249,889 overlapping SNPs = 0.9931)

• Alleles on the (+/fwd) strand of NCBI b36

Page 9: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Phase 3: Draft Release 1

samples QC+ SNPs poly QC+ SNPs

71 ASW 1,632,186 1,536,247

162 CEU 1,634,020 1,403,896

82 CHB 1,637,672 1,311,113

70 CHD 1,619,203 1,270,600

83 GIH 1,631,060 1,391,578

82 JPT 1,637,610 1,272,736

83 LWK 1,631,688 1,507,520

71 MEX 1,614,892 1,430,334

171 MKK 1,621,427 1,525,239

77 TSI 1,629,957 1,393,925

163 YRI 1,634,666 1,484,416

Page 10: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Phase 3 Data

• HapMap format:http://ftp.hapmap.org/genotypes/2008-07_phaseIII/hapmap_format* Excluded 1,527 SNPs with strandedness issues & 411 indels

• PLINK format:http://ftp.hapmap.org/genotypes/2008-07_phaseIII/plink_format

• HapMap3 sites:Broad - http://www.broad.mit.edu/~debakker/p3.htmlSanger - http://www.sanger.ac.uk/humgen/hapmap3/Baylor - http://www.hgsc.bcm.tmc.edu/projects/human/

Page 11: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Goals of This Tutorial

• Find HapMap3 SNPs near a gene or region of interest (ROI)– Visualize allele frequencies in HapMap3 populations– Download SNP genotypes in ROI for use in Haploview 4.1– Identify GWA hits in the vicinity of ROI & visualize in the context of

all chromosomes (karyogram)– Add custom data onto the GWAs karyogram– Add custom tracks of association data onto ROI– Create publication-quality images

• Download the entire HapMap3 data set in bulk– Distinguish genotype data in PLINK and HapMap formats

• Visualize LD patterns, find tag SNPs, impute genotypes using release #24 (phase 1+2)

• Generate customized extracts of the entire dataset using HapMart

This tutorial will show you how to:

Page 12: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

1: Surf to the HapMap Browser

1b. Select “HapMap phase

3”

1a. Go to www.hapmap.o

rg

Page 13: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

2: Search for TCF7L2

2. Type search term – “TCF7L2”

Search for a gene name, a

chromosome band, or a phrase like

“insulin receptor”

Page 14: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

3: Examine Region

Region view puts your ROI in

genomic context

Chromosome-wide summary data is

shown in overview

Default tracks show HapMap genotyped SNPs, refGenes with exon/intron splicing

patterns, etc.

3: This exonic region has many typed SNPs.

Click on ruler to re-center image.

Page 15: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

3: Examine Region (cont)

As you zoom in further, the

display changes to include more

detail

Use the Scroll/Zoom

buttons and menu to change position &

magnification

3: Mouse over a SNP to see allele frequency

table

Click to go to SNP details page

Page 16: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

4: Generate Text Reports

4: Select the desired “Download” option and

press “Go” or “Configure”

Available phase 3 downloads:

- Individual genotypes - Population allele & genotype frequencies

Page 17: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

4: Generate Reports (cont)

The Genotype download format can be saved to disk or loaded directly into

Haploview v4.1

Page 18: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

5: Find GWA hits5a: Scroll down to turn on GWA studies tracks in overview & region

panels

5b: Find GWA hits in nearby region. Click on a GWA hit to re-center

Page 19: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

5: Find GWA hits (cont)

5c: Mouse over & click on GWA hit for more

info

Page 20: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

6: Examine GWA hits in entire genome

6: From www.hapmap.org, select “Karyogram”

Page 21: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

6: Custom GWA hits in karyogram

Detailed help on the format is

under the “Help” link

6: Follow these instructions to upload your own GWA data

Page 22: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

7: Create your own tracks

7: Upload example file: TCF7L2_annotations.txt

Example:

• Interested in T2DM genetics

• Create file with custom annotations from http://www.broad.mit.edu/diabetes and superimpose on the HapMap

Detailed help on the format is

under the “Help” link

Page 23: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

7: Create your own tracks (cont)

Save as a text file!

Some SNPs were typed (known

platform) and others were imputed. Format data for both typed &

imputed SNPs.

Scores allow you to display data in quantitative form, such as XY plots

Page 24: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

7: Create your own tracks (cont)

Remember to point your browser to the

location of your annotations (TCF7L2 gene in this case).

Page 25: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Make edits on your own

browser window by clicking on “Edit File…”

7: Create your own tracks (cont)

Page 26: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

7: Create your own tracks (cont)

Page 27: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

8: Create Image for Publication

8a. Click on “High-res Image”

Click on the +/- sign to

hide/show a section

Mouse over a track until a cross

appears.

Click on track name to drag track up or

down.

Page 28: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Can view file in Firefox, but use other programs

(Adobe Illustrator or Inkscape) to convert to

other formats and/or edit

8b. Click on “View SVG Image in new browser window”

8c. Save generated file with “.svg”

extensions

8: Image for Publication (cont)

Page 29: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Inkscape is free and lets you edit and convert to other formats

(many journals prefer EPS)

8: Image for Publication (cont)

Page 30: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

9. Bulk downloads

18. From www.hapmap.org, click

on “Bulk Data Download”

Or directly click on “Data”

Page 31: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

9. Bulk downloadsDownload the entire HapMap3 data set to your own computer

Analytic results (LD & phased haplotype data available for

HapMap3)

HapMap Samples

Protocols & assay design

Your own copy of the HapMap

Browser

9a. Select “Genotypes”

Also available at http://ftp.hapmap.org

HapMap3 genotypes

& frequencie

s

Page 32: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

9. Bulk downloads (cont)

9b. Click on hapmap_format/forward to download genotypes

Also at http://ftp.hapmap.org/genotypes/latest_phaseIII_ncbi_b36/

Page 33: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

10: Surf to the HapMap phase 1+2 genome browser

10. Go to www.hapmap.org & select

“HapMap Genome Browser B36”

Page 34: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

11: Search for TCF7L2

11. Type search term – “TCF7L2”

Page 35: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

12: Examine Region

12. Re-center & zoom in

Page 36: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

12: Turn on LD & Haplotype Tracks

12b: Press “Update Image”

12a: Scroll down to the “Tracks” section. Turn

on the LD Plot and Haplotype Display

tracks.

These sections allow you to adjust the

display and to superimpose your own data on the

HapMap

Page 37: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

13: View variation patternsTriangle plot shows LD

values using r2 or D’/LOD scores in one

or more HapMap populations

Phased haplotype track shows all 120 chromosomes with

alleles colored yellow and blue

Page 38: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

14: Adjust Track Settings (on the spot)

14b. Adjust population and

display settings & press “Configure”

14a. Click on question mark

precedingtrack name

Page 39: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

14: Adjust Track Settings (cont)

Select the analysis track to adjust and press “Configure”

Page 40: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

15: Turn on Tag SNP Track

15: Activate the “tag SNP Picker” and press

“Update Image”

Page 41: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

16: Adjust tag SNP picker

Tag SNPs are selected on the fly as you

navigate around the genome

16a: Click on question mark behind “tag SNP

Picker”

Alternatively, you may select

“Annotate tag SNP Picker” and press

“Configure…”

Page 42: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

16: Adjust tag SNP picker (cont)

Select population

Select tagging algorithm and parameters

[optional] upload list of SNPs to be

included, excluded, or design scores16b: Press “Configure”

to save changes

Page 43: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17: Impute genotypes using HapMap Data

• Interested in the VAV1 gene

• Commercially available platforms with few overlapping SNPs in this region

• HapMap genotyped lots of SNPs in region

Use genotypes for HapMap SNPs to impute genotypes & compare non-overlapping SNP sets!

Page 44: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17: Impute genotypes using MACH1

17a. Go to chr19:6,765,000..6,900,000

17b. Select “Download Impute Data”, click “Configure”

Page 45: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17: Configure MACH1

17c. Upload input files: example.dat & example.ped.

Enter e-mail address. Click “Go”

Page 46: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17: Impute genotypes: Input files

• example.dat (20 user-provided SNPs; all should be part of the HapMap):

M rs4807101M rs164022M rs625828M rs461970M rs331684…

• example.ped (genotypes for 336 unrelated inds):

PED00001 IND00001 0 0 2 C/C C/C T/T C/T C/C G/G G/G …PED00002 IND00002 0 0 1 C/T C/C T/T T/T C/C A/A A/G …PED00003 IND00003 0 0 2 T/T G/G A/A C/T C/C A/G A/G ……

Page 47: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17. Visualize imputed SNPs

Your imputation results appear as an external

track that can be edited. Hint: Click on “Help” link below for display options

17e. Click “Edit File”

17d. Return to browser

Page 48: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17. Edit external annotations file

17f. Edit annotations file & “Submit Changes”

Page 49: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17. Edit external annotations file

Page 50: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

17: Impute genotypes: Results• Info (143 provided & imputed HapMap SNPs)

SNP Al1 Al2 Freq1 MAF Quality Rsqrs10419572 T A 0.9041 0.0959 0.8179 0.1069rs415218 T A 0.9709 0.0291 0.9427 0.0313rs4807100 A G 0.4713 0.4713 0.9790 0.9625rs4807101 T C 0.4714 0.4714 0.9803 0.9649rs1651876 T C 0.9631 0.0369 0.9277 0.0216…

• Geno (143 SNPs x 336 inds)PED00001->IND00001 ML_GENO T/T T/T G/G C/C T/T T/T A/T G/G A/A T/T T/C …PED00002->IND00002 ML_GENO T/T T/T A/G T/C T/T T/T A/T G/G A/A T/T T/C …PED00003->IND00003 ML_GENO T/T T/T A/A T/T T/T T/T A/T G/G A/A T/T T/T ……

• Dose (allele dosage)PED00001->IND00001 ML_DOSE 1.719 1.911 0.004 0.003 1.913 1.980 1.246 1.884 1.949 1.948 1.302 …PED00002->IND00002 ML_DOSE 1.861 1.957 1.000 1.000 1.952 1.892 1.086 1.909 1.949 1.948 1.096 …PED00003->IND00003 ML_DOSE 1.994 1.999 1.993 1.995 1.955 1.656 1.297 1.863 1.987 1.988 1.374……

Probability of match imputed:experimenta

l genotype (1.0 for provided markers)

17g. Check your e-mail for text results

Page 51: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

18. Use HapMart to Generate Extracts of the HapMap Dataset

Find all HapMap characterized SNPs that:

1. Have a MAF > 0.20 in the Yoruban population panel (YRI)

2. Cause a nonsynonymous amino acid change

3. Were typed by Perlegen

Page 52: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

Further Information

• HapMap Publications & Guidelineshttp://hapmap.cshl.org/publications.html.en

• Past tutorials & user’s guide to HapMap.orghttp://www.hapmap.org/tutorials.html.en

[email protected]

Page 53: Resources at HapMap.Org HapMap3 Tutorial Marcela K. Tello-Ruiz Cold Spring Harbor Laboratory.

HapMap DCC Present Members (CSHL)Lincoln SteinMarcela K. Tello-RuizZhenyuan LuWei Zhao

HapMap DCC Former MembersLalitha Krishnan Albert Vernon SmithGudmundur ThorissonFiona Cunningham