Disruption of Genomics - Mardis.ppt · has DNA sequencing as its core technology ... Microsoft...

40
NGS: The Disruption of Genomics and Medicine Elaine R. Mardis, Ph.D. Co-director, The Genome Institute Robert E. and Louise F. Dunn Distinguished Professor of Medicine 9th Annual Texas Conference on Health Disparities

Transcript of Disruption of Genomics - Mardis.ppt · has DNA sequencing as its core technology ... Microsoft...

NGS: The Disruption of Genomics and Medicine

Elaine R. Mardis, Ph.D.Co-director, The Genome InstituteRobert E. and Louise F. Dunn Distinguished Professor of Medicine

9th Annual Texas Conference on Health Disparities

Genomics: BeginningsSequencing DNA

DNA Sequencing: Basic Principles

• DNA sequencing has a brief history (<40 yrs)

• The scientific discipline known as “Genomics”has DNA sequencing as its core technology

• What used to take years now takes weeks (or less)

• Computational methods and computing technology underlie nearly every aspect of modern genomic approaches

• Genomics is most impactful if practiced in a teamscience environment

Sanger Dideoxynucleotide Sequencing: the beginning

DNA Sequencing c. 1985: radiolabeling

Klenow DNA polymerase

32P-dATP + dC,dG,dTTP

6 M13 templates

Universal primer

One hand-poured urea/polyacrylamide gel

Power supply

Fan

X-ray filmYield: 6 x 500 bp/ ??

DNA Sequencing c. 1987: fluorescent labeling

6 M13 templates

USB T7 Polymerase kit

Custom mixes (no 32P)

Hand-poured urea/PA gel

Yield: 6 x 500 bp/24 hr$100K/instrument

1990-1998: C. elegans Genome Sequence

• Most sequence data for the worm were generated after 1994

• Development of methods, technology, algorithms and infrastructure

• First complete animal genome sequenced (100 Mb)

ABI 3730xl Sequencer

• Each 3730xl sequencer cost $350K

• One 3730xl processed 20x96 well plates/24hr day and provided ~650bp/sample (1.25Mb/day)

• By 2005, we had 135 of these sequencers and produced ~10M sequencing reactions/month (~650 Mb)

The Biological “Moon Shot”: The Human Genome

…and beyond…

February 2001

The Trajectory of Throughput: 10 years

E.R. Mardis, Nature (2011) E.R. Mardis, Ann. Rev. Analyt. Chem. (2013)

Capillary technology

Applied Biosystems 3730xl (2004)

$15,000,000~5 years, International project

Next-gen technology

Illumina HiSeq 2500 (2014)

$7,000~28 hours, one instrument

Whole human genome sequencing: 10 years

The 1000 Genomes project: a catalogue of human polymorphismcreated using next generation sequencing

What has the 1000 Genomes Project produced?

• Essentially all SNPs (MAF >1%) in multiple population groups from worldwide collection of bloods• Many variants within 0.2-1% MAF

• Highly complete catalogue of CNVs• Information required for imputation of lower

frequency alleles into existing GWAS samples• A set of validated computational methods for

use of next generation sequencing in disease samples

• Important reference genomes for clinical sequencing

Next-generation Sequencer basicsHow massively parallel sequencing works

Next-generation DNA sequencing instruments

• All NGS platforms require a library obtained either by amplification or ligation with custom linkers (adapters)

• Each library fragment is amplified on a solid surface (either bead or flat Si-derived surface) with covalently attached adapters that hybridize the library adapters

• Direct step-by-step detection of the nucleotide base incorporated by each amplified library fragment set

• Hundreds of thousands to hundreds of millions of reactions detected per instrument run = “massively parallel sequencing”

• A “digital” read type that enables direct quantitative comparisons

• Shorter read lengths than capillary sequencers

Library Construction for NGS• Shear high molecular weight DNA with sonication

• Enzymatic treatments to blunt ends

• Ligate synthetic DNA adapters

• Produce size fractions

• Quantitate

• Reduced input amount and time to library vs. conventional sequencing methods

• Read lengths much shorter than conventional sequencers

Hybrid Capture

• Hybrid capture - fragments from a whole genome library are selected by combining with probes that correspond to most (not all) human exons or gene targets.

• The probe DNAs are biotinylated, making selection from solution with streptavidin magnetic beads an effective means of purification.

• An “exome” by definition, is the exons of all genes annotated in the reference genome.

• Custom capture reagents can be synthesized to target specific loci that may be of clinical interest.

Multiplex PCR Amplification of Targets

1. Design amplification primer pairs for exons of genes of interest; tile primers to overlap fragments in larger exons

2. Group primer pairs according to G+C content, Tm and reaction condition specifics

3. Amplify genomic DNA to generate multiple products from each primer set; pool products from each set

4. Create library by ligation or tail platform adaptors on the primer ends

5. Sequence

Illumina: Massively Parallel Sequencing by Synthesis

Excitation

EmissionIncorporate

DetectDe-block

Cleave fluor

Platforms: Illumina

• High accuracy, range of capacity and throughput• Longer read lengths on some platforms (MiSeq)• Improved kits, improved software pipeline and capabilities, cloud compute

ION Torrent-pH Sensing of Base Incorporation

Platforms: Ion Torrent

PGMProton

• Two human exomes (Proton 1 chip) or one genome (@20X-Proton 2 chip) per run• Ion One Touch or Ion Chef preparatory modules• 2-4 hour/run• ~200 bp average read length• Proton 1 produces 60-80 Mreads >50 bp

• Three sequencing chips available:

• 314 = up to 100 Mb• 316 = up to 1 Gb• 318 = up to 2 Gb

• 2-7 hour/run• up to 400 bp read length• 400kreads up to 5 Mreads

• Low substitution error rate, in/dels problematic, no paired end reads• Inexpensive and fast turn-around for data production• Improved computational workflows for analysis

Translating the Cancer GenomeTherapeutic Options via NGS and analysis

Cancer is a Disease of the Genome

In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes suggested that specific alterations led to cancer, laying the foundation for cancer genomics.

Cancer genomics

K DFG R Y

Tyrosine kinase

745 Y869

K DFG Y Y Y YTM

718 964

EGF ligand binding autophos

GXGXXG

835

R

776

H

858 947

MLREA

~80% of Iressa responders have EGFR mutations

W. Pao et al., PNAS 2004

Cancer Genomics

R.K.Wilson 2011

AML1: First tumor:normal comparison by NGS

Ley et al., Nature 2008

• Caucasian female, mid-50s at diagnosis

• De novo M1 AML• Family history of

AML and lymphoma• 100% blasts in

initial BM sample• Relapsed and died

at 23 months• Informed consent

for whole genome sequencing

• Solexa sequencer, 32 bp unpaired reads

Funding: A. Siteman

TGI: Cancer Cases by WGS (March 2014)

0

25

50

75

100

125

150

WGS: 2,700 from ~1,100 cancer patients

Exomes: 6,200 from ~3,000 cancer patients

WU-SJ PCGP: 750 patients

Pan-Cancer Analyses from TCGA

Nature Genetics 45: 1113-1120 (2013)

WUMS Genomics Tumor Board

• The Genomics Tumor Board (GTB) serves as a vehicle for education, decision-making, and patient monitoring• Physicians work with junior faculty to develop and present case reports of

each patient’s clinical history• Oversight board of GTB reviews cases and determines 1-2 per month that

are most likely to benefit from genomic diagnosis (difficult diagnoses, late stage metastatic patients)

• Appropriate tumor and normal samples are obtained and studied by genome, exome and transcriptome sequencing and analysis

• Results of genomic diagnosis are communicated to the physician lead, then to GTB participants

• Physician lead presents their decision to treat, outcomes if available, difficulties encountered

• Costs for the GTB work will be absorbed by the Division of Oncology and The Genome Institute

Integrated Discovery: DNA and RNA data

Whole genome sequencing(tumor + normal)

Exome sequencing (tumor + normal)

Whole transcriptome sequencing(tumor)

Linking Somatic Variants to Therapies

Obi Griffith, Ph.D. and Malachi Griffith, Ph.D.2013 Wired UK Genius List

Somatic/Germline Cancer Events (DNA+RNA)

Drug Gene Interaction database

(>50 database sources)

Filtered (activating/drivers)

Candidate genes/pathways

Clinically actionable events

(aka “The Report”)

Functional annotation

DrugBank

TTD

clinicaltrials.gov

PharmGKB

TEND

Literature

dGene

TALC

Clinical prioritization and reporting

Malachi & Obi Griffith

Single Nucleotide Variants

Insertion/deletions

Structural Variants

Copy Number Variations

SV-predicted gene fusions

Differentially Expressed Genes

Differentially Expressed Isoforms

Therapeutic Interpretation of Variants

MyCancerGenome

Expressed variants

**

**

• Highly curated database of mutations having a demonstrated association with cancer

• General information about each somatic variant• Chromosomal Location• Strand• Gene• Protein impact of variant (annotation)• PubMed ID evidence cited, linked

• Easy to access from the web and programmatically through an API

DoCM: A Database of Canonical Cancer Mutations

DGIdb: Drug Gene Interaction database

Griffith, M. et al., Nature Methods 2013dgidb.org

Welch et al., JAMA 2011: 305(15): 1577-1584.

Lukas Wartman, M.D. is Patient “ALL1”

Marston E, et al. Blood 2009 Jan 1;113(1):117-26.

FLT3 Over-expression in ALL1

• FLT3 was within the top 1% of all expressed genes.

• Absent a normal comparator, the literature report from Marston identified FLT3 over-expression in pre-B-ALL

• Based on wt FLT3 over-expression by the tumor cells, we predicted the cancer would be sensitive to the FLT3 inhibitor Sunitinib (Sutent) [DrugBank].

Patient biopsied metastatic

melanoma lesions

Tumor and germline DNA sequenced,

somatic mutations identified; RNA capture verifies

expressed mutations and expression level;

netMHC algorithm identifies

immunoepitopes

Apheresis samples from patient used to

verify the algorithmically-

identified immunoepitopes that elicit T cell memory

Sequencing to identify tumor-specific immunoepitopes:

Mardis, Schreiber et al., Nature 2012

Genome-driven cancer immunotherapy

Dendritic Cell Vaccine Platform

SNV specific T-cell immunityEx-vivo monitoring to evaluate generation of cytotoxic effectors

DC generationEx-vivo differentiation of monocytes into DC

DC maturationEx-vivo instruction to generate appropriate cytotoxic effectors

Infusion of DC vaccine

DC targeting Loading with selected SNV-derived peptides

A dendritic cell-based approach is currently being tested in an FDA approved protocol for metastatic melanoma patients:

• Patient 1 has received all three doses of vaccine, and is being monitored• Patient 2 has received three doses of vaccine, this patient has measurable disease and will be monitored for

progression, stability or regression• Patient 3 has measurable disease, has completed her vaccine infusions early March• Patients 4 and 5 have genomic analysis completed, in vitro assays completed, GMP peptides underway

Name [email protected]

Acknowledgements

WUSM/Siteman Cancer CenterTimothy J. Ley, M.D.Lukas Wartman, M.D.Peter Westervelt, M.D.John DiPersio, M.D.Gerry Linette, M.D.Beatriz Carreno, M.D., Ph.D.

Thanks also to:Aaron QuinlanGabor MarthMichael Zody

Our patients and their families

The Genome InstituteMalachi Griffith, Ph.D.Obi Griffith, Ph.D.Ben AinscoughZach SkidmoreAvinash RamuAllison RegierLee TraniNick SpiesVincent Magrini, Ph.D.Sean McGrathRyan DemeterJasreet Hundal, M.S.Jason WalkerDavid Larson, Ph.D.Lucinda FultonRobert FultonBrad Ozenberger

Richard K. Wilson, Ph.D.