Post on 13-Jan-2016
description
HST Advisory Council Thursday 16-Nov-2004 2:00 to 2:20 PM
Personal Genomes & Medicine
Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools,
NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation
Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, ThermoFinnigan, Xeotron/Invitrogen
For more info see: arep.med.harvard.edu
Why sequence?
• Cancer: mutation sets for individual clones, loss-of-heterozygosity• Pathogen "weather map", biowarfare sensors• RNA splicing & chromatin modification patterns.• Synthetic biology & lab selections• Antibodies or "aptamers" for any protein• B & T-cell receptor diversity: Temporal profiling, clinical • Preventative medicine & genotype–phenotype associations • Cell-lineage during development• Phylogenetic footprinting, biodiversity
Shendure et al. 2004 Nature Rev Gen 5, 335.
The idea of Common SNPs for Common Diseases has been hugely oversold.
Do association studies need the added baggage of
"linkage" assumptions?
Should we determine genotype (haplotype) directly
(at low cost) rather than infer it from population
trends?
Rare Alleles / Common Diseases
Even "dispensable" regions of the genome can harbor neomorphic alleles. Each of us has about 104 mutations since the last major population bottleneck.
"-463GA, has been associated with incidence or severity of inflammatory diseases, including atherosclerosis and Alzheimer's disease, and some cancers. The polymorphism is within an Alu element " Kumar AP, et al. (2004) J Biol Chem. 279:8300-15.
Variable breakpoints in Burkitt lymphoma cells with chromosomal t(8;14) translocation separate c-myc and the IgH locus up to several hundred kb. Joos S, et al. (1992) Hum Mol Genet. 1:625-32.
Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Cohen JC et al. (2004) Science. 305:869-72.
Personal genomics & cancer therapy
Mutations G719S, L858R, Del746ELREA in red.
EGFR Mutations in lung cancer: correlation with clinical response to gefitinib [Iressa] therapy. Paez, … Meyerson (2004) Science 304: 1497
Dulbecco R. (1986) A turning point in cancer research: sequencing the human genome. Science 231:1055-6.
Why 'single molecule' sequencing?
(1) Single-cells: Preimplantation (PGD), uncultivatable
(2) Co-occurrence on a molecule, complex, cell RNA splice-forms & DNA haplotypes
(3) Cost: $1K-100K "personal genomes"http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-003.html
(4) Precision: Counting 109 RNA tags (to reduce variance)
(~5e5 RNAs per human cell)Fixed 5e3 5e4 5e6 5e9 (goal) costs EST SAGE MPSS Polony-FISSeq (polymerase colony)
CD44 Exon Combinatorics (Zhu & Shendure)
• Alternatively Spliced Cell Adhesion Molecule• Specific variable exons are up-or-down-regulated in
various cancers (>2000 papers)• v6 & v7 enable direct binding to chondroitin sulfate,
heparin…
Zhu,J, et al. Science. 301:836-8.
Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing.
EXON PATTERN Eph4 Eph4bDD TOTALEph4 FRATIO LSTP-PV------------7-8-9-10 609 764 1373 1.17 1E-4--------------8-9-10 320 390 710 1.13 3E-2----------6-7-8-9-10 431 251 682 -1.85 4E-18------4-5-6-7-8-9-10 218 216 434 -1.08 2E-1----------------9-10 68 143 211 1.96 7E-7--------5-6-7-8-9-10 86 39 125 -2.37 2E-6----3-4-5-6-7-8-9-10 40 56 96 1.30 9E-2------4-5---7-8-9-10 16 74 90 4.30 2E-9--2-3-4-5-6-7-8-9-10 44 28 72 -1.69 1E-21-2-3-4-5-6-7-8-9-10 22 5 27 -4.73 3E-4--------5---7-8-9-10 5 19 24 3.53 3E-3----3-4-5---7-8-9-10 1 15 16 13.95 4E-4--2-3-4-5---7-8-9-10 1 10 11 9.30 5E-3
Eph4 = murine mammary epithelial cell line
Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic)
CD44 RNA isoforms
Multi-locus haplotyping on pooled samples
Kun Zhang
Throughput = (# loci × # samples) / time
Multi-locus haplotyping
NOS3
C/T G/A G/T G/A T/A C/T C/T
~24-Kb
Chr 7
Chromosome-wide haplotyping
IL6-3572 : A/C
~60-Mb
CD36-4366 : T/A
Human Chr. 7
A..T
A..A
Convergence on non-electrophorectic tag-sequencing methods?
Tag >400 14-26 20 100 26 bp (2-ends) EST SAGE MPSS 454 Polony-Seq Ronaghi• Single-molecule vs. amplified single molecule. • Array vs. bead packing vs. random• Rapid scans vs. long scans (chemically limited, 454)• Number of immobilized primers: 0: Chetverin'97 "Molecular Colonies" 1: Mitra'99 > Agencourt "Bead Polonies" 2: Kawashima'88, Adams'97 > Lynx/Solexa: "Clusters"
http://arep.med.harvard.edu/Polonator/Plone.htm
Bead Polony Sequencing Pipeline
In vitro libraries via paired tag
manipulation
Bead polonies via emulsion PCR
[Dre03]
Monolayered immobilization in acrylamide
Enrichment of amplified beads
SOFTWARE
Images → Tag Sequences
Tag Sequences → Genome
FISSEQ or “wobble”sequencing
Epifluorescence Scope with Integrated Flow
Cell
Polony Fluorescent In Situ Sequencing Libraries
Greg PorrecaAbraham Rosenbaum
1 to 100kb Genomic1 to 100kb Genomic
M
L R
M
PCRbead
Sequencingprimers
Selectorbead
2x20bp after MmeI (BceAI, AcuI)
Dressman et al PNAS 2003 emulsion
Cleavable dNTP-Fluorophore (& terminators)
Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65
Reduce
or
photo-cleave
Polony-FISSeq: up to 2 billion beads/slideCy5 primer (570nm) ; Cy3 dNTP (666nm)
Jay ShendureSelf Organizing Monolayer
• # of bases sequenced (total) 23,703,953
• # bases sequenced (unique) 73
• Avg fold coverage 324,711 X
• Pixels used per bead (analysis) ~3.6
• Read Length per primer 14-15 bp
• Insertions 0.5%
• Deletions 0.7%
• Substitutions (raw) 4e-5 • Throughput: 360,000 bp/min
Polony FISSeq Stats
Current capillary sequencing 1400 bp/min (600X speed/cost ratio, ~$5K/1X)
(This may omit: PCR , homopolymer, context errors)Shendure
Anonymity, privacy, identity
Required disclosure > optional > required privacy
Non-anonymous healthy genotype-phenotype studies
• Are information-rich resources (e.g. facial imaging & genome sequence) really anonymous?
• What are the risks and benefits of "open-source"?
• What level of training is needed to give informed consent on open-ended studies?
• Harvard Medical School IRB Human Subjects protocol submitted 16-Sep-2004
.