Personal Genome Project 9:35-10:20 Hyatt 4-Oct-2010 Future ...0.000001 0.00001 0.0001 0.001 0.01 0.1...
Transcript of Personal Genome Project 9:35-10:20 Hyatt 4-Oct-2010 Future ...0.000001 0.00001 0.0001 0.001 0.01 0.1...
Thanks to:
.gov||||.edu||||.org||||.com|| ||
Read = = = = = = = = I/O = = = = = = = Write 1
Personal Genome Project 9:35-10:20 Hyatt 4-Oct-2010 Future of Healthcare Using Genomics as a Key Tool
AzcoRBH
ArmRev.orgOppenheimer Foundation
Gen9
LSRF
22
Will we be able to handle & interpret the data?
Christley et al Bioinformatics
(2009) apply a series of techniques to James Watson's genome that in combination reduce it to a mere 4MBytes, small enough to be sent as an email attachment. algorithms implemented in C++ and are freely available http://www.ics.uci.edu/~xhx/project/DNAzip
(Google processes 24E9 Mbytes/day)
3
Medical Genomics: Individually rare collectively common (10%)
1935 genes are highly predictive
& medically actionableat ~$1000 per gene test.
Why not on DTC SNP chips? Cost? Patents?PKU, Tay Sachs, Cystic Fibrosis, BRCA1/2, etc.rarediseases.info.nih.gov genetests.org
1963 PKU
44
US clinics quietly embrace whole-genome sequencing
Nature 14-Sep-2010
"If one hospital is doing it, you can be sure others will start, because patients will vote with their feet," --
Elizabeth Worthey HMGC & Children's Hospital of Wisconsin
“At the age of 3, he had more than 100 separate surgeries …On the basis of [sequencing], the physician recommended a bone-marrow transplant in June 2010. By mid-July, the child was eating his first meal.”
5
Reading & Writing Genomes: First semi-synthetic plasmid 1978: $10/b
CGI Human diploid genome 2009: $1500 / 6Gb
pBR3225 genes
(=7 logs/30y mostly since 2005)
NAR 1978Sutcliffe & Church(BR:Bolivar & Rodriguez)
•Human insulin•Human growth hormone•Alpha-interferon•G-CSF•TPA•GM-CSF•Gamma-interferon•IL-2•Erythropoietin•Hepatitis B vaccine
(Amgen, Biogen, Genentech, etc)
6
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
100000
1000000
1980 1990 2000 2010
dsDNAOligosSeq bp/$
NIBR SynBio 15-Oct-2010
George Church
1Mb/$ 0.1Mb/$ chip
Moore’s law =1.5x/yr vs 10x/yr
1st-generation Gene synthesis vs
2nd-generation Sequencing (21 tech)
& DNA synthesis
5-log GAP
1b/$in cells
7
2nd-generation sequencing technologies1.
Illumina-GA SbP Fluorescent read-length (2*110 bp)2.
AB-SOLiD SbL Longest ligation reads (2*70 bp)3.
CGI
SbL $2000 genome, rolony grid, (7*10bp)4.
Polonator
SbL/P
Open-source,
$170K device, (2*30bp)5.
Roche-454 SbP Long reads (900 bp))6.
Helicos SbP-sm
High parallelism & quantitation (2*30 bp)7.
Ion Torrent SbP
$50K, small device (100 bp)8.
Pacific Bio SbP-sm
Long reads (>2.0 kb)9.
Intelligent Bio
SbP hexagonal grid (2*30 bp)10. GnuBio
SbP-picoliter droplets (ND)11.
Halcyon EM-sm
Long reads (ND >Mb)12.
Visigen/StarLite SbP-sm
Qdot-Pol-dNTP FRET (ND 500Kb)13.
Bionanomatrix SbP-sm
Fluorescent mapping (ND)14.
OxfordNanopore
Pore-protein-sm
small device (ND)15.
Nabsys
Pore-SbH-sm
small device (ND)16. IBM
Pore Si-sm
small device (ND) 17.
Genizon BioSci
SbH in situ sequencing (ND)18. LightSpeed
SbL 16X density, >10X speed (ND)19.
ZS Genetics
EM-sm
Iodine labels (ND)20. GE Global
SbP-sm
(ND)21. Electronic Biosci
Pore-protein-sm (ND)
Polonator Polonator
Beadless Isothermally-amplified Rolling Circle Molecules on a grid
8•Billions of Library Molecules can be amplified in 1 ml of solution in 30 minutes•Unamplified molecules do not take up flow-cell space
Polony Exclusion Principle
9
Human Genome Sequencing Using Unchained Base Reads on Self-assembling DNA Nanoarrays.
Drmanac et al Science Jan 2010
X,Y density: 5-fold due to grid * 25 vs 1 pixel/template
Nutrition
Chemicals
10
Genomes Environments
Traits
Immunome TRAITS(Phenome)
Personal stem-cells
Epigenome (RNA,mC)
PERSONAL GENOME
3M alleles
One in a life-time genome + yearly ( to daily) testsBio-weather map : Allergens, Microbes, Viruses
PersonalGenomes.org
Microbiome
Therapies
Immunome
Value of extreme & distinctive traits
Hyperthymesia, savant, synaesthesiaSolomon Shereshevskii 1886-1958 Kim Peek 1951-2009Brad Williams 1956-present
PhysicsIsaac Newton 1643-1727Albert Einstein 1879-1955
SpiritualSiddhārtha Gautama 563-483 BCETenzin Gyatso 1935-present
1212
2 Alleles from 4 Genome Sequences
MYH3 R672H R672C Ng et al Nature 20009Freeman–Sheldon syndrome (FSS, dominant)
1313
3 Alleles from 2 Genome Sequences
Whole-Genome Sequencing in a Patientwith Charcot–Marie–Tooth NeuropathyLupski et al. NEJM 362;13 April 1, 2010SH3TC2 R954X, Y169H
Genetic diagnosis by whole exome captureand massively parallel DNA sequencing.Choi, Lifton, et al. PNAS 2009“Homozygous missense D652N mutation at a position in SLC26A3 (congenital chloride diarrhea locus)”
1414
4+10 Alleles from 4 Genome Sequences
Analysis of Genetic Inheritance in a Family Quartet by Whole- Genome Sequencing. Roach, et al. Science 2010
Exome Sequencing Identifies the Cause of a MendelianDisorder. Ng et al. Nature Gen. 2010
• Primary ciliary dyskinesia (lungs)DNAH5• Pyrimidine synthesis (face & limbs) DHODH
Logan & Heather Madsen
sltrib.com/health/ci_14648608
15
Genes Environments
Traits, cells1) First/only open access data 2) Avoid over-promising on de-identification 3) 100% on
Exam
to assure informed consent
(*Educate pre-consent rather than post-discovery*)4) Genome sequence
+ epigenome
5) Multi-traits: images, iPS-etc.RNA, microbe/VDJ 6) Cells available
for personal functional genomics
7) IRB approval for 100,000 diverse volunteers501(c)(3)
16,000 so far
20431
21070
2166
0
21677
21833
21687
21846
21731
21730
21781
Genome+Environment = Traits, [diseases], (treatments)
Hair: Baldness [alopecia](minoxidil) Eyes:
[Near/Far-sightedness](glasses) Iris color [ARMD] (sunglasses)Face:
[Developmental syndromes, Wrinkles] (Botox)Brain:
ADHD(Ritalin); Depression(Prozac); Headache(analgesics)Sleep & Circadian (caffeine, amphetamine, modafinil)Motion sickness (Dramamine, and Scopolamine)Ears:
Sensitivity (hearing aids)Nose:
Shape [breathing disorders] (CPAP)Lip:
[Cleft palate] (surgery); [Hirsutism] (calcium thioglycolate)Mouth: Halitosis, throat exams; aerosols [airborne pathogens]Digestion [reflux, gas,ulcer] (antibiotics, antacids, PPIs) Back: Strain sensitivity [IDD] (analgesics)Skin:
Perspiration, Body odor, Pheromones (deodorants)Surface texture [psoriasis] (topicals, photo-treatments)Immune components [acne] (topical antibiotics) Skin color [vitamin D & sunburn] (supplements, SPF cream)Hands: Dermatoglyphics [syndromes], [Arthritis](corticosteroids) Internal sensors:
Proprioceptor, Repetitive stress (NSAIAs)Body: Height [Marfan] [short stature] (hGH)Weight [anorexia] [obesity] (Orlistat, Phentermine, Sibutramine)Allergies (antihistamines, cortisone, epinephrine, theophylline)Metabolic polymorphisms (nutriceuticals, insulin, statins)Feet:
Plantar fasciitis (orthotic shoes)Athlete’s foot (miconazole, itraconazole, terbinafine, salicylate)1933
PGP
1717
PGP#1 fMRI
Randy
Bruckner
Behavioral & cognitive
tests
Ken Nakayama
Prosopagnosia
1818
Individual report:
97 year old
PGP11
evidence.personalgenomes.org
1919
How do we improve interpretation? e.g. PGP 6 hypertrophic cardiomyopathy allele
evidence.personalgenomes.org
20
Microbiome vs
Immunome
Microbe tests: Detect Drug resistance spectrumEarlier warning (e.g. meningitis)
Immune tests: Focus on response to exposureLonger times to detect exposure (e.g. HIV, TB)
21
Microbiomes: What limits diagnostics?
-Standard practice: skip diagnostics; guess at pathogen & antibiotics-Biomarkers vs causative sequences.-Ideally target pathogenicity & resistance- Assay 6 nanoliters or 6 liters?
(if <1 cell / ml)
22Yung, Ingber et al. Lab Chip, 2009
Circulating tumor, pathogen, fetal, immune cells
23
Microbiome vs
Immunome
Microbe tests: Detect Drug resistance spectrumEarlier warning (e.g. meningitis)
Immune tests: Focus on response to exposureLonger times to detect exposure (e.g. HIV, TB)
24
Vaccination Immunome
Harvard/MIT: Vigneault, Laserson, Lieberman-Aiden, Church
Roche: Egholm, Simen
25
PGP Time Series Vaccine ExperimentTracking human dynamic response to vaccination to 11 strains:Hepatitis A+B, Flu A/Brisbane/59/2007 (H1N1)-like, 10/2007
(H3N2)-like, B/Florida/4/2006-like virus Polio, Yellow feverMeningococcusTyphoid, Tetanus Diptheria, PertussisCollect samples at -14d, 0d, +1d, +3d, +7d, +14d, +21d, +28d
26
SR1+SR2+TR1
IMGT/LIGM
V D J usage –
CDR3 size distribution
PGP Vaccination Immunome Self Organizing Map (SOM) clustering
27Synthesize and test for antigen binding.
Chemicals28
Genomes Environments
Traits
Immunome TRAITS(Phenome)
Personal stem-cells
Epigenome (RNA,mC)
PERSONAL GENOME
3M alleles
One in a life-time genome + yearly ( to daily) testsBio-weather map : Allergens, Microbes, Viruses
PersonalGenomes.org
Microbiome
Therapies
Immunome
29
PGP#1 & #9 skin to stem cells to ...
Lee J, Park IH,
Gao Y, Li JB, Li Z, Daley G,
Zhang K, Church GM
(2009)
A Robust Approach to Identifying Tissue-specific
Gene Expression Regulatory Variants Using Personalized Human Induced Pluripotent Stem Cells. PLoS Genetics Nov 2009
30
PGP iPSC-derived hepatic proteins & activity
Generation of Functional Human Hepatic Endodermfrom Human Induced Pluripotent Stem CellsGareth et al (Daley, Church, Ian Wilmut labs)
31
G
A
TC
Allele‐specific expression (ASE)
Test cis element
variants in heterozygotes
GA
AAAAAAAAAAAAAAAAAAAA
TC
TT
eliminate environmental & trans-acting variation among individuals.Cis:
Copy number, enhancer, promoter, splicing, polyA, termination,
transport, decay.
G
A
GG
Allele‐specific
transcription factor
binding
TF
Causality:
Synthetic
homologous
allele‐
replacement
Digital RNA allelotyping
Zhang, Li, Church unpublishedForton et al. Genome Res. 2007
32
Clustering stat- significant
allele-specific expression in
reprogrammed cells, ~50% of ASE invariant
among cell types
LeeZhangParkDaleyChurch
33
In situ sequencing for metaphase chromsomes
Zhang et al 2006
In situ sequencing: Resistance to BCR-ABL kinase inhibitors in CML therapy Nardi, Raz, Chao, Wu, Stone, Cortes, Deininger, Church, Zhu, Daley. Oncogene 27:775-82
E255K
T315I
M244V
Bakal, et al. Science 316, 1753 35
145 morphological measures → in situ sequencing
3636
The Future of Reading & Writing Genomes
• Next Generation BI/O: Reading & WritingSafety: multivirus-resistance, Genomic isolation
•Personal Genomes –
Integration tasks- Personal Genomes, Environments, Traits, -
Stem cells, Microbiome/Immunome
•Causality: Multiplex changes innetwork causes & monitor effects.
•In situ multiplexing: subcellular, paralogs, alleles•Proteome: antibody multiplexing•Human diversity:
multiplexing
3737
.
38
Promising (implying) anonymity?
Research identification trend (12) Identify individual case/control status from pooled SNP data Homer et al
PLoS Genetics 2008
as this became known, NCBI pulled dbGAP data (11) Re-identification after “de-identification”
using public data. Group Insurance list of birth date, gender, zip code sufficient to re-identify medical records of Governor Weld & family via voter-registration records (1998)
Self identification trend (10) Unapproved self-identification. e.g. Celera IRB. (Kennedy Science. 2002)(9) Obtaining data about oneself
via FOIA or sympathetic researchers. (8) DNA data CODIS data in the public domain. even if acquitted
39
Anonymity vs Open Access? Are we in denial?Accessing “Secure data”(7) Laptop loss. 26 million Veterans' medical records,
SSN & disabilities stolen Jun 2006. (6) Hacking. A hacker gained access to confidential medical info at the U.
Washington Medical Center --
4000 files (names, conditions, etc, 2000)(5) Combination of surnames from genotype with geographical info
An anonymous sperm donor traced on the internet 2005 by his 15 year
old son who used his own Y chromosome data.
(4) Identification by phenotype. If CT or MR imaging data is part of a study, one could reconstruct a person’s appearance . Even blood chemistry can be identifying in some cases.
(3) Inferring phenotype from genotype
Markers for eye, skin, and hair color, height, weight, geographical features, dysmorphologies, etc. are
known & the list is growing.
(2) “Abandoned DNA”
bearing samples (e.g. hair, dandruff, hand-prints, etc.) (1) Government subpoena. False positive IDs and/or family coercion
index
Sharing genetic information on the Web-- (Forbes 05.07.07 by Peter Huber
Manhattan Institute's Center for Legal Policy)
--will lead to cures. .. Routine lab tests .. already expose genetic variations associated with cholesterol, cancer and hundreds of other traits and diseases, both common and rare. Coming soon: .. dipsticks for the genes that control .. stress, pleasure, irritability, aggression, impulsive behavior, suicidal tendencies, alcoholism and sexual proclivities.. Discovering what ails us and how to beat it is a statistical game. .. traditionally .. left to .. Washington, because they had the best access to the most data. The networked masses, first to know what hanky-panky they've been up to, what pills they've been popping, and whether they feel better or worse. What Google medicine will lack in discipline and traditional rigor, it will more than make
up for in speed and scope.
Heritability & multigenics
Existing large studies Study Start Cohort size SamplesFramingham Heart Study '48 13,833 1,500Health Professional's study '86 52,000 30,000Nurses Health Study I '89 122,000 63,000Washington County study '89 33,000 33,000Women's Health study '92 40,000 28,000Women's Health Initiative '93 162,000 162,000NCI PLCO study '94 155,000 70,000Nurses Health Study II '96 116,000 60,000Am Cancer Soc CPS II-Lifelink study '98 184,000 109,000Multiethnic Cohort study '96 215,000 80,000VITAL cohort '99 78,000 54,000Agricultural Health study '99 90,000 35,000Southern Community Cohort study '02 90,000 80,000Black Women's cohort study '05 59,000 41,000
1,409,833 846,500
43
Trends toward opennessHR 2764
“SEC. 218. all investigators funded by the NIH submit ..
an electronic version of their final, peer-reviewed manuscripts .., to be made publicly available”
(make science paid for by tax-payers
accessible to the tax-payers, not just the experts) 26-Dec-07
PatientsLikeMe.com: MS, Parkinson’s, ALS, Depression, Anxiety, Bipolar, OCD, HIV/AIDS
“sharing your healthcare experiences and
outcomes is good.”
(Full names & photos)
44
Traits
Down Syndrome
Waardenburg
Eyes
Normal Dermatoglyph
45
Imaging Diagnostics
Control
22q11DS
Noonan
Smith-Magenis
William’s
Hammond et al, Am J Med Genet 2004, Am J Hum Genet 2005
46
How many ‘complete’
public genomes now?
Name >90% publicFold
coverage Inst TechVenter Sep-07 7.5 JCVI ABIWatson Apr-08 7.4 BCM 454Yang Nov-08 36 BGI ILMN
Kim 1(4) May-09 29 Incheon ILMNQuake Aug-09 28 Stanford HLCSChurch Nov-09 45-90 PGP CGIGates 2 Mar-10 20 PGP ILMN
Tutu Mar-10 30 PSU SOLiDAngrist Mar-10 30 PGP ILMN
Madsen 0(4) Mar-10 51-88 ISB CGILupski Mar-10 30 BCM SOLiDLucier Apr-10 ? ABI SOLiDFlatley Apr-10 ? ILMN ILMN
West 1(4) Apr-10 ? ILMN ILMNGill Apr-10 15 TBI/PGI ILMN
13