John P. Hussman Institute for Human Genomics University of ...
Transcript of John P. Hussman Institute for Human Genomics University of ...
![Page 1: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/1.jpg)
Stephan Züchner, MD
John P. Hussman Institute for Human Genomics
University of Miami Miller School of Medicine
![Page 2: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/2.jpg)
• Part of a patent licensing agreements with Athena Diagnostics.
• Receiving honorarium from Illumina.
![Page 3: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/3.jpg)
Completion of Human Genome Project: 2001 - 2011
• Completion using Sanger sequencing
• Initiation of new seq technologies:
Shot-gun approach
Sequencing by synthesis
• Today, seq industry very competitive,
extremely innovative
![Page 4: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/4.jpg)
• HiSeq2000 produces >300 billion bases per run (9days, ~$20K)
that is a 100,000-fold improvement in 10 years
• >600Gb by mid 2011
• The rate of technical improvement in the sequencing arena by far
outpaces Moore's Law (2 fold in 1.5 years).
Recent numbers …
![Page 5: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/5.jpg)
• Most challenges relate to the analysis of data.
• Study designs.
• Interdisciplinary teams are key (molecular, bioinformatics, clinical,
statistical expertise).
• Ever evolving tool set – much time is occupied by staying up-to-date.
Challenges
![Page 6: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/6.jpg)
03/2010
and Richard A. Gibbs, Ph.D.
Commentary in Nat Rev Neurology, S. Züchner 2010
![Page 7: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/7.jpg)
Individual genome vs exome sequencing
o Not yet suitable for routine whole human genome sequencing:
o Cost for sequencing (still ~$10K per genome)
o Cost for data processing and storage
o Cost and time for bioinformatic analysis and follow-up studies
o For many disease-oriented applications in human genetics, partial sequencing
of the human genome is sufficient (linkage peaks, association areas, etc).
Hence, EXOME sequencing is becoming a major (temporary) application.
![Page 8: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/8.jpg)
What (exactly) is the “Exome”?
Coding exons Mb coding exonic sequence
CCDS 196,266 ~32
Exome enrichment kits (Roche, Agilent, Illumina)
~200,000 ~38 - 62
o The number of all coding exons in the human genome.
o The true size is unknown and will continue to change over the next years.
o Exome kits capture ~96 - 98% of CCDS (Consensus Coding Sequence).
![Page 9: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/9.jpg)
6,000 monogenic disorders described
<2,000 disease genes identified
For many disorders, Mendelian genes have provided unique guidance to the underlying pathways.
Immediate modeling in vitro and in vivo possible.
Gene discovery in Mendelian diseases
![Page 10: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/10.jpg)
GWAS have successfully determined the contribution of common variation to disease.
A large gap of “missing heritability” exists for many phenotypes.
Rare variants may play a significant role in common so-called complex disease.
Rare variant discovery in common disease
![Page 11: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/11.jpg)
o % of reads aligning to the human genome reference sequence.
o % of reads on target.
o % of targets covered by a minimum of reads.
o Allelic bias.
General issues with exon capture and NGS
(from Hedges et al., 2009; Nimblegen arrays/ 454 seq)
![Page 12: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/12.jpg)
10
100
1000
Rea
d d
ep
th i
n r
ea
ds p
er
ba
se
po
sit
ion
• Uniformity of depth of sequence coverage requires 100-200 - times the
sequence amount of the target size
bp-wise sequence depth of CMT genes
Uniformity/ evenness of coverage depth
![Page 13: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/13.jpg)
Nimblegen V. 2
Newer designs of capture kits improve evenness and coverage Plots of Coverage Depth Across exons of 40 CMT Genes
Nimblegen V. 1
![Page 14: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/14.jpg)
* (p<0.05)
Coefficient of variation - EZ exome V1 vs. V2
Based on 40 neuropathy
related genes.
V1 Roche V2 in house V2 Roche
![Page 15: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/15.jpg)
Proportion of uncovered bases - EZ exome V1 vs. V2
Based on 40 neuropathy
related genes.
V1 Roche V2 in house V2 Roche
Avg
. pro
port
ion o
f uncovere
d b
ases p
er
gene
![Page 16: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/16.jpg)
(Miller syndrome)
(Bartter syndrome) November 2009
![Page 17: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/17.jpg)
American Journal of Human Genetics, February, 2011
![Page 18: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/18.jpg)
• Retinitis pigmentosa (RP) causes degeneration of photoreceptors:
Impaired night vision loss of peripheral vision
loss of central vision in later life.
• Prevalence is approximately 1 in 3,000 - 4,500 individuals.
• 50 genes are known to cause RP, but …
~ 50% of RP patients have mutations in unknown genes.
![Page 19: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/19.jpg)
Images from the Foundation Fighting Blindness.
![Page 20: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/20.jpg)
• We studied an RP family of Ashkenazi Jewish origin.
• All known RP genes had been excluded.
• Single pedigree with only three affected siblings
- traditionally very difficult to find the underlying novel gene.
![Page 21: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/21.jpg)
Affected
Sibling 1
Affected
Sibling 2
Affected
Sibling 3
Missense,
non-sense, splice
site variations
8,712 8,716 8,752
Filtered for
homozygosity
and novelty
11 18 27
Variants detected with exome sequencing
• Across the four individuals we identified 19,307 coding single nucleotide variants.
• No novel indels co-segregated with disease.
Affected
Sibling
1+2
Affected
Sibling
1+2+3
+ NOT in
Unaffected
Sibling 4
5
4
1
(DHDDS)
All detected changes Sharing within family
![Page 22: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/22.jpg)
Chromosomes screened
Variant observed
Estimated MAF
Estimated homozygous frequency
Jewish 1,434 8 0.0056 0.00003136
Non Jewish 13,954 0 < 0.000072 < 5.2 E-09
Unknown Ethnicity 11,786 1 0.000085 7.2E-09
Sum 27,174 9
Detailed results of genotyping of population controls for
the identified variant in DHDDS.
![Page 23: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/23.jpg)
DHDDS (dehydrodolichol diphosphate synthase) links
important pathways in RP
1. Pathway analyses
![Page 24: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/24.jpg)
The mutated amino acid is highly conserved across species
2. Conservation analysis
![Page 25: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/25.jpg)
3D in silico modeling of protein function
• The K42 (+) residue stabilizes the farnesyl-pyrophosphate (FPP) binding
pocket via charge-charge repulsive forces towards R38 (+).
• The mutant E42 (-) will compete for R38 (+) binding.
3. In-silico function
![Page 26: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/26.jpg)
Morpholino knock-down of DHDDS function in zebrafish
DHDDS deficient Morpholino control
Compared to control zebrafish, morpholino knock-down of DHDDS
significantly reduces escape reactions to light changes.
4. Animal modeling
![Page 27: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/27.jpg)
Histopathology of zebrafish eye – rods of photoreceptors are degenerated
DHDDS deficient Wild type
![Page 28: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/28.jpg)
5. Additional genetic support
DHDDS mutation found in
15 out of 123 index patients
(12%)
![Page 29: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/29.jpg)
Summary
• We have identified a novel RP gene, DHDDS, highlighting a key
biological pathway.
• Exome sequencing of rare genetically heterogeneous
phenotypes will require complementary functional approaches.
• We have demonstrated that in silico protein studies and
zebrafish modeling are sufficient, fast, and cost-effective
strategies.
Science, November 2011
![Page 30: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/30.jpg)
Team work ...
HIHG
Stephan Züchner, Gary Beecham, Adam Naj, Amjad Farooq, Martin Kohli,
Patrice L. Whitehead, William Hulme, Ioanna Konidari, Juan Young, David
Seo, Susan Blanton, Jeffery M. Vance, and Margaret A Peričak-Vance
Department of Biology
Julia Dallman
BPEI
Byron Lam, Rong Wen, Eduardo Alfonso
Vanderbilt University
Jonathan Haines
Department of Biochemistry
Amjad Farooq
Mt. Sinai Hospital, NYC
Joseph Buxbaum
![Page 31: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/31.jpg)
What can go wrong in targeted or exome sequencing?
Capture/ enrichment:
• Technical issues, sample mix-up
• Relevant variant(s) not covered by capture/ enrichment kit (capture probe design, large
sequence never 100% suitable for hybridization)
• Uniformity/ evenness low
Sequencing:
• Technical issues
• Insufficient sequence amount (low coverage)
• Read length choice, single vs paired-end reads
Analysis:
• Ambiguous and/ or multiple alignment of reads (pseudo genes, repetitive sequence, GC)
• Variant calling fails for specific reasons (low coverage or quality)
Annotation:
• Automated mass annotation is essential, but can be erroneous or incomplete (splice
variants, functional synonymous changes, bindings sites for regulatory factors, unknown
exons)
Interpretation:
• Wrong assumptions regarding the outcome (statistical model, class of molecular variant)
• Inadequate statistical power
• Human error
![Page 32: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/32.jpg)
What do we usually miss with exome sequencing?
• Copy number variation
• Large indels (>20bp)
• Long repeats (STR)
• Homologous regions
• Unknown exons
• UTR
• Regulatory and intronic changes
![Page 33: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/33.jpg)
Hussman Institute for Human Genomics
• 7 next generation sequencing instruments, max capacity of 1.5 Trillion base
pairs every 9 days (this will roughly double with instrument upgrade early May).
• Single run produces ~4 Terabytes of raw data: 1.2 Petabyte disc storage.
• 5,000 node computing cluster.
• Developed fully-automated exome capture on Caliper robot with capacity of 288
exome samples per week.
![Page 34: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/34.jpg)
At HIHG a wide range of diseases are being studied with
targeted and exome sequencing
• Alzheimer disease
• Amyotrophic lateral sclerosis
• Age-related macula degeneration
• Autism
• Club foot
• Charcot-Marie-Tooth disease
• Deafness
• Essential tremor
• Dilated cardiomyopathy
• Hereditary spastic paraplegia
• HIV
• Multiple sclerosis
• Parkinson disease
• Variety of recessive syndromes
• …
![Page 35: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/35.jpg)
HIHG faculty are actively publishing in the exome field since late 2009
• Hedges D, et al. (2009) Exome sequencing of a multigenerational human pedigree. PloS One.
• Martin ER et al. (2010) SeqEM: an adaptive genotype-calling approach for next-generation sequencing
studies. Bioinformatics.
• Sirmaci A et al. (2010). MASP1 mutations in patients with facial, umbilical, coccygeal, and auditory findings of
Carnevale, Malpuech, OSA, and Michels syndromes. Am J Hum Genet.
• Montenegro G et al. (2011) Exome sequencing allows for rapid gene identification in a Charcot-Marie-Tooth
disease family. Annals of Neurology.
• Norton N et al. (2011) Genome-wide Studies of Copy Number Variation and Exome Sequencing Identify Rare
Variants in BAG3 as a Cause of Dilated Cardiomyopathy. Am J Hum Genet.
• Züchner S et al (2011) Whole-exome sequencing links a variant in DHDDS to retinitis pigmentosa. American
Journal of Human Genetics.
• Hedges DJ et al. (2011) Comparison of three targeted enrichment strategies on the SOLiD sequencing
platform” PloS One.
• …
![Page 36: John P. Hussman Institute for Human Genomics University of ...](https://reader031.fdocuments.net/reader031/viewer/2022021906/620f234abd18631186313e1a/html5/thumbnails/36.jpg)
Exome and targeted sequencing is a mature research tool.
Cost-effective: < US $2,000 today; ~$1,000 by end 2011
It allows entry into Human Genomics with all its complications of data analysis
and interpretation.
Summary
Is targeted sequencing here to stay (vs whole genome seq)?
• Probably as long as the economics are attractive.
• And as long as new discoveries are indeed possible.