Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.
-
Upload
francine-stone -
Category
Documents
-
view
221 -
download
1
Transcript of Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.
![Page 1: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/1.jpg)
Considerations for Analyzing Targeted NGS Data
Introduction
Tim Hague, CTO
![Page 2: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/2.jpg)
![Page 3: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/3.jpg)
Introduction
Many mapping, alignment and variant calling algorithms
Most of these have been developed for whole genome sequencing and to some extent population genetic studies.
![Page 4: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/4.jpg)
Premise
In contrast, NGS based diagnostics deals with particular genes or mutations of an individual.
Different diagnostic targets present specific challenges.
![Page 5: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/5.jpg)
Goal
Present analysis issues related to differences in:
Sequencing technologiesTargeting technologiesTarget specifics Pseudogenes and segmental duplication
![Page 6: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/6.jpg)
NGS Sequencers Illumina Ion Torrent Roche 454 (SOLiD)
Roche 454Illumina IonTorrentt
![Page 7: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/7.jpg)
Moore B, Hu H, Singleton M, De La Vega, FM, Reese MG, Yandell M. Genet Med. 2011 Mar;13(3):210-7.
![Page 8: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/8.jpg)
Sequencing TechnologyDifferences:Homopolymer error ratesG/C content errorsRead length Sequencing protocols (single vs paired reads)
![Page 9: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/9.jpg)
Targeting Methods PCR primers (e.g. amplicons) Hybridization probes (e.g. exome kits)
![Page 10: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/10.jpg)
Targeting TechnologyDifferences:Exact matching regions vs regions with SNPs.
Results in:Need for mapping against whole chromosomes to avoid false positives.
![Page 11: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/11.jpg)
Analysis Targets
Differences:Rate of polymorphismRepetitive structuresMutation profilesG/C contentSingle genes vs multi gene complexes
![Page 12: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/12.jpg)
BRCA1/2 HLA CFTR1/2000 1/29 1/2000
Distributions of insertions and deletionsDistribution of repeat elements
![Page 13: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/13.jpg)
![Page 14: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/14.jpg)
Segmental Duplications Sometimes called Low Copy Repeats (LCRs) Highly homologous, >95% sequence identity Rare in most mammals Comprise a large portion of the human genome
(and other primate genomes)
Important for understanding HLA
![Page 15: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/15.jpg)
Segmental Duplications
Many LCRs are concentrated in "hotspots"
Recombinations in these regions are responsible for a wide range of disorders, including:
Charcot-Marie-Tooth syndrome type 1AHereditary neuropathy with liability to pressure palsiesSmith-Magenis syndromePotocki-Lupski syndrome
![Page 16: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/16.jpg)
Data Analysis Tools
Differences:Detection rates of complex variants (sensitivity)False positive rates (accuracy)SpeedEase of use
Data analysis shouldn’t be like this!
![Page 17: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/17.jpg)
“Depending upon which tool you use, you can see pretty big differences between even the same genome called with different tools—nearly as big as the two Life Tech/Illumina genomes.”
Mark Yandel in BioIT-World.com, June 8, 2011
![Page 18: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/18.jpg)
Examples Missing variants SNPs, a DNP and deletions
![Page 19: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/19.jpg)
![Page 20: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/20.jpg)
Identify more valid variants
![Page 21: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/21.jpg)
Find homopolymer indels
![Page 22: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/22.jpg)
Examples Coverage differences
![Page 23: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/23.jpg)
Four times exon coverage
[0-432]
[0-96]
![Page 24: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/24.jpg)
Higher exome coverage
[0-24]
[0-10]
![Page 25: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/25.jpg)
First conclusion
Read accuracy is not the limiting factor in accurate variant analysis.
![Page 26: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/26.jpg)
Example Dense region of SNPs
![Page 27: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/27.jpg)
www.omixon.com
![Page 28: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/28.jpg)
Second conclusion
As variant density increases the performance of most tools goes down.
![Page 29: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/29.jpg)
Variant Calling
TThere are few popular variant callers: GATK, SAMtools mpileup, VarScanThe most comprehensive (GATK) has a whole pipeline, including a quality recalibration step and an indel realignment stepThese recalibration and realignment steps are highly recommended to be run before any variant callDeduplication and removing non-primary alignments may also be required
There are few popular variant callers: GATK, SAMtools mpileup, The most comprehensive (GATK) has a whole pipeline, including a quality recalibration step and an indel realignment stepThese recalibration and realignment steps are highly recommended to be run before any variant callDeduplication and removing non-primary alignments may also be required
There are few popular variant callers: GATK, SAMtools mpileup, VarScan
The most comprehensive (GATK) has a whole pipeline, including a quality recalibration step and an indel realignment step
These recalibration and realignment steps are highly recommended to be run before any variant call
Deduplication and removing non-primary alignments may also be required
![Page 30: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/30.jpg)
Indel realigner problem
![Page 31: Considerations for Analyzing Targeted NGS Data Introduction Tim Hague, CTO.](https://reader030.fdocuments.net/reader030/viewer/2022032802/56649e005503460f94ae8bd0/html5/thumbnails/31.jpg)
Variants that can be hard to find
DNPs TNPs Small indels next to SNPs 30+ bp indels Homopolymer indels Homopolymer indel and SNP together Indels in palindromes Dense regions of variants