Jan2016 dnanexus giab uses andrew carroll
-
Upload
genomeinabottle -
Category
Health & Medicine
-
view
466 -
download
2
Transcript of Jan2016 dnanexus giab uses andrew carroll
The Global Network For Genomic Medicine™
®
Using Genome in a Bottle DataAndrew Carroll, PhDDirector of Science
® 2
What is DNAnexusGenomic Analysis in the Cloud. Scalable, Cost Effective, Secure, Compliant.
® 3
What’s in the Talk
• GIAB in PrecisionFDA
• Datasets on DNAnexus
• Example 1: Comparing mapper+variant caller combination
• Example 2: Assessing structural variation in AJ-Trio
https://precision.fda.gov/
® 10
Public X-Ten Data on DNAnexus
® 11
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
BWA is used for mapping sequences against a large reference genomes, such as the human genome. It preforms very well for low divergent sequences or reads. Bowtie2 is a memory efficient tool for aligning sequencing reads to long reference sequences. It performs extremely well for sequences length between 50 bp and 1000.ISAAC, developed by Illumina, is a set of DNA sequence aligner and variant caller that uses high memory hardware to improve efficiency and accuracy. SNAP is a relatively new aligner as accurate as existing tools like BWA-mem, Bowtie2 and Novoalign. SNAP was developed by a team from the UC Berkeley AMP Lab, Microsoft, and UCSF.
Mappers
® 12
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
Atlas is a variant caller that is known differentiating between the genuine SNPs and indels from sequencing and mapping errors. It is mainly used for whole exome data. FreeBayes is a haplotype-based Bayesian genetic variant caller designed to find small polymorphisms, specifically SNPs, indels, MNPs and complex events smaller than the length of a short-read sequencing alignment.GATK Haplotype Caller is one of the most popular variant caller. It calls SNPs and Indels simultaneously using local de novo assembly and a Bayesian statistical model. ISAAC, developed by Illumina, is a set of DNA sequence aligner and variant caller that uses high memory hardware to improve efficiency and accuracy. Platypus is an efficient variant detection tool, that can detect SNPs, MNPs, short indels and replacements up to several kb.
Variant Callers
® 13
One Example Analysis
® 14
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
Atlas Freebayes GATK ISAAC Platypus0.80000
0.82000
0.84000
0.86000
0.88000
0.90000
0.92000
0.94000
0.96000
0.98000
1.00000
SENSITIVITYPe
rcen
tage
SNPs
® 15
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
SNPs
Atlas Freebayes GATK ISAAC Platypus0.93000
0.94000
0.95000
0.96000
0.97000
0.98000
0.99000
1.00000
SPECIFICITYPe
rcen
tage
® 16
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
SNPs
Bowtie BWA ISAAC SNAP0.89000
0.91000
0.93000
0.95000
0.97000
0.99000
1.01000
AVERAGE Sensitivity and Specificity By MappersPe
rcen
tage
® 17
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
Indels
Atlas Freebayes GATK ISAAC Platypus0.00000
0.10000
0.20000
0.30000
0.40000
0.50000
0.60000
0.70000
0.80000
0.90000
1.00000
SENSITIVITY
Axis Title
® 18
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
Indels
Atlas Freebayes GATK ISAAC Platypus0.00000
0.10000
0.20000
0.30000
0.40000
0.50000
0.60000
0.70000
0.80000
0.90000
SPECIFICITYPe
rcen
tage
® 19
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
Bowtie BWA ISAAC SNAP0
50100150200250300350
Mappers CPU-hours
CPU
-hou
r
Atlas Freebayes GATK ISAAC Platypus0
100
200
300
400
500
Variants Callers CPU-hours
CPU
-hou
r
® 20
Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)
A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]
® 21
Use of AJ-Trio to Understand SV
22
Baylor College of MedicineCharacterizing large genomic variants is essential to expanding the research & clinical applications of genome sequencing.
Adam English
Will Salerno
Narayanan Veeraraghavan
Singer Ma
AndrewCarroll
23
Pipeline Schematic
24
Development through Orthogonal Technology
25
Development through Orthogonal Technology
26
GIAB Inheritance Benhmarks
DNAnexus is working actively with Genome in a Bottle to help develop high quality benchmark datasets for structural variations in the Ashkenazi Jewish Trio, applying Parliament alongside to combine Illumina and PacBio alongside other techniques.