Jan2016 dnanexus giab uses andrew carroll

27
The Global Network For Genomic Medicine™ ® Using Genome in a Bottle Data Andrew Carroll, PhD Director of Science

Transcript of Jan2016 dnanexus giab uses andrew carroll

Page 1: Jan2016 dnanexus giab uses andrew carroll

The Global Network For Genomic Medicine™

®

Using Genome in a Bottle DataAndrew Carroll, PhDDirector of Science

Page 2: Jan2016 dnanexus giab uses andrew carroll

® 2

What is DNAnexusGenomic Analysis in the Cloud. Scalable, Cost Effective, Secure, Compliant.

Page 3: Jan2016 dnanexus giab uses andrew carroll

® 3

What’s in the Talk

• GIAB in PrecisionFDA

• Datasets on DNAnexus

• Example 1: Comparing mapper+variant caller combination

• Example 2: Assessing structural variation in AJ-Trio

Page 4: Jan2016 dnanexus giab uses andrew carroll
Page 5: Jan2016 dnanexus giab uses andrew carroll
Page 6: Jan2016 dnanexus giab uses andrew carroll
Page 7: Jan2016 dnanexus giab uses andrew carroll
Page 8: Jan2016 dnanexus giab uses andrew carroll
Page 9: Jan2016 dnanexus giab uses andrew carroll

https://precision.fda.gov/

Page 10: Jan2016 dnanexus giab uses andrew carroll

® 10

Public X-Ten Data on DNAnexus

Page 11: Jan2016 dnanexus giab uses andrew carroll

® 11

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

BWA is used for mapping sequences against a large reference genomes, such as the human genome. It preforms very well for low divergent sequences or reads. Bowtie2 is a memory efficient tool for aligning sequencing reads to long reference sequences. It performs extremely well for sequences length between 50 bp and 1000.ISAAC, developed by Illumina, is a set of DNA sequence aligner and variant caller that uses high memory hardware to improve efficiency and accuracy. SNAP is a relatively new aligner as accurate as existing tools like BWA-mem, Bowtie2 and Novoalign. SNAP was developed by a team from the UC Berkeley AMP Lab, Microsoft, and UCSF.

Mappers

Page 12: Jan2016 dnanexus giab uses andrew carroll

® 12

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Atlas is a variant caller that is known differentiating between the genuine SNPs and indels from sequencing and mapping errors. It is mainly used for whole exome data. FreeBayes is a haplotype-based Bayesian genetic variant caller designed to find small polymorphisms, specifically SNPs, indels, MNPs and complex events smaller than the length of a short-read sequencing alignment.GATK Haplotype Caller is one of the most popular variant caller. It calls SNPs and Indels simultaneously using local de novo assembly and a Bayesian statistical model. ISAAC, developed by Illumina, is a set of DNA sequence aligner and variant caller that uses high memory hardware to improve efficiency and accuracy. Platypus is an efficient variant detection tool, that can detect SNPs, MNPs, short indels and replacements up to several kb.

Variant Callers

Page 13: Jan2016 dnanexus giab uses andrew carroll

® 13

One Example Analysis

Page 14: Jan2016 dnanexus giab uses andrew carroll

® 14

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Atlas Freebayes GATK ISAAC Platypus0.80000

0.82000

0.84000

0.86000

0.88000

0.90000

0.92000

0.94000

0.96000

0.98000

1.00000

SENSITIVITYPe

rcen

tage

SNPs

Page 15: Jan2016 dnanexus giab uses andrew carroll

® 15

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

SNPs

Atlas Freebayes GATK ISAAC Platypus0.93000

0.94000

0.95000

0.96000

0.97000

0.98000

0.99000

1.00000

SPECIFICITYPe

rcen

tage

Page 16: Jan2016 dnanexus giab uses andrew carroll

® 16

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

SNPs

Bowtie BWA ISAAC SNAP0.89000

0.91000

0.93000

0.95000

0.97000

0.99000

1.01000

AVERAGE Sensitivity and Specificity By MappersPe

rcen

tage

Page 17: Jan2016 dnanexus giab uses andrew carroll

® 17

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Indels

Atlas Freebayes GATK ISAAC Platypus0.00000

0.10000

0.20000

0.30000

0.40000

0.50000

0.60000

0.70000

0.80000

0.90000

1.00000

SENSITIVITY

Axis Title

Page 18: Jan2016 dnanexus giab uses andrew carroll

® 18

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Indels

Atlas Freebayes GATK ISAAC Platypus0.00000

0.10000

0.20000

0.30000

0.40000

0.50000

0.60000

0.70000

0.80000

0.90000

SPECIFICITYPe

rcen

tage

Page 19: Jan2016 dnanexus giab uses andrew carroll

® 19

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Bowtie BWA ISAAC SNAP0

50100150200250300350

Mappers CPU-hours

CPU

-hou

r

Atlas Freebayes GATK ISAAC Platypus0

100

200

300

400

500

Variants Callers CPU-hours

CPU

-hou

r

Page 20: Jan2016 dnanexus giab uses andrew carroll

® 20

Benchmarking well know bioinformatics aligners and variant callers using the Pilot Genome (NA12878)

A. B. Diallo, A. Carroll, B. Hannigan, M. Kinsella, S. Ma, N. ThangarajDNAnexus, Mountain View, CA. –email: [email protected]

Page 21: Jan2016 dnanexus giab uses andrew carroll

® 21

Use of AJ-Trio to Understand SV

Page 22: Jan2016 dnanexus giab uses andrew carroll

22

Baylor College of MedicineCharacterizing large genomic variants is essential to expanding the research & clinical applications of genome sequencing.

Adam English

Will Salerno

Narayanan Veeraraghavan

Singer Ma

AndrewCarroll

Page 23: Jan2016 dnanexus giab uses andrew carroll

23

Pipeline Schematic

Page 24: Jan2016 dnanexus giab uses andrew carroll

24

Development through Orthogonal Technology

Page 25: Jan2016 dnanexus giab uses andrew carroll

25

Development through Orthogonal Technology

Page 26: Jan2016 dnanexus giab uses andrew carroll

26

GIAB Inheritance Benhmarks

DNAnexus is working actively with Genome in a Bottle to help develop high quality benchmark datasets for structural variations in the Ashkenazi Jewish Trio, applying Parliament alongside to combine Illumina and PacBio alongside other techniques.

Page 27: Jan2016 dnanexus giab uses andrew carroll