Aug2014 abrf interlaboratory study plans
-
Upload
genomeinabottle -
Category
Health & Medicine
-
view
224 -
download
5
description
Transcript of Aug2014 abrf interlaboratory study plans
The ABRF Next Generation Sequencing Study:Multi-Platform and Cross-Methodological Reproducibility of RNA and DNA Profiling
Genome in a Bottle Consortium Workshop
August 2014
Don A. Baldwin, Ph.D.CSO, Pathonomics LLC
ABRF is an international organization of over 700 scientists from shared research resource core facilities and biotechnology laboratories.
Members represent over 250 core labs in academic and research institutions, government, and industry.
“Yellow pages” and “MarketPlace” databases of members at www.ABRF.org Electronic discussion group facilitates sharing of technical advice and core facility
networking.
The Journal of Biomolecular Techniques covers genomics, proteomics, imaging, and other biotechnologies, and core facility operational management.
www.abrf.orgMetagenomics (MGRG)
The ABRF Next Generation Sequencing (NGS) Study:
• Produce reference data sets to establish baseline performance• Promote the use of standard samples• Provide public access to data for self-evaluation, performance monitoring
and methods development
Phase I: RNA-Seq and degraded RNA-Seq (2011-2013)Phase II: DNA-Seq and hard-to-sequence regions and samples (2014-2016)Phase III: Clinical genetics sequencing panels
Phase I Study Design
Major Conclusions
Intraplatform concordance: Spearman rank R > 0.86
Interplatform concordance: R > 0.83
Q10 – Q60, most variation at read starts and ends
Higher alignment rates with platform-specific algorithms vs. STAR
Higher single-base mismatch and indel rates with platform-specific algorithms vs. STAR
Wide range of efficiencies and costs for splice junction profiling
Highly similar profiles from rRNA-depleted and polyA-enriched samples
Effective analysis of degraded RNA after rRNA depletion
Funded by:Vendor donations of sample preparation and sequencing reagents
Participating laboratories
ABRF
Nature Biotechnology, September 20146 figures, 2 tables39 supplementary figures, 7 supplementary tables
The ABRF NGS Study, Phase I
26 primary scientists34 contributing scientists21 research institutions
4.3 billion reads447 billion nucleotides
The ABRF NGS Study, Phase II
DNA sequencing topics were brainstormed and prioritized by the study consortium
Samples were chosen based on the August 2013 Genome in a Bottle Workshop
Phase II DNA sequencing aims
Reference data sets• Intra- and inter-lab replication to model the range of performance
expected under normal service laboratory conditions Reference samples• Easily accessible for self-evaluation by comparison to the reference data• Standardized, stably reproduced, suitable for methods development Immediate utility• Performance metrics and data applicable to methods used now or in the
near future by sequencing core facilities
Projectsin no particular order, with project scope and sequencing coverage to be prioritized by interest and funding:
Performance using different platforms and technical protocols• NIST GiaB designated human genomic DNA• Measure sequencing accuracy and coverage Performance using damaged DNA and chimeric cell populations• DNA from formalin-fixed, paraffin embedded cells• Measure sequencing accuracy, coverage, and limits of detection for
somatic mutations
Performance on small genomes over a range of GC content• NIST GiaB (with FDA) designated bacterial genomic DNA• Measure sequencing accuracy and coverage
Samples
Sample ID DNA source
Sequencing Project
Per replicateBreadth Depth
A Ashkenazim Jew PGP, maternal cell line 1 genome 35xB Ashkenazim paternal cell line 1 genome 35xC Ashkenazim child cell line from NIST stock 1 genome 35x
Performance using different platforms and technical protocols
Performance using damaged DNA and chimeric cell populations
Sample ID DNA sourceSequencing
ProjectPer replicate
Breadth Depth
M pool of FFPE DNA from mutant AcroMetrix lines #1, #2 and #3 plus Horizon Dx line #4:1 and 4 40% each, 2 and 3 10% each by copy number
Syn Accugenomics pool of synthetic templates for the mutations in lines #1-#4; tagged, stock = 40:40:10:10
C2 Ashkenazim child cell line from Coriell stock 2 exome 100x
C2f Ashkenazim child cell line suspension, formalin-fixed, paraffin embedded 2 exome 100x
Mf0 100% M DNA from FFPE,spike gDNA with Syn at molarity = single copy gene in total M DNA 2 exome 100x
Mf1 25% C2f, 75% M (each target’s copy number = 15% or 3.75%);spike gDNA with Syn = M 2 exome 100x
Mf2 50% C2f, 50% M (targets = 10% or 2.5%); spike gDNA with Syn = M 2 exome 100x
Mf3 75% C2f, 25% M (targets = 5% or 1.25%); spike gDNA with Syn = M 2 exome 100x
Mf4 90% C2f, 10% M (targets = 2% or 0.5%); spike gDNA with Syn = M 2 exome 100x
Mf5 95% C2f, 5% M (targets = 1% or 0.25%); spike gDNA with Syn = M 2 exome 100x
Mf6 99% C2f, 1% M (targets = 0.2% or 0.05%) Mf7 99.5% C2f, 0.5% M (targets = 0.1% or 0.025%)
Mf8 99.9% C2f, 0.1% M (targets = 0.02% or 0.005%)
Samples
Oncogenic mutations
• BRAF V600E• KRAS G12C• EGFR c.2235_2249 del15 • EML4-ALK
Sample ID DNA sourceSequencing
Project
Per replicateBreadth Depth
Sta Staphylococcus aureus 3 genome 100x
Sae Salmonella enterica 3 genome 100x
Psa Pseudomonas aeruginosa 3 genome 100x
Cls Clostridium sporogenes 3 genome 100x
P pooled metagenomic sample with all four bacterial genomes 3 genome 100x
Performance on small genomes over a range of GC content
Samples
Species Genome (bp)
Avg % GC
Reference strain Distributor
Staphylococcus aureus 2.8x106 33 NRS77 (NCTC 8325)
NARSA #NRS77
Salmonella enterica subsp. enterica serovar Typhimurium
4.9x106 52 LT2 ATCC #700720
Pseudomonas aeruginosa 6.7x106 67 PA01 ATCC #47085Clostridium sporogenes 4.1x106 28 Metchnikoff ATCC #15579
Small genomes project: sizes and GC content
Platforms and library methodsPlatform Project 1 Samples Project 2 Samples Project 3 Samples
Illumina X10 A, B, C, C2
Illumina 1 T A, B, C, C2
Illumina NextSeq 500 A, B, C, C2 Sta, Sae, Psa, Cls, P
Illumina HiSeq 2500 A, B, C, C2 C2, C2f, Mf0-Mf5
Illumina 2500 Rapid run C for long-read scaffold
Illumina MiSeq C for long-read scaffold Sta, Sae, Psa, Cls, P
Life Technologies Proton A, B, C C2, C2f, Mf0-Mf5 Sta, Sae, Psa, Cls, P
Life Technologies PGM Sta, Sae, Psa, Cls, P
Pacific Biosciences C for long-read scaffold Sta, Sae, Psa, Cls, P
Qiagen GeneReader Sta, Sae, Psa, Cls, P
Library Protocol
Illumina Moleculo A, B, C
Nextera on HiSeq A, B, C Sta, Sae, Psa, Cls
NuGEN on HiSeq A, B, C Sta, Sae, Psa, Cls
New England Biolabs on HiSeq A, B, C Sta, Sae, Psa, Cls
Kapa on HiSeq A, B, C Sta, Sae, Psa, Cls
Rubicon on HiSeq A, B, C Sta, Sae, Psa, Cls
Bioo on HiSeq A, B, C Sta, Sae, Psa, Cls
EXOME: Agilent Sure Select C2, C2f, Mf0-Mf5
EXOME: Roche Nimblegen SeqCap EZ C2, C2f, Mf0-Mf5
EXOME: Ampliseq Exome Panel for Proton C2, C2f, Mf0-Mf5
An ABRF – GiaB collaboration
• Get vendor commitments for technical support and reagent donations• Extract high-quality genomic DNA from cultured cells for A, B, C, C2, Sta,
Sae, Psa and Cls• Prepare equimolar blend of bacterial DNA for pool P• Procure somatic mutation cell lines in FFPE blocks• Extract genomic DNA from FFPE blocks of cell suspensions, prepare blends• Assemble platform groups with at least 3 labs per instrument or method• Each platform group will determine a consensus protocol for library
preparation and sequencing• Distribute aliquots of DNA reference stocks to participating study labs• Construct and/or sequence libraries (intra-lab replicates encouraged)• Collect and annotate data in a central repository• Analyze sequencing performance
planned started complete
Name email Contact regarding:Baldwin, Don [email protected] study designGrills, George [email protected] vendor and partner relationsMason, Chris [email protected] data analysisNicolet, Charlie [email protected] sequencing methodsTighe, Scott [email protected] logistics
The ABRF NGS Study leadership groupin alphabetical order, with level of participation and devotion to be prioritized by alcoholic intake: