Validation of a Next-Generation Sequencing Assay for...

17
Validation of a Next-Generation Sequencing Assay for Clinical Molecular Oncology Q32 Catherine E. Cottrell,* Hussam Al-Kateb,* Andrew J. Bredemeyer,* Eric J. Duncavage,* David H. Spencer,* Haley J. Abel, y Christina M. Lockwood,* Ian S. Hagemann,* Stephanie M. OGuin, y Lauren C. Burcea, y Christopher S. Sawyer, y Dayna M. Oschwald, y Jennifer L. Stratman,* Dorie A. Sher,* Mark R. Johnson, y Justin T. Brown, y Paul F. Cliften, y Bijoy George,* Leslie D. McIntosh,* Savita Shrivastava,* TuDung T. Nguyen,* Jacqueline E. Payton,* Mark A. Watson,* Seth D. Crosby, y Richard D. Head, y Robi D. Mitra, y Rakesh Nagarajan,* Shashikant Kulkarni,* y Karen Seibert,* Herbert W. Virgin IV,* Jeffrey Milbrandt, y and John D. Pfeifer* From the Departments of Pathology and Immunology,* and Genetics, y Genomics and Pathology Services, Washington University School of Medicine, St. Louis, Missouri Accepted for publication October 1, 2013. Address correspondence to Hussam Al-Kateb, Ph.D., Q26 Washington University School of Medicine, 660 S. Euclid Ave., Box 8118, St. Louis, MO 63110. E-mail: halkateb@ path.wustl.edu. Currently, oncology testing includes molecular studies and cytogenetic analysis to detect genetic aberrations of clinical signicance. Next-generation sequencing (NGS) allows rapid analysis of multiple genes for clin- ically actionable somatic variants. The WUCaMP assay uses targeted capture for NGS analysis of 25 cancer- associated genes to detect mutations at actionable loci. We present clinical validation of the assay and a detailed framework for design and validation of similar clinical assays. Deep sequencing of 78 tumor spec- imens (1000 average unique coverage across the capture region) achieved high sensitivity for detecting somatic variants at low allele fraction (AF). Validation revealed sensitivities and specicities of 100% for detection of single-nucleotide variants (SNVs) within coding regions, compared with SNP array sequence data (95% CI Z 83.4e100.0 for sensitivity and 94.2e100.0 for specicity) or whole-genome sequencing (95% CI Z 89.1e100.0 for sensitivity and 99.9e100.0 for specicity) of HapMap samples. Sensitivity for detecting variants at an observed 10% AF was 100% (95% CI Z 93.2e100.0) in HapMap mixes. Analysis of 15 masked specimens harboring clinically reported variants yielded concordant calls for 13/13 variants at AF of 15%. The WUCaMP assay is a robust and sensitive method to detect somatic variants of clinical signicance in molecular oncology laboratories, with reduced time and cost of genetic analysis allowing for strategic patient management. (J Mol Diagn 2013, -:1e17; http://dx.doi.org/10.1016/j.jmoldx.2013.10.002) Q1 Traditional approaches to the genetic characterization of clinical oncology specimens include cytogenetic analysis, uorescence in situ hybridization (FISH), and molecular studies of single genes. These methodologies are comple- mentary to each other and generate information of diagnostic and prognostic relevance. However, as new insight is gained into the complexities of cancer at the molecular level, the need emerges to obtain a more detailed cancer genetic prole for improved patient management. As illustrated by recent studies, identifying DNA mutations in cancer may aid in understanding clonal evolution, 1 risk stratication, 2 and therapeutic strategies. 3,4 With the advent of next-generation sequencing (NGS), a more complete biological character- ization of a tumor can be attained at the molecular level. 5 Increased access to sequencing technology and a decrease in the associated costs have made it possible for clinical laboratories to develop testing strategies using NGS. Clinical tests may be targeted to a panel of genes relevant to a given phenotype or disease, or may be more broad in scope (eg, whole-exome or whole-genome analyses). To date, most NGS clinical testing has focused on the detection of consti- tutional rather than somatic sequence variation, such as that reported in neuromuscular disease, mitochondrial disorders, familial cancer syndromes, cardiomyopathy, ciliopathies, Supported by Genomics and Pathology Services, Washington University School of Medicine, St. Louis, MO. Q25 C.E.C., H.A.-K., and A.J.B. contributed equally to this work. Copyright ª 2013 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jmoldx.2013.10.002 jmd.amjpathol.org The Journal of Molecular Diagnostics, Vol. -, No. -, - 2013 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 FLA 5.2.0 DTD ĸ JMDI284_proof ĸ 14 November 2013 ĸ 2:15 pm ĸ EO: JMD13_0016

Transcript of Validation of a Next-Generation Sequencing Assay for...

  • Q32

    Q26

    Q1

    The Journal of Molecular Diagnostics, Vol. -, No. -, - 2013

    1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162

    jmd.amjpathol.org

    63646566676869707172737475767778798081828384

    Validation of a Next-Generation Sequencing Assay forClinical Molecular OncologyCatherine E. Cottrell,* Hussam Al-Kateb,* Andrew J. Bredemeyer,* Eric J. Duncavage,* David H. Spencer,* Haley J. Abel,y

    Christina M. Lockwood,* Ian S. Hagemann,* Stephanie M. O’Guin,y Lauren C. Burcea,y Christopher S. Sawyer,y

    Dayna M. Oschwald,y Jennifer L. Stratman,* Dorie A. Sher,* Mark R. Johnson,y Justin T. Brown,y Paul F. Cliften,y Bijoy George,*Leslie D. McIntosh,* Savita Shrivastava,* TuDung T. Nguyen,* Jacqueline E. Payton,* Mark A. Watson,* Seth D. Crosby,y

    Richard D. Head,y Robi D. Mitra,y Rakesh Nagarajan,* Shashikant Kulkarni,*y Karen Seibert,* Herbert W. Virgin IV,*Jeffrey Milbrandt,y and John D. Pfeifer*

    858687

    From the Departments of Pathology and Immunology,* and Genetics,

    y Genomics and Pathology Services, Washington University School of Medicine,St. Louis, Missouri

    888990

    Accepted for publication

    C

    a

    P

    h

    91929394959697

    October 1, 2013.

    Address correspondence toHussam Al-Kateb, Ph.D.,Washington University Schoolof Medicine, 660 S. EuclidAve., Box 8118, St. Louis,MO 63110. E-mail: [email protected].

    opyright ª 2013 American Society for Invend the Association for Molecular Pathology.

    ublished by Elsevier Inc. All rights reserved

    ttp://dx.doi.org/10.1016/j.jmoldx.2013.10.002

    9899100101102103104105

    FL

    Currently, oncology testing includesmolecular studies and cytogenetic analysis to detect genetic aberrationsof clinical significance. Next-generation sequencing (NGS) allows rapid analysis of multiple genes for clin-ically actionable somatic variants. The WUCaMP assay uses targeted capture for NGS analysis of 25 cancer-associated genes to detect mutations at actionable loci. We present clinical validation of the assay and adetailed framework for design and validation of similar clinical assays. Deep sequencing of 78 tumor spec-imens (�1000� average unique coverage across the capture region) achieved high sensitivity for detectingsomatic variants at low allele fraction (AF). Validation revealed sensitivities and specificities of 100% fordetection of single-nucleotide variants (SNVs) within coding regions, compared with SNP array sequencedata (95% CIZ 83.4e100.0 for sensitivity and 94.2e100.0 for specificity) or whole-genome sequencing(95% CIZ 89.1e100.0 for sensitivity and 99.9e100.0 for specificity) of HapMap samples. Sensitivity fordetecting variants at anobserved10%AFwas100%(95%CIZ93.2e100.0) inHapMapmixes. Analysis of 15masked specimens harboring clinically reported variants yielded concordant calls for 13/13 variants at AF of�15%. TheWUCaMP assay is a robust and sensitivemethod to detect somatic variants of clinical significancein molecular oncology laboratories, with reduced time and cost of genetic analysis allowing for strategicpatient management. (J Mol Diagn 2013,-: 1e17; http://dx.doi.org/10.1016/j.jmoldx.2013.10.002)

    106

    Supported by Genomics and Pathology Services, Washington UniversitySchool of Medicine, St. Louis, MO. Q25

    C.E.C., H.A.-K., and A.J.B. contributed equally to this work.

    107108109110111112113114115116117118119120121122

    Traditional approaches to the genetic characterization ofclinical oncology specimens include cytogenetic analysis,fluorescence in situ hybridization (FISH), and molecularstudies of single genes. These methodologies are comple-mentary to each other and generate information of diagnosticand prognostic relevance. However, as new insight is gainedinto the complexities of cancer at the molecular level, theneed emerges to obtain a more detailed cancer genetic profilefor improved patient management. As illustrated by recentstudies, identifying DNA mutations in cancer may aid inunderstanding clonal evolution,1 risk stratification,2 andtherapeutic strategies.3,4 With the advent of next-generationsequencing (NGS), a more complete biological character-ization of a tumor can be attained at the molecular level.5

    stigative Pathology

    .

    A 5.2.0 DTD � JMDI284_proof � 1

    Increased access to sequencing technology and a decreasein the associated costs have made it possible for clinicallaboratories to develop testing strategies using NGS. Clinicaltests may be targeted to a panel of genes relevant to a givenphenotype or disease, or may be more broad in scope (eg,whole-exome or whole-genome analyses). To date, mostNGS clinical testing has focused on the detection of consti-tutional rather than somatic sequence variation, such as thatreported in neuromuscular disease, mitochondrial disorders,familial cancer syndromes, cardiomyopathy, ciliopathies,

    123124

    4 November 2013 � 2:15 pm � EO: JMD13_0016

    Delta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnameDelta:1_given nameDelta:1_surnamemailto:[email protected]:[email protected]://dx.doi.org/10.1016/j.jmoldx.2013.10.002http://dx.doi.org/10.1016/j.jmoldx.2013.10.002http://jmd.amjpathol.orghttp://dx.doi.org/10.1016/j.jmoldx.2013.10.002

  • Q2

    Q3

    4

    ½F1�½F1�

    print&

    web4C=FPO

    Figure 1 The Washington University Cancer Mutation Profiling(WUCaMP) gene set includes NGS analysis of 25 genes with relevance acrossmultiple tumor types. As supplements to the set, ALK and MLL are assessedby FISH for rearrangements. The choice of genes for the set was based ondirect clinical actionability of the target mutations, as determined byconsensus between pathologists and oncologists at our institution.

    Cottrell et al

    125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186

    187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247

    and familial hypercholesterolemia.6e12 Nonetheless, there isan increasing role for NGS testing to direct the managementof oncology patients. For example, KRAS mutations in co-dons 12 and 13 are observed in approximately 40% ofcolorectal cancer cases,13 and these correlate with a poorresponse to anti-EGFR antibody therapy.14 Likewise, thedetection of an EGFR exon 19 or 21 mutation in nonesmallcell lung cancer is correlated with sensitivity to EGFR tyro-sine kinase inhibitors, including gefitinib and erlotinib.15

    Concomitant NGS analysis of a select set of genes withrelevance across a broad scope of cancers increases thelikelihood of detecting rare but clinically actionable variants(such asKITmutations, which are present in

  • Q5

    Q6

    Q7

    ½F2�½F2�

    print&

    web4C=FPO

    Figure 2 Schematic view of the WUCaMP assay workflow. DNA is extracted from tumor tissue (1) derived from fresh or FFPE specimens and fragmented bysonication (2). Libraries are prepared and amplified via limited-cycle PCR (3) and enriched for WUCaMP genes by fluid phase hybridization to custom cRNAcapture reagents (4). The hybridized product is amplified (5) and sequenced on an Illumina HiSeq 2000 or Illumina MiSeq instrument (6). Paired-end reads arealigned to the genome (7), PCR duplicates are removed (8), and variant calls are made (9). Variants are annotated and classified by our internally developedCGW application, using publicly available and proprietary databases, and the case is reviewed and interpreted by a clinical genomicist for sign-out in CGW (10).A report is then issued to the medical record (11).

    NGS for Clinical Molecular Oncology

    249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310

    311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371

    individually. A series of meetings with medical oncologistsacross a range of subspecialties was used to limit thenumber of genes in the set to those viewed as having themost clinical utility at that time.

    Genes

    For 25 genes (BRAF, CHIC2, CSF1R, CTNNB1, DNMT3A,EGFR, FLT3, IDH1, IDH2, JAK2, KIT, KRAS, MAP2K2,MAPK1, MET, NPM1, NRAS, PDGFRA, PIK3CA, PTEN,PTPN11, RET, RUNX1, TP53, and WT1), all exons and 200bp flanking each exon were sequenced by NGS (Figure 1).The capture region (Supplemental Table S1) totals approx-imately 300 kb. Interphase FISH for ALK and MLL (todetect rearrangements) and for EGFR (to detect amplifica-tion) supplements the sequence information derived fromthe 25 genes. (We note that MLL has been reclassified asKMT2A, but for consistency with current clinical practicehere we use the MLL gene symbol.)

    Target Selection

    Illumina (San Diego, CA) sequencing libraries were enrichedfor the targeted 25 genes using solution-phase biotinylatedcRNA capture reagents (SureSelect; Agilent Technologies,Santa Clara, CA). The design used Agilent’s proprietary al-gorithm and synthetic process to generate the custom capturereagents for the panel. Targeted regions were based oncustom coordinates for annotated exons of the 25 genes fromNCBI RefSeq release 45, incorporating 200 bp flanking each

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    exon (to ensure high coverage at exon boundaries) and 1000bp upstream and downstream of each gene. The resultingcapture region comprised approximately 20% coding and80% noncoding sequence. The UCSC Genome Browser(http://genome.ucsc.edu, last accessed August 23, 2013)mappability data (both ENCODE/CRG alignability andENCODE/OpenChrom uniqueness) were used during assaydesign to ensure the uniqueness of the regions selected forcapture and to make sure that areas with possible homologyto pseudogenes would not prohibit mapping and variantdetection. Mappability scores, WUCaMP assay mappingquality data by gene and exon, and location of homologoussequences are presented in Supplemental Tables S2 and S3.PIK3CA exons 10 to 14 exhibited the greatest homology toanother genomic location (a pseudogene on chromosome 22;Supplemental Table S3) and showed as high as 33% of readswith a mapping quality score of

  • Q8

    Q9

    10

    11

    12

    13

    14

    Cottrell et al

    373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434

    435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495

    QIAamp DNA blood mini kit (Qiagen). Sample quality wasassessed using gel electrophoresis, Qubit fluorometer (LifeTechnologies, Carlsbad, CA), and NanoDrop spectropho-tometer (Thermo Fisher Scientific, Waltham, MA) readings.Requirements for acceptable genomic DNA were as fol-lows: total mass �750 ng (by Qubit fluorometry), absor-bance ratio A260/A280 at �1.7 and �2.1, and absorbanceratio A260/A230 at �0.7. In our clinical experience withthis assay, extracted DNA from approximately 30% ofspecimens falls below a 750-ng threshold. We have recentlydemonstrated identical analytical sensitivity and specificitywith DNA inputs as low as 200 ng (R.D. Head et al, un-published data); in our clinical experience, yields fall belowa 200-ng threshold for approximately 13% of specimens onwhich extraction is performed.

    Library Preparation and Amplification, TargetedCapture, and Illumina-Based Sequencing

    Genomic DNA (750 to 1000 ng) was fragmented using aCovaris S220 series sonicator (Covaris, Woburn, MA) andQC was performed using an Agilent Bioanalyzer 2100(Agilent Technologies) to ensure an average fragment size of160 to 230 bp. Fragmentation was followed by end repair, A-tailing, and sequencing adapter ligation (which included aunique nucleic acid barcode, or index) using an AgilentSureSelect library kit. The adapter-ligated DNA was ampli-fied via selective, limited-cycle PCR for a total of seven cy-cles. Prepared library (500 ng) was hybridized for 24 to 72hours to WUCaMP custom capture baits (Agilent Technol-ogies). The hybridized product was amplified for 14 PCRcycles using Agilent post-capture primers and a customindexing primer. QC was performed on the amplified productusing an Agilent Bioanalyzer HS chip to ensure that the finallibrary fragment size ranged from 260 to 600 bp; the productwas quantified using an Invitrogen Quant-iT dsDNA HSassay (Life Technologies) to ensure a yield of�5 mmol/L forsequencing. Exome sequencing was performed using tar-geted solution-phase enrichment of whole-genome shotgunsequencing libraries with SureSelect Human ALL Exon V3biotinylated cRNA capture reagents (Agilent Technologies)according to the manufacturer’s instructions.

    For paired-end 101-bp sequencing on the Illumina HiSeq2000 instrument, captured libraries were denatured andloaded onto an Illumina cBot instrument at 12 to 16 pmol/Lfor cluster generation according to the manufacturer’s in-structions. Up to 20 WUCaMP libraries were sequenced perHiSeq lane. A PhiX control (Illumina) was added to lane 8of each flowcell. For paired-end 150-bp sequencing,captured libraries were denatured and loaded onto an Illu-mina MiSeq instrument at 8 pmol/L for on-board clustergeneration and sequencing according to the manufacturer’sinstructions. Up to four WUCaMP libraries were sequencedat a time per MiSeq instrument, (with version 2 upgrade,2012). A denatured PhiX control (11 pmol/L) was added toeach sample at 1% as a control.

    4FLA 5.2.0 DTD � JMDI284_proof �

    Bioinformatics Pipeline and Data Analysis

    For analysis and interpretation, we used the following soft-ware packages (all accessed September 6, 2011): Novoalign(version 2.07.11, for alignment to the reference humangenome; Novocraft Technologies, Selangor, Malaysia),SAMtools22 (version 0.1.18-1), picard tools (version 1.53, toremove PCR duplicates; http://picard.sourceforge.net Q),vcftools (version 0.1.6, to merge VCF files; http://vcftools.sourceforge.net), BEDTools23 (to compare the VCF files),the Genome Analysis Toolkit24,25 [GATK version 1.2, forlocal realignment and base quality-score recalibration, and tocall single-nucleotide variants (SNVs) and small indel vari-ants], Integrative Genomics Viewer26 (IGV version 2.0.16 orlater, for visualization), IGV-tools27 (version 1.5.15, to indexVCF files), and our internally developed Clinical GenomicistWorkstation application (CGW version 1.0, for visualizationand interpretation). Software parameters and commands areavailable on request.All analyses were based on the human reference sequence

    UCSC build hg19 (NCBI build 37.2). SNVs with a uniquecoverage depth of 100 byFisher’s exact test were excluded from analyses. Thecoverage-depth threshold of 50� was determined empiri-cally to minimize false-positive calls while maintaining highsensitivity for the range of DNA input amounts expected inthe assay. Variants were reported according to HumanGenome Variation Society nomenclature (http://www.hgvs.org/mutnomen, last accessed QSeptember 6, 2011) and wereclassified into eight categories, based on clinical actionabilityand previously reported data in the literature. Variations foundin dbSNP (version 132; http://www.ncbi.nlm.nih.gov/projects/SNP) that have >5% minor allele frequency in at least onepopulation or that were reported by the 1000 Genomes Proj-ect28 (http://www.1000genomes.org) from the initial discoveryphase, all pilot phases, or the production phase were classifiedas known polymorphisms. Database versions current as ofSeptember 6, 2011 were used for the launch of the assay.Additional data analysis was performed using Microsoft

    Excel 2010 software Qand the spliteapplyecombine strat-egy29 and R.30 Confidence intervals were calculated online(http://www.vassarstats.net/clin1.html, last accessed October4, 2013 Q). Graphing was performed using R software version2.15.1 and ggplot2 for R.31 Sanger sequence data werevisualized and analysis was aided using Mutation Surveyorsoftware (SoftGenetics, State College, PA).

    Validation Specimens and Assessment of AnalyticalPerformance Characteristics

    For validation of the WUCaMP assay, we tested 78 tumorspecimens, including 67 solid tumors derived from formalinfixed, paraffin embedded (FFPE) samples and 11 hemato-logic malignancy (acute myeloid leukemia) specimens

    Qderived from involved blood or bone marrow, and threeHapMap cell-line DNA samples. Analyses were performed

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    496

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://picard.sourceforge.nethttp://vcftools.sourceforge.nethttp://vcftools.sourceforge.nethttp://www.hgvs.org/mutnomenhttp://www.hgvs.org/mutnomenhttp://www.ncbi.nlm.nih.gov/projects/SNPhttp://www.ncbi.nlm.nih.gov/projects/SNPhttp://www.1000genomes.orghttp://www.vassarstats.net/clin1.htmlhttp://jmd.amjpathol.org

  • NGS for Clinical Molecular Oncology

    497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558

    559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619

    in a masked fashion for 15 of the 78 tumor specimens, asdescribed below. Extracted DNA was prepared andsequenced multiple times for some specimens, to assessreproducibility (38 specimens were sequenced twice and 1specimen was sequenced four times), yielding a total of 119tumor specimen sequencing data sets from solid-tumor andhematologic specimens. For 16 of the FFPE specimens,matched fresh-frozen tissue was also tested for comparison.In addition to the 78 validation tumor specimens, afterlaunch of the assay 51 routine clinical NGS specimens weretested in parallel by Sanger sequencing in a comparativeanalysis (described below).

    Quality and depth-of-coverage metrics were measuredacross all 119 tumor specimen data sets, to establish anacceptable reference range for key measures (includingpercentage of reads mapped to the reference sequence,percentage of reads mapped to the target region, and numberof unique on-target reads).

    Analytical performance characteristics were assessedusing HapMap and clinical specimens. For accuracy and tocalculate analytic sensitivity and specificity, variant callsmade by our pipeline for HapMap specimen NA19240 werecompared with publicly available data on that specimenfrom two orthogonal platforms, the Illumina Omni 2.5Mmicroarray (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/working/20101206_hapmap_omni_results)32 and whole-genome sequence data from Complete Genomics (ftp://ftp2.completegenomics.com/YRI_trio/ASM_Build37_2.0.0/NA19240).33 To assess precision, including repeatability andreproducibility of the assay, we measured concordance be-tween repeat runs, between different lanes of the IlluminaHiSeq instrument, between instruments, and between samplesprepared by different technicians.

    Accuracy was also assessed using parallel Sanger testingin routine clinical NGS samples. PCR amplicons (368primer pairs, synthesized by Integrated DNA Technologies,Coralville, IA; sequences are available on request) weredeveloped to span the entire coding region, nearly 60 kb, ofthe WUCaMP gene set (with the exception of exon 1 ofWT1, for technical reasons). The targeted amplicon size was250 to 500 bp, and exons larger than the targeted ampliconsize were covered by multiple amplicons with 50- to 100-bpoverlap, to ensure complete high-quality coverage. Smallexons separated by short intronic regions were covered by asingle amplicon if possible. All primers were ordered withthe appropriate sense and antisense M13 tail sequences, toensure full-length Sanger reads; when used to verify NGSresults, all amplicons were sequenced with bidirectionalcoverage or in duplicate with unidirectional coverage.

    The limit of detection of the assay was assessed usingpreanalytic and in silico mixtures of two HapMap samples,NA18507 and NA19129, to validate the detection of variantsat low variant allele fractions (VAFs) and coverage depths.The two samples were first sequenced individually to establisha gold-standard set of variants using two established variantcalling tools, SAMtools,22 and GATK.24,25 For NA18507 and

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    NA19129, respectively, 277 and 305 calls were concordantbetween the two variant callers and present at a uniquecoverage depth of >50�, including 33 and 36 variants,respectively, in coding regions. Positions with low coverage(

  • Q15

    ½F3�½F3�

    0 500 1000 1500 2000 2500 3000 3500

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    Minimum Unique Coverage Depth

    Perc

    ent o

    f Pos

    ition

    s

    Figure 3 Distribution of unique coverage depth across the full WUCaMPcapture region. The percentage of targeted WUCaMP positions (includingboth coding and flanking intronic sequence) that achieve unique coveragedepth on the HiSeq instrument greater than or equal to that shown on the xaxis is plotted. Rectangles (dashed lines) indicate the unique coveragedepth achieved at 95% of positions and at 50% of positions (medianunique coverage). On the y-axis scale, 1.0 indicates 100%. Q22

    Cottrell et al

    621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682

    683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743

    index (ie, reads with an index sequence that was not matchedto any of the 96 indices used in our laboratory) were dis-carded from the analysis. To assess for possible crossoverbetween indexed samples on the instrument during multiplexsequencing, libraries from HapMap genomic DNA samplesNA18507 and NA19240 were generated side by side,enriched for exome targets as described above, and pooled inthe same lane for multiplex sequencing on a HiSeq 2000instrument. All positions in the targeted regions of our panelwith unique homozygous nonreference genotypes inNA19240 relative to NA18507 were identified (nZ 15), andthe base counts at those positions in sequence data fromNA18507 were generated. The converse analysis was alsoperformed, querying positions in the data from NA19240 forwhich NA18507 had unique homozygous nonreference ge-notypes (n Z 11). A similar analysis was performed forwhole-exome data.

    QC

    Data collected during the assay validation phase wereapplied toward the development of QC metrics for use inroutine clinical specimen testing. These data were derivedfrom the 78 tumor specimens described above, which wereconsidered representative of the type to be encountered inour standard workflow. A QC report consists of three parts.

    Level 1. Specimen-level sequencing metrics are param-eters indicating overall sequencing quality and coverage forthe specimen, including total reads, percent mapped, num-ber of on-target reads, percentage of reads on target, per-centage of unique on-target reads (ie, not potential PCRduplicates), and percentage of targeted positions with�50�, �400�, and �1000� unique coverage.

    Level 2. The exon-level coverage metric comprises a listof exons in the capture region that did not achieve 50�unique coverage at 95% or more of positions (includinggene name and exon number). Low coverage of such exonsis declared in the clinical report.

    Level 3. The clinically actionable variants metric com-prises a list of curated nucleotide positions with knownclinically actionable variants that did not achieve 50�unique coverage.

    Results

    Overview

    WUCaMP is a high-coverage, NGS-based clinical test fordetection of somatic mutations in cancer. DNA is extractedfrom tumors, fragmented, amplified via limited-cycle PCR,and subjected to solution-phase enrichment of the exons andflanking intronic sequence of 25 target genes (totalingw300kb), using custom-designed biotinylated cRNA capture re-agents (SureSelect; Agilent Technologies). Paired-end, 101-bp or 150-bp sequencing is performed on an Illumina HiSeq2000 or an Illumina MiSeq instrument, respectively. In

    6FLA 5.2.0 DTD � JMDI284_proof �

    routine practice, data are analyzed to generate an automateddraft clinical report, which is edited and finalized by aWashington University clinical genomicist and returned tothe ordering physician. Interphase FISH for ALK, MLL, andEGFR accompanies the NGS analysis for detection of generearrangements and copy number variation at these loci.Selection of tumor tissue for analysis by a Board-certified

    anatomical pathologist and great depth of coverage are in-tegral to the WUCaMP assay. Detection of somatic variantsin cancer is most straightforward when only tumor cellDNA is sampled, such as in acute leukemias with high blastpercentages. In solid tumors, however, the malignant cellsare always surrounded by supporting stromal cells, inflam-matory cells, and benign parenchymal tissue elements,resulting in the generation of sequence derived from bothtumor and nontumor DNA that includes both constitutionaland somatic variants. For the WUCaMP assay, a pathologistreviews H&E-stained slides cut from tumor specimens andmarks the areas of greatest tumor cellularity and viability, tominimize contamination from nontumor cell DNA. Evenwhen areas of very high tumor cellularity are sampled,tumor heterogeneity can result in a range of observed AFs.The deep coverage, averaging approximately 1018� uniquecoverage across the WUCaMP assay capture region for 78validation tumor specimens (Figure 3), allows detection ofvariants at low AF resulting from low tumor cellularity ortumor heterogeneity, as described below.Analysis and interpretation of sequence data and variants

    detected in the 25 WUCaMP genes are managed by CGW,our custom Web-based application. The CGW applicationmanages order intake, including case number, patient andspecimen details, and nucleic acid index; facilitates thebioinformatics analyses for base calling, read mapping, and

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    744

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • Q16

    ½T1�½T1�

    ½T2�½T2�

    ½F4�½F4�½F5�½F5�

    Table 1 Average Quality and Depth-of-Coverage Metrics across Validation Samples on Illumina Platforms

    InstrumentSample typeand number Total reads (no.) Mapped (%)

    On target(%) On-target reads (no.)

    On target,unique (%)

    On-target reads, unique(no.)

    HiSeq FFPE (n Z 99) 13.9 � 106 � 4.8 � 106 98.6 � 1.5 42.7 � 11.8 6.0 � 106 � 3.0 � 106 56.2 � 20.3 2.9 � 106 � 1.0 � 106HiSeq Fresh frozen

    and cell line(n Z 64)

    13.5 � 106 � 4.5 � 106 99.1 � 0.5 42.6 � 8.6 5.77 � 106 � 2.6 � 106 67.7 � 11.9 3.7 � 106 � 1.3 � 106

    MiSeq FFPE (n Z 4) 6.1 � 106 � 0.2 � 106 96.3 � 1.0 47.6 � 7.6 2.8 � 106 � 0.5 � 106 80.5 � 2.6 2.3 � 106 � 0.4 � 106

    Data are expressed as means � SD. Q27

    NGS for Clinical Molecular Oncology

    745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806

    807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858

    variant calling using various tools and filters; incorporatesgenomic annotation, including dbSNP and the Catalogue OfSomatic Mutations In Cancer (COSMIC) database (http://cancer.sanger.ac.uk/cancergenome/projects/cosmic, last ac-cessed October 30, 2013); applies custom clinical-gradeannotations curated by Genomics and Pathology Services;provisionally classifies each variant into one of the eightvariant levels; and inserts appropriate curated interpretationsinto a draft clinical report in the context of the constellationof variants and clinical indication. A clinical genomicist re-views specimen-level and exon-level QC data, inspectsvariants and position-level coverage data in the context ofannotation resources using the CGW gene viewer, assessesthe medical relevance of the identified variants with assis-tance from the available curated interpretations, and editsand finalizes the clinical report for sign-out.

    To validate the WUCaMP assay, we assessed analyticalperformance characteristics on 81 unique validation samples(78 tumor specimens and 3 HapMap21 DNA cell lines) andperformed a complete clinical validation for the detection ofSNVs.

    Establishment of Quality Metrics and Reportable Range

    We measured quality and depth-of-coverage metrics acrossall clinical validation specimen data sets to establish accept-able run-level QC parameters, including the percentage ofreads mapped to the reference sequence, the percentage ofmapped reads that mapped specifically to the target region,and the number of unique on-target reads (Table 1). On theIllumina HiSeq instrument, the average total number of readsacross 99 FFPE validation tumor specimen data sets was 13.9million (SD Z 4.8 million). An average of 98.6% (SD Z1.5%) of the bases were mapped, and an average of 42.7%(SDZ 11.8%) of the mapped reads were on-target (ie, withinthe w300-kb capture region). On average, validation speci-mens yielded 2.9 million unique on-target reads (SD Z 1.0

    Table 2 Percent of Capture Region That Met Unique on-Target Covera

    Instrument Sample type and number

    HiSeq FFPE (n Z 99)HiSeq Fresh frozen and cell line (n Z 64)MiSeq FFPE (n Z 4)

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    million), representing approximately 21% of the average totalreads per case.

    The read and coverage statistics were similar for se-quencing runs of libraries prepared from fresh-frozen tissuespecimens, blood or bone marrow specimens, or cell lines.Similar numbers of on-target reads were achieved fromformalin-fixed and unfixed sources of DNA. These dataindicate that formalin-fixed tissues are suitable for NGS-based analyses, as documented elsewhere.34

    The percentage of base positions at which high-qualityunique on-target reads were achieved at depths of 50�,400�, and 1000� was calculated (Table 2). For an averageFFPE specimen sequenced on the HiSeq instrument, 96.9%(SD Z 1.3%) of positions were covered at a depth of�50�, 82.6% (SD Z 12.2%) at a depth of �400�, and46.2% (SD Z 23.0%) at a depth of �1000�. With theMiSeq instrument, distribution of coverage was nearlyidentical to that of the HiSeq.

    Quality metric data obtained from the assay validationphase were used to establish a QC report for use in thestandard workflow for clinical specimen testing. This allowedfor an evaluation of the data quality generated from eachclinical sample in comparison with the aggregate validationdata metrics. Indicators including specimen-level sequencingmetrics, exon-level coverage metrics, and coverage at clini-cally actionable nucleotide positions were assessed for eachclinical sample. If the quality of a given clinical sample wasdeemed inadequate (because of low coverage), the assay wasrepeated to improve this outcome.

    To determine the reportable range, defined as the region ofthe genome in which sequence of an acceptable quality can bederived, average depth of unique coverage for each base po-sition in each exon of the 25WUCaMP genes was determinedfor all validation tumor specimen data sets (n Z 119). Highcoverage on a gene-by-gene (Figure 4) and an exon-by-exon(Figure 5) basis was present throughout the 25-gene captureregion. A limited number of exons (largely the GC-rich first

    ge Depth Thresholds on Illumina Platforms

    Capture region that met threshold (%)

    �50� �400� �1000�96.9 � 1.3 82.6 � 12.2 46.2 � 23.097.4 � 0.6 86.7 � 3.5 52.4 � 24.395.1 � 0.4 82.5 � 1.2 48.4 � 1.3

    7

    859860861862863864865866867868

    4 November 2013 � 2:15 pm � EO: JMD13_0016

    http://cancer.sanger.ac.uk/cancergenome/projects/cosmichttp://cancer.sanger.ac.uk/cancergenome/projects/cosmichttp://jmd.amjpathol.org

  • ½T3�½T3�

    ½T4�½T4�

    ½T5�½T5�

    print&

    web4C=FPO

    Figure 4 Distribution of unique coverage depth across the 25 genes inthe WUCaMP panel. Unique coverage data across 119 validation tumorspecimen data sets from HiSeq sequencingQ23 are plotted by gene. Each boxrepresents the interquartile range, with the midline as the median uniquecoverage; whiskers represent exon coverage for a given sequencing runwithin 2 SD of the median. Outlier exons for a sequencing run are plotted asindependent dots.

    print&web4C=FPO

    Figure 5 Distribution of unique coverage depth across exons in JAK2,one gene of the WUCaMP panel. Unique coverage data across 119 validationtumor specimen data sets from HiSeq sequencing are plotted by exon.Boxewhisker plots are defined as for Figure 4, except that unique coveragelevel is considered by position rather than averaged across an exon. The redhorizontal line near the x axis indicates 50� unique coverage. JAK2coverage was slightly below average, relative to other WUCaMP genes (datanot shown). Q24

    Cottrell et al

    869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930

    931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991

    exon of several genes; Supplemental Tables S2 and S3) didnot achieve 50� unique coverage at 95% of exonic base-pairpositions across validation samples. Those regions includedexon 1 in BRAF, EGFR, FLT3, IDH2, KRAS, MAP2K2,MAPK1, MLL, PIK3CA, PTPN11, RET, and WT1, and alsoexons 2 and 8 in theDNMT3A gene, representing 14 of the 393exons (3.6%) contained in our target capture region.

    Accuracy and Analytical Sensitivity and Specificity

    Demonstrating accuracy across the large number of clini-cally important positions sequenced in the approximately300-kb WUCaMP capture region was not feasible usingclinical cancer specimens. Unlike those of clinical speci-mens, the genomes of cell lines from the HapMap Projecthave been thoroughly characterized by a number of geno-typing and sequencing methods, providing ample referencedata to validate SNV detection over a range of coveragesand sequence contexts. SNV calls made by our pipeline forHapMap cell line NA19240 were therefore compared withpublicly available data on NA19240 from two orthogonalplatforms, the Illumina Omni 2.5M microarray (Table 3)and whole-genome sequence data from Complete Genomics(Table 4), the latter of which allowed for comparison of themajority of sites in our capture region.

    Unfiltered coding region variant calls from the WUCaMPpipeline are presented in Supplemental Table S4. Referencebase calls and homozygous and heterozygous nonreferenceSNV calls were extracted from the Omni 2.5M microarrayand Complete Genomics data on NA19240 and werecompared with SNV calls made by our WUCaMP pipelinefor eight independent NA19240 library preparations

    8FLA 5.2.0 DTD � JMDI284_proof �

    sequenced across three separate HiSeq runs. Minimumsensitivity, specificity, and positive predictive value (PPV)calculated across these eight preparations from the Omni2.5M and Complete Genomics comparisons are reported inTables 3 and 4, respectively. For the latter comparison, a totalof 288,955 sites were compared, 302 of which were callednonreference (homozygous or heterozygous) by CompleteGenomics. The sensitivity of WUCaMP across our captureregion, as assessed using Complete Genomics as the goldstandard, was�98.3% [95% confidence interval (95% CI)Z95.9e99.4] across runs, and the specificity was 100.0% (95%CI Z 99.9e100.0). The PPV across the capture regionranged from 95.5% (95% CI Z 92.4e97.4) to 97.7% (95%CI Z 95.1e99.0) across runs.Because only approximately 20% of our capture region

    includes coding sequence, which generally has higher co-verage (because of greater sequence complexity and morecomplete tiling with capture baits) and contains the vastmajority of the clinically relevant loci, we also assessedthese metrics limited to the coding region sequence. Thisanalysis showed coding region sensitivity, specificity, andPPV were 100% (95% CI Z 89.1e100.0 for sensitivity andPPV and 95% CI Z 99.9e100.0 for specificity) when usingComplete Genomics as the gold standard, with >57,600sites compared (Table 4).To extend the assessment of the accuracy of the assay to

    routine clinical samples, a concurrent testing strategy wasused for 51 WUCaMP cases. Coding regions containinghotspots for mutations associated with clinical outcomes incertain cancers were PCR-amplified and sequenced by theSanger method. This analysis identified 56 coding variants,including germline polymorphisms and somatic variants(Table 5). All 56 coding variants were also identified byNGS in the WUCaMP assay, indicating that the WUCaMPassay is at least as sensitive as Sanger sequencing. Of thevariant calls made by NGS, seven were not made by Sangersequencing. Of these, no result was obtained for six variantpositions, because of amplification failure (likely attributable

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    992

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • ½T6�½T6�

    F6�F6�

    Table 3 Minimum SNV Analytic Sensitivity, Specificity, and Positive Predictive Value Calculated for WUCaMP Illumina Platforms versusPositions Genotyped by the Illumina Omni 2.5M Array

    Instrument Region assessedSensitivity (%)[TP/(TPþFN)] 95% CI

    Specificity (%)[TN/(TNþFP)] 95% CI

    PPV (%)[TP/(TPþFP)] 95% CI

    HiSeq Entire capture region 95.9 (118/123) 90.3e98.5 99.6 (273/274) 97.7e99.9 99.2 (118/119) 94.7e99.9Coding region only 100.0 (25/25) 83.4e100.0 100.0 (79/79) 94.2e100.0 100.0 (25/25) 83.4e100.0

    MiSeq Entire capture region 95.9 (118/123) 90.3e98.5 100.0 (274/274) 98.3e100.0 100.0 (118/118) 96.1e100.0Coding region only 100.0 (25/25) 83.4e100.0 100.0 (79/79) 94.2e100.0 100.0 (25/25) 83.4e100.0

    FN, false negative; FP, false positive; PPV, positive predictive value; TN, true negative; TP, true positive.

    NGS for Clinical Molecular Oncology

    9939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054

    10551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101

    to degraded DNA derived from older FFPE tissues); theseventh was likely below the limit of detection of Sangersequencing (a SNV at an estimated AF of 20%).

    Precision and Reproducibility

    To assess precision, including repeatability and reproduc-ibility of the assay, HapMap cell line NA19240was subjectedto sequencing and analyses as described underMaterials andMethods on repeat runs that differed in the technician per-forming library preparation, HiSeq lane number, instrument,or instrument run (Table 6). We calculated average codingregion variant-call concordance rates of 100.0 � 0.0% be-tween samples prepared by different technicians, and 98.7 �1.2% between samples prepared by the same technician ondifferent days. Inter-run reproducibility, in which the samelibrary preparation was sequenced on two separate runs of thesame sequencing instrument, yielded a coding region variant-call concordance of 98.8 � 1.6%. Coding region concor-dance was 99.2� 1.2%when a single library preparation wasrun on two different instruments. Concordance in the codingregions fell short of 100% in some comparisons because of asingle position that was not covered at the 50� depthrequired to make a call in our pipeline in one of the sequencedspecimens in the pairwise comparison. This position wasidentified in dbSNP as rs12872889, in the coding portion ofexon 1 of the FLT3 gene.

    Limit of Detection

    The sensitivity for detecting variants at low AF that occurbecause of tumor heterogeneity or stromal contamination

    Table 4 Minimum SNV Analytic Sensitivity, Specificity, and Positive PGenomics Whole Genome Sequence

    InstrumentRegion assessed[size (bp)]

    Sensitivity (%)[TP/(TPþFN)] 95% CI

    Spec[TN/

    HiSeq Entire capture region(306,336)

    98.3 (297/302) 95.9e99.4 100.

    HiSeq Coding region only(59,490)

    100.0 (40/40) 89.1e100.0 100.

    MiSeq Entire capture region(306,336)

    98.3 (297/302) 95.9e99.4 100.

    MiSeq Coding region only(59,490)

    100.0 (40/40) 89.1e100.0 100.

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    was assessed using preanalytic and in silico mixtures oftwo HapMap samples. DNAs from HapMap samplesNA18507 and NA19129 were mixed at 1:1 and 4:1 ratiosand were sequenced as described under Materials andMethods. Variants in each sample that were heterozygousand unique to that sample relative to the other (as describedunder Materials and Methods) were used to determinesensitivity, because these positions were expected to haveVAFs of 25% and 10%, respectively, for the 50% and 20%mixes.

    Review of the VAFs at heterozygous positions unique toNA18507 (n Z 95, of which 11 were coding) revealed themean observed VAF to be 23.9% in the 50% mix and themean VAF of NA19129-specific variants to be 13.1% in the20% mix, demonstrating good agreement with the expectedvalues. For the 50% mix, the sensitivity of detection for allNA18507 heterozygous unique variants meeting the vali-dated quality criteria was 100%, both across the full captureregion (n Z 90; 95% CI Z 94.9e100.0) and in coding re-gions (n Z 11; 95% CI Z 67.9e100.0) (Figure 6 ½½). For the20% mix, the sensitivity of detection for all heterozygousNA19129 variants meeting the quality criteria (n Z 109,including 14 coding; 13.1% mean observed VAF) was 92%(95% CI Z 84.5e95.9) and 93% (95% CI Z 64.2e99.6),respectively, across the capture region and in coding regions(Figure 6). Considering only variants with observed VAFsof �10% (n Z 67), the sensitivity was 100% (95% CI Z93.2e100.0). Seven and six false-positive calls were detectedin the 50% and 20% NA19129 mixes, respectively, yieldingPPVs of 98.3% (95% CI Z 96.3e99.2) and 98.5% (95%CI Z 96.5e99.3) across the full capture region. No false-positive calls were detected in the analysis of coding

    redictive Value of WUCaMP on Illumina platforms versus Complete

    ificity (%)(TNþFP)] 95% CI

    PPV (%)[TP/(TPþFP)] 95% CI

    0 (288,262/288,276) 99.9e100.0 95.5 (299/313) 92.4e97.4

    0 (57,645/57,645) 99.9e100.0 100.0 (40/40) 89.1e100.0

    0 (288,575/288,580) 99.9e100.0 98.3 (297/302) 95.9e99.4

    0 (57,700/57,700) 99.9e100.0 100.0 (40/40) 89.1e100.0

    9

    110211031104110511061107110811091110111111121113111411151116

    4 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • ½F7�½F7�

    T7�T7�

    17

    Table 5 Sanger Correlation Results

    Sanger panel, and genesand exons covered

    Casestested(no.)

    Codingvariantsidentifiedby Sanger(no.)

    Sangercodingvariantssupportedby NGS (no.)

    BreastKIT exon 11 2 1 1PIK3CA exons 10, 21TP53 exons 5, 6, 7, 8

    Colon/gastrointestinalBRAF exon 15 2 0 0CTNNB1 exon 3KIT exon 11KRAS exons 2, 3PIK3CA exons 10, 21TP53 exons 5, 6, 7

    LungBRAF exon 15 36 43 43EGFR exons 18, 19, 20, 21KIT exon 11KRAS exons 2, 3PIK3CA exons 10, 21MET exon 14

    CholangiocarcinomaBRAF exon 15 1 0 0IDH1 exon 4IDH2 exon 4KIT exon 11KRAS exon 2PIK3CA exon 10TP53 exons 5, 6, 7, 8

    HematologicQ28BRAF exon 15 2 2 2IDH1 exon 4IDH2 exon 4JAK2 exon 14KIT exons 8, 11, 17PIK3CA exon 10TP53 exons 5, 6, 7, 8

    PancreaticBRAF exon 15 9 10 10CTNNB1 exon 3KIT exon 11KRAS exon 2PIK3CA exons 5, 21TP53 exons 4, 5, 6, 7

    Total 52* 56 56

    *Of the 51 cases, 1 was tested for two Sanger panels.

    Cottrell et al

    11171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178

    1179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239

    regions in either mix for PPVs of 100% (95% CI Z90.6e100.0 for the 50% mix and 95% CIZ 90.4e100.0 forthe 20% mix) (Figure 6).

    The sensitivity for detecting variants at low AF was alsoassessed as a function of coverage depth using in silicomixing of NA18507 and NA19129 sequencing reads andrandom downsampling, as described under Materials andMethods. Synthetic mixed samples were generated from

    10FLA 5.2.0 DTD � JMDI284_proof �

    mapped read data from NA18507 and NA19129 in pro-portions of 50%, 20%, 10%, and 2%, which resulted inmean observed VAFs of 23.8%, 10.6%, 5.8%, and 1.1%for NA18507 variants that were heterozygous and unique(n Z 95). The performance of the variant calling pipelinefor each mix, with variants binned into groups based onunique coverage levels (0 to 100�, 101 to 200�, and soon) at the variant position, is presented in Figure 7.Sensitivity was calculated based on the proportion of var-iants in a given mix and coverage bin that were detected.For variants with 25% AF (50% mix), the sensitivity was38% (95% CI Z 23.8e53.5) and 100% (95% CI Z92.7e100.0), respectively, at 100� and 200� uniquecoverage. For variants with 10% AF (20% mix), thesensitivity was 16% (95% CI Z 4.0e25.6) at 100�,reaching approximately 90% (95% CI Z 75.1e96.6) at400� and 100% (95% CI Z 79.4e100.0) at 1000� uniquecoverage (Figure 7). Sensitivity for variants with 5% and1% AF (10% and 2% mixes, respectively) was essentially0% at all coverage levels.

    Multiplexed Sequencing Validation

    Because combining multiple samples in a singlesequencing run introduces the possibility of cross-samplecontamination, we examined the fidelity of indexed librarysequencing by determining the frequency of spuriousindices (ie, index sequences other than those expected)after multiplex sequencing. We first assessed the frequencyof contamination that occurs during library preparation bypreparing libraries from two human genomic DNA samplesside by side and sequencing in separate lanes, as describedunder Materials and Methods. A total of 95 and 94spurious indices were identified in samples 1 and 2,respectively, and the frequency of reads with the mostcommonly encountered spurious index was 0.0074% and0.1822% in samples 1 and 2, respectively (Table 7 ½½). Themost frequent spurious index in sample 1 was the indexused for library preparation for sample 2 Qand vice versa,reflecting contamination during library preparation, mostlikely due to aerosol formation. However, the frequency ofdetection of spurious indices in this experiment did notexceed 0.2% of the reads, which is far below the limit ofdetection of the assay.In addition, to investigate crossover between indexed

    samples on the instrument during multiplex sequencing,we assessed base calls in multiplexed HapMap samples atpositions with unique homozygous nonreference genotypesin one sample relative to the other, as described underMaterials and Methods. Because these positions are ho-mozygous for the reference allele in one HapMap sampleand homozygous nonreference in the other, all observednonreference bases in the index with the homozygousreference genotype represent either sequencing errors orcontamination from the other HapMap sample. The analysisof positions homozygous for the reference allele in

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    1240

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • ½T8�½T8�

    ½T9�½T9�

    Table 6 Reproducibility and Repeatability of the WUCaMP Assay on the Illumina HiSeq System

    Comparison

    Capture region Coding region

    Comparisons(no.)

    Variants Concordance Variants Concordance

    Mean no. SD % SD Mean no. SD % SD

    Intratechnician repeatability* 353 1.4 96.6 1.3 43 0.5 98.7 1.2 9Intertechnician reproducibilityy 353 1.4 97.3 0.4 43 0.5 100.0 0.0 4Interlane reproducibilityz 357 6.3 95.1 1.9 43 0.5 98.8 1.6 2Inter-run reproducibilityx 352 0.0 97.2 1.6 43 0.5 98.8 1.6 2Interinstrument reproducibility{ 346 3.7 96.9 1.1 43 0.4 99.2 1.2 6

    *The same technician performed multiple library preparations from the same specimen on different days; HiSeq lane also varied.yDifferent technicians performed library preparations on the same specimen, which were sequenced together in the same lane.zThe same library preparation was sequenced in two different lanes on the same HiSeq run.xThe same library preparation was sequenced on two separate HiSeq runs.{The same library preparation was sequenced on two different instruments.

    NGS for Clinical Molecular Oncology

    12411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302

    13031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331

    NA19240 resulted in 0/2726 (0%) reads that matchedNA18507 (Table 8). In the converse analysis, 1/1586 reads(0.06%) matched the other sample. An extended analysisacross the entire exome yielded a 0.03% rate of potentialsample crossover at homozygous reference positions in bothHapMap samples. This resulted in an estimated on-instrument crosstalk frequency of 0.03% to 0.06%. Thetotal error rate, as estimated by fraction of any nonreferencebases called at such positions, was 0.04% to 0.06% acrossthe WUCaMP capture region and 0.3% to 0.4% across theexome.

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    Detection of Pathogenic Variants in Known ClinicalStandards

    The sensitivity of the WUCaMP assay to detect pathogenictumor mutations was demonstrated by assaying 15 maskedclinical specimens known to harbor one or more pathogenicSNVs in the capture region, based on prior Sanger sequencingor other genotyping assay performed in a CLIA-licensedclinical laboratory. All SNVs (13/13 on the HiSeq instrumentand 4/4 on the MiSeq) were detected using our pipeline whenthe AF of these variants was �15% (Table 9). Three SNVs

    Figure 6 Low VAF detection. For all targetregions (top row) and for coding regions only(bottom row), sensitivity, false positives, and PPVare presented for one sample with a 50% mixproportion and a second sample with a 20% mixproportion. Error bars indicate the 95% binomialconfidence interval for each point estimate. Toprow: n Z 109 variants (50% mix); n Z 95 vari-ants (20% mix). Bottom row: n Z 11 variants(50% mix); n Z 14 variants (20% mix). PPV ZTP/(TPþFP).

    11

    133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364

    4 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • 18

    Mix50%20%10%2%

    100

    200

    300

    400

    500

    600

    700

    800

    900

    1000

    1100

    1200

    020

    4060

    8010

    0

    Sensitivity vs. coverage depth

    Observed coverage (100bp bins)

    Sens

    itivi

    ty (%

    )

    print&

    web4C=FPO

    Figure 7 Sensitivity for low VAF detection as a function of coveragedepth. Synthetic mixed samples were generated from two individual Hap-Map samples in silico, with mix proportions of 50%, 20%, 10%, and 2% andmean coverage levels across the entire target region of 1000�, 750�,500�, and 250�. Each mixed sample had 95 heterozygous variants uniqueto the minor sample present at mean observed VAFs of 23.8%, 10.6%,5.8%, and 1.1%, respectively. Data indicate the sensitivity (percentdetected) for variants with observed coverage in bins of 100. Error barsindicate the 95% binomial confidence interval for each point estimate.

    Cottrell et al

    13651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426

    14271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476

    present at

  • Q30

    Q31

    Table 8 Evaluation of Index Cross-Talk between Two HapMap Samples, by Region Analyzed

    MetricQ29

    Positions homozygous referencein NA19240 and homozygousnonreference in NA18507

    Positions homozygous referencein NA18507 and homozygousnonreference in NA19240

    WUCaMP target region

    Total positions (no.) 15 11Nonreference calls matching other HapMap [n/N (%)] 0/2726 (0) 1/1587 (0.06)Other nonreference calls [n/N (%)] 1/2726 (0.04) 0/1587 (0)Fraction of any nonreference bases to total bases (%) 0.04 0.06

    Whole exome

    Total positions (no.) 8866 8769Nonreference calls matching other HapMap [n/N (%)] 549/1,683,767 (0.03) 462/1,627,991 (0.03)Other nonreference calls [n/N (%)] 4462/1,683,767 (0.27) 6161/1,627,991 (0.38)Fraction of any nonreference bases to total bases (%) 0.30 0.41

    NGS for Clinical Molecular Oncology

    14891490149114921493149414951496149714981499150015011502150315041505150615071508150915101511151215131514151515161517151815191520152115221523152415251526152715281529153015311532153315341535153615371538153915401541154215431544154515461547154815491550

    1551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581

    in lung cancer43 and resistance to treatment with anti-EGFRmonoclonal antibodies in colorectal cancer.44 Unexpectedfindings, such as the discovery of BRAFV600E in an ovariancancer, with subsequent response to treatment with a BRAFinhibitor,45 support the utility of sequencing genes that arenot strictly characteristic for the patient’s tumor type. Thus,the WUCaMP assay that we have validated here providesactionable information to direct patient care in routinepractice.

    Compared with genetic tests designed to detect inherited(constitutional) mutations, clinical molecular oncology testshave additional complexities because they target acquired

    Table 9 Confirmation of Known Clinically Important Mutationsby WUCaMP on Illumina Sequencing Instruments

    Instrumentand sampleID

    Expected mutation[reported allelefraction (%)]

    NGS assayresult?

    Observed allelefraction byNGS (%)

    HiSeq4V00Oz IDH1 R132S Yes 38.820 FLT3-TKD D835 (15) Yes 20.23Y056A IDH2 R140Q Yes 41.822 FLT3-TKD D835 (2.9) No 0.321 FLT3-TKD D835 (43) Yes 30.23Y0563 IDH2 R140Q Yes 29.223 FLT3-TKD D835 (2.5) No 1.224 FLT3-TKD D835 (3.6) No 2.83Y055Y IDH1 R132C Yes 43.23Y055Y NRAS G12D Yes 37.73y0562 IDH1 R132C Yes 31.428 RET V804M Yes 5138 JAK2 V617F Yes 19.135 JAK2 V617F Yes 43.840 JAK2 V617F Yes 87.232 RET C634G Yes 50.8

    MiSeq20 FLT3-TKD D835 (15) Yes 23.4*3y0562 IDH1 R132C Yes 34.1*28 RET V804M Yes 47.5*35 JAK2 V617F Yes 41.4*

    *Mean observed allele fraction across two sequencing runs.

    The Journal of Molecular Diagnostics - jmd.amjpathol.org

    158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611

    FLA 5.2.0 DTD � JMDI284_proof � 1

    (somatic) mutations. Tumor samples are an admixture ofcell types, tumor viability varies within the tissue samples,and tumors harbor intrinsic genetic heterogeneitydall ofwhich can affect test results. Analysis of FFPE specimens isfurther complicated by the chemical effects of formalinfixation on nucleic acids, which introduce crosslinks thatcan inhibit the enzymatic steps of DNA library preparationand also can cause direct DNA damage.34,46 Nonetheless,these challenges can be overcome through the use of NGStechnologies, which have been already used to sequencecancer genomes, including acute myeloid leukemia,47 small-and non-small cell lung cancer,48,49 and melanoma,50

    among others. Our own demonstration that read andcoverage statistics for fresh tissues are similar to those ofFFPE is also paramount to validation of the WUCaMP assayfor use in routine clinical practice, because for the vastmajority of solid-tumor specimens only FFPE tissue isavailable for testing.

    Based on our in silico sensitivity analysis, 85% and 100%of variants with 10% AF were detected at 400� and 1000�unique coverage levels, respectively. Thus, the high averagecoverage level achieved by the WUCaMP assay (>1000�)is crucial for the detection of variants at low AF. However,even at this depth, somatic variants present at AFs of

  • 19

    Cottrell et al

    16131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674

    1675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735

    on our estimated VAF limit of detection. This step not onlyminimizes the chance of obtaining false-negative resultsdue to the presence of variants below the level of detection,but also avoids performing unnecessary testing. Second, aclinical genomicist reviews QC parameters of DNA extrac-tion, library preparation, and targeted capture to evaluateDNA quality and library product quantity and fragment size.Third, a clinical genomicist evaluates three levels ofsequencing run QC metrics, benchmarked to referencestandards obtained from our 119 validation tumor specimendata sets.

    The level 1 metrics concern read and coverage depth,including the number of unique on-target reads achievedacross the 300-kb capture region and the percentage of po-sitions that reach unique coverage thresholds of 50�, 400�,and 1000�. In routine clinical use, specimens that fail tomatch the range of coverage observed during validationrequire particular scrutiny of the remaining QC measures.The level 2 metric is a measure of which exons failed toachieve 50� unique coverage at 95% of exonic positions;this is frequently the first exon of genes, which are often GC-rich and therefore difficult to amplify by PCR and/or tocapture by fluid-phase targeted hybrid capture.51 Of the 14exons that failed this level of QC on an average validationrun, 2 exons are entirely noncoding; according to the COS-MIC database, no recurrent mutations are reported in theremaining 12 exons. It is unlikely, therefore, that a clinicallyimportant mutation would be missed because of the lowercoverage in these regions. For such failed exons, orthogonaltesting is not performed; instead, low coverage of the exon isdeclared in the clinical report. Finally, the level 3 metricverifies that all curated actionable mutation positions haveachieved a unique coverage depth of �50�.

    We have considered the utility of verifying NGS somaticvariant findings by Sanger sequencing, and indeed per-formed targeted Sanger sequencing in parallel with theWUCaMP assay on 51 patient specimens after launch of thetest. It is important to emphasize that this analysis includedverification of both wild-type as well as SNV sequencecalls, to monitor the occurrence of false negatives as well asfalse positives. An approach focused on verification ofidentified SNVs alone would have represented a form ofdiscrepant analysis (also known as review bias), which iswell established as an intrinsically biased approach thatgenerates significant overestimates of test performancemetrics, including sensitivity and specificity.52,53 In ouranalysis, all SNVs identified by Sanger sequencing werealso called by the WUCaMP assay, whereas a single high-quality, low-AF SNV identified by the WUCaMP assaywas not called by Sanger sequencing. This finding, althoughunsurprising in light of the limitations of Sanger sequencing,underscores the shortcomings of verifying somatic NGScalls using an approach with inherently inferior sensitivity.Given the high sensitivity and specificity of the WUCaMPassay for variants at observed AFs of 10% to 100%, targetedverification of positive and negative findings by Sanger

    14FLA 5.2.0 DTD � JMDI284_proof �

    sequencing on every clinical case would not improve theaccuracy of test results.To make review of the NGS data from the WUCaMP

    assay efficient and accurate, we have curated all publishedvariants of the genes in the assay associated with treatmentor clinical outcomes into a database that links the variantwith an annotated interpretation based on the medicalliterature. The need for this functionality, together with therequirement for software that can integrate publicly avail-able NGS analysis tools and NGS viewers, led us to developthe CGW comprehensive software application. In generalterms, CGW filters and annotates the variant calls to removethose reported in dbSNP with a VAF of >5%. It then marksvariants that are reported in COSMIC and attaches theappropriate genomic, coding, and protein nomenclature andcurated clinical interpretations from the internally developedmedical annotation database noted above (with visualizationof the identified variants and hyperlinks to relevant data-bases) before generating a draft report, all of which sig-nificantly reduces the time and effort required to reviewdata-heavy NGS results. The WUCaMP report classifiesvariants into levels based on the potential actionability ofthe variant, a classification that is tailored to the type of thevariant in the setting of the specific tumor type. SNPs notreported in COSMIC are classified as known polymor-phisms, Qand non-SNP variants lacking evidence of action-ability or prognostic value from the literature are classifiedas variants of unknown significance.Proficiency testing for NGS is currently unavailable.

    However, reference materials for NGS proficiency testing arebeing developed by CAP, the National Institute of Standardsand Technology (NIST), and the CDC, with plans for pilotofferings in 2013 and full implementation in 2014. In theinterim, we have developed our own alternative performanceassessment strategy in accord with CAP recommendations, asfollows. The alternative assessment (AA) is performedbiannually, and two samples are analyzed in each AA cycle.One sample is a tissue specimen processed starting from theDNA isolation step, to evaluate the wet-bench procedure andthe bioinformatic analysis; the second sample is an NGSsequence file, to evaluate the bioinformatic analysis only. Thecases used for AA are either obtained from other CLIA-licensed laboratories or from clinical cases previously pro-cessed and signed out in our laboratory; the cases cover bothdisease-associated and naturally occurring sequence varia-tions across the genomic regions targeted by our test (asrecommended19). Each clinical genomicist (either a pathol-ogist with subspecialty certification in Molecular GeneticPathology from the American Board of Pathology or a lab-oratory director certified by the American Board of MedicalGenetics in Clinical Molecular Genetics and/or Clinical Cy-togenetics, or by the American Board of Clinical Chemistry)performs a masked independent review of the two cases andsigns them out in the laboratory information managementsystem. Three types of comparisons are then performed. First,the QC metrics of library preparation and sequencing are

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    1736

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://jmd.amjpathol.org

  • NGS for Clinical Molecular Oncology

    17371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798

    1799180018011802180318041805180618071808180918101811181218131814181518161817181818191820182118221823182418251826182718281829183018311832183318341835183618371838183918401841184218431844184518461847184818491850185118521853185418551856185718581859

    compared with those of the original run. Second, the variantcalls are compared with those of the original run. Third, thevariant classification and interpretation are compared be-tween the reviewing clinical genomicists and the originalreport. Any significant differences observed for the QCmetrics, variant calls, or classification between the AA andthe original analysis are rigorously assessed until the sourceof the discrepancy is identified and resolved. It is worthnoting that our AA approach is somewhat similar to the CAPproficiency testing challenges for chromosomal microarrayanalysis, in which DNA is provided to the laboratory totest from the wet-bench procedure through bioinformaticanalysis and subsequent interpretation, with additional chal-lenge questions on paper used to further gauge interpretativecomponents of array testing.

    As with any genetic test, there is a risk of detectingsecondary or incidental findings. DNA sequence analysis oftumor tissue may identify variants linked to familial cancersyndromes in addition to somatically acquired mutations.Correct classification of these variants can be difficult,because there is overlap between mutations described assomatic changes and those linked to familial cancer syn-dromes.48,50,54,55 Although it is possible to distinguish be-tween a germline and a somatic change by paired analysis ofgermline and tumor tissue, current laboratory workflows andeconomic realities limit NGS to the tumor specimen itself.In the context of the WUCaMP assay, whenever a clinicallysignificant variant is suspected to be constitutional, theordering clinician is informed, formal medical geneticevaluation is recommended, and, if clinically indicated, aconstitutional sample (such as a buccal swab or peripheralblood specimen) is obtained, to establish the correct clas-sification of the problematic variant by Sanger sequencing.

    Here, we have described the validation of an NGS-basedclinical oncology test for the detection of somatic variationin actionable gene targets. The validation process wedescribe for SNVs addresses recommendations of Gargiset al,19 the NGS checklist requirements of CAP,18 and NGSregulations of the New York State Department of Health. Asdetailed above, our validation approach for independentconfirmation of SNVs meets these requirements, inasmuchas our approach includes comparison of our results withpublicly available whole-genome sequence data, SNV callsextracted from publicly available microarray data, and datafrom Sanger sequencing performed in our laboratory.Likewise, we have independently confirmed well in excessof 10 SNVs from patient samples in which NGS detected a1-bp change.

    There is admittedly some ambiguity as to the preciseintent of some of the recommendations of the regulatoryentities. For example, the New York State Department ofHealth regulations require that a minimum of 10 positivepatient samples “for each type of variant in each target area”must be independently confirmed. It is reasonable to inter-pret the phrase “each type of variant” as referring to theclass of variant (namely, SNVs as opposed to indels,

    The Journal of Molecular Diagnostics - jmd.amjpathol.orgFLA 5.2.0 DTD � JMDI284_proof � 1

    structural variants, or copy number variants), rather than tothe functional consequence of the variant (splice site, versuscoding region synonymous, missense, or nonsense) and tointerpret the phrase “target area” as referring to the entire300-kb capture region of our assay, rather than to specificmutation hotspots, exons, or genes, as we did for the vali-dation of our assay. Nonetheless, other interpretations arepossible that would clearly require different validation par-adigms in routine clinical practice. Because the New YorkState Department of Health requirements seem to be themost rigorous yet published, there is no doubt that theambiguities must be addressed. Thus, the type of regulatoryuncertainty highlighted by our validation approach forSNVs in tumor samples will likely be of interest to manyentities as they develop guidelines for NGS-based clinicaltesting that address the technical and bioinformatic featuresof hybrid capture-based methods, as well as amplification-based methods.

    The various features of our validation approach for SNVsin tumor samples evaluated by NGS should prove useful toclinical laboratories developing similar approaches to testtissue samples to guide patient management in the emergingparadigm of precision medicine (or personalized medicine),whether at the time of diagnosis or at relapse. Similarly, ourvalidation approach should also prove useful for clinicallaboratories developing validation approaches for the otherclasses of somatic variation, including indels, copy numbervariation, and structural variation.

    Acknowledgment

    We thank Manoj Tyagi for assistance with data analysis.

    Supplemental Data

    Supplemental material for this article can be found athttp://dx.doi.org/10.1016/j.jmoldx.2013.10.002.

    References

    1. Walter MJ, Shen D, Ding L, Shao J, Koboldt DC, Chen K,Larson DE, McLellan MD, Dooling D, Abbott R, Fulton R,Magrini V, Schmidt H, Kalicki-Veizer J, O’Laughlin M, Fan X,Grillot M, Witowski S, Heath S, Frater JL, Eades W, Tomasson M,Westervelt P, DiPersio JF, Link DC, Mardis ER, Ley TJ, Wilson RK,Graubert TA: Clonal architecture of secondary acute myeloid leuke-mia. N Engl J Med 2012, 366:1090e1098

    2. Patel JP, Gönen M, Figueroa ME, Fernandez H, Sun Z, Racevskis J,Van Vlierberghe P, Dolgalev I, Thomas S, Aminova O, Huberman K,Cheng J, Viale A, Socci ND, Heguy A, Cherry A, Vance G,Higgins RR, Ketterling RP, Gallagher RE, Litzow M, van denBrink MR, Lazarus HM, Rowe JM, Luger S, Ferrando A, Paietta E,Tallman MS, Melnick A, Abdel-Wahab O, Levine RL: Prognosticrelevance of integrated genetic profiling in acute myeloid leukemia. NEngl J Med 2012, 366:1079e1089

    3. Kalari KR, Rossell D, Necela BM, Asmann YW, Nair A, Baheti S,Kachergus JM, Younkin CS, Baker T, Carr JM, Tang X, Walsh MP,Chai HS, Sun Z, Hart SN, Leontovich AA, Hossain A, Kocher JP,

    15

    1860

    4 November 2013 � 2:15 pm � EO: JMD13_0016

    http://dx.doi.org/10.1016/j.jmoldx.2013.10.002http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref1http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref2http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://jmd.amjpathol.org

  • Q20

    21

    Cottrell et al

    18611862186318641865186618671868186918701871187218731874187518761877187818791880188118821883188418851886188718881889189018911892189318941895189618971898189919001901190219031904190519061907190819091910191119121913191419151916191719181919192019211922

    1923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983

    Perez EA, Reisman DN, Fields AP, Thompson EA: Deep sequenceanalysis of non-small cell lung cancer: integrated analysis of geneexpression, alternative splicing, and single nucleotide variations inlung adenocarcinomas with and without oncogenic KRAS mutations.Front Oncol 2012, 2:12

    4. Chen J, Huang XF, Katsifis A: Activation of signal pathways and theresistance to anti-EGFR treatment in colorectal cancer. J Cell Bio-chem 2010, 111:1082e1086

    5. Duncavage EJ, Abel HJ, Szankasi P, Kelley TW, Pfeifer JD: Tar-geted next generation sequencing of clinically significant genemutations and translocations in leukemia. Mod Pathol 2012, 25:795e804

    6. Vasli N, Böhm J, Le Gras S, Muller J, Pizot C, Jost B, Echaniz-Laguna A, Laugel V, Tranchant C, Bernard R, Plewniak F, Vicaire S,Levy N, Chelly J, Mandel JL, Biancalana V, Laporte J: Next gen-eration sequencing for molecular diagnosis of neuromuscular dis-eases. Acta Neuropathol 2012, 124:273e283

    7. Calvo SE, Compton AG, Hershman SG, Lim SC, Lieber DS,Tucker EJ, Laskowski A, Garone C, Liu S, Jaffe DB,Christodoulou J, Fletcher JM, Bruno DL, Goldblatt J, Dimauro S,Thorburn DR, Mootha VK: Molecular diagnosis of infantile mito-chondrial disease with targeted next-generation sequencing. SciTransl Med 2012, 4:118ra10

    8. Pritchard CC, Smith C, Salipante SJ, Lee MK, Thornton AM,Nord AS, Gulden C, Kupfer SS, Swisher EM, Bennett RL,Novetsky AP, Jarvik GP, Olopade OI, Goodfellow PJ, King MC,Tait JF, Walsh T: ColoSeq provides comprehensive Lynch and pol-yposis syndrome mutational analysis using massively parallelsequencing. J Mol Diagn 2012, 14:357e366

    9. Voelkerding KV, Dames S, Durtschi JD: Next generation sequencingfor clinical diagnosticseprinciples and application to targeted rese-quencing for hypertrophic cardiomyopathy: a paper from the 2009William Beaumont Hospital Symposium on Molecular Pathology.J Mol Diagn 2010, 12:539e551

    10. Sikkema-Raddatz B, Johansson LF, de Boer EN, Almomani R,Boven LG, van den Berg MP, van Spaendonck-Zwarts KY, vanTintelen JP, Sijmons RH, Jongbloed JD, Sinke RJ: Targeted next-generation sequencing can replace Sanger sequencing in clinical di-agnostics. Hum Mutat 2013, 34:1035e1042

    11. Berg JS, Evans JP, Leigh MW, Omran H, Bizon C, Mane K,Knowles MR, Weck KE, Zariwala MA: Next generation massivelyparallel sequencing of targeted exomes to identify genetic mutationsin primary ciliary dyskinesia: implications for application to clinicaltesting. Genet Med 2011, 13:218e229

    12. Vandrovcova J, Thomas ERA, Atanur SS, Norsworthy PJ,Neuwirth C, Tan Y, Kasperaviciute D, Biggs J, Game L, Mueller M,Soutar AK, Aitman TJ: The use of next-generation sequencing inclinical diagnosis of familial hypercholesterolemia. Genet Med 2013[Epub ahead of press] http://dx.doi.org/10.1038/gim.2013.55

    13. Neumann J, Zeindl-Eberhart E, Kirchner T, Jung A: Frequency andtype of KRAS mutations in routine diagnostic analysis of metastaticcolorectal cancer. Pathol Res Pract 2009, 205:858e862

    14. Dienstmann R, Vilar E, Tabernero J: Molecular predictors of responseto chemotherapy in colorectal cancer. Cancer J 2011, 17:114e126

    15. Ladanyi M, Pao W: Lung adenocarcinoma: guiding EGFR-targetedtherapy and beyond. Mod Pathol 2008, 21(Suppl 2):S16eS22

    16. Girard N: Thymic tumors: relevant molecular data in the clinic.J Thorac Oncol 2010, 5(10 Suppl 4):S291eS295

    17. Godley LA: Profiles in leukemia. N Engl J Med 2012, 366:1152e115318. College of American Pathologists: Molecular Pathology Checklist.

    Northfield, IL, College of American Pathologists, 201219. Gargis AS, Kalman L, Berry MW, Bick DP, Dimmock DP,

    Hambuch T, et al: Assuring the quality of next-generationsequencing in clinical laboratory practice. Nat Biotechnol 2012,30:1033e1036

    20. National Committee for Clinical Laboratory Standards (NCCLS):Nucleic acid sequencing methods in diagnostic laboratory medicine;

    16FLA 5.2.0 DTD � JMDI284_proof �

    approved guideline. CLSI (NCCLS) document MM09-A. Wayne,PA, Clinical and Laboratory Standards Institute, 2004

    21. International HapMap Consortium; Frazer KA, Ballinger DG, Cox DR,Hinds DA, Stuve LL, Gibbs RA, et al: A second generation humanhaplotype map of over 3.1 million SNPs. Nature 2007, 449:851e861

    22. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N,Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Pro-cessing Subgroup: The Sequence Alignment/Map format and SAM-tools. Bioinformatics 2009, 25:2078e2079

    23. Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities forcomparing genomic features. Bioinformatics 2010, 26:841e842

    24. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K,Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M,DePristo MA: The Genome Analysis Toolkit: a MapReduce frame-work for analyzing next-generation DNA sequencing data. GenomeRes 2010, 20:1297e1303

    25. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR,Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M,McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY,Cibulskis K, Gabriel SB, Altshuler D, Daly MJ: A framework forvariation discovery and genotyping using next-generation DNAsequencing data. Nat Genet 2011, 43:491e498

    26. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M,Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. NatBiotechnol 2011, 29:24e26

    27. Thorvaldsdóttir H, Robinson JT, Mesirov JP: Integrative GenomicsViewer (IGV): high-performance genomics data visualization andexploration. Brief Bioinform 2013, 14:178e192 Q

    28. 1000 Genomes Project Consortium; Abecasis GR, Altshuler D,Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME,McVean GA: A map of human genome variation from population-scale sequencing. Nature 2010, 467:1061e1073

    29. Wickham H: The split-apply-combine strategy for data analysis. J StatSoftware 2011, 40:1e29

    30. R Development Core Team: R: A language and environment forstatistical computing. Vienna, Austria, R Foundation for StatisticalComputing, 2012

    31. Wickham H: ggplot: Elegant Graphics for Data Analysis. ed 2. NewYork, Springer, 2009

    32. 1000 Genomes Project Consortium; Abecasis GR, Altshuler D,Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME,McVean GA: A map of human genome variation from population-scale sequencing. Nature 2010, 467:1061e1073

    33. Drmanac R, Sparks AB, Callow MJ, Halpern AL, Burns NL,Kermani BG, et al: Human genome sequencing using unchained basereads on self-assembling DNA nanoarrays. Science 2010, 327:78e81

    34. Spencer DH, Sehn JK, Abel HJ, Watson MA, Pfeifer JD,Duncavage EJ: Comparison of clinical targeted next-generationsequence data from formalin-fixed and fresh-frozen tissue speci-mens. J Mol Diagn 2013, 15:623e633

    35. Spencer DH, Tyagi M, Vallania F, Bredemeyer A, Pfeifer JD, MitraRD, Duncavage EJ: Performance of common analysis methods fordetecting low-frequency single nucleotide variants in targeted next-generation sequence data. Am J Pathol 2013 [Epub ahead of press]http://dx.doi.org/10.1016/j.jmoldx.2013.09.003

    36. Spencer DH, Abel HJ, Lockwood CM, Payton JE, Szankasi P,Kelley TW, Kulkarni S, Pfeifer JD, Duncavage EJ: Detection ofFLT3 internal tandem duplication in targeted, short-read-length, next-generation sequencing data. J Mol Diagn 2013, 15:81e93

    37. Paez JG, Jänne PA, Lee JC, Tracy S, Greulich H, Gabriel S,Herman P, Kaye FJ, Lindeman N, Boggon TJ, Naoki K, Sasaki H,Fujii Y, Eck MJ, Sellers WR, Johnson BE, Meyerson M: EGFRmutations in lung cancer: correlation with clinical response to gefi-tinib therapy. Science 2004, 304:1497e1500

    38. Stephens P, Hunter C, Bignell G, Edkins S, Davies H, Teague J, et al:Lung cancer: intragenic ERBB2 kinase mutations in tumours. Nature2004, 431:525e526

    jmd.amjpathol.org - The Journal of Molecular Diagnostics

    1984

    14 November 2013 � 2:15 pm � EO: JMD13_0016

    http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref3http://refhub.elsevier.com/S1525-1578(13)00219-5/sref4http://refhub.elsevier.com/S1525-1578(13)00219-5/sref4http://refhub.elsevier.com/S1525-1578(13)00219-5/sref4http://refhub.elsevier.com/S1525-1578(13)00219-5/sref4http://refhub.elsevier.com/S1525-1578(13)00219-5/sref5http://refhub.elsevier.com/S1525-1578(13)00219-5/sref5http://refhub.elsevier.com/S1525-1578(13)00219-5/sref5http://refhub.elsevier.com/S1525-1578(13)00219-5/sref5http://refhub.elsevier.com/S1525-1578(13)00219-5/sref5http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref6http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref7http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref8http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref9http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref10http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://refhub.elsevier.com/S1525-1578(13)00219-5/sref11http://dx.doi.org/10.1038/gim.2013.55http://refhub.elsevier.com/S1525-1578(13)00219-5/sref13http://refhub.elsevier.com/S1525-1578(13)00219-5/sref13http://refhub.elsevier.com/S1525-1578(13)00219-5/sref13http://refhub.elsevier.com/S1525-1578(13)00219-5/sref13http://refhub.elsevier.com/S1525-1578(13)00219-5/sref14http://refhub.elsevier.com/S1525-1578(13)00219-5/sref14http://refhub.elsevier.com/S1525-1578(13)00219-5/sref14http://refhub.elsevier.com/S1525-1578(13)00219-5/sref15http://refhub.elsevier.com/S1525-1578(13)00219-5/sref15http://refhub.elsevier.com/S1525-1578(13)00219-5/sref15http://refhub.elsevier.com/S1525-1578(13)00219-5/sref16http://refhub.elsevier.com/S1525-1578(13)00219-5/sref16http://refhub.elsevier.com/S1525-1578(13)00219-5/sref16http://refhub.elsevier.com/S1525-1578(13)00219-5/sref17http://refhub.elsevier.com/S1525-1578(13)00219-5/sref17http://refhub.elsevier.com/S1525-1578(13)00219-5/sref18http://refhub.elsevier.com/S1525-1578(13)00219-5/sref18http://refhub.elsevier.com/S1525-1578(13)00219-5/sref18http://refhub.elsevier.com/S1525-1578(13)00219-5/sref18http://refhub.elsevier.com/S1525-1578(13)00219-5/sref18http://refhub.elsevier.com/S1525-1578(13)00219-5/sref19http://refhub.elsevier.com/S1525-1578(13)00219-5/sref19http://refhub.elsevier.com/S1525-1578(13)00219-5/sref19http://refhub.elsevier.com/S1525-1578(13)00219-5/sref19http://refhub.elsevier.com/S1525-1578(13)00219-5/sref20http://refhub.elsevier.com/S1525-1578(13)00219-5/sref20http://refhub.elsevier.com/S1525-1578(13)00219-5/sref20http://refhub.elsevier.com/S1525-1578(13)00219-5/sref20http://refhub.elsevier.com/S1525-1578(13)00219-5/sref21http://refhub.elsevier.com/S1525-1578(13)00219-5/sref21http://refhub.elsevier.com/S1525-1578(13)00219-5/sref21http://refhub.elsevier.com/S1525-1578(13)00219-5/sref21http://refhub.elsevier.com/S1525-1578(13)00219-5/sref21http://refhub.elsevier.com/S1525-1578(13)00219-5/sref22http://refhub.elsevier.com/S1525-1578(13)00219-5/sref22http://refhub.elsevier.com/S1525-1578(13)00219-5/sref22http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref23http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref24http://refhub.elsevier.com/S1525-1578(13)00219-5/sref25http://refhub.elsevier.com/S1525-1578(13)00219-5/sref25http://refhub.elsevier.com/S1525-1578(13)00219-5/sref25http://refhub.elsevier.com/S1525-1578(13)00219-5/sref25http://refhub.elsevier.com/S1525-1578(13)00219-5/sref26http://refhub.elsevier.com/S1525-1578(13)00219-5/sref26http://refhub.elsevier.com/S1525-1578(13)00219-5/sref26http://refhub.elsevier.com/S1525-1578(13)00219-5/sref26http://refhub.elsevier.com/S1525-1578(13)00219-5/sref27http://refhub.elsevier.com/S1525-1578(13)00219-5/sref27http://refhub.elsevier.com/S1525-1578(13)00219-5/sref27http://refhub.elsevier.com/S1525-1578(13)00219-5/sref27http://refhub.elsevier.com/S1525-1578(13)00219-5/sref27http://refhub.elsevier.com/S1525-1578(13)00219-5/sref28http://refhub.elsevier.com/S1525-1578(13)00219-5/sref28http://refhub.elsevier.com/S1525-1578(13)00219-5/sref28http://refhub.elsevier.com/S1525-1578(13)00219-5/sref29http://refhub.elsevier.com/S1525-1578(13)00219-5/sref29http://refhub.elsevier.com/S1525-1578(13)00219-5/sref29http://refhub.elsevier.com/S1525-1578(13)00219-5/sref30http://refhub.elsevier.com/S1525-1578(13)00219-5/sref30http://refhub.elsevier.com/S1525-1578(13)00219-5/sref31http://refhub.elsevier.com/S1525-1