NextGen Sequencing Method Validation and CLIA Considerations€¦ · Sequencing. Data Transfer,...
Transcript of NextGen Sequencing Method Validation and CLIA Considerations€¦ · Sequencing. Data Transfer,...
Office of Advanced Molecular DetectionNational Center for Emerging and Zoonotic Infectious Diseases
NextGen Sequencing Method Validation and CLIA ConsiderationsMaryann TurnsekEnteric Diseases Laboratory Branch/DDID/NCEZID/CDCPN session: Sequence Quality/Regulatory Authorities (CLIA/ISO 17025): Looking Through the Magnifying Glass East Coast PulseNet/OutbreakNet Regional Meeting- Tampa, FL
AMD
–In
nova
te * T
rans
form
* Pro
tect
Objectives
• Discuss CLIA method validation requirements from a NextGen Sequencing (NGS) perspective
• Share EDLB approach to NGS Method Validation for CLIA
AMD
–In
nova
te * T
rans
form
* Pro
tect
NGS Method Validation and CLIA considerations
CLIA method validation requirements are based on test complexity.
NGS assays are high-complexity, laboratory developed tests (LDTs).
Regulation requires laboratories to establish analytical performance specifications for LDTs before the release of any patient test results.
Traditional definitions do not readily translate to NGS.
AMD
–In
nova
te * T
rans
form
* Pro
tect
DEVELOPMENT/OPTIMIZATION VALIDATION
DailyQA/QC PT/AA
PATIENT TESTING
Method Validation Framework
AMD
–In
nova
te * T
rans
form
* Pro
tect
Method Validation Framework
VALIDATION
PLATFORM
TEST
IT / PIPELINE
TEST DEVELOPMENT / OPTIMIZATION
PATIENT TESTING
QC PT/AA
DAILY PERIODICALLY
IT / BIOINFORMATICS INVOLVEMENT
Gargis et al., Assuring the quality of next-generation sequencing in clinical laboratory practice. Nat Biotechnol. 2012 Nov;30(11):1033-6.
AMD
–In
nova
te * T
rans
form
* Pro
tect
DEVELOPMENT/OPTIMIZATIONStage 1: • Standard operating procedure(s) for entire test
system (NGS) while incorporating quality assurance throughout
• Preliminary analysis to define test specific (i.e. ANI) acceptance criteria, limitations, quality thresholds
• Majority of sequence generation and analysis occurred during Stage 1.
AMD
–In
nova
te * T
rans
form
* Pro
tect
Standardized Workflow for NGS Test System in EDLB
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
IdentificationTest Criteria
SubtypingTest Criteria
Ref IDBioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Includes all the individual procedural steps CLIA validation considerations examine the test system
Pre-analytical, analytical, and post analytical
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
Input requirements: 10X average coverage required for ANI test Assembly within expected size range (MB)
Preliminary Analysis during Development and Optimization for Average Nucleotide Identity (ANI) in RefID Database
Isolate received
Pre-analytical requirements
ANICriteria
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
AMD
–In
nova
te * T
rans
form
* Pro
tect
ANICriteria
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Output requirements: 70-75% based aligned between query sequence and reference genome 92-95% ANI value for speciation
Preliminary Analysis during Development and Optimization for Average Nucleotide Identity (ANI) in RefID Database
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
• Stated purpose and rationale
• Selected validation strain set
• Detailed precision test plan
• Summarized results
• Defined ANI acceptance criteria
• Described the quality assurance plan
• Established performance characteristics
ValidationStage 2:
AMD
–In
nova
te * T
rans
form
* Pro
tect
Strain Selection Representative of the assay
• Initial targets: 6 Campylobacter sp., 3 Escherichia sp., 6 Listeria sp.
Taxonomic diversity within each target species
Frequency of receipt
Public Health Significance
325 sequences in the final dataset (Reminder: most work is already done.)
Same data will be used to validate other tests !!!
AMD
–In
nova
te * T
rans
form
* Pro
tect
Conditions to change: operator, chemistries, equipment Look at the variables over time Start with DNA extraction – end with evaluation of the ANI test output Subset of 16 strains – 15 target species (‘True Positive’) and 1 non-
target species (‘True Negative’)
Precision Testing Plan for ANI
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
ANICriteria
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect EDLB example for determining precision using the subset of 16 strains:
Katie, Qiagen Extractions, Nextera XT library, MiSeq M947, v2 Chemistry (week 1)
Ashley, Promega Extractions, NEB library, MiSeq M347, v2 Chemistry (week 2)
Jan, NXP Robot Extractions, Nextera XT library, MiSeq M3235, v2 Chemistry (week 3)
Precision Testing Plan
AMD
–In
nova
te * T
rans
form
* Pro
tect
Establishing Performance Characteristics –EDLB definitions for ANI test
Sensitivity:Sensitivity will be defined as the ability of ANI to correctly identify True Positive samples.
Specificity:Specificity will be defined as the ability of ANI to correctly identify True Negative samples.
Definitions may change depending on the test !!!
AMD
–In
nova
te * T
rans
form
* Pro
tect
Establishing Performance Characteristics –EDLB definitions for ANI test
True Positives (TP) are the samples we expect to correctly identify, as a representative genome of that species is present in the reference genome set.
True Negatives (TN) are the samples we do not expect to identify, as they will not have a representative genome in the reference genome set.
Definitions may change depending on the test !!!
AMD
–In
nova
te * T
rans
form
* Pro
tect
QC Checkpoints
Decision point used to determine whether the test samples and controls are of acceptable quality to continue through the process. QC checkpoints are defined by the acceptance criteria within each procedure.
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
ANICriteria
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
QC Checkpoints – Outcomes and Action
AMD
–In
nova
te * T
rans
form
* Pro
tect
Patient TestingStage 3:
• External Controls are prepared in parallel w/ test samples and included on each run
• Enrollment in CMS*-approved Proficiency Testing ‘PT’ program/Alternative Assessment
• An assay change will require re-validation or verification
*Centers for Medicare and Medicaid Services
AMD
–In
nova
te * T
rans
form
* Pro
tectExternal Controls
Our CLIA director requires…• Positive (+) and Negative (-) controls are prepared and analyzed in parallel with
test samples.
In EDLB…• Rotate use of approximately 14 controls.• The analytical interpretation has been established for (+) controls.• The analytical interpretation has not been established for (-) controls.
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
ANICriteria
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
Run StatisticsReagent Information
Post-Run MetricsTrackedComponent ExpirationDate LotNumber PartNumber SerialNumber
RunID: M3235-18-054 ATM 2020-04-16 20256901 Cluster Density: 841K/mm2RunStaff: PS (koj1,jmp5) Beads 2018-12-12 16510700 Clusters Passing Filter: 92.7%RunDate: 11/30/2018 EBT 2019-09-21 9/21/2018 Estimated Yield: 7755.4MBKitUsed: Illumina NexteraXT v2 Flowcell 2019-06-18 20267990 15028382 000000000-C3DYY Post-Run Wash: Tween + BleachTotal Strains: 19 Incorporationbuffer 2019-06-25 20269015 Q30: 88.1%Total Failed: 0 Indices 2018-12-04 20158607
Thermocycler (CT014693/7-611) MiSeqcartridge 2019-06-15 20288799- Amplification: Acceptable Temp Range?
Yes NPM 2019-02-12 20239091
- Amplification: Error Message? No NT Buffer 2019-07-26 20184109
- Amplification: Temp Check? Yes PR2Bottle 2019-06-25 15041807 MS7045099-00PR2
- Tagmentation: Acceptable Temp Range?
Yes ReagentKit 2019-06-15 15033573 MS7202550-500V2
- Tagmentation: Error Message? No RSB 2019-07-10 20278909
- Tagmentation: Temp Check? Yes TD Buffer 2019-06-25 20278674
Isolate Information
EDLB Strain ID Workbook ID Coverage AvgQuality(R1;R2)
Assembled Genome Size ANI ID ANI Score (%) Bases Aligned (%) Comments
2013C-3256 Escherichia coli 100.6 36.26; 33.88 5.38 Escherichia coli 98.96 89.14 Successful: Positive Control
2014N-4038 Cronobacter sakazakii 79.8 35.44; 32.02 4.59 NO ID Successful2018C-3447 Escherichia species 234.8 36.49; 34.83 4.61 Escherichia coli 98.53 91.5 Successful2018D-0184 Campylobacter species 92.6 37.58; 35.63 1.56 Campylobacter lari 98.29 90.24 Successful
2018D-0185 Campylobacter species 135.5 37.35; 36.17 1.62 Campylobacter upsaliensis 98.19 92.39 Successful
2018EL-1128a Escherichia species 99.2 36.35; 33.3 4.85 Escherichia coli 99.06 91.06 SuccessfulCDC03-98-PS Escherichia coli 53 35.78; 32.91 5.26 Escherichia coli 97.86 87.9 SuccessfulCDCH8394-PS Listeria monocytogenes 84.5 36.91; 35.04 2.88 Listeria monocytogenes 99.5 95.99 Successful
D2371 Helicobacter pylori 126.5 37.06; 35.83 1.66 NO ID Successful: Negative Control
WGS Run Summary
AMD
–In
nova
te * T
rans
form
* Pro
tect
Re-validating after a change
Implementation of new DNA extraction kit
High throughput workflow using our core facility
Added Salmonella and Vibrionaceae to the ANI test. Two parts: 1st validated the new Reference Genome Dataset version 2 (RDG v2) did not
change the final result for our previously validated targets 2nd validated the new targets
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
ANICriteria
SubtypingTest Criteria
ANI Calculation
BioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
Approach to Validation of Additional Tests
rpoBCriteria
rpoBphylogeny
DNAExtraction
LibraryPreparation Sequencing
Concentration,Purity
LibraryConcentration
Data Transfer, Assembly
Coverage, Read Quality,Genome Size
Run Quality
OrganismDatabase
BioNumerics
SubtypingTest Criteria
ReportResults
The same workflow, documents and sequence data can be used to validate additional tests.
Examine input requirements for sequence quality and output requirements for results interpretation, test limitations.
Documents specific to analysis and interpretive criteria needed.
Reporting Results from Sequence Data Submitted by External Laboratory
Form 50.34 Submitted with State ID and Identification Test
Request
Sequence Data Deposited
to PHL Cloud Storage
• Validation for CLIA > Acceptable Material for testing in EDLB NGS test system
• Fit into existing systems for submission
• WGS ID provided on the form to link the sequence to the request !
AMD
–In
nova
te * T
rans
form
* Pro
tect
In conclusion…..
• There is more than one way to approach validation of an NGS-based assay subject to CLIA regulations.
• Encourage you to join in conversation with other laboratories adopting NGS.
Thank you !
Office of Advanced Molecular DetectionNational Center for Emerging and Zoonotic Infectious Diseases
CLIA-certified Reporting of Escherichia Serotype and Virulence Genes using WGS Rebecca Lindsey, PhDEnteric Diseases Laboratory Branch/CDCPN session: Sequence Quality/Regulatory Authorities (CLIA/ISO 17025): Looking Through the Magnifying Glass East Coast PulseNet/OutbreakNet Regional Meeting- Tampa, FL
AMD
–In
nova
te * T
rans
form
* Pro
tect
Objectives
Share EDLB approach to NGS Method Validation of Escherichia subtyping for CLIA
AMD
–In
nova
te * T
rans
form
* Pro
tect
Definitions of analysis pipeline CLIA
• secure externally accessible analytical engine that uses static well curated databases
PulseNet• BioNumerics v7 Escherichia
database with the PulseNet WGS Tools plug in the Genotyper. Contains the CGE SerotypeFinderand VirulenceFinder databases.
AMD
–In
nova
te * T
rans
form
* Pro
tect
E. coli subtypingTest name Results WGS Method in BN v7
Serotype O:H BLAST
Virulence stx1, stx2, eae, ehxA, ipaH BLAST and in silico PCR
Toxin subtype stx1a stx1c stx1d stx2a- stx2g BLAST and in silico PCR
Validation plan - Selected 95 strains • Representatives for 31 distinctive O groups and 14 H groups• Represent all listed STEC virulence and Shiga-toxin subtypes
AMD
–In
nova
te * T
rans
form
* Pro
tect
Standardized Workflow for NGS Test System in EDLB
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
Input to subtype
SubtypingTest Criteria
Ref IDBioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Input requirements: 40X average coverage required for transfer to Escherichia database Assembly within expected size range (4.5-5.5 MB)
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
Output requirementsBLAST-based Tests Sequence Identity
(%) Minimum relative
Coverage (%)
Serotype Detection ≥90 ≥60 Virulence Detection ≥90 ≥60
in silico PCR-based Tests QualityParameter Value
Maximum non-IUPAC nucleotides 2Maximum nucleotide mismatches 1Maximum sequence length deviation (maximum) 10%
Minimum Percent Identity 90%
AMD
–In
nova
te * T
rans
form
* Pro
tect
Validation of a subset of the O and H Validation plan - Selected 95 strains
• Representatives for 31 distinctive O groups and 14 H groups• Represent all described STEC virulence and Shiga-toxin subtypes
Static, curated database = approved for all serotypes! Add sequence context – physical characteristics of region
(%GC, strand bias, repetitive regions, indels, SNP)• Can accurately sequence within %GC range• O antigens range of 25.7% - 46.8% • H antigens range of 42.9% - 51.9%
AMD
–In
nova
te * T
rans
form
* Pro
tect
Standardized Workflow for NGS Test System in EDLB
Concentration,Purity
LibraryConcentration
Coverage, Read Quality,Genome Size
Run Quality
Input to subtype
SubtypingTest Criteria
Ref IDBioNumerics
DNAExtraction
LibraryPreparation Sequencing Data Transfer,
Assembly
OrganismDatabase
BioNumerics
ReportResults
Can use electron data as controls for externally submitted sequence data submitted by a partner.
Isolate received
Pre-analytical requirements
AMD
–In
nova
te * T
rans
form
* Pro
tect
Run StatisticsReagent Information
Post-Run MetricsTrackedComponent ExpirationDate LotNumber PartNumber SerialNumber
RunID: M3235-18-054 ATM 2020-04-16 20256901 Cluster Density: 841K/mm2RunStaff: PS (koj1,jmp5) Beads 2018-12-12 16510700 Clusters Passing Filter: 92.7%RunDate: 11/30/2018 EBT 2019-09-21 9/21/2018 Estimated Yield: 7755.4MBKitUsed: Illumina NexteraXT v2 Flowcell 2019-06-18 20267990 15028382 000000000-C3DYY Post-Run Wash: Tween + BleachTotal Strains: 19 Incorporationbuffer 2019-06-25 20269015 Q30: 88.1%Total Failed: 0 Indices 2018-12-04 20158607
Thermocycler (CT014693/7-611) MiSeqcartridge 2019-06-15 20288799- Amplification: Acceptable Temp Range?
Yes NPM 2019-02-12 20239091
- Amplification: Error Message? No NT Buffer 2019-07-26 20184109
- Amplification: Temp Check? Yes PR2Bottle 2019-06-25 15041807 MS7045099-00PR2
- Tagmentation: Acceptable Temp Range?
Yes ReagentKit 2019-06-15 15033573 MS7202550-500V2
- Tagmentation: Error Message? No RSB 2019-07-10 20278909
- Tagmentation: Temp Check? Yes TD Buffer 2019-06-25 20278674
Isolate Information
EDLB Strain ID Workbook ID Coverage AvgQuality(R1;R2)
Assembled Genome Size ANI ID Serotype Virulence Toxin_wgs Comments
ATCC BAA-460 Escherichia coli 100.6 36.26; 33.88 5.38 Escherichia coli O157:H7 stx1, stx2, eae, ehxA stx1a, stx2a Successful: Positive Control
2016C-4874 Escherichia species 53 35.78; 32.91 5.26 Escherichia coli O103:H2 stx1, eae stx1a Successful
2015C-5225 Escherichia coli 85.8 36.02; 34.21 5.00 Escherichia coli O26:H11 stx1, eae, ehxA stx1a Successful
2013C-3256 Escherichia coli 100.6 36.26; 33.88 5.38 Escherichia coli O111:H8 stx1, eae, ehxA stx1a Successful
2015C-3895 Escherichia species 99.2 36.35; 33.30 4.85 Escherichia coli O63:H6 stx2, eae stx2f Successful
2015C-3165 Escherichia species 74.2 36.12; 34.30 4.95 Escherichia coli O123/O186:H2 stx1, eae stx1a Successful2015C-3799 Escherichia coli 53 35.78; 32.91 5.12 Escherichia coli :H16 ipaH Successful: Negative Control
WGS Run Summary
AMD
–In
nova
te * T
rans
form
* Pro
tect
In conclusion…..
• There is more than one way to approach validation of an NGS-based assay subject to CLIA regulations.
• Encourage you to join in conversation with other laboratories adopting NGS.
For questions, please email….([email protected]) using the subject line NGS Quality – ANI ID.
Thank you !