Reducing False Positives with Automated NMR Verification · Reducing False Positives with Automated...
Transcript of Reducing False Positives with Automated NMR Verification · Reducing False Positives with Automated...
Reducing False Positives with Automated NMR Verification
Ryan Sasaki
NMR Product Manager SMASH 2011
Why Not More NMR?
• Cost $$$
• NMR > LC-MS > GC-MS1
• Sample requirements/Sensitivity
• Tougher to interpret
But can we afford to NOT have a complementary data evaluation tool?
1High-Throughput NMR Analysis: The End Game, Anthony Macherone, ASDI Group of Companies, ENC 2008
Why NMR?
O
O
Cl
Br
Br
H
O
O
Br
Br
Cl
H
N
N
N
OH
Cl CH3
CH3CH3
CH3
CH3 CH3
N
N
NCl
CH3CH3
CH3
CH3
CH3
CH3
OH
O O
Br
O OCH3 CH3
O
OCH3
O
O CH3
Br
N+O-
O
O
OH
O
O
CH3
CH3 H
H
N+O-
O
OH
OO
CH3
O
CH3
H
H
The Orthogonal NMR
• LC-MS is still the way to go on a sample-by-sample basis in high-throughput
• Usage of NMR as a complement to LC-MS
• Automatic evaluation by NMR of only those that pass LC-MS analysis
• The goal:
• Identify a “manageable” subset of compounds that may require a second look
• The challenge:
• Ensure incorrect structures get caught
Major Challenges in Automated NMR Verification Today
• Limited set of experiments for routine and high-throughput work
• 1H NMR
• COSY?
• HSQC?
• Balancing acquisition time vs. acceptable results
• How much is enough?
• Balancing False Positives vs. False Negatives
Our Focus
• Ensuring samples that pass our system are passing for the right reason
◦ Improving accuracy of NMR assignments2
• Catching the false positives
• Both endeavors are impacted by the amount of data that can be acquired.
◦ How much is enough?
2 Evaluation of the Benefit of Including COSY and HSQC 2D Data in Automated Structure Verification, ENC 2010
1H and HSQC Combined Verification
• The benefits of a combination of 1H and HSQC
◦ NMR Prediction – 13C chemical shifts are generally more “predictable”
– Usage of both 1H and 13C improve assignments and overall verification performance
– Incorrect assignment of 1H can be proactively caught by prediction of attached carbon’s chemical shift and vice versa
1H and HSQC Combined Verification
• The benefits of a combination of 1H and HSQC
◦ Peak Picking and multiplet creation – Filtering of peak artifacts in 2D
– Identification of Labile Protons in 1D
– Easier identification of diastereotopic protons
– Better recognition of distinct, but overlapping multiplets
– Multiplicity-edited information can help assignments
• The drawbacks
◦ The information does not always prove that the structure is correct.
Introducing Concurrent NMR Verification
• Can be used for any verification routine (1H, 13C, Combined 1H & HSQC).
• Verification triggers the generation of multiplet alternative structures every time a proposed structure passes.
• Software automatically evaluates the verification of all proposed structures under default conditions and settings
NH
F F
FOH
O
OH
NH
O
F
F
F
Concurrent NMR Verification
• If software passes at least one “generated” structure, it will re-run verification under tighter chemical shift constraints
Concurrent NMR Verification
• If multiple structures (including the proposed) survive 3 iterations, a flag is generated.
◦ False positive warning- Either multiple structures pass, or all structures fail
◦ False positive alert- Only incorrect structure passes
• False positive warnings and alerts suggest
◦ The proposed structure may not be correct
◦ The data is ambiguous and additional experiments may be required.
Concurrent NMR Verification
• Questions to ask:
◦ How does this affect the “pass rate”
◦ How does this affect the “false positive” rate?
◦ What is the best way to measure the results?
Concurrent NMR Verification Test 1
• 127 1H and HSQC datasets evaluated
• One positive control and one negative control structure was evaluated
N
O
NH
O
CH3 N
CH3
CH3
N
O
NH
O
CH3 N
CH3
CH3
Comparison of Standard Verification vs. Concurrent Verification
100
26
92
0
27
101
27
119
8 8
0
20
40
60
80
100
120
140
Correct Structures Incorrect Structures Correct Structures Incorrect Structures
Standard Verification Concurrent Verification
Pass Fail Alert/Warning
79% Pass Rate
20% False Positive Rate
72% Pass Rate
0% False Positive Rate
Concurrent NMR Verification Test 1 Results
• Test 1 suggested a system whereby 72% of spectra (92/127) can be automatically evaluated without human intervention with a false positive rate of 0%
• This compared to the standard verification approach whereby 79% of spectra (100/127) can be automatically evaluated but with a false positive rate of 20%.
Concurrent NMR Verification Test 2
• The same 127 1H and HSQC datasets evaluated
• Mimic a scenario where the correct structure is not proposed
• Two negative controls (wrong structures) were evaluated
• Originally proposed incorrect structure from Test #1 was considered the “proposed” structure for a fair comparison of two approaches
26
6
101 102
19
0
20
40
60
80
100
120
Incorrect Structures Incorrect Structures
Standard Verification Concurrent Verification
Pass Fail Alert/Warning
Comparison of Standard Verification vs. Concurrent Verification
20% False Positive Rate
5% False Positive Rate
Concurrent NMR Verification Test 2 Results
• Test 2 results an improvement in false positive detection from 20% to 5%
• This improvement comes with the added cost of 19 additional datasets that were flagged for manual review
Conclusions
• Tests revealed that concurrent verification can dramatically improve false positive detection rates without a significant increase in manual labor (review of flagged results)
• The new category of false positive warnings/alerts can be used to communicate that more experiments may be required for confirmation
Current and Future Developments
• Implementation of a structure generation component to automatically generate alternative structures on the fly
• More tests to evaluate the impact of increasing the number of alternative chemical structures3
• Analysis of the impact of concurrent verification on other experiments (1H, 13C, etc.)
• Further work on the impact of COSY on Combined Verification2
• The usage of peak deconvolution to estimate the reliability of multiplicity patterns
3 ENC Posters #386 and #388
Acknowledgements
• Sergey Golotvin, ACD/Labs
• Kirill Blinov, ACD/Labs
• Asya Nikitina, ACD/Labs
• Phil Keyes, Lexicon Pharmaceuticals
• Gonzalo Hernandez, Vis Magnetica
• John Hollerton, GSK Stevenage
• Duncan Farrant, GSK Stevenage
• Randy Rutkowske, GSK RTP
• Tim Spitzer, GSK RTP