Post on 07-Feb-2016
description
Improving Peptide Probability Modeling in Scaffold 4
Brian C. Searlebrian.searle@proteomesoftware.com
Scaffold Users Meeting, 2013
Creative Commons Attribution
Scaffold 4 Improvements
• Probability Estimation using LFDR
• Target/Decoy Classification of multiple scores
• Delta Mass Error Modeling Improvements
• Requires Target/Decoy analysis (1:1 … 1:10)
“Correct”
“Incorrect”
p( | D) p(D | ) p()
p(D | ) p() p(D | ) p( )
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
XCorr
DeltaCN
% Ions Identified
…
XCorr
DeltaCN
% Ions Identified
…
XCorr
DeltaCN
% Ions Identified
€
discriminant score = logp(D | +)∏p(D | −)∏
⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
Naïve Bayes Classifier
• Trained to each data set
• Simple (can calculate with a formula, no magic!)
• Robust to over-fitting
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
Probability the ID is Correct
Probability the ID is Wrong
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
Protein-Level False Discovery Rate
Num
ber o
f Ide
ntifi
ed P
rote
ins
Num
ber o
f Ide
ntifi
ed P
rote
ins
1% Peptide FDR
Num
ber o
f Ide
ntifi
ed P
rote
ins
Protein-Level FDR
1% Peptide FDR > 10% Protein FDR?!?
New Search Engines?
• Difficult to add new search engines with PeptideProphet (new seeds)
• Easy to add with Naïve Bayes / LFDR
• mzIdentML interchange (HUPO standard)
New Search Enginesin Scaffold 4
• Peaks• Byonic• Myrimatch (Tabb Lab)• SQID (Wysocki Lab)• MS-GF+ (Pevzner Lab)• MS-Amanda (Mechtler Lab, PD)
New Search Enginesin Scaffold 4
• Peaks• Byonic• Myrimatch (Tabb Lab)• SQID (Wysocki Lab)• MS-GF+ (Pevzner Lab)• MS-Amanda (Mechtler Lab, PD)
• ... Any engine with decoys & mzIdentML!
Scaffold 4 Improvements
• New Naïve Bayes / LFDR Probabilities– Probability Estimation using LFDR– Target/Decoy Classification– Delta Mass Error Modeling– “Next generation” search engine interpretation
• New mzIdentML File Loading– Several newly supported search engines– Any search engine with decoys