Eric Proegler Oredev Interpreting Performance Testing Results
Interpreting MS\\MS Results
-
Upload
proteome-software -
Category
Documents
-
view
5.139 -
download
2
description
Transcript of Interpreting MS\\MS Results
Interpreting MS/MS Proteomics Results
Brian C. SearleProteome Software Inc. Portland, Oregon USA
NPC Progress Meeting(February 2nd, 2006)
The first thing I should say is that none of the material presented is
original research done at Proteome Software
but we do strive to make the tools presented here available in our software product Scaffold. With that
caveat aside…
Illustrated by Toni Boudreault
OrganizationThis is foremost an
introduction so we’re first going to talk
about
Then we’re going to talk about the motivations behind the development of
the first really useful bioinformatics technique in our field, SEQUEST.
This technique has been extended by two other tools
called X! Tandem and Mascot.
We’re also going to talk about how these programs differ
and how we can use that to our advantage by considering them simultaneously using probabilities.
Identify SEQUEST
X! Tandem/Mascot
Differ
Combine
how you go about identifying proteins with tandem mass spectrometry in the
first place
So, this is proteomics, so we’re going to use tandem mass spectrometry to identify proteins-- hopefully many of them, and hopefully very quickly.
A
A
I
K
G
K
I
D
VC
I
V
L
L
Q H KA
E PT
I
R
NT
DG
R
TA
Start with a protein
And to use this technique you
generally have to lyse the protein
into peptides about 8 to 20 amino acids in length and…
A
A
I
K
G
K
I
D
VC
I
V
L
L
Q H KA
E PT
I
R
NT
DG
R
TA
Cut with an enzyme
A
A
I
K
G
K
I
D
VC
I
V
L
L
Q H KA
E PT
I
R
NT
DG
R
TA
Select a peptide
Look at each peptide individually.
We select the peptide by mass using the first half of the tandem mass spectrometer
A E P T I R H2O
Impart energy in collision cell
The mass spectrometer imparts energy into the peptide causing it to fragment at the peptide bonds between amino acids.
M/z
Inte
nsity
A E P
A
A E
A E P T
72.0201.1
298.1399.2
Measure mass of daughter ionsThe masses of these fragment ions is recorded using the second mass spectrometer.
M/z
Inte
nsity
A E P T I R
B-type Ions
H2O
72.0 129.0 97.0 101.0 113.1 174.1
These ions are commonly called B ions, based on nomenclature you don’t really want to
know about…
But the mass difference between the peaks corresponds directly to the amino acid sequence.
M/z
Inte
nsity
A E P T I R
B-type Ions
H2O
72.0 129.0 97.0 101.0 113.1 174.1
A-0 AE-A AEP-AE
AEPT-AEP
AEPTI-AEPT
AEPTIR-AEPTI
For example, the A-E peak minus
the A peak should produce the mass
of E.
You can build these mass differences up and derive a sequence for the original peptide
This is pretty neat and it makes tandem mass spectrometry one of the best tools out there for sequencing novel peptides.
So, it seems pretty easy, doesn’t it?
But there are a couple confounding factors.
For example…
M/z
Inte
nsity
A E P T I R
B-type Ions
H2O
CO CO CO CO CO CO
B ions have a tendency to degrade and lose carbon monoxide producing…
M/z
A E P T I R
A-type Ions
H2O
CO CO CO CO CO CO
A ions.
Furthermore…
M/z
Inte
nsity
R I T P E A
Y-type Ions
H2O
… The second half are represented as Y ions that
sequence backwards.
And, unfortunately, this is the real world, so…
M/z
Inte
nsity
R I T P E A
Y-type Ions
H2O
… All the peaks have different measured heights and many peaks can often be missing.
M/z
Inte
nsity
R I T P E AH2O
B-type, A-type, Y-type Ions
All these peaks are seen together simultaneously
and we don’t
even know…
M/z
Inte
nsity
What type of ion they are, making the mass differences approach even more difficult.
Finally, as with all analytical techniques,
M/z
Inte
nsity
There’s noise,producing a final spectrum that looks like…
M/z
Inte
nsity
….This, on a good day. And so it’s actually fairly difficult to…
M/z
Inte
nsity
72.0 129.0 97.0 101.0 113.1 174.1
A E P T I R H2O
… compute the mass differences to sequence the peptide, certainly in a computer automated way.
So the community needed a new technique.
Now, it wasn’t all without hope…
Known Ion Types
B-type ions
A-type ions
Y-type ions
We knew a couple of things about peptide fragmentation.
Not only do we know to expect B, A, and Y ions,
but…
Known Ion Types
B-type ions
A-type ions
Y-type ions
B- or Y-type +2H ions
B- or Y-type -NH3 ions
B- or Y-type -H2O ions
… We also know a couple
of other variations on
those ions that come up.
We even know something
about the…
Known Ion Types
B-type ions
A-type ions
Y-type ions
B- or Y-type +2H ions
B- or Y-type -NH3 ions
B- or Y-type -H2O ions
• 100%• 20%• 100%
• 50%• 20%• 20%
… likelihood of seeing each type of ion,
where generally B and Y ions are most prominent.
If we know the amino acid
sequence of a peptide,
we can guess
what the spectra should look like!
So it’s actually pretty easy to guess what a spectrum
should look like
if we know what the peptide sequence is.
ELVISLIVESK
Model Spectrum
*Courtesy of Dr. Richard Johnsonhttp://www.hairyfatguy.com/
So as an example, consider the peptide
ELVIS LIVES K
that was synthesized by Rich Johnson in
Seattle
Model Spectrum
We can create a hypothetical spectrum based on our rules
B/Y type ions (100%)
A type ionsB/Y -NH3/-H2O
(20%)
B/Y +2H type ions(50%)
Where B and Y ions are estimated at 100%,
plus 2 ions are estimated at
50%, and other stragglers are at 20%.
Model Spectrum
So if we consider the spectrum that was derived from the ELVIS LIVES K peptide…
Model Spectrum
We can find where the overlap is between the hypothetical and the actual spectra…
Model Spectrum
And say conclusively based on the evidence that the spectrum does belong to the ELVIS LIVES K peptide.
But who cares?
The more important question is
“what about situations where we don’t know the sequence?”
We guess!
PepSeqAAAAAAAAAA
AAAAAAAAAC
AAAAAAAACC
AAAAAAACCC
ELVISLIVESK
WYYYYYYYYY
YYYYYYYYYY
……
J. Rozenski et al., Org. Mass Spectrom.,
29 (1994) 654-658.
build a hypothetical spectrum,
And so this was an approach followed by a program called PepSeq
which would guess every combination of amino acids possible
and find the best matching hypothetical.
PepSeq
• Impossibly hard after 7 or 8 amino acids!
• High false positive rate because you consider so many options
but it’s clearly impossibly hard with larger peptides
and there’s a lot of room to overfit the data.
This was a start,
PepSeq
• Impossibly hard after 7 or 8 amino acids!
• High false positive rate because you consider so many options
Another strategy is needed!
So obviously this isn’t going to work in the long run.
Sequencing Explosion
• 1977 Shotgun sequencing invented, bacteriophage fX174 sequenced.
• 1989 Yeast Genome project announced• 1990 Human Genome project announced• 1992 First chromosome (Yeast) sequenced• 1995 H. influenza sequenced • 1996 Yeast Genome sequenced • 2000 Human Genome draft
…
et cetra, et cetra
In 89 and 90 the Yeast and Human Genome projects were announced
We needed a new invention to come around
followed by the first chromosome
in 92
and that was shotgun Sanger-sequencing
• 1977 Shotgun sequencing invented, bacteriophage fX174 sequenced.
• 1989 Yeast Genome project announced• 1990 Human Genome project announced• 1992 First chromosome (Yeast) sequenced• 1995 H. influenza sequenced • 1996 Yeast Genome sequenced• 2000 Human Genome draft
Sequencing Explosion
…
Eng, J. K.; McCormack, A. L.; Yates, J. R. III J. Am. Soc. Mass Spectrom. 1994, 5, 976-989.
In 1994 Jimmy Eng and John Yates published a technique to
exploit genome sequencing
And the idea was …
for use in tandem mass
spectrometry.
SEQUEST.…instead of searching all possible peptide sequences,
search only those in genome databases.
Now, in the post- genomic world this seems like a pretty
trivial idea,
but back then there was a lot of assumption placed on
the idea
that we’d actually have a complete Human genome in
a reasonable amount of time.
SEQUEST2*1014 -- All possible 11mers
(ELVISLIVESK)
2*1010 -- All possible peptides in NR
1*108 -- All tryptic peptides in NR
4*106 -- All Human tryptic peptides in NRSo, In terms of 11amino
acid peptides
we’re talking about a 10 thousand fold difference between searching every
possible 11mer those in the current non-redundant protein
database from the NCBI
And a 100 million fold difference for searching human trypic peptides
So that was huge,
it made hypothetical spectrum matching feasible.
SEQUEST Model Spectrum
Instead of trying to make a better model,
Jimmy and John noted that there was a
discontinuity between the intensities of the
hypothetical spectrum and the actual spectrum.
SEQUEST made a couple of other interesting
improvements as well
they decided just to make the actual spectrum look
like the model with normalization…
SEQUEST Model Spectrum
For a scoring function they decided to use Cross-Correlation,
Like so. which basically sums the peaks that
overlap between hypothetical and the actual spectra
SEQUEST Model Spectrum
And then they shifted the spectra back and ….
SEQUEST Model Spectrum
They used this number, also called the Auto-Correlation, as their background.
… Forth so that the peaks shouldn’t align.
SEQUEST XCorr
Gentzel M. et al Proteomics 3 (2003) 1597-1610
Offset (AMU)
Cor
rela
tion
Sco
re
Cross Correlation(direct comparison)
Auto Correlation(background)
This is another representation of the Cross Correlation and the Auto Correlation.
SEQUEST XCorrCross Correlation
(direct comparison)
Auto Correlation(background)
CrossCorr
avg AutoCorr offset=-75 to 75 XCorr =Gentzel M. et al
Proteomics 3 (2003) 1597-1610
Offset (AMU)
Cor
rela
tion
Sco
re
The XCorr score is the Cross Correlation divided
by the average of the auto correlation over a
150 AMU range.
The XCorr is high if the direct comparison is significantly
greater than the background,
which is obviously good for peptide identification.
SEQUEST DeltaCn
XCorr1 XCorr 2
XCorr1and so far, there really
haven’t been any significant
improvements on it.The DeltaCn is another
score that scientists often use.
It measures how good the XCorr is relative to the
next best match.
And this XCorr is actually a pretty robust method for estimating how accurate
the match is,
As you can see, this is actually a pretty crude calculation.
Accuracy Score Relative Score
Strong(XCorr)
Weak(DeltaCn)
SE
QU
ES
T
Here’s another representation of that sentiment.
The XCorr is a strong measure of accuracy,
whereas the DeltaCn is a weak measure of relative goodness.
.
Accuracy Score Relative Score
Alte
rnat
eM
etho
dStrong(XCorr)
Weak
Weak(DeltaCn)
Strong
SE
QU
ES
T
Obviously, there could be an alternative method that focuses more on the success of the relative score.
Mascot and X! Tandem fit that bill.
by-Score= Sum of intensities of peaks matchingB-type or Y-type ions
HyperScore=
X! Tandem Scoring
by-Score Ny! Nb!
Fenyo, D.; Beavis, R. C. Anal. Chem., 75 (2003) 768-774
Now the X! Tandem accuracy score is
rather crude. It only considers B and Y ions and
and attaches these factorial terms with an admittedly hand waving argument.
Distribution of “Incorrect” Hits
Hyper Score
# of
Mat
ches
Best HitSecond
Best
But instead of just considering the best match to the second best, it looks at the
distribution of lower scoring hits, assuming that they are all wrong.
This is somewhat based on ideas pioneered with the BLAST algorithm.
Here, every bar represents the number of matches at a given score.
The X! Tandem creators found that the distribution decays (or slopes down)
exponentially…
Estimate Likelihood (E-Value)
Best Hit
Hyper Score
Lo
g(#
of M
atch
es)
…and the log of the distribution is relatively linear because of the exponential decay.
Estimate Likelihood (E-Value)Hyper Score
Lo
g(#
of M
atch
es)
Expected NumberOf Random Matches
Best Hit
If the distribution represents the number of random
matches at any given score,
the linear fit should correspond to the expected number of random matches.
Estimate Likelihood (E-Value)L
og
(# o
f Mat
ches
)
Score of 60 has1/10 chanceof occurring
at random
Best Hit
This is called an E-Value, or Expected-Value.
And from this, you can calculate the likelihood that the best match is random.
In this case, a score of 60 corresponds with a log number of
matches being -1 which means the estimated number of random matches
for that score is 0.1
X! Tandem and Mascot
E-Value=Likelihood that match is incorrect relative to N guesses
Empirical(X! Tandem)
P-Value=Likelihood that match is incorrect (E~P·N)
Theoretical(Mascot)
Another search engine, Mascot, tries to get at the same kind of number using
theoretical calculations,
Now, X! Tandem calculates this E-Value empirically.
most likely based on the number of identified peaks and the likelihood of finding certain amino acids in the
genome database.
They’ve never explicitly published their algorithm, so we’ll never really know,
I just want to bring up a point that we’ll touch on a little
later…
but I suspect it’s something smart.
X! Tandem and Mascot
E-Value=Likelihood that match is incorrect relative to N guesses
Empirical(X! Tandem)
P-Value=Likelihood that match is incorrect (E~P·N)
Theoretical(Mascot)
Probability=Likelihood that match is correct
Note (Probability≠1-P)!
This is realistically not nearly as useful as
knowing
the probability that a peptide identification is right, which is NOT 1 minus
the P-Value.
…the E-Value that X! Tandem calculates
and the P-Value that Mascot calculates are
probabilistically based,but they can only estimate the
likelihood that the match is wrong.
Accuracy Score Relative Score
X! T
ande
m
S
EQ
UE
ST
XCorr
HyperScore
DeltaCn
E-Value
Now, let’s go back and fill in the X! Tandem part of our accuracy/relativity scoring grid.
Accuracy Score Relative Score
X! T
ande
m
S
EQ
UE
ST
XCorr
HyperScore
DeltaCn
E-Value
To reiterate, the XCorr is an excellent measure of accuracy…
Accuracy Score Relative Score
X! T
ande
m
S
EQ
UE
ST
XCorr
HyperScore
DeltaCn
E-Value
If we assume that accuracy and relativity scores are independent measures of
goodness,could we use both the SEQUEST’s XCorr and
X! Tandem’s E-Value together?
…whereas the E-Value is an excellent measure of how good the best score is relative to the rest.
SEQUEST: Discriminant Score
X!
Tan
de
m: -
log
(E-V
alu
e)
10 Protein Control SampleAnd the answer is a resounding
yes.Each point on this
graph is a spectrum, where correct
identifications are marked in red, while
incorrect identifications are marked in blue.
Although in general the spectra SEQUEST scores well are spectra X!Tandem also scores well,
there is considerable scatter between the search engines.
We know what’s correct and incorrect
because this is a control sample.
Mascot: Ion-Identity Score
10 Protein Control Sample
X!
Tan
de
m: -
log
(E-V
alu
e)
One might wonder if X! Tandem and Mascot use similar scoring
approaches,
would they benefit as much,
Now, why are the scores so different?
but the answer is
surprisingly still yes!
Why So Different?• Sequest
– Considers relative intensities
• X! Tandem– Considers
semi-tryptic peptides
– Considers only B/Y-type Ions
• Mascot– Considers
theoretical
P-Value relative to search space
Well, here are a couple of possible reasons.
SEQUEST is the only method to consider relative intensities.
Why So Different?• Sequest
– Considers relative intensities
• X! Tandem– Considers
semi-tryptic peptides
– Considers only B/Y-type Ions
• Mascot– Considers
theoretical
P-Value relative to search space
X! Tandem is the only method to consider peptides outside the standard search space by default,
such as semi-tryptic peptides.
However, it’s the only score that considers only B and Y ions,
as opposed to a complete model.
Why So Different?• Sequest
– Considers relative intensities
• X! Tandem– Considers
semi-tryptic peptides
– Considers only B/Y-type Ions
• Mascot– Considers
theoretical
P-Value relative to search space
And Mascot is the only search engine to compute a completely theoretical P-Value
Mascot: Ion-Identity Score
Consider Multiple Algorithms?
X!
Tan
de
m: -
log
(E-V
alu
e)
So we clearly want to consider multiple search engines
simultaneously,
but how?
How To Compare Search Engines?– SEQUEST: XCorr>2.5, DeltaCn>0.1– Mascot: Ion Score-Identity Score>0– X! Tandem:E-Value<0.01
You can’t use a thresholding system
because it’s impossible to find corresponding
thresholds.
For example, a SEQUEST match with an XCorr of 2.5
doesn’t mean the same thing
as an X! Tandem match with an E-Value of 0.01.
How To Compare Search Engines?
Need to convert scores to probabilities!
– SEQUEST: XCorr>2.5, DeltaCn>0.1– Mascot: Ion Score-Identity Score>0– X! Tandem:E-Value<0.01
The simplest way would be to convert the scores into probabilities and compare
those.
We advocate for Andrew Keller and Alexy Nesviskii’s Peptide Prophet approach
because it actually calculates a true probability, not just a p-value.
10 Protein Control Sample (Q-ToF)X! Tandem approach
Other IncorrectIDs for Spectrum
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
So if you remember,
X! Tandem considers the best peptide
match for a spectrum against a
distribution of incorrect
matches
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
ALL Other“Best” Matches
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
Keller, A. et al Anal. Chem. 74, 5383-5392
Well, Peptide Prophet looks across the entire sample,
and not at just one spectrum at a time.
It compares the best match against all of
the other best matches in the
sample, which is clearly bimodal.
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
ALL Other“Best” Matches
PossiblyCorrect?
Mascot: Ion-Identity Score
# of
Mat
ches
Keller, A. et al Anal. Chem. 74, 5383-5392
The low mode represents matches that are most likely wrong while the high mode represents matches that are probably right.
10 Protein Control Sample (Q-ToF)Peptide Prophet approach
PossiblyCorrect?
“Correct”
“Incorrect”
Mascot: Ion-Identity Score
# of
Mat
ches
Peptide Prophet curve fits two distributions to
the modes,
following the assumption that the low scoring
distribution is “Incorrect”
and that the higher scoring distribution is “correct”.
10 Protein Control Sample (Q-ToF)
“Incorrect” p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )
Mascot: Ion-Identity Score
# of
Mat
ches
PossiblyCorrect?
“Correct”
These two distributions can be analyzed using Bayesian statistics with
this formula.
Now that formula looks pretty complex,
but…
10 Protein Control Sample (Q-ToF)
p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )“Incorrect”
Mascot: Ion-Identity Score
# of
Mat
ches
“Correct”
It just calculates the height of the correct distribution at a particular score, divided by the height of both distributions.
10 Protein Control Sample (Q-ToF)
p( | D)
p(D | ) p()
p(D | ) p() p(D | ) p( )
prob of having scoreand being correct
prob of having score
“Correct”
“Incorrect”
Mascot: Ion-Identity Score
This is essentially the probability of having that score and being correct
divided by the probability of just having that score
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
This is a neat method because it actually considers the likelihood of being correct,
rather than X! Tandem and Mascot, which only calculate the probability of being incorrect.
It’s because of this that Peptide Prophet can get
produce a true probability,
which is important when the sample characteristics change.
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches Q-ToF:
For example, the control sample we’ve been looking at was derived from Q-
ToF data
which produces pretty high quality results
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
Mascot: Ion-Identity Score
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches Q-ToF:
Ion Trap:
If you compare that to the same sample on run on an Ion Trap,
the probability of being correct is greatly
diminished.
If you’ll note, the Incorrect distribution doesn’t change very much between the two
analyses, however, the likelihood that the
identification is right changes dramatically!
PossiblyCorrect?
“Correct”
“Incorrect”
# of
Mat
ches
Mascot: Ion-Identity Score
Ion Trap:
As Peptide Prophet considers the correct distribution, it is immune to fluctuations between samples.
P-Values and E-Values don’t consider this information, so they can’t be compared across multiple samples, or different examinations of the same sample
hence the reason why we need to use Peptide
Prophet for comparing two different search engines
Mascot: Ion-Identity Score
Consider Multiple Algorithms?
X!
Tan
de
m: -
log
(E-V
alu
e)
So going back to the scatter plot between X! Tandem and Mascot,
we can use Peptide Prophet to compute the score
threshold that represents a 95% cut-off…
Mascot: Ion-Identity Score
Consider Multiple Algorithms?
X! Tandem: 2.6=95%
Mascot: -2.5=95%
X!
Tan
de
m: -
log
(E-V
alu
e)Like so.
This allows you to fairly consider the answers from both search engines simultaneously.
The important thing to note, is that if you looked at a different sample, these thresholds should change depending on the height of the correct distributions
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results
So in conclusion,
all of the search engines look at different criteria
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results
And we can leverage this to identify more peptides
Conclusion• All search engines
use different criteria, producing different scores
• Using multiple search engines simultaneously yields better results
• Peptide Prophet can normalize search engine results
And that Peptide Prophet is a great
mechanism for doing that
because it calculates true probabilities,
instead of p-values
The End