MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation...

26
MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples Parminder Kaur , Konstantin Aizikov , Bogdan Budnik and Peter B. O’Connor Department of Electrical and Computer Engineering, Boston University Cardiovascular Proteomics Center, Boston University School of Medicine Mass Spectrometry Resource, Department of Biochemistry, Boston University School of Medicine Department of Bioinformatics, Boston University MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.1

Transcript of MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation...

Page 1: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

MasSPIKE (Mass SPectrumInterpretation and Kernel

Extraction) for Biological SamplesParminder Kaur

���

��

, Konstantin Aizikov

��

��

,

Bogdan Budnik

��

and Peter B. O’Connor

��

��

Department of Electrical and Computer Engineering, Boston University

Cardiovascular Proteomics Center, Boston University School of Medicine

Mass Spectrometry Resource, Department of Biochemistry,

Boston University School of Medicine

Department of Bioinformatics, Boston University

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.1

Page 2: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Introduction

Goal - Reducing complex mass spectra intomonoisotopic mass lists

Noise Baseline Modelling

Isotopic Distribution (ID) Identification

Charge State Determination

Picking Experimental Isotopic Peaks

Alignment of a Theoretical Isotopic Distribution (TID)with the Experimental Isotopic Distribution (EID)

Generating the Monoisotopic Mass List

Matching Observed Masses against TheoreticalFragment Masses from Given Sequence

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.2

Page 3: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Noise Baseline Modelling

Baseline of a top down spectrum of bovine carbonic anhydrase (blue), noise mean vs m/z(white)

Model based on the mean of the signal across m/z range

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.3

Page 4: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Isotopic Distribution (ID) Identification

(a) (b)

(a) Top down spectrum of BCA with red and green lines indicating start and end of IDs (b)Zoomed-in view

Isotopic distribution identification uses (default) S/N=3 as a trigger threshold

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.4

Page 5: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Charge State Determination

Isotopic Distributions obtained from previous step arepassed as input for z determination

Two new methodsMaximum Likelihood (ML) method using FourierTransform (FT) of EIDMatched Filter Approach

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.5

Page 6: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

ML method using FT of EID

An EID is composed of complex exponentials withfundamental frequency corresponding to the chargestate and its harmonics

Peak locations are used to identify the charge state

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.6

Page 7: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Matched Filter (MF) Approach

Parameters for generating TID (peak width, inter-pointspacing, MAX Z, MIN Z) are based upon the data

The TID (represented by T(Z) for charge state Z) thatgives maximum value of cross-correlation coefficientwith EID (E) generally represents the true charge state

� ���� � ��� � ���� � �� �� ���� � � � � (1)

� � � �� arg max� � � �� � � (2)

� �� ���

� � � �� � � ��� � �� � �� � � � �� � � � ��� �� �

� � � �� � � ��� � ! � �� � �� � � � �� � � � � � �� � ! (3)

�est

� arg max� � �� �

(4)

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.7

Page 8: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Typical MF Match

(a) Raw Spectrum

(b) Z=3

(c) Output List

(a) EID of a fragment of BCA (b) TID with Z=3 (red) and EID (blue) TID shift corresponds tomaximum value of cross-correlation coefficient (0.954) between the two (c) Snapshot ofoutput listing corresponding to above fragment ( � ! � )

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.8

Page 9: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Automated Comparison of Charge State Determination Methods

Results using 775 isotopic distributions from myoglobin using 26 spectra with charge statesranging from 8-22 and from S/N of 1-100, comparison of different methods

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.9

Page 10: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Advantages of MF over ML

Results are better 91%(MF) vs 88%(ML)

Allows for pulling out EIDs from the observed signaleven when signal contains multiple distributions

Works better in case of overlapping distributions

Since ML method uses FT map, it works better forhigher z than for lower z, while MF works equally wellfor both cases

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.10

Page 11: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Picking Isotopic Peaks and ML alignment

(a) (b) (c)

(d) (e) (f)

(a) Picking isotopic peaks of EID of myoglobin, Z=16 (b) TID of myoglobin, Alignment with (c)TID shifted by 5 (d) TID shifted by 6 (e) TID shifted by 7 (f) Probability of alignment as afunction of TID indices

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.11

Page 12: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Testing ML Alignment with Low Ion Numbers

(a) (b)

(a) Alignment of myoglobin IDs using 3150 simulations (100 ions in each simulation) (b) Atypical 100 ion distribution of myoglobin

� � � � �� � �� � � ��� � � � � �� �� � � � � � � ��

� ���� �� � � � � �� � � (5)

index� arg � ��� � � � � � �� � � �

(6)

� �� � � �� � � �� � � � � � � �� where

� � =Length of E (7)

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.12

Page 13: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Separating Overlapping Distributions

(a) Raw Spectrum

(b) Z=3, r=0.74

(c) Z=4, r=0.64

(d) Residual Signal

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.13

Page 14: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Low Charge State Overlapping Distributions from Top-Down Spectrum of BCA

(a) Raw Spectrum

(b) Z=4

(c) Residual

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.14

Page 15: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(d) Z=1

(e) Z=3

(f) Residual

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.15

Page 16: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Analysis of top-down spectrum of Ubch10 - Mixed Z Cases

(a) Input Signal

(b) Z=14, r=0.76

(c) Residual

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.16

Page 17: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(d) Z=14, r=0.74

(e) Residual

(f) Z=1, r=0.5

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.17

Page 18: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(g) Z=2, r=0.51

(h) Z=14, r=0.57

(i) Residual

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.18

Page 19: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(j) Z=1, r=0.54

(k) Z=2, r=0.5

(l) Z=14, r=0.58

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.19

Page 20: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(m) Residual

(n) Z=1,r=0.55

(o) Z=14,r=0.55

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.20

Page 21: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

(p) Final Residual

Applying MasSPIKE to a particular noisy region of a top-down mass spectrum of abiologically derived protein Ubch10 (a) Input signal (b) z=14 detected (c) Residual aftersubtraction of (b) from (a) (d) z=14 detected in region m/z=1056-1057 (e) Residual signal (f),(g) & (h) z=1, 2 and 14 detected simultaneously (z=1 and 2 are probably false positives dueto chemical noise) (i) Residual signal after subtraction of signal due to already determinedcharge states (j), (k) & (l) z=1, 2 and 14 detected simultaneously again, sharing threecommon peaks (m) Remaining signal (n) & (o) z=1 and 14 being detected (p) Final residual.Overall, 10 isotopic distributions were recovered in an 8 m/z window

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.21

Page 22: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Mass Spectrum of Hemoglobin of a normal person

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.22

Page 23: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Hemoglobin Variants Analysis

Spectrum of Hemoglobin variants and comparison between theoretical and experimentalmasses

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.23

Page 24: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Conclusions

Matched Filter method works best for charge statedetermination, helps in resolving overlappingdistributions

Maximum likelihood based alignment improves theaccuracy of monoisotopic masses

MasSPIKE has been tested against analysis of complexspectra from biologically derived proteins

Once fully implemented in BUDA[5], MasSPIKE willsubstantially simplify interpretation of mass spectra

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.24

Page 25: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

Acknowledgments

Prof W. Clem Karl Dr Amit JunejaDr Hua Huang Dr Judith JebanathirajahJason J. Cournoyer Dr Cheng ZhaoRaman Mathur Dr Cheng LinDr Roger Theberge Vera IvlevaDr Mark McComb Dr Jason PittmanProf Catherine E. Costello Prof Richard CohenDr David Perlman

This work was supported in part by Federal funds from theNational Center for Research Resources under grant No.P41-RR10888 and the National Heart, Lung, and Blood

Institute under Contract No. HHSN268200248178C.

MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for Biological Samples – p.25

Page 26: MasSPIKE (Mass SPectrum Interpretation and Kernel ...€¦ · substantially simplify interpretation of mass spectra MasSPIKE (Mass SPectrum Interpretation and Kernel Extraction) for

References

[1] M W Senko; S C Beu; F W McLafferty, “Automated Assign-

ment of Charge States from Resolved Isotopic Peaks for

Multiply Charged Ions”, J. Am. Soc. Mass Spectrom.; 1995;

6, 52-56

[2] A L Rockwood, “Ultrahigh-Speed Calculation of Isotope Dis-

tributions”, Anal Chem; 1996; 68; 2027-2030

[3] D M Horn; R A Zubarev; F W McLafferty, “Automated Re-

duction and Interpretation of High Resolution Electrospray

Mass Spectra of Large Molecules”, J. Am. Soc. Mass Spec-

trom.; 2000; 11; 320-332

[4] P Kaur; P B O’Connor, “Use of Statistical Methods for Esti-

mation of Total Number of Charges in a Mass Spectrometry

Experiment”, Anal Chem; 2004; 76; 2756-2762

[5] P B O’Connor, “BUDA - Boston University Data Analysis

www.bumc.bu.edu/ftms”