Proteome Bioinformatics - Vanderbilt...

Post on 21-Sep-2020

2 views 0 download

Transcript of Proteome Bioinformatics - Vanderbilt...

Proteome Bioinformatics

Applied Bioinformatics lecture 6

David L. Tabb

Overview

• Identifying proteins through “shotgun”

proteomics: bench and bioinformatics

• Understanding Peptide-Spectrum Matches

• Trawling ProteomExchange via PeptideAtlas

Mass spectrometers

• IONIZATION:– Produce ions from biological materials.

• MASS ANALYSIS:– Separate or select ions by mass-to-charge (m/z)

ratio.

• DETECTION:– Report intensity of ions in mass spectrum.

Discovery Proteomics

Peptide

Mixture

Liquid

Chromatography

Electrospray

Ionization

High-Resolution

Mass Spectrometry

Isolate

Ions of Peptide

Collide Ions to

Dissociate

Collect Fragments

in Tandem MS

Tandem

Mass spectra

Peptide

Identifications

Confident

Peptide List

Assembled

Protein List

Disassembly and reassembly

Collection of tandem

mass spectra

Collection of raw

peptide identifications

LSELIGAR

z=2 XCorr=3.5

Mixture of PeptidesConfidently identified

peptide sequences

...LSEGTSFR

LSELIGAR

LSENLRK

LSEPVHK...

Mixture of Proteins Confidently identified

proteins

...YGR192C

YGR204W

YGR208W

YGR209C...

After AI Nesvizhskii, Mol Cell Proteomics (2005) 4: 1419-40.

Database search algorithms

First published in 1994, these tools identify MS/MS scans by comparing them

to predictions from database peptide sequences. Prominent examples include:

Sequest : Eng (1994) J. Amer. Soc. Mass Spectrom. 5: 976-989.

Mascot: Perkins (1999) Electrophoresis 20: 3551-3567.

X!Tandem: Craig (2003) Rapid Comm. Mass Spectrom. 17: 2310-2316.

Fragment ions result from breakage of peptide bonds

TSIIGTIGPK

N-terminal

b ionC-terminal

y ion

HFISELEK, +2 charge state

HF-

-LEK -SELEK

-ISELEK

-FISELEK

Neutral loss of

water from peptide

Proteomic Repositories

Web resources of proteomic data have become substantial in recent years.

Raw data and

Identification archives

Peer-to-peer file storage

Data analysis tools

and databases

Peptide Atlas: protein coverage

http://www.peptideatlas.org search for NP_862897 in mouse build

Peptide-spectrum match view

predicted and observed

fragment ions

y ion contains

peptide C-terminusb ion contains

peptide N-terminus

Distance between

b8 and b9 is mass

of ninth amino acid Tabb (2006) Nat. Protocols 1: 2213-2222.

Summary

• Proteomics generates large data sets that require automated interpretation.

• Proteomic repositories are relatively new, and interpreting experiments requires expertise.

• Visual inspection of tandem mass spectra is largely built from rules-of-thumb.

Challenges

• What areas of mouse dihydropyrimidinedehydrogenase (NP_740748.1) have been observed by proteomics?

• I want to create a targeted measurement of mouse adenylyl cyclase-associated protein 1 (NP_031624.2). What charge state does KEPALLELEGK adopt?