High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical...

35
High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California

Transcript of High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical...

Page 1: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

High Throughput and Large Scale Proteomics Analysis

Austin Yang, Ph.D.Department of Pharmaceutical Sciences, University of Southern California

Page 2: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Overview

1. Shotgun proteomics and ESI mass spectrometry

2. Proteomic data mining and data visualization

Page 3: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

12,000 proteins

Page 4: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.
Page 5: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Metabolism 0.1 mM, 1x 108

Ribosomes

10 1x 107

Kinases

Cyclins

1 1x 106

0.1 1x 105

Transcription factors 10 nM, 1x 104

Synaptic Markers 0.1 nM, 1x 103

Cytoskelatal Proteins mM, 1x 109 copies/cell

2-D GelShotgun

Proteomics

Are We Ready for Mammalian Proteomics ?

Page 6: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Advantages of Proteomics Using LC-MS/MS

• No pre-selection of biased targets(hypothesis-free, open approach)

• Protein variants are detected simultaneously

• Protein isolation and detection are on a small scale (~ 10 fmol from complex mixtures – subcellular fractions, whole cells, or tissue)

• Obtain sequence information of peptides (not just masses) and can sequence ~4,000 proteins in a single experiment

Page 7: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Liquid Chromatography Quadrupole Ion Trap Tandem Mass Spectrometer

Page 8: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Electrospray vs Nanospray

Page 9: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Splitless Nano-Liquid Chromatography

Page 10: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Five Independent Loop Injections

Page 11: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

SCX (NH4OAc)

RF#1 RF#2

100 mM MSM Wash200 mM Wash MS300 mM MS Wash400 mM Wash MSM500 mM MSS Wash600 mM Wash MSM700 mM MSM Wash800 mM Wash MSS900 mM MS Wash1000 mM Wash MS

10-cycle MudPIT Analysis

Page 12: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

100

200

300

400

500

0-500 mM NH4OAc

SCX Column

100200

300

500400

RP #1 RP #2

1,000-2,000 Sequencing Attempts in 60 Minutes

20,000 MS/MS spectra/day

Digested protein complexes Multidimensional Protein Identification Technology (MudPIT)

Page 13: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Isotope-Coded Affinity Tags (ICAT)

Page 14: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Electrospray Ionization (ESI)

Ions in solution

Ions in gaseous phase

LC

Spray tip

Ion sourceopeningfor the MS

Page 15: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.
Page 16: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

b1

b2

b3

y1

y2

y3

LF G K

Rela

t ive I

nte

nsit

y

m/z

F L G K

++

F L G K

++

F L G K

++

CID

F L G K++

F L G K

++

F L G K

++

b1

b2

b3

y3

y2

y1 F L G K

++

F L G K

+

Theoretical CID of a Tryptic Peptide

K G L F

MS/MSSpectrum

Parentions

(464.29)

Daughter ionsNon-dissociatedParent ions

Page 17: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

SequestQueue (6,000 dta x50 = 300,000 ms/ms scans)

Page 18: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Data Mining through SEQUEST and PAULA

Database Search Time•Yeast ORFs (6,351 entries) 52 sec: 0.104 sec/s•Non-redundant protein (100k entries) 3500 min: •EST (100K entries, 3-frames) 5-10,000 min:

Page 19: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

SEQ 1

SEQ 2

SEQ 3

SEQ 4

STEP 1.

STEP 3.

SEQUEST Algorithm

(Experimental MS/MS Spectrum)

500 peptides with masses closest to that of the parent ion are retrieved from a protein database. Computer generates a theoretical MS/MS Spectrum for each peptide sequence (SEQ1, 2, 3, 4, …)

(Experimental MS/MS Spectrum)

Theoretical MS/MSspectra

Step 1.Determine Parent

Ion molecular mass

Step 2.

Step 3.Experimental Spectrum is compared with each theoretical spectra and correlation scores are assigned.

Step 4.Scores are ranked andProtein Identifications are made based on these cross correlation scores.

ZSA-charge assignment

Unified Scoring Function

Page 20: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

One spectrum TWO protein identifications

Spectrum A was used to search againstNCBI human database: Macrophage inhibitory factor was identified

Same spectrum was used to search againstnon-redundant database. Bovine G-proteingamma was identified. Since the primary amino acid sequence of human G-protein gamma is almost identical to bovine, this protein was later identified as human G-proteinGamma. The initial false ID was due to an entry missing of human g-protein in humandatabase. The sequence was later reenteredInto the human database and the third searchyielded correct ID.

Fragment ions match both sequences are indicated by *Spectrum B has two additional ions matched to G-protein gamma

Mol Cell Proteomics. 2003 Jul;2(7):428-42.

Page 21: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Distribution of Xcorr from correctly and incorrectly identified peptides

Page 22: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

X-correlation vs Peptide length

Page 23: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Distribution of Xcorr vs Charge State

Page 24: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

F-score and probability-based peptide assignment

Page 25: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.
Page 26: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.
Page 27: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Identification of modified LRP in APP/PS1 Transgenic Mice

Page 28: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Tg Peptide

A) 1. (Q9WV18) Gamma-aminobutyric acid type B receptor, subunit 1 precursor (GABA-B-R1)

2. (NP_032102.1) gamma-aminobutyric acid (GABA-A) receptor, subunit rho 2

3. (NP_034382.1) gamma-aminobutyric acid A receptor, gamma 1

4. (NP_033733.1) cholinergic receptor, nicotinic, epsilon polypeptide; acetylcholine receptor

5. (NP_150372.1) cholinergic receptor, muscarinic 3, cardiac; AChR M3

6. (S28058) serotonin receptor 5

7. (NP_031903.1) dopamine receptor 3; D3 receptor

8. (Q60934) Glutamate receptor, ionotropic kainate 1 precursor (Glutamate receptor 5)

9. (I49696) glutamate receptor chain B (version flip)

B) 1. (NP_038589.1) 5-hydroxytryptamine (serotonin) receptor 3A

2. (P30545) Alpha-2B adrenergic receptor (Alpha-2B adrenoceptor)

3. (NP_032195.1) glutamate receptor, ionotropic, NMDA1 (zeta 1)

4. (NP_032198.1) glutamate receptor, ionotropic, NMDA2D (epsilon 4); GluRepsilon4

5. (I49696) glutamate receptor chain B (version flip)

C) 1 (NP_034428.1) glycine receptor, beta subunit

2. (JC4262) glutamate transporter 2

Neurotransmitter Receptors

Page 29: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Proteomic Data Visualization and Future Directions

• information overload• data integration• ease of visualization

Page 30: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Network for NMDA and glutamate receptors

Page 31: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Network for NMDA and glutamate receptors(Zoom-in)

Page 32: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Scoring Algorithm for Spectral Analysis

Raw Unidentified Spectra(~10,000-100,000)

Identified Sequence

SEQUEST

SALSA

Page 33: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

• SALSA is a tool for identifying MS-MS spectra in Xcalibur analysis files that display specific user-defined characteristics. Because these characteristics correspond to structural features of a peptide, SALSA allows the user to selectively locate MS-MS spectra of specific peptides or their variant or modified forms.

SALSA Overview

*

product ion

neutral loss Mass difference

T W D G A

ion series

charged loss

Page 34: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

Construction of SALSA ruler GAIIGLMGGVV

GAIIGLMG

GAIIGLMGG

GAIIGLMGGV

GAIIGLMGGVV

GAIIGLM

GAIIGLGAIIG

GA GAIGAII

GAIIGLGAIIG

GA GAIGAII

GAIIGLMG

GAIIGLMGG

GAIIGLMGGV

GAIIGLMGGVVGAIIGLM

Methionine Oxidation16 amu (one oxygen atom)

m/z

Page 35: High Throughput and Large Scale Proteomics Analysis Austin Yang, Ph.D. Department of Pharmaceutical Sciences, University of Southern California.

GAIIGLMVGGVVGAIIGLMVGGVV: +7 amu

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A.

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A.

y3 y5

b4

b6

[b11*]+2

y6*

b11*

y9*b7*

y8*

b12*y7*

b9*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

y3

b9*

y5y6*y7*y8*

G A I

y9*

I

b4 b7*b6

MLG G G VV

b11*

V

b12*

A. [Aß29-40+1O]+1

[Aß29-40+2O]+1

[Aß29-40]+1

B. [Aß29-40+1O]+1

[Aß29-40+2O]+1

[Aß29-40]+1

B.

Quantification of Methionine OxidationAbsolute Quantification Analysis