Systematic Identification of Protein Domains for Structure Determination

12
Systematic Identification of Protein Domains for Structure Determination Ming Luo, Ph.D. University of Alabama at Birmingham March 29, 2004 NIH

description

Systematic Identification of Protein Domains for Structure Determination. Ming Luo, Ph.D. University of Alabama at Birmingham March 29, 2004 NIH. Current Progress on C. elegans Proteins. * Unique ORFs, each expressed and purified multiple times. Markley. Domain Identification. Methods - PowerPoint PPT Presentation

Transcript of Systematic Identification of Protein Domains for Structure Determination

Page 1: Systematic Identification of Protein Domains for Structure Determination

Systematic Identification of Protein Domains for Structure

Determination

Ming Luo, Ph.D.

University of Alabama at Birmingham

March 29, 2004

NIH

Page 2: Systematic Identification of Protein Domains for Structure Determination

Current Progress on C. elegans Proteins

Nov, 03 Mar, 04 INCREASECLONED 4762 7326 2564

EXPRESSED 2293 3128 835SOLUBLE 368 503 135

PURIFIED 152 189 37CRYSTALLIZED 58 65 7

X-RAY DATA 14 18 4STRUCTURE 9 11 2 + 1

Protein expressed (1 mL)

0

700

1400

2100

2800

3500

Oct-00

Apr-01

Oct-01

Apr-02

Oct-02

Apr-03

Oct-03

Apr-04

Soluble confirmed (1 L)

0

125

250

375

500

625

Oct-00

Apr-01

Oct-01

Apr-02

Oct-02

Apr-03

Oct-03

Apr-04

Purified (6 L)

0

40

80

120

160

200

Oct-00

Apr-01

Oct-01

Apr-02

Oct-02

Apr-03

Oct-03

Apr-04

Clone & expression

0

1500

3000

4500

6000

7500

Oct-00

Apr-01

Oct-01

Apr-02

Oct-02

Apr-03

Oct-03

Apr-04

Selected ORFsSelected ORFs ClonedCloned Expressed Expressed Soluble (1 L)Soluble (1 L) Purified * (6 L)Purified * (6 L)

4/7/2003 14,440 2,342 1,369 268 110

3/7/2004 15,556 7,326 3,218 503 189

* Unique ORFs, each expressed and purified multiple times.

Page 3: Systematic Identification of Protein Domains for Structure Determination

Domain Identification

Methods

1. Conserved Sequence

(e.g. Pfam)

2. Spontaneous Degradation

3. Proteolysis

4. Functional Data

Markley

Page 4: Systematic Identification of Protein Domains for Structure Determination

Predict Domains by Sequence

Program used: SMART (http://smart.embl-heidelberg.de/)Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864

Letunic et al. (2002) Nucleic Acids Res 30, 242-244

2-H91 35629 286

1 34655 320

11-D11

1 647323 475

28-C5

1 31325 273

11-E3

1 320151 304

11-D5

1 50041 436

20-D7

278 283

Four expressedOne solubleNone purified

Page 5: Systematic Identification of Protein Domains for Structure Determination

Spontaneous Degradation

1F11 76F6 3D2

Purified protein samples were stored at 4°C over one month.

Page 6: Systematic Identification of Protein Domains for Structure Determination

Mass Spectrometry

31.3141193404-Nov-200306:43:10

5000 7500 10000 12500 15000 17500 20000 22500 25000 27500 30000 32500 35000 37500 40000 42500 45000 47500 50000mass0

100

%

18H1 88 (3.002) M1 [Ev-64188,It10] (Gs,0.750,627:1371,1.00,L50,R50); Sm (Mn, 2x3.00); Cm (87:90-54:85) TOF MS ES+ 1.11e325440

12720

8480

6360 95392471419079 22012

38955337503109627753 36454

454954367641164 47376

Eluted from Gel

18H1

Solution Specimen

3D2

30.9721527124-Sep-200315:28:43

5000 10000 15000 20000 25000 30000 35000 40000 45000 50000mass0

100

%

3D2_NATIVE 24 (0.824) M1 [Ev-77696,It12] (Gs,0.750,843:1558,1.00,L50,R50); Sm (Mn, 2x3.00); Cm (20:40-1:12) TOF MS ES+ 79016809

8984 12340

17969

33618

31214

287502467922412

19961

26953

48183

4482335938 4291637192 40156

46631

Page 7: Systematic Identification of Protein Domains for Structure Determination

MS + AA Sequencing

MS19695

AA CodeSAIKD

140-309379

MS21279

AA CodeGSQSTSL

18-210261

76F6 3D2

Page 8: Systematic Identification of Protein Domains for Structure Determination

Proteolysis

Trypsin Digestion

1. Trypsin:protein 1:200, 10 mM Tris, pH7.6, 37°C.

2. N-terminal Sequencing after transfer to PVDF

ELTSAEK---

3. Mass Spectrometry using solution mixture

1927717774

Result: 59-212

Min 0 5 10 15 20 60 MW

9H3

Page 9: Systematic Identification of Protein Domains for Structure Determination

Functional Data

1D10

Predicted Signal Peptide parameters from Soren Brunak's SignalP server:

Signal peptide predicted:

HMM-cleavage prediction: MPKLPLLLSFPLLFFASFAYA--(22)DEDFVT

ANN-cleavage prediction: MPKLPLLLSFPLLFFASFAYA--(22)DEDFVT

79D4

Page 10: Systematic Identification of Protein Domains for Structure Determination

SUMMARY

Total# of ORFs 14 23 8 6 51

Domain ID by Degradation Proteolysis Functional Sequence

11 11 8 7 37Expressed 10 7 5 4 26

Soluble 10 7 5 1 23Purified 7 1 3 0 11

Xtal 5/(1 NMR) 1 2 0 8Structure 3 0 2 0 5

SUMMARY OF DOMAIN IDENTIFICATION

Page 11: Systematic Identification of Protein Domains for Structure Determination

CONCLUSIONS

Smaller structural domains are most suitable for HTP structure determination.

Domains experimentally identified from folded proteins are most reliable.

Spontaneously occurring or limited proteolysis, followed by N-terminal sequencing and mass spectrometry, are most efficient approaches.

Page 12: Systematic Identification of Protein Domains for Structure Determination

Our Team

TARGET SCREEN AND AUTOMATION* Chi-Hao Luan , Team Leader Wen Ying Huang ShiHong Qiu Zhuhua Cao Rita Gray Qiao Shang

PROTEIN PRODUCTION* Robert Bunzel , Team Leader Danlin Luo, Jennifer Zhou Alireza Arabshahi Elizaveta Karpova Annette McKinstry

CRYSTALLIZATION* Larry DeLucas , Team Leader Songlin Li Youhong Zhang

X-RAY CRYSTALLOGRAPHY* Songlin Li , Team Leader Jindrich Symersky Norbert Schormann Guangda Lin Shanyun Lu

BIOINFORMATICS* Mike Carson , Team Leader David Johnson Jun Tsao