Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation...

68
Protein analysis and proteomics (Part 1 of 2)

Transcript of Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation...

Page 1: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Protein analysis and proteomics(Part 1 of 2)

Page 2: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Many of the images in this powerpoint presentationare from Bioinformatics and Functional Genomicsby Jonathan Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by John Wiley & Sons, Inc.

These images and materials may not be usedwithout permission from the publisher. We welcomeinstructors to use these powerpoints for educationalpurposes, but please acknowledge the source.

The book has a homepage at http://www.bioinfbook.orgIncluding hyperlinks to the book chapters.

Copyright notice

Page 3: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Outline for today

Protein analysis and proteomics

Individual proteinsProtein familiesPhysical propertiesLocalization Function

Large-scale protein analysis2D protein gelsYeast two-hybridRosetta Stone approachPathways

Page 4: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

Page 224

RNADNA

Page 5: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

[1] Protein families

Page 224

Page 6: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

[1] Protein families

[2] Physical properties

Page 224

Page 7: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

[1] Protein families

[2] Physical properties

Page 224

[3] Protein localization

Page 8: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

[1] Protein families

[4] Protein function

[2] Physical properties

Page 224

[3] Protein localization

Page 9: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

[1] Protein families

[4] Protein function

[2] Physical properties

Page 224

[3] Protein localization

Gene ontology (GO):--cellular component--biological process--molecular function

Page 10: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Perspective 1: Protein domains and motifs

Page 225

Page 11: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Definitions

Signature: • a protein category such as a domain or motif

Page 225

Page 12: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Definitions

Signature: • a protein category such as a domain or motif

Domain: • a region of a protein that can adopt a 3D structure• a fold• a family is a group of proteins that share a domain• examples: zinc finger domain immunoglobulin domain

Motif (or fingerprint):• a short, conserved region of a protein• typically 10 to 20 contiguous amino acid residues

Page 225

Page 13: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

15 most common domains (human)

Zn finger, C2H2 type 1093 proteinsImmunoglobulin 1032EGF-like 471Zn-finger, RING 458Homeobox 417Pleckstrin-like 405RNA-binding region RNP-1 400SH3 394Calcium-binding EF-hand 392Fibronectin, type III 300PDZ/DHR/GLGF 280Small GTP-binding protein 261BTB/POZ 236bHLH 226Cadherin 226 Page 227

Page 14: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

15 most common domains (various species)

The European Bioinformatics Institute (EBI)offers many key proteomics resources:

http://www.ebi.ac.uk/proteome/

Page 227

Page 15: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Page 16: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Definition of a domain

According to InterPro at EBI (http://www.ebi.ac.uk/interpro/):

A domain is an independent structural unit, found aloneor in conjunction with other domains or repeats.Domains are evolutionarily related.

According to SMART (http://smart.embl-heidelberg.de):

A domain is a conserved structural entity with distinctivesecondary structure content and a hydrophobic core.Homologous domains with common functions usuallyshow sequence similarities.

Page 226

Page 17: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Varieties of protein domains

Page 228

Extending along the length of a protein

Occupying a subset of a protein sequence

Occurring one or more times

Page 18: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2)

MBD

Page 227

TRD

The protein includes a methylated DNA binding domain(MBD) and a transcriptional repression domain (TRD).MeCP2 is a transcriptional repressor.

Mutations in the gene encoding MeCP2 cause RettSyndrome, a neurological disorder affecting girlsprimarily.

Page 19: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 228

Result of an MeCP2 blastp search:A methyl-binding domain shared by several proteins

Page 20: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 228

Are proteins that share only a domain homologous?

Page 21: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Example of a multidomain protein: HIV-1 pol

• 1003 amino acids long

• cleaved into three proteins with distinct activities:-- aspartyl protease-- reverse transcriptase-- integrase

We will explore HIV-1 pol and other proteins at theExpert Protein Analysis System (ExPASy) server.

Visit www.expasy.org/

Page 229

Page 22: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 230

Page 23: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Page 24: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 230

SwissProt entry for HIV-1 pol links to many databases

Page 25: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 231

ProDom entry for HIV-1 pol shows many related proteins

Page 26: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 231

Proteins can have both domains and patterns (motifs)

Domain(aspartylprotease)

Domain(reversetranscriptase)

Pattern(severalresidues)

Pattern(severalresidues)

Page 27: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 232

Page 28: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Definition of a motif

A motif (or fingerprint) is a short, conserved region of a protein. Its size is often 10 to 20 amino acids.

Simple motifs include transmembrane domains andphosphorylation sites. These do not imply homologywhen found in a group of proteins.

PROSITE (www.expasy.org/prosite) is a dictionary of motifs (there are currently 1600 entries). In PROSITE,a pattern is a qualitative motif description (a proteineither matches a pattern, or not). In contrast, a profileis a quantitative motif description. We will encounterprofiles in Pfam, ProDom, SMART, and other databases.

Page 231-233

Page 29: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Perspective 2: Physical properties of proteins

Page 233

Page 30: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 234

Page 31: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Physical properties of proteins

Many websites are available for the analysis ofindividual proteins. ExPASy and ISREC are twoexcellent resources.

The accuracy of these programs is variable. Predictions based on primary amino acid sequence (such as molecular weight prediction) are likely to be more trustworthy. For many other properties (such asposttranslational modification of proteins by specific sugars), experimental evidence may be required rather than prediction algorithms.

Page 236

Page 32: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 235

Page 33: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 235

Page 34: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 235

Page 35: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 236

Page 36: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 238

Page 37: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 238

Page 38: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 238

Page 39: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Syntaxin, SNAP-25 and VAMP are three proteins that interact via coiled-coil domains

Page 40: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Introduction to Perspectives 3 and 4: Gene Ontology (GO) Consortium

Page 237

Page 41: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

The Gene Ontology Consortium

An ontology is a description of concepts. The GOConsortium compiles a dynamic, controlled vocabularyof terms related to gene products.

There are three organizing principles: Molecular functionBiological processCellular compartment

You can visit GO at http://www.geneontology.org.There is no centralized GO database. Instead, curatorsof organism-specific databases assign GO termsto gene products for each organism.

Page 237

Page 42: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 241

GO terms are assigned to LocusLink entries

Page 43: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 241

Page 44: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 241

Page 45: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 241

Page 46: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

The Gene Ontology Consortium: Evidence Codes

IC Inferred by curatorIDA Inferred from direct assayIEA Inferred from electronic annotationIEP Inferred from expression patternIGI Inferred from genetic interactionIMP Inferred from mutant phenotypeIPI Inferred from physical interactionISS Inferred from sequence or structural similarityNAS Non-traceable author statementND No biological dataTAS Traceable author statement

Page 240

Page 47: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Perspective 3: Protein localization

Page 242

Page 48: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

protein

Protein localization

Page 242

Page 49: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Protein localization

Proteins may be localized to intracellular compartments,cytosol, the plasma membrane, or they may be secreted. Many proteins shuttle between multiple compartments.

A variety of algorithms predict localization, but thisis essentially a cell biological question.

Page 240

Page 50: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Page 51: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Page 52: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 242

Page 53: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 244

Page 54: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Page 244

Page 55: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Localization of 2,900 yeast proteins

Michael Snyder and colleagues incorporated epitopetags into thousands of S. cerevisiae cDNAs,and systematically localized proteins (Kumar et al., 2002).

See http://ygac.med.yale.edu for a database including2,900 fluorescence micrographs.

Page 243

Page 56: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Perspective 4: Protein function

Page 243

Page 57: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Protein function

Function refers to the role of a protein in the cell.We can consider protein function from a varietyof perspectives.

Page 243

Page 58: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

1. Biochemical function(molecular function)

RBP binds retinol,could be a carrier

Page 245

Page 59: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

2. Functional assignmentbased on homology

RBPcould bea carrier

too

Othercarrier

proteins

Page 245

Page 60: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

3. Functionbased on structure

RBP forms a calyx

Page 245

Page 61: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

4. Function based onligand binding specificity

RBP binds vitamin A

Page 245

Page 62: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

5. Function based oncellular process

DNA RNA

RBP is abundant,soluble, secreted

Page 245

Page 63: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

6. Function basedon biological process

RBP is essential for vision

Page 245

Page 64: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

7. Function based on “proteomics”or high throughput “functional genomics”

High throughput analyses show...

RBP levels elevated in renal failureRBP levels decreased in liver disease

Page 245

Page 65: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Functional assignment of enzymes:the EC (Enzyme Commission) system

Oxidoreductases 1,003Transferases 1,076Hydrolases 1,125Lyases 356Isomerases 156Ligases 126

Page 246

Page 66: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Functional assignment of proteins:Clusters of Orthologous Groups (COGs)

Information storage and processing

Cellular processes

Metabolism

Poorly characterized

Page 247

Page 67: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

Functional assignment of proteins:Clusters of Orthologous Groups (COGs)

Information storage and processing

Cellular processes

Metabolism

Poorly characterized

(Most useful for prokaryotes)

Page 247

Page 68: Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.

This lecture continues in part 2 with a discussion of two dimensional gels and the yeast two-hybrid system