Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation...
-
Upload
hugh-parrish -
Category
Documents
-
view
216 -
download
1
Transcript of Protein analysis and proteomics (Part 1 of 2). Many of the images in this powerpoint presentation...
Protein analysis and proteomics(Part 1 of 2)
Many of the images in this powerpoint presentationare from Bioinformatics and Functional Genomicsby Jonathan Pevsner (ISBN 0-471-21004-8). Copyright © 2003 by John Wiley & Sons, Inc.
These images and materials may not be usedwithout permission from the publisher. We welcomeinstructors to use these powerpoints for educationalpurposes, but please acknowledge the source.
The book has a homepage at http://www.bioinfbook.orgIncluding hyperlinks to the book chapters.
Copyright notice
Outline for today
Protein analysis and proteomics
Individual proteinsProtein familiesPhysical propertiesLocalization Function
Large-scale protein analysis2D protein gelsYeast two-hybridRosetta Stone approachPathways
protein
Page 224
RNADNA
protein
[1] Protein families
Page 224
protein
[1] Protein families
[2] Physical properties
Page 224
protein
[1] Protein families
[2] Physical properties
Page 224
[3] Protein localization
protein
[1] Protein families
[4] Protein function
[2] Physical properties
Page 224
[3] Protein localization
protein
[1] Protein families
[4] Protein function
[2] Physical properties
Page 224
[3] Protein localization
Gene ontology (GO):--cellular component--biological process--molecular function
Perspective 1: Protein domains and motifs
Page 225
Definitions
Signature: • a protein category such as a domain or motif
Page 225
Definitions
Signature: • a protein category such as a domain or motif
Domain: • a region of a protein that can adopt a 3D structure• a fold• a family is a group of proteins that share a domain• examples: zinc finger domain immunoglobulin domain
Motif (or fingerprint):• a short, conserved region of a protein• typically 10 to 20 contiguous amino acid residues
Page 225
15 most common domains (human)
Zn finger, C2H2 type 1093 proteinsImmunoglobulin 1032EGF-like 471Zn-finger, RING 458Homeobox 417Pleckstrin-like 405RNA-binding region RNP-1 400SH3 394Calcium-binding EF-hand 392Fibronectin, type III 300PDZ/DHR/GLGF 280Small GTP-binding protein 261BTB/POZ 236bHLH 226Cadherin 226 Page 227
15 most common domains (various species)
The European Bioinformatics Institute (EBI)offers many key proteomics resources:
http://www.ebi.ac.uk/proteome/
Page 227
Definition of a domain
According to InterPro at EBI (http://www.ebi.ac.uk/interpro/):
A domain is an independent structural unit, found aloneor in conjunction with other domains or repeats.Domains are evolutionarily related.
According to SMART (http://smart.embl-heidelberg.de):
A domain is a conserved structural entity with distinctivesecondary structure content and a hydrophobic core.Homologous domains with common functions usuallyshow sequence similarities.
Page 226
Varieties of protein domains
Page 228
Extending along the length of a protein
Occupying a subset of a protein sequence
Occurring one or more times
Example of a protein with domains: Methyl CpG binding protein 2 (MeCP2)
MBD
Page 227
TRD
The protein includes a methylated DNA binding domain(MBD) and a transcriptional repression domain (TRD).MeCP2 is a transcriptional repressor.
Mutations in the gene encoding MeCP2 cause RettSyndrome, a neurological disorder affecting girlsprimarily.
Page 228
Result of an MeCP2 blastp search:A methyl-binding domain shared by several proteins
Page 228
Are proteins that share only a domain homologous?
Example of a multidomain protein: HIV-1 pol
• 1003 amino acids long
• cleaved into three proteins with distinct activities:-- aspartyl protease-- reverse transcriptase-- integrase
We will explore HIV-1 pol and other proteins at theExpert Protein Analysis System (ExPASy) server.
Visit www.expasy.org/
Page 229
Page 230
Page 230
SwissProt entry for HIV-1 pol links to many databases
Page 231
ProDom entry for HIV-1 pol shows many related proteins
Page 231
Proteins can have both domains and patterns (motifs)
Domain(aspartylprotease)
Domain(reversetranscriptase)
Pattern(severalresidues)
Pattern(severalresidues)
Page 232
Definition of a motif
A motif (or fingerprint) is a short, conserved region of a protein. Its size is often 10 to 20 amino acids.
Simple motifs include transmembrane domains andphosphorylation sites. These do not imply homologywhen found in a group of proteins.
PROSITE (www.expasy.org/prosite) is a dictionary of motifs (there are currently 1600 entries). In PROSITE,a pattern is a qualitative motif description (a proteineither matches a pattern, or not). In contrast, a profileis a quantitative motif description. We will encounterprofiles in Pfam, ProDom, SMART, and other databases.
Page 231-233
Perspective 2: Physical properties of proteins
Page 233
Page 234
Physical properties of proteins
Many websites are available for the analysis ofindividual proteins. ExPASy and ISREC are twoexcellent resources.
The accuracy of these programs is variable. Predictions based on primary amino acid sequence (such as molecular weight prediction) are likely to be more trustworthy. For many other properties (such asposttranslational modification of proteins by specific sugars), experimental evidence may be required rather than prediction algorithms.
Page 236
Page 235
Page 235
Page 235
Page 236
Page 238
Page 238
Page 238
Syntaxin, SNAP-25 and VAMP are three proteins that interact via coiled-coil domains
Introduction to Perspectives 3 and 4: Gene Ontology (GO) Consortium
Page 237
The Gene Ontology Consortium
An ontology is a description of concepts. The GOConsortium compiles a dynamic, controlled vocabularyof terms related to gene products.
There are three organizing principles: Molecular functionBiological processCellular compartment
You can visit GO at http://www.geneontology.org.There is no centralized GO database. Instead, curatorsof organism-specific databases assign GO termsto gene products for each organism.
Page 237
Page 241
GO terms are assigned to LocusLink entries
Page 241
Page 241
Page 241
The Gene Ontology Consortium: Evidence Codes
IC Inferred by curatorIDA Inferred from direct assayIEA Inferred from electronic annotationIEP Inferred from expression patternIGI Inferred from genetic interactionIMP Inferred from mutant phenotypeIPI Inferred from physical interactionISS Inferred from sequence or structural similarityNAS Non-traceable author statementND No biological dataTAS Traceable author statement
Page 240
Perspective 3: Protein localization
Page 242
protein
Protein localization
Page 242
Protein localization
Proteins may be localized to intracellular compartments,cytosol, the plasma membrane, or they may be secreted. Many proteins shuttle between multiple compartments.
A variety of algorithms predict localization, but thisis essentially a cell biological question.
Page 240
Page 242
Page 244
Page 244
Localization of 2,900 yeast proteins
Michael Snyder and colleagues incorporated epitopetags into thousands of S. cerevisiae cDNAs,and systematically localized proteins (Kumar et al., 2002).
See http://ygac.med.yale.edu for a database including2,900 fluorescence micrographs.
Page 243
Perspective 4: Protein function
Page 243
Protein function
Function refers to the role of a protein in the cell.We can consider protein function from a varietyof perspectives.
Page 243
1. Biochemical function(molecular function)
RBP binds retinol,could be a carrier
Page 245
2. Functional assignmentbased on homology
RBPcould bea carrier
too
Othercarrier
proteins
Page 245
3. Functionbased on structure
RBP forms a calyx
Page 245
4. Function based onligand binding specificity
RBP binds vitamin A
Page 245
5. Function based oncellular process
DNA RNA
RBP is abundant,soluble, secreted
Page 245
6. Function basedon biological process
RBP is essential for vision
Page 245
7. Function based on “proteomics”or high throughput “functional genomics”
High throughput analyses show...
RBP levels elevated in renal failureRBP levels decreased in liver disease
Page 245
Functional assignment of enzymes:the EC (Enzyme Commission) system
Oxidoreductases 1,003Transferases 1,076Hydrolases 1,125Lyases 356Isomerases 156Ligases 126
Page 246
Functional assignment of proteins:Clusters of Orthologous Groups (COGs)
Information storage and processing
Cellular processes
Metabolism
Poorly characterized
Page 247
Functional assignment of proteins:Clusters of Orthologous Groups (COGs)
Information storage and processing
Cellular processes
Metabolism
Poorly characterized
(Most useful for prokaryotes)
Page 247
This lecture continues in part 2 with a discussion of two dimensional gels and the yeast two-hybrid system