Classification: understanding the diversity and principles of
description
Transcript of Classification: understanding the diversity and principles of
MCSG 2001 structures
protein structure and function
Classification: understanding the diversity and principles of
Tool: structure comparison
Why compare protein structures? Learn about– structure-function relationships– evolution– common building blocks – motifs Make order out of the universe of protein structures Help structure prediction Inge Jonassen
Structure Comparison
• Evolutionary relationship• Structure classification
Growth factorCytokine
Shapiro & Harris, 2000
9% sequence identity
Gerstein
Strategies Bottom Upfind local matches first, then solve the combinatorial problem to identify the largest cluster of matching substructures (e.g., Dali, CE, etc.) Top Downfind a rough global alignment first, then use a refinement procedure to identify details of local matches (e.g., 3D look-up, subgraph isomorphism, etc.)
Types of substructure used Atom/atom group Residue Fragment Secondary structure element (SSE) Structure described by elements of the chosen type (e.g. molecular surface)
Gerstein
Gerstein
Gerstein
SAP (iterated DDP)
DDP
Dali
Dali’s idea
3D LookupHolm & Sander (1995)
3D Lookup
Structure Comparison Tools DALI ( http://www2.ebi.ac.uk/dali/ ) CE ( http://cl.sdsc.edu/ ) VAST (http://www.ncbi.nlm.nih.gov/Structure/VAST/vast.html
) Prosup ( http://www.came.sbg.ac.at ) FLASH (http://thr.ibms.sinica.edu.tw/flash/)
Complication: alternative alignments
Results of different methods on the comparison of Azurin (1azc:A) vs plastocyanin (1plc)
Complication: permutation
N
C
A B
DC N
C
A B
DC
..A..B..C..D.. ..C..D..A..B..
Circular Permutation (CP)
1nls (Concanavalin) 1led (Lectin)
N
C
N C
A real example of CP
Angle-distance map:
SSE matching:
FLASH’s algorithm
Greedy selection of distinct alignment solutions
g - f’ 50.4d - c’ 42.1f - e’ 33.7e - d’ 33.3b - a’ 30.4c - b’ 27.2c - e’ 8.9e - a’ 7.7f - b’ 6.3b - d’ 5.8g - c’ 4.6d - f’ 3.8f - a’ 1.5c - a’ 1.2a - c’ 1.2
(1) g - f’ d - c’ f - e’ e - d’ b - a’ c - b’
(2) c - e’ b - d’ d - f’
(3) e - a’ f - b’ g - c’
Optimal: Alternative:
Scrambled Protein Pair
A B C D
A D C B
N
N’
C
C’
NC
N’
C’
“…few, if any, are able to detect permutations directly.”- Robert B. Russell (2002)
Unique Capabilities:
• alternative alignments allowing sub-domain motif (structural building block) discovery
• permutation detection at all levels of complexity