MIAMExpress development October 2002 Mohammad shojatalab [email protected]
Eugene Krissinel Macromolecular Structure Database [email protected]
description
Transcript of Eugene Krissinel Macromolecular Structure Database [email protected]
![Page 1: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/1.jpg)
EBI is an Outstation of the European Molecular Biology Laboratory.
Assessment ofmacromolecular interactions
and identification ofmacromolecular assemblies
in crystalline state
(PISA software)
Eugene KrissinelMacromolecular Structure Database
![Page 2: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/2.jpg)
www.pdb.org
www.pdbj.org
www.ebi.ac.uk/msd
http://www.ebi.ac.uk/msd
![Page 3: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/3.jpg)
Macromolecular Assemblies
Complexes of protein, DNA/RNA chains and ligands, stable in native environment
The way the chains assemble represents the [Protein] Quaternary Structure (PQS)
Macromolecular assemblies are often the Biological Units, performing a certain biochemical function
Biological and biochemical significance of macromolecular assemblies is truly immense
![Page 4: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/4.jpg)
Biological Unit Example: Hemoglobin
• Iron-containing oxygen-transport tetrameric protein complex
• Oxygen gets bound to Fe+2 in heme ligand
• Binding and release of ligands induces structural changes
PDB code: 1a00
![Page 5: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/5.jpg)
Biological Units are Hard to Study
Light / Neutron / X-ray / Small angle scatterings: mainly composition and multimeric state may be found. 3D shape may be guessed from mobility measurements.
Very few quaternary structures have been identified experimentally.
Electron microscopy: not a fantastic resolution and not applicable to all objects
NMR is not good for big chains, even less so for protein assemblies.
![Page 6: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/6.jpg)
Asymmetric Unit vs. Biological Unit
Crystal = translated Unit Cells
More than 80% of macromolecular structures are solved by means of X-ray diffraction on crystals.
It is reasonable to expect that Biological Units make construction blocks for the crystal.
An X-ray diffraction experiment produces atomic coordinates of the Asymmetric Unit, which is stored as a PDB file.
In general, neither Asymmetric Unit nor Unit Cell has any direct relation to Biological Unit. Biological Unit may be made of
Unit Cell = all space symmetry group mates of ASU
PDB file (ASU)
• a single ASU• part of ASU
• several ASU• several ASU parts
Biological Unit
![Page 7: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/7.jpg)
PDB entries and Biological Units
PDB entry 1P30A monomer?
Biological unit 1P30Homotrimer!
![Page 8: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/8.jpg)
PDB entries and Biological Units
PDB entry 2TBVA trimer?
Biological Unit 2TBV180-mer!
![Page 9: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/9.jpg)
PDB entries and Biological Units
PDB entry 1E94????????
2 Biological Units in 1E94:
A dodecamer and a hexamer!
![Page 10: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/10.jpg)
Protein Crystallography in Simple Words
?no image or bad image
2
crystallisation
3
in crystal
? ?good image but no
associations
in vivo
1
![Page 11: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/11.jpg)
Significant interfaces?
PQS server @ EBI (Kim Henrick) Trends in Biochem. Sci. (1998) 23, 358 PITA server @ EBI (Hannes Ponstingl) J. Appl. Cryst. (2003) 36, 1116
![Page 12: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/12.jpg)
Significant interfaces or artefacts?
“No single parameter absolutely differentiates the interfaces from all other surface
patches”
Jones, S. & Thornton, J.M. (1996) Principles of protein-protein interactions, Proc. Natl. Acad. Sci. USA, 93, 13-20.
Formation of N>2 -meric complexes is most probably a corporate process involving a set of interfaces. Therefore significance of an interface should not be detached from the context of protein complex
“…the type of complexes need to be taken into account when characterizing interfaces between them.”
Jones, S. & Thornton, J.M., ibid.
![Page 13: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/13.jpg)
Is there a Measure of Significance?
![Page 14: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/14.jpg)
Chemistry’s say
It is not so much properties of individual interfaces but rather chemical stability of protein complex in general that really matters
Protein chains will most likely associate into largest complexes that are still stable
A protein complex is stable if its free Gibbs energy of dissociation is positive:
0int0 STGGdiss
RTG
ni i
ddiss
A
AK
0
exp
![Page 15: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/15.jpg)
Detection of Biological Units in Crystals
1. Enumerate all possible assemblies in crystal, subject to crystal structure: space symmetry group, geometry and composition of the Asymmetric Unit (Graph Theory used)
2. Evaluate assemblies for chemical stability
3. Leave only stable assemblies in the list and range them by chances to be a Biological Unit :
• Larger assemblies take preference• Single-assembly solutions take preference• Otherwise, assemblies with higher Gdiss take preference
Method Summary
![Page 16: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/16.jpg)
Classification of protein complexes
Assembly classification on the benchmark set of 218 structures published in
Ponstingl, H., Kabir, T. and Thornton, J. (2003) Automatic inference of protein quaternary structures from crystals. J. Appl. Cryst. 36, 1116-1122.
1mer 2mer 3mer 4mer 6mer Other Sum Correct 1mer 50 4 0 1 0 0 55 91% 2mer 6 68+11 0 2+1 0 0 76+12 90% 3mer 1 0 22 0 1 0 24 92% 4mer 2 3 0 27+6 0 0 32+6 87% 6mer 0 0 0 1 10+2 0 11+2 92% Total: 198+20 90%
198+20 <=> 198 homomers and 20 heteromers
Fitted parameters:
hbE
sbE
1. Free energy of a H-bond :
2. Free energy of a salt bridge :
3. Constant entropy term :
4. Surface entropy factor : FT CT
= 0.51 kcal/mol
= 0.21 kcal/mol
= 11.7 kcal/mol
= 0.57·10-3 kcal/(mol*Å2)
Classification error in Gdiss : ± 5 kcal/mol
![Page 17: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/17.jpg)
Classification of protein-DNA complexes
Assembly classification on the benchmark set of 212 protein – DNA complexes published in
Luscombe, N.M., Austin, S.E., Berman H.M. and Thornton, J.M. (2000) An overview of the structures of protein-DNA complexes. Genome Biol. 1, 1-37.
Classification error in : ± 5 kcal/mol0dissG
2mer 3mer 4mer 5mer 6mer 10mer Other Sum Correct2mer 1 0 0 0 0 0 0 1 100%3mer 6 96 0 0 1 0 2 105 91%4mer 0 2 83 0 0 0 0 85 98%5mer 0 0 2 3 0 0 0 5 60%6mer 1 0 0 0 13 0 1 15 87%
10mer 0 0 0 0 0 1 0 1 100%Total: 212 93%
![Page 18: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/18.jpg)
Free energy of misclassifications
0 20 40 60 80 100
0
4
8
12
16
20
|G0diss| [kcal/mol]
Num
ber
of m
iscl
assi
ficat
ions
1q
ex
(6:3
)
1cg
2 (
4:2
) 2
he
x(1
0:1
)
1to
n (
2:1
) 1
crx
(12
:6)
1d
3u
(8
:4)
1h
cn (
4:2
)
![Page 19: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/19.jpg)
Misclassification of 1QEX
Predicted: homohexamer
Dissociates into 2 trimers 106 kcal/mol0
dissG
Biolgical unit: homotrimer
Dissociates into 3 monomers 90 kcal/mol0
dissG
BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR
![Page 20: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/20.jpg)
What is 1QEX?
Rossmann M.G., Mesyanzhinov V.V., Arisaka F and Leiman P.G. (2004) The bacteriophage T4 DNA injection machine. Curr. Opinion Struct. Biol. 14:171-180.
BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR
![Page 21: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/21.jpg)
1QEX: nobody is perfect
1QEX hexamer Identified wrongly
1QEX trimer
1S2E trimerCorrect mainchain tracing
Identified correctly
Wrong mainchain tracing!
BACTERIOPHAGE T4 GENE PRODUCT 9 (GP9), THE TRIGGER OF TAIL CONTRACTION AND THE LONG TAIL FIBERS CONNECTOR
![Page 22: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/22.jpg)
Misclassification of 1D3UTATA-BINDING PROTEIN / TRANSCRIPTION FACTOR
Predicted: octamer
Dissociates into 2 tetramers 20 kcal/mol
0dissG
Functional unit: tetramer
![Page 23: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/23.jpg)
Misclassification of 1CRXCRE RECOMBINASE / DNA COMPLEX REACTION INTERMEDIATE
Predicted: dodecamer
Dissociates into 2 hexamers 28 kcal/mol0
dissG
Functional unit: trimer
![Page 24: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/24.jpg)
Synaptic complex 1CRX
Guo F., Gopaul D.N. and van Duyne G.D. (1997)
Structure of Cre recombinase complexed with DNA in a site-specific recombination synapse.
Nature 389:40-46.
CRE RECOMBINASE / DNA COMPLEX REACTION INTERMEDIATE
![Page 25: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/25.jpg)
Misclassification of 1TONTONIN
Predicted: dimer
Dissociates at 37 kcal/mol0
dissG
Biological unit: monomer
Apparent dimerization is an artefact due to the presence of Zn+2 ions added to the buffer to aid crystallization. Removal Zn from the file results in 3 kcal/mol0
dissG
Fujinaga M., James M.N.G. (1997) Rat submaxillary gland serine protease, tonin structure solution and refinement at 1.8 Å resolution. J.Mol.Biol. 195:373-396.
![Page 26: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/26.jpg)
Reasons to be imperfect
50%
50%• Crystal packing may introduce interactions that are
not engaged in native environment
• Definition of Biological Unit is based on physiological function and may be to a certain degree subjective
• Interaction artifacts are often due to the addition of binding agents in order to aid crystallization
• Theoretical models of macromolecular complexation are simplified
• Experimental data are of a limited accuracy
• No explicit account for concentrations, temperature, pH and ionic strength
![Page 27: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/27.jpg)
Web-server PISA@MSD
A new MSD-EBI tool for working with Protein Interfaces, Surfaces and Assemblies
http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
![Page 28: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/28.jpg)
Web-server PISA@MSD
![Page 29: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/29.jpg)
Web-server PISA@MSD
![Page 30: Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk](https://reader035.fdocuments.net/reader035/viewer/2022062410/568157d3550346895dc559b0/html5/thumbnails/30.jpg)
Conclusions
Chemical-thermodynamical models for protein complex stability allow one to recover biological units from protein crystallography data at 80-90% success rate
Considerable part of misclassifications is due to the difference of experimental and native environments and artificial interactions induced by crystal packing
Functional significance of protein interfaces cannot be reliably inferred only from their properties. Due to entropy contribution and entangled interactions, interface function is also subject to protein complex composition and geometry
Protein interface and assembly analysis software (PISA) is available for public use as a web-service from MSD@EBI.
Acknowledgement. This work has been supported by research grant No. 721/B19544 from the Biotechnology and Biological Sciences Research Council (BBSRC) UK.