CZ5226: Advanced Bioinformatics Lecture 8: Molecular Modeling Method Prof. Chen Yu Zong Tel:...
-
Upload
alban-mccoy -
Category
Documents
-
view
217 -
download
0
Transcript of CZ5226: Advanced Bioinformatics Lecture 8: Molecular Modeling Method Prof. Chen Yu Zong Tel:...
CZ5226: Advanced BioinformaticsCZ5226: Advanced Bioinformatics
Lecture 8: Molecular Modeling Method Lecture 8: Molecular Modeling Method
Prof. Chen Yu ZongProf. Chen Yu Zong
Tel: 6874-6877Tel: 6874-6877Email: Email: [email protected]@nus.edu.sghttp://xin.cz3.nus.edu.sghttp://xin.cz3.nus.edu.sg
Room 07-24, level 7, SOC1, Room 07-24, level 7, SOC1, National University of SingaporeNational University of Singapore
22
References on Modeling of MHC Binding PeptideReferences on Modeling of MHC Binding Peptide • Protein Sci. 2004 Sep;13(9):2523-32• J Am Chem Soc. 2004 Jul 14;126(27):8515-28• Proteins. 2004 Feb 15;54(3):534-56• Hum Immunol. 2003 Dec;64(12):1123-43• Immunity. 2003 Oct;19(4):595-606• Mol Med. 2003 Sep-Dec;9(9-12):220-5• Nature. 2002 Aug 1;418(6897):552-6• Eur J Immunol. 2002 Aug;32(8):2105-16• Immunol Cell Biol. 2002 Jun;80(3):286-99• Ann N Y Acad Sci. 2002 Apr;958:317-20• Mol Immunol. 2002 May;38(14):1039-49• J Pept Res. 2002 Mar;59(3):115-22• Mol Immunol. 2002 Feb;38(9):681-7• Tissue Antigens. 2002 Feb;59(2):101-12• J Comput Aided Mol Des. 2001 Jun;15(6):573-86• J Mol Biol. 2000 Jul 28;300(5):1205-35• J Comput Aided Mol Des. 2000 Jan;14(1):71-82• J Comput Aided Mol Des. 2000 Jan;14(1):53-69• J Mol Graph Model. 1999 Jun-Aug;17(3-4):180-6, 217
33
What is Docking?What is Docking?
• Given two molecules find their correct association:
+
=
Recep
tor Ligand
T
Complex
44
General Protein–Ligand BindingGeneral Protein–Ligand Binding• Ligand
- Molecule that binds with a protein
- DNA, drug lead compounds, etc.
• Protein active site(s)- Allosteric binding
- Competitive binding
• Function of binding interaction
- Natural and artificial
55
What is Protein-Ligand What is Protein-Ligand Docking?Docking?
• Definition: Computationally predict the structures of protein-ligand
complexes from their conformations and orientations. The orientation that maximizes the interaction reveals the most accurate structure of the complex.
• Importance of complexes- structure -> function
66
Example: HIV-1 ProteaseExample: HIV-1 Protease
Active Site(Aspartyl groups)
77
Example: HIV-1 ProteaseExample: HIV-1 Protease
88
PDBfiles
Surface Representation
Patch Detection
Matching Patches
Scoring & Filtering
Candidatecomplexes
Docking StrategyDocking Strategy
99
Issues Involved in DockingIssues Involved in Docking
• Protein Structure and Active Site- assumed knowledge (PDBs, etc.)- PROCAT database: 3d enzyme active site templates
• Ligand Structure- pharmacophore (base fragment) in potential drug compound - well known groups
• Rigid vs. Flexible- solution or vacum- structure
1010
Algorithmic Approaches to DockingAlgorithmic Approaches to Docking
• Qualitative– Geometric– shape complementarity and fitting
• Quantitative– Energy Calculations– determine global minimum energy– free energy measure
• Hybrid– Geometric and energy complementarity
– 2 phase process: soft and hard docking
1111
..Design of HIV-1 Protease InhibitorDesign of HIV-1 Protease Inhibitor
1212
..Design of HIV-1 Protease InhibitorDesign of HIV-1 Protease Inhibitor
1313
..Design of HIV-1 Protease InhibitorDesign of HIV-1 Protease Inhibitor
1414
..Design of HIV-1 Protease InhibitorDesign of HIV-1 Protease Inhibitor
1515
Scoring in Ligand-Protein DockingScoring in Ligand-Protein Docking
Potential Energy Description:
1616
Preprocessing Preprocessing • Determine internal representation
- convert coordinates of both molecules from PDB files
- e.g. Michael Connolly’s MS program (www.biohedron.com)
- dot surface
- AutoGrid
- 3d grid (array) with discrete values
- often used in rigid docking
1717
Some techniquesSome techniques
• Surface representation, that efficiently represents the docking surface and identifies the regions of interest (cavities and protrusions)
• Connolly surface• Lenhoff technique• Kuntz et al. Clustered-Spheres• Alpha shapes
• Surface matching that matches surfaces to optimize a binding score:
• Geometric Hashing
1818
Surface RepresentationSurface Representation
• Dense MS surface (Connolly)
• Sparse surface (Shuo Lin et al.)
1919
Surface RepresentationSurface Representation
• Each atomic sphere is given the van der Waals radius of the atom
• Rolling a Probe Sphere over the Van der Waals Surface leads to the Solvent Reentrant Surface or Connolly surface
2020
Lenhoff techniqueLenhoff technique
• Computes a “complementary” surface for the receptor instead of the Connolly surface, i.e. computes possible positions for the atom centers of the ligand
Atom centers of the ligand
van der Waals surface
2121
Kuntz et al. Clustered-SpheresKuntz et al. Clustered-Spheres• Uses clustered-spheres to identify cavities on the receptor and
protrusions on the ligand• Compute a sphere for every pair of surface points, i and j, with
the sphere center on the normal from point i• Regions where many spheres overlap are either cavities (on the
receptor) or protrusions (on the ligand)
i
j
2222
Alpha ShapesAlpha Shapes
• Formalizes the idea of “shape”• In 2D an “edge” between two points is “alpha-exposed” if
there exists a circle of radius alpha such that the two points lie on the surface of the circle and the circle contains no other points from the point set
2323
Alpha Shapes: ExampleAlpha Shapes: Example
Alpha=infinity
Alpha=3.0 Å
2424
Surface MatchingSurface Matching
• Find the transformation (rotation + translation) that will maximize the number of matching surface points from the receptor and the ligand
First satisfy steric constraints…• Find the best fit of the receptor and ligand using only geometrical
constraints
… then use energy calculations to refine the docking• Selet the fit that has the minimum energy
2525
..Design of HIV-1 Protease InhibitorDesign of HIV-1 Protease Inhibitor
2626
Docking ProgramsDocking Programs
More information in: http://www.bmm.icnet.uk/~smithgr/soft.html
The programs are:
• DOCK (I. D. Kuntz, UCSF)
• AutoDOCK (Arthur Olson, The Scripps Research Institute)
• RosettaDOCK (Baker, Washington Univ., Gray, Johns Hopkins Univ.)
• INVDOCK (Y. Z. Chen, NUS)
2727
DOCK as an ExampleDOCK as an Example
DOCK works in 5 steps:• Step 1 Start with crystal coordinates of target receptor• Step 2 Generate molecular surface for receptor• Step 3 Generate spheres to fill the active site of the
receptor: The spheres become potential locations for ligand atoms
• Step 4 Matching: Sphere centers are then matched to the ligand atoms, to determine possible orientations for the ligand
• Step 5 Scoring: Find the top scoring orientation
2828
DOCK as an ExampleDOCK as an Example
1 2
3
- HIV-1 protease is the target receptor- Aspartyl groups are its active side
2929
DOCK as an ExampleDOCK as an Example
4 5
• Three scoring schemes: Shape scoring, Electrostatic scoring and Force-field scoring• Image 5 is a comparison of the top scoring orientation of the molecule thioketal with the orientation found in the crystal structure
3030
The DOCK AlgorithmThe DOCK Algorithm
Two steps in rigid ligand mode:
Orienting the putative ligand in the siteGuided by matching distances, between pre-defined site points on the target to interatomic distances of the ligand.The RT matrix is used for the transform of the ligand.
Scoring the resulting orientationEach orientation is scored for each quality fit. The process is repeated a user-defined number of orientations or maximum orientations
3131
.. .
.
..
. .
N
NH
N
SO
F
.. .
N
NH
N
SO
F
.
N
NH
N
SO
F
N
NH
N
SO
F
1. Define the target binding site points.
2. Match the distances.
3. Calculate the transformation matrix for the orientation.
4. Dock the molecule.
5. Score the fit.
3232
Site Points Generation in DOCKSite Points Generation in DOCK
• Program SPHGEN identifies the active site, and other sites of interest.
• Each invagination is characterized by a set of overlapping spheres.
• For receptors, a negative image of the surface invaginations is created;
• For a ligand, the program creates a positive image of the entire molecule.
3333
The MatchingThe MatchingCan be directed by 2 additional features:
• Chemical matching - labeling the site points such that only particular atom types are allowed to be matched to them.
• Critical cluster - subsets of interest can be defined as critical clusters, so that at least one member of them will be part of any accepted ligand “match”.
Increase in efficiency and speed due to elimination of potentially less promising orientations!
3434
Other Docking programsOther Docking programs
AutoDock– AutoDock was designed to dock flexible ligands into receptor
binding sites– The strongest feature of AutoDock is the range of powerful
optimization algorithms available
RosettaDOCK– It models physical forces and creates a very large number of
decoys – It uses degeneracy after clustering as a final criterion in decoy
selection
INVDOCK– Docking strategy and algorithm similar to DOCK, but with the
capability of finding the receptors to which a molecule can bind to.
Conformational Ensembles Conformational Ensembles DockingDocking
3636
Conformational Ensembles DockingConformational Ensembles Docking
Observations:
1. Generating an orientation of a ligand in a binding site may be separated from calculating a conformation of the ligand in that particular orientation.
2. Multiple conformations of a given ligand usually have some portion in common (internally rigid atoms such as ring systems), and therefore, contain redundancies.
3737
Conformational Ensemble DockingConformational Ensemble Docking
3838
Conformational Ensemble DockingConformational Ensemble Docking
• Conformational ensembles are generated by overlaying all conformations of a given molecule onto its largest rigid fragment.
• Only atoms within this largest rigid fragment are used during the distance matching step. The RT matrix is defined.
• Each of the conformers is oriented into the site and scored. The score measures steric and electrostatic complementarity.
• One matching steps - all the conformers are docked and scored in the selected orientation.
3939
Overview of the Ligand Ensemble MethodOverview of the Ligand Ensemble Method
4040
Advantages of Conformational Ensemble DockingAdvantages of Conformational Ensemble Docking
Speed increase due to:
• One matching step for all the conformers.
• The largest rigid fragment usually has fewer atoms (less potential matches are examined).
4141
Disadvantages of Conformational Disadvantages of Conformational Ensemble DockingEnsemble Docking
• Loss of information when the orientations are guided only by a subset of the atoms in molecule. Orientations may be missed because potential distance matches from non-rigid portions of the molecule are not considered.
• The ensemble method will fail for ligands that lack internally rigid atoms.
• The use of chemical matching and critical clusters is limited.
4242
Results of Docking StudiesResults of Docking Studies
The docked (blue) and crystal (yellow) structure of ligands in some PDB ligand-protein complexes. The PDB Id of each structure is shown.
4343
Protein-Protein cases from protein-protein docking benchmark [6]:Enzyme-inhibitor – 22 casesAntibody-antigen – 16 cases
Protein-DNA docking: 2 unbound-bound cases
Protein-drug docking: tens of bound cases (Estrogen receptor, HIV protease, COX)
Performance: Several minutes for large protein molecules and seconds for small drug molecules on standard PC computer.
Dataset and Testing ResultsDataset and Testing Results
Endonuclease I-PpoI (1EVX) with DNA (1A73). RMSD 0.87Å, rank 2
DNAendonucleasedocking solution
Estrogen receptor
Estradiol molecule from complex
docking solution
Estrogen receptor with estradiol (1A52). RMSD 0.9Å, rank 1, running time: 11 seconds
4444
Results Enzyme-Inhibitor Results Enzyme-Inhibitor dockingdockingComplex Description
pen. res.1
geom score time with ACE score
PDB receptor/ligand rmsd rank min. rmsd rank
1ACB α-chymotrypsin/Eglin C 0,2 2.0 41 9:37 1.8 55
1AVW Trypsin/Sotbean Trypsin inhibitor 3,4 1.9 913 11:27 1.9 319
1BRC Trypsin/APPI 0,2 5.0 528 5:20 5.6 66
1BRS Barnase/Barstar 1,3 3.5 115 5:18 2.7 7
1CGI α-chymotrypsinogen/trypsin inhibitor 4,2 2.4 114 6:26 3.0 10
1CHO α-chymotrypsin/ovomucoid 3rd Domain 0,3 3.4 148 5:35 1.2 26
1CSE Subtilisin Carlsberg/Eglin C 0,2 3.8 166 6:58 2.3 540
1DFJ Ribonuclease inhibitor/Ribonuclease A 12,8 3.9 1446 11:58 11.9 612
1FSS Acetylcholinesterase/Fasciculin II 8,3 2.5 296 11:42 2.3 46
1MAH Mouse Acetylcholinesterase/inhibitor 2,5 2.5 436 14:39 2.3 57
1PPE* Trypsin/CMT-1 0,0 2.0 1 2:34 2.0 1
1STF* Papain/Stefin B 0,0 2.2 4 8:15 2.1 13
1TAB* Trypsin/BBI 0,1 1.4 96 3:41 7.2* 104
1TGS Trypsinogen/trypsin inhibitor 5,4 2.2 345 5:19 3.6 101
1UDI* Virus Uracil-DNA glycosylase/inhibitor 4,2 2.6 3 7:40 2.4 1
1UGH Human Uracil-DNA glycosylase/inhibitor 8,3 2.1 12 5:45 3.8 5
2KAI Kallikrein A/Trypsin inhibitor 10,7 4.2 126 7:15 4.7 42
2PTC β-trypsin/ Pancreatic trypsin inhibitor 2,4 4.4 66 5:13 3.4 12
2SIC Subtilisin BPN/Subtilisin inhibitor 5,3 2.5 129 9:41 4.7 21
2SNI Subtilisin Novo/Chymotrypsin inhibitor 2 6,7 8.3 1241 5:08 7.3 450
2TEC* Thermitase/Eglin C 0,1 3.0 66 7:58 1.4 29
4HTC* α-Thrombin/Hirudin 2,2 3.3 2 3:36 2.8 21 Number of highly penetrating residues in unbound structures superimposed to complex
4545
Results Antibody-Antigen dockingResults Antibody-Antigen docking
Complex Description pen. res. 1
geom score time ACE score
PDB receptor/ligand rmsd rank min. rmsd rank
1AHW Antibody Fab 5G9/Tissue factor 3,3 2.5 29 10:12 2.5 10
1BQL* Hyhel - 5 Fab/Lysozyme 0,0 2.5 13 6:21 1.4 7
1BVK Antibody Hulys11 Fv/Lysozyme 0,0 3.8 1301 6:25 3.5 809
1DQJ Hyhel - 63 Fab/Lysozyme 18,7 4.3 773 5:30 5.1 953
1EO8* Bh151 Fab/Hemagglutinin 3,1 1.8 567 9:45 1.6 292
1FBI* IgG1 Fab fragment/Lysozyme 2,5 5.0 536 10:13 5.0 2416
1IAI* IgG1 Idiotypic Fab/Igg2A Anti-Idiotypic Fab 5,6 4.8 1302 9:13 3.4 1304
1JHL* IgG1 Fv Fragment/Lysozyme 0,0 1.6 282 13:15 1.3 143
1MEL* Vh Single-Domain Antibody/Lysozyme 0,1 1.8 3 2:40 2.0 2
1MLC IgG1 D44.1 Fab fragment/Lysozyme 8,3 4.0 136 5:29 2.6 123
1NCA* Fab NC41/Neuraminidase 0,0 2.6 114 17:50 2.8 66
1NMB* Fab NC10/Neuraminidase 0,0 2.7 2593 28:10 2.4 1734
1QFU* Igg1-k Fab/Hemagglutinin 0,0 2.7 44 5:42 2.7 23
1WEJ IgG1 E8 Fab fragment/Cytochrome C 0,0 4.3 232 7:44 2.6 87
2JEL* Jel42 Fab Fragment/A06 Phosphotransferase 0,2 4.7 114 5:02 4.7 50
2VIR* Igg1-lamda Fab/Hemagglutinin 0,0 3.1 258 7:34 3.5 306
1 Number of highly penetrating residues in unbound structures superimposed to complex
4646
Quality of INVDOCK AlgorithmQuality of INVDOCK Algorithm Proteins. 1999; 36:1Proteins. 1999; 36:1
Molecule Docked Protein PDB Id
RMSDDescription of Docking Quality Energy
(kcal/mol)
Indinavir HIV-1 Protease 1hsg 1.38 Match -70.25
Xk263 Of Dupont Merck
HIV-1 Protease 1hvr 2.05 Match -58.07
Vac HIV-1 Protease 4phv 0.80 Match -88.46
Folate
Dihydrofolate Reductase 1dhf 6.55 One end match, the other in different orientation -46.02
5-Deazafolate Dihydrofolate Reductase 2dhf 1.48 Match -65.49
Estrogen Estrogen Receptor 1a52 1.30 Match -45.86
4-Hydroxytamoxifen Estrogen Receptor
3ert
5.45
Complete overlap, flipped along short axis -55.15
Guanosine-5'-[B,G-Methylene] Triphosphate
H-Ras P21
121p
0.94 Match-80.20
Glycyl-*L-Tyrosine
Carboxypeptidase A 3cpa 3.56 Overlap, flipped along short axis-40.63
4747
Identification of the N-terminal Identification of the N-terminal peptide binding site of GRP94peptide binding site of GRP94
GRP94 - Glucose regulated protein 94
VSV8 peptide - derived from vesicular stomatitis virus
Gidalevitz T, Biswas C, Ding H, Schneidman-Duhovny D, Wolfson HJ, Stevens F, Radford S, Argon Y. J Biol Chem. 2004
4848
Biological motivationBiological motivation
The complex between the two molecules highly stimulates the response of the T-cells of the immune system. The grp94 protein alone does not have this property. The activity that stimulates the immune response is due to the ability of grp94 to bind different peptides. Characterization of peptide binding site is highly important.
4949
GRP94 moleculeGRP94 molecule
There was no structure of grp94 protein. Homology modeling was used to predict a structure using another protein with 52% identity.
Recently the structure of grp94 was published. The RMSD between the crystal structure and the model is 1.3A.
5050
DockingDocking
PatchDock was applied to dock the two molecules, without any binding site constraints. Docking results were clustered in the two cavities:
5151
GRP94 moleculeGRP94 molecule There is a binding site for inhibitors between the helices. There is another cavity produced by beta sheet on the opposite side.