Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of...
Transcript of Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of...
![Page 1: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/1.jpg)
Homology Modeling
Roberto LinsEPFL - summer semester 2005
Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton, Bioinformatics, genes, protein &computers; A.M. Lesk, Introduction to Bioinformatics; A.D. Baxevanis & B.F. Ouellette, Bioinformatics, a practical guide to the analysis of genes and proteins; several online
materials (George Washington University, University of Houston, Tel-Aviv University) and resources (RCSB, NCBI, SWISS-PROT) as well as personal research data.
![Page 2: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/2.jpg)
TERTIARY STRUCTURE (fold)TERTIARY STRUCTURE (fold)
Genome
Expressome
Proteome
Metabolome
Functional GenomicsFunctional Genomics
algorithmdatabase
algorithm
algorithm
algorithm
database
database
database
![Page 3: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/3.jpg)
Annotated proteins in the databank: ~ 100,000
Limitations of Experimental MethodsLimitations of Experimental Methods
Proteins with known structure: ~5,000 !
Total number including ORFs: ~ 700,000
ORF, or Open Reading Frame, is a region of genome that codesfor a protein
Have been identified by whole genome sequencing effortsORFs with no known function are termed orphan
Datasetfor analysis
![Page 4: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/4.jpg)
Structural Biology Consortia:Structural Biology Consortia:Brute Force Approach Towards Structure ElucidationBrute Force Approach Towards Structure Elucidation
Employment of a Ph.Ds & Postdocs army
Aim to solve about 400 structures a year
Large-scale expression & crystallization attempts
++
–– Basic strategies remain the same
No (known) new tricks
**
Enhances the statistical base for inferring sequence– structure relationships
“Unrelenting” ones will be ignored
![Page 5: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/5.jpg)
Can we predict structure from sequence?
GCTCCTCACTGTCTGTGTTTATTCTTTTAGCTTCTTCAGATCTTTTAGTCTGAGGAAGCCTGGCATGTGCAAATGAAGTTAACCTAA...
![Page 6: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/6.jpg)
Structure is much more conserved than sequenceduring evolution
Comparative ModelingComparative Modeling(Homology Modeling)(Homology Modeling)
BasisBasis
Higher the similarity, higher is theconfidence in the modeled structure
Limited applicabilityLimited applicability
A large number of proteins and ORFs have no similarityto proteins with known structure
![Page 7: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/7.jpg)
What’s homology modeling?Predicts the three-dimensional structure of a given proteinsequence (target) based on an alignment to one or more knownprotein structures (templates).
If similarity between the target sequence and the templatesequence is detected, structural similarity can be assumed.
In general, 30% sequence identity is required to generate an usefulmodel.
It can be used to understand function, activity, specificity, etc.
It is of interest to drug companies wishing to do structure-aideddrug design
A keystone of structural proteomics
![Page 8: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/8.jpg)
Homology modeling - applications
Structure-based assessment of target drugability
Structure-guided design of mutagenesis experiments
Tool compound design for probing biological function
Homology model based ligand design
Design of in vitro test assays
Structure-based prediction of drug metabolism and toxicity
![Page 9: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/9.jpg)
Accuracy and application of protein structure
![Page 10: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/10.jpg)
Does sequence similarity impliesstructure similarity?
Twilight zone
Safe zone (thanks to evolution!)
![Page 11: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/11.jpg)
RMSD
of
back
bone
ato
ms
(Ǻ)
% identical residues in core
0.0
0.5
2.5
2.0
1.5
1.0
100 75 50 25 0
Chotia & Lesk, 1986
Natoms
d
RMSD
Natoms
i
i!== 1
2
Natoms = total number of atoms; di = distance between the coordinates of anatom i at t0 and tn , when the structures are superimposed.
![Page 12: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/12.jpg)
My target sequence has over 30% sequence identitywith a known protein structure, so I want to generate
a 3D model.
What do I have to do?
![Page 13: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/13.jpg)
Structure prediction by homology modeling
![Page 14: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/14.jpg)
– The structure of a protein is determined by its primaryamino acid sequence (Anfinsen).
– During evolution, the structure of protein a has changedmuch slower than its sequence.
• Similar sequences adopt identical structures anddistantly related sequences fold into similarstructures.
Homology modeling makes two fundamental assumptions
![Page 15: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/15.jpg)
1) Template recognition & initial alignment
2) Alignment correction
3) Backbone generation
4) Loop modeling
5) Side-chain modeling
6) Model optimization
7) Model validation
In summary: homology modeling steps
![Page 16: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/16.jpg)
Template recognition & initial alignment
Select the best template from a library of known protein structuresderived from the PDB
Templates can be found using the target sequence as a query forsearching using FASTA or BLAST
![Page 17: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/17.jpg)
Gaining confidence in template searching
Once a suitable template is found, a literature search on therelevant fold can determine what biological role it plays
Does this match the biological/biochemical function that youexpect?
Ligand(s) present?
Resolution of the template
Family of Proteins
Multiple templates?
![Page 18: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/18.jpg)
Further Considerations:
duplication
speciation
species 1 species 2
paralogues
orthologues
Function may berelated or verydifferent!
Function more likely to be conserved
Proteins are homologous if they are related by divergence from a common ancestor
![Page 19: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/19.jpg)
In summary: there are two types of homologous
- Orthologs: proteins that carry out the same function in differentspecies -Paralogs: proteins that perform different, but related functionswithin one organism
![Page 20: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/20.jpg)
Alignment of the target onto the template
Correct alignment is necessary to create the most probable 3Dstructure of the target
If sequences aligns incorrectly, it will result in false positive ornegative results
Important to consider:- algorithms- scoring alignments- gap penalties
Identity SCRs (Structure Conserved Regions and SVRs(Structure Variable Regions)
![Page 21: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/21.jpg)
The (true) alignment indicates the evolutionary processgiving rise to the different sequences starting from thesame ancestor sequence and then changing throughmutations (insertions, deletions, and substitutions)
Alignment Outcome
![Page 22: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/22.jpg)
Alignment vs. databases
Task: given a query sequence and millions of databaserecords, find the optimal alignment between thequery and a record
AGTCTCCAGTTATGCCA…
![Page 23: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/23.jpg)
Alignment vs. databases
Tool: given two sequences, there exists an algorithm to find thebest alignment.
Naïve solution: apply algorithm to each of the records, one by one.
Problem: an exact algorithm is just too slow to run millions oftimes (even linear time algorithm will run slowly on a hugedatabase).
Solution: - run in parallel (expensive)- use of a fast (heuristic) method to discard
irrelevant records and the apply the exact algorithm to theremaining few
![Page 24: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/24.jpg)
Sequence alignment algorithms
Used to calculate a similarity score to infer sequence homologybetween two sequences
Examples: the two most used in homology modeling are:
BLAST: General strategy is to optimise the maximal segmentpair (MSP) score - BLAST computes similarity, not alignment(Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J., J. Mol. Biol.(1990) 215:403-410)
FastA (local alignment): searches for both full and partialsequence matches, i.e., local similarity obtained; more sensitivethan BLAST, but slower; many gaps may represent a problem(Pearson, W. R., Lipman, D. J., P.N.A.S. (1988) 85:2444-2448).
![Page 25: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/25.jpg)
Sequence alignment outputsFa
stA
BLA
ST
![Page 26: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/26.jpg)
Alignment correctionsAlignments are scored (substitution score) in order to definesimilarity between 2 aa residues in the sequences
A substitutions score is calculated for each aligned pair of letters.
Substitution matrices:
- reflect the true probabilities of mutations occurringthrough a period of evolution
- PAM family: based on global aligments of closely relatedproteins. Mutation probability matrix.
- BLOSUM family: based on observed alignments, noextrapolation of sequences that are related.
![Page 27: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/27.jpg)
Gap is one or more empty spaces in one sequence aligned withletters in the other sequence
Gap Penalties
These empty spaces may or may not be treated as penalties:
- higher penalty score is assigned for the first missing aa then thesubsequent ones; it considers the fact that each mutational eventcan insert or delete many residues at a time
![Page 28: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/28.jpg)
Gap Penalties
![Page 29: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/29.jpg)
N
C
Insertion/deletion of structural domains can ‘easily’ be done at loop sites
Gap Penalties
![Page 30: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/30.jpg)
Gap Penalties
The overall alignment score is the sum of similarity and gap scores:
the higher the overall alignment score, the better the alignment(more conserved)
![Page 31: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/31.jpg)
Corrections by hand may still be needed!
![Page 32: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/32.jpg)
Multiple nucleotide or amino sequence alignment techniques areusually performed to fit one of the following scopes :
-to characterize protein families, identify shared regions ofhomology in a multiple sequence alignment; (this happens generallywhen a sequence search revealed homologies to several sequences) ;
-to determine the consensus sequence of several aligned sequences;
-to help prediction of the secondary and tertiary structures of newsequences;
- preliminary step in molecular evolution analysis using Phylogeneticmethods for constructing phylogenetic trees.
Multiple Sequence Alignments
![Page 33: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/33.jpg)
![Page 34: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/34.jpg)
Backbone generation
Uses known structurally conserved regions to generate coordinatesfor the unknown
For SCRs - copy coordinates from known structures
For variable regions (VR) - copy from known structure, if theresidue types are similar; otherwise, use databases forfragtmented loop sequences.
![Page 35: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/35.jpg)
Backbone generation
Template-based fragment assembly
a) Find structurally conserved regionsb) build model core
![Page 36: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/36.jpg)
Loop modeling
![Page 37: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/37.jpg)
Loop modeling
1. Database search for segments from known protein structuresfitting fixed end-points2. Molecular mechanics/molecular dynamics3. Combination of 1+2
![Page 38: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/38.jpg)
Loop modeling
Ab initio rebuilding (e.g., Monte Carlo, MD, etc) to build missing loops
![Page 39: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/39.jpg)
Side chain modeling1. Use of rotamer libraries (backbone dependent)
2. Molecular mechanics optimization- Dead-end elimination (heuristic)- Monte Carlo (heuristic)- Branch & Bound (exact)
3. Mean-field methods
![Page 40: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/40.jpg)
Model optimizationMolecular mechanics methods
Model validation/evaluationModel should be evaluated for:
- correctness of the overall fold/structure- errors over localized regions- stereochemical parameters: bond lengths, angles, etc
Some softwares for model verification:
- Procheck http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html-WHAT IF http://swift.cmbi.kun.nl/whatif-PROSA II http://www.came.sbg.ac.at/Services/prosa.html-Profile 3D & Verify 3D http://shannon.mbi.ucla.edu/DOE/Services
![Page 41: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/41.jpg)
Model validation/evaluation
The Ramachandran plot
![Page 42: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/42.jpg)
Model validation/evaluation
![Page 43: Homology Modeling - BioJuncture · Homology modeling - applications Structure-based assessment of target drugability Structure-guided design of mutagenesis experiments Tool compound](https://reader033.fdocuments.net/reader033/viewer/2022041703/5e42be16cb755f64ac15065c/html5/thumbnails/43.jpg)
Model validation/evaluation
Profile 3D & Verify 3D:
-verify newly solved structures or homology models-find structures/folds compatible with a given sequence-find sequences compatible with known structure/fold from adatabase of sequences