Bioinformatics how to … use publicly available free tools to predict protein structure by...
-
Upload
maryann-blankenship -
Category
Documents
-
view
220 -
download
4
Transcript of Bioinformatics how to … use publicly available free tools to predict protein structure by...
![Page 1: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/1.jpg)
Bioinformatics how to …
use publicly available free tools to predict protein
structure by comparative modeling
![Page 2: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/2.jpg)
Proteins are 3D objects with complex shapes
Over 60,000 protein structures have been determined, mostly by X-ray crystallography (PDB)
3D structure of ~70% of bacterial and 50% of human proteins can be predicted (comparative modeling)
![Page 3: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/3.jpg)
A predicted model simply illustrates our assumptions
No assumptions, thisis nature telling us how it is
GNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHKETGNHFAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEYSFKDNSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQVTDFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKDGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF
Sequence Assumption(protein A is
Similar to protein B)
Result(protein A is
Similar to protein B)
![Page 4: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/4.jpg)
Unknown protein GLLTTKFVSLLQEAKDGVLD
LKLAADTLAVRQKRRIYDITNVLEGIGLIEKKSKNSIQW
Well studied protein
SRRSASHPTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAA
similarity
prediction
How do we know that these proteins are similar?
![Page 5: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/5.jpg)
How can we make such assumptions?
Statistical reliability of the prediction E-value - the number of hits one can
"expect" to see just by chance when searching a database of a particular size (closer to zero the better)
Z-score – score expressed as a distance from the mean calculated in standard deviations (the bigger the better)
![Page 6: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/6.jpg)
Similar, but not homologous
phosphoribosyltransferase and viral coat protein, identity: 42%, different folds, different functions
. . . . . 99 IRLKSYCNDQSTGDIKVIGGDDLSTLTGKNVLIVEDIIDTGKTMQTLLSLVRQY.NPKMVKVASLLVKRTPRSVGY 173 : ||. ||| || |. || | : | | | | || | || |:| | ||.| |214 VPLKTDANDQ.IGDSLY....SAMTVDDFGVLAVRVVNDHNPTKVT..SKVRIYMKPKHVRV...WCPRPPRAVPY 279
![Page 7: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/7.jpg)
Different, but homologous Histone H5 and transcription factor E2F4, identity 7%, similar fold, similar
function (DNA binding)
PTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAAGVLKQTKGVGASGSFRL | | | | |
GLLTTKFVSLLQEAKD-GVLDLKLAADTLA------VRQKRRIYDITNVLEGIGLIEKKS----KNSIQW
![Page 8: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/8.jpg)
Steps in comparative modeling
Recognition
Model analysis
Are there any well characterizedproteins similar to my protein?
What is the detailed 3D structure of my proteins
Is my model any good?
Modeling
AlignmentWhat is the position-by-positiontarget/template equivalence
![Page 9: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/9.jpg)
Recognition
BLAST, PSI-BLAST or PFAM, FFAS, metaserver (bioinfo)
Name (PDB code) of the template
Statistical significance of the match (Z-score, e.value, p.value, points)
![Page 10: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/10.jpg)
Alignment
The same tools as in recognition (perhaps with different parameters), editing by hand
Position by position equivalence table
![Page 11: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/11.jpg)
Modeling Commercial
programs Accelrys (Insight) Tripos (Sybyl) …
Freeware/shareware/servers Modeller (Andrej
Sali) Jackal (Barry
Honig) SCRWL (Roland
Dunbrack) SwissModel
![Page 12: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/12.jpg)
Model quality
Empirical energy based tools PSQS
(http://www1.jcsg.org/psqs/psqs.cgi) SwissPDB viewer
Geometric quality Procheck, SFCHECK, etc.
(http://www.jcsg.org/scripts/prod/validation/sv3.cgi)
![Page 13: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/13.jpg)
75
50
25
0
Easy – 100-40% sequence id - strong sequencesimilarity, strong structure similarity,obvious function analogy
Difficult – 40%-25% - twilight zonesequence similarity, increasing structure divergence, function diversification
Fold prediction – below 25% seq id.no apparent sequence similarityextreme function divergence
Expectations of comparative modeling
![Page 14: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/14.jpg)
Challenges of comparative modeling
100
80
60
40
20
Recognition Alignment Modeling Challenges
Trivial Trivial Simple Loop modeling
Trivial Easy Simple Loop modeling
Simple Challenging Challenging Alignment, backbone shifts
Difficult Very difficult
Significant errors
Alignment, backbone shifts
Often impossible
Significant errors
Often impossible
Recognition
![Page 15: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.](https://reader035.fdocuments.net/reader035/viewer/2022062722/56649f325503460f94c4f0e9/html5/thumbnails/15.jpg)
Hands-on Activity
Click below for a hands-on, “bioinformatics how to” activity
Go to
http://bioinformatics.burnham.org/
Click Structure Biology Course - “Protein Modeling Tutorial” Link in the homepage.
OR Go to….
http://bioinformatics.burnham.org/SSBC/modeling.html