Bioinformatics how to … use publicly available free tools to predict protein structure by...

15
Bioinformatics how to use publicly available free tools to predict protein structure by comparative modeling

Transcript of Bioinformatics how to … use publicly available free tools to predict protein structure by...

Page 1: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Bioinformatics how to …

use publicly available free tools to predict protein

structure by comparative modeling

Page 2: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Proteins are 3D objects with complex shapes

Over 60,000 protein structures have been determined, mostly by X-ray crystallography (PDB)

3D structure of ~70% of bacterial and 50% of human proteins can be predicted (comparative modeling)

Page 3: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

A predicted model simply illustrates our assumptions

No assumptions, thisis nature telling us how it is

GNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHKETGNHFAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEYSFKDNSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQVTDFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKDGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF

Sequence Assumption(protein A is

Similar to protein B)

Result(protein A is

Similar to protein B)

Page 4: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Unknown protein GLLTTKFVSLLQEAKDGVLD

LKLAADTLAVRQKRRIYDITNVLEGIGLIEKKSKNSIQW

Well studied protein

SRRSASHPTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAA

similarity

prediction

How do we know that these proteins are similar?

Page 5: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

How can we make such assumptions?

Statistical reliability of the prediction E-value - the number of hits one can

"expect" to see just by chance when searching a database of a particular size (closer to zero the better)

Z-score – score expressed as a distance from the mean calculated in standard deviations (the bigger the better)

Page 6: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Similar, but not homologous

phosphoribosyltransferase and viral coat protein, identity: 42%, different folds, different functions

. . . . . 99 IRLKSYCNDQSTGDIKVIGGDDLSTLTGKNVLIVEDIIDTGKTMQTLLSLVRQY.NPKMVKVASLLVKRTPRSVGY 173 : ||. ||| || |. || | : | | | | || | || |:| | ||.| |214 VPLKTDANDQ.IGDSLY....SAMTVDDFGVLAVRVVNDHNPTKVT..SKVRIYMKPKHVRV...WCPRPPRAVPY 279

Page 7: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Different, but homologous Histone H5 and transcription factor E2F4, identity 7%, similar fold, similar

function (DNA binding)

PTYSEMIAAAIRAEKSRGGSSRQSIQKYIKSHYKVGHNADLQIKLSIRRLLAAGVLKQTKGVGASGSFRL | | | | |

GLLTTKFVSLLQEAKD-GVLDLKLAADTLA------VRQKRRIYDITNVLEGIGLIEKKS----KNSIQW

Page 8: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Steps in comparative modeling

Recognition

Model analysis

Are there any well characterizedproteins similar to my protein?

What is the detailed 3D structure of my proteins

Is my model any good?

Modeling

AlignmentWhat is the position-by-positiontarget/template equivalence

Page 9: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Recognition

BLAST, PSI-BLAST or PFAM, FFAS, metaserver (bioinfo)

Name (PDB code) of the template

Statistical significance of the match (Z-score, e.value, p.value, points)

Page 10: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Alignment

The same tools as in recognition (perhaps with different parameters), editing by hand

Position by position equivalence table

Page 11: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Modeling Commercial

programs Accelrys (Insight) Tripos (Sybyl) …

Freeware/shareware/servers Modeller (Andrej

Sali) Jackal (Barry

Honig) SCRWL (Roland

Dunbrack) SwissModel

Page 12: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Model quality

Empirical energy based tools PSQS

(http://www1.jcsg.org/psqs/psqs.cgi) SwissPDB viewer

Geometric quality Procheck, SFCHECK, etc.

(http://www.jcsg.org/scripts/prod/validation/sv3.cgi)

Page 13: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

75

50

25

0

Easy – 100-40% sequence id - strong sequencesimilarity, strong structure similarity,obvious function analogy

Difficult – 40%-25% - twilight zonesequence similarity, increasing structure divergence, function diversification

Fold prediction – below 25% seq id.no apparent sequence similarityextreme function divergence

Expectations of comparative modeling

Page 14: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Challenges of comparative modeling

100

80

60

40

20

Recognition Alignment Modeling Challenges

Trivial Trivial Simple Loop modeling

Trivial Easy Simple Loop modeling

Simple Challenging Challenging Alignment, backbone shifts

Difficult Very difficult

Significant errors

Alignment, backbone shifts

Often impossible

Significant errors

Often impossible

Recognition

Page 15: Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.

Hands-on Activity

Click below for a hands-on, “bioinformatics how to” activity

Go to

http://bioinformatics.burnham.org/

Click Structure Biology Course - “Protein Modeling Tutorial” Link in the homepage.

OR Go to….

http://bioinformatics.burnham.org/SSBC/modeling.html