6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions...
Transcript of 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions...
![Page 1: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/1.jpg)
Programme 8.00-8.30 Last week’s quiz results 8.30-9.00 Prediction of secondary structure &
surface exposure 9.00-9.20 Protein disorder prediction 9.20-9.30 Break – get computers upstairs 9.30-11.00 Ex.: Secondary structure prediction 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
1
![Page 2: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/2.jpg)
Feedback Persons
2
![Page 3: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/3.jpg)
Programme 8.00-8.30 Last week’s quiz results 8.30-9.00 Prediction of secondary structure &
surface exposure 9.00-9.20 Protein disorder prediction 9.20-9.30 Break 9.30-11.00 Ex.: Secondary structure prediction 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
3
![Page 4: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/4.jpg)
1-D Predictions
Prediction of local features: Secondary structure
& surface exposure
4
![Page 5: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/5.jpg)
Learning Objectives After today’s session you should be able to:
– Explain the meaning and usage of the following local feature terms:
• Secondary structure • Surface accessibility/exposure • Transmembrane helix • Signal peptide • Protein disorder
– Use different 1-D prediction servers and interpret the results (the exercise).
5
![Page 6: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/6.jpg)
Residue Patterns Helices
– Helix capping – Amphiphilic residue
patterns
Sheets – Amphiphilic residue
patterns – Residue preferences at
edges vs. middle
Special residues – Proline
• Helix breaker
– Glycine • In turns/loops/bends
N
C
6
![Page 7: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/7.jpg)
1-D predictions Local Structures " Secondary Structure " Trans Membrane Helix
Features " Surface Accessibility " Signal Peptides
7
![Page 8: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/8.jpg)
Secondary Structure Elements α-helix = H 310-helix = G π-helix = I Extended (ß)-Strand = E Isolated ß-bridge = B Turn = T Bend = S
Rest (Coil) = C/.
8
![Page 9: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/9.jpg)
Assignment from Structure
• DSSP ( http://www.cmbi.kun.nl/gv/dssp/ )• STRIDE ( http://www.hgmp.mrc.ac.uk/Registered/Option/stride.html )• DSSPcont ( http://cubic.bioc.columbia.edu/services/DSSPcont/ )
9
![Page 10: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/10.jpg)
Helices
10
![Page 11: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/11.jpg)
Α-helix = H 310 - helix = G π-helix = I Extended (ß)-Strand = E Isolated ß-bridge = B Turn = T Bend = S The Rest (Coil) = ./C
Three-State Prediction of Classes
H
E
C
11
![Page 12: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/12.jpg)
Prediction Servers
Psi-Pred (http://bioinf.cs.ucl.ac.uk/psipred/)
PHDProf Jpred
12
![Page 13: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/13.jpg)
PsiPRED PSIPRED PREDICTION RESULTS""Key""Conf: Confidence (0=low, 9=high)"Pred: Predicted secondary structure (H=helix, E=strand, C=coil)" AA: Target sequence"""# PSIPRED HFORMAT (PSIPRED V2.3 by David Jones)""Conf: 962265677776523477650688877787645776578999999733875215678887"Pred: CCCHHHHHHHHHHHCCCCCCCHHHHHHHHHHHCCCCCCHHHHHHHHHCCCCCCHHHHHHH" AA: MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRPILSPLTKGIL" 10 20 30 40 50 60"""Conf: 754642045401245555330224688880246788999999865213001344431012"Pred: HHHHHHCCCCHHHHHHHHHHHCCCCCCCCCHHHHHHHHHHHHHHHHCCHHHHHHHHHCCC" AA: GFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLYRKLKREITFHGAKEISLSYS" 70 80 90 100 110 120"""Conf: 113899999987067751045678889988888742346778777764042033332466"Pred: HHHHHHHHHHHHHCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCHHHHHHHHH" AA: AGALASCMGLIYNRMGAVTTEVAFGLVCATCEQIADSQHRSHRQMVTTTNPLIRHENRMV" 130 140 150 160 170 180"""Conf: 554368888741366024789999999999999999862489875310478899999998"Pred: HHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCHHHHHHHHHHHHH" AA: LASTTAKAMEQMAGSSEQAAEAMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAY" 190 200 210 220 230 240"""Conf: 886363002159"Pred: HHHHCCHHHCCC" AA: QKRMGVQMQRFK" 250"""Calculate PostScript, PDF and JPEG graphical output for this result using: "http://bioinf2.cs.ucl.ac.uk/cgi-bin/psipred/gra/nph-view2.cgi?id=3644f256afcf5ec3.psi"
13
![Page 14: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/14.jpg)
PsiPred
14
![Page 15: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/15.jpg)
Trans-Membrane Helices
15
![Page 16: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/16.jpg)
Transmembrane Helix Predictors
TMHMM HMMTOP DAS
16
![Page 17: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/17.jpg)
Signal Peptide
SignalP Phobius Philius
17
![Page 18: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/18.jpg)
Prediction Methods
Exemplified by Secondary Structure Predictions
18
![Page 19: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/19.jpg)
Amino Acid Statistics
VKEFLAKAKEDFLKKWETPSQNTAQLDQFDRIKTLGTGSFGRVMLVKHKESGNHYAMKILDKQKVVKLKQIEHTLNEKRI!.HHHHHHHHHHHHHHHHS.......GGGEEEEEEEEE.SS.EEEEEEETTTTEEEEEEEEEHHHHHHTT.HHHHHHHHHH!
VKEFLAKAK!
KEFLAKAKE!
EFLAKAKED!!.!.!.!.!.!
Helix QLDQFDRIK!
LDQFDRIKT!
DQFDRIKTL!!.!.!.!.!.!
Strand KKWETPSQN!
KWETPSQNT!
WETPSQNTA!!.!.!.!.!.!
Coil
19
![Page 20: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/20.jpg)
Propensities
Helix
20
![Page 21: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/21.jpg)
BLOSUM Substitution A R N D C Q E G H I L K M F P S T W Y V B Z X * A 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 -2 -1 0 -4 R -1 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 -1 0 -1 -4 N -2 0 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 3 0 -1 -4 D -2 -2 1 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 4 1 -1 -4 C 0 -3 -3 -3 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 -3 -3 -2 -4 Q -1 1 0 0 -3 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 0 3 -1 -4 E -1 0 0 2 -4 2 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 1 4 -1 -4 G 0 -2 0 -1 -3 -2 -2 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 -1 -2 -1 -4 H -2 0 1 -1 -3 0 0 -2 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 0 0 -1 -4 I -1 -3 -3 -3 -1 -3 -3 -4 -3 4 2 -3 1 0 -3 -2 -1 -3 -1 3 -3 -3 -1 -4 L -1 -2 -3 -4 -1 -2 -3 -4 -3 2 4 -2 2 0 -3 -2 -1 -2 -1 1 -4 -3 -1 -4 K -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 -1 -3 -1 0 -1 -3 -2 -2 0 1 -1 -4 M -1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 0 -2 -1 -1 -1 -1 1 -3 -1 -1 -4 F -2 -3 -3 -3 -2 -3 -3 -3 -1 0 0 -3 0 6 -4 -2 -2 1 3 -1 -3 -3 -1 -4 P -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 -1 -1 -4 -3 -2 -2 -1 -2 -4 S 1 -1 1 0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 1 -3 -2 -2 0 0 0 -4 T 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1 -1 -1 -2 -1 1 5 -2 -2 0 -1 -1 0 -4 W -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3 -2 11 2 -3 -4 -3 -2 -4 Y -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 -1 -3 -2 -1 -4 V 0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 -3 -2 -1 -4
21
![Page 22: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/22.jpg)
Position Specific Substitution Matrices (PSSM)
22
![Page 23: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/23.jpg)
PSSM
A R N D C Q E G H I L K M F P S T W Y V 1 I -2 -4 -5 -5 -2 -4 -4 -5 -5 6 0 -4 0 -2 -4 -4 -2 -4 -3 4 2 K -1 -1 -2 -2 -3 -1 3 -3 -2 -2 -3 4 -2 -4 -3 1 1 -4 -3 2 3 E 5 -3 -3 -3 -3 3 1 -2 -3 -3 -3 -2 -2 -4 -3 -1 -2 -4 -3 1 4 E -4 -3 2 5 -6 1 5 -4 -3 -6 -6 -2 -5 -6 -4 -2 -3 -6 -5 -5 5 H -4 2 1 1 -5 1 -2 -4 9 -5 -2 -3 -4 -4 -5 -3 -4 -5 1 -5 6 V -3 0 -4 -5 -4 -4 -2 -3 -5 1 -2 1 0 1 -4 -3 3 -5 -3 5 7 I 0 -2 -4 1 -4 -2 -4 -4 -5 1 0 -2 0 2 -5 1 -1 -5 -3 4 8 I -3 0 -5 -5 -4 -2 -5 -6 1 2 4 -4 -1 0 -5 -2 0 -3 5 -1 9 Q -2 -3 -2 -3 -5 4 -1 3 5 -5 -3 -3 -4 -2 -4 2 -1 -4 2 -2 10 A 2 -4 -4 -3 2 -3 -1 -4 -2 1 -1 -4 -3 -4 1 2 3 -5 -1 1 11 E -1 3 1 1 -1 0 1 -4 -3 -1 -3 0 3 -5 4 -1 -3 -6 -3 -1 12 F -3 -5 -5 -5 -4 -4 -4 -1 -1 1 1 -5 2 5 -1 -4 -4 -3 5 2 13 Y 3 -5 -5 -6 3 -4 -5 -2 -1 0 -4 -5 -3 3 -5 -2 -2 -2 7 1 14 L -1 -3 -4 -2 1 5 1 -1 -1 -1 1 -3 -3 1 -5 -1 -1 -2 3 -2 15 N -1 -4 4 1 5 -3 -4 2 -4 -4 -4 -3 -2 -4 -5 2 0 -5 0 0 16 P -2 4 -4 -4 -5 0 -3 3 2 -5 -4 0 -4 -3 0 1 -2 -1 5 -3 17 D -3 -2 1 5 -6 -2 2 2 -1 -2 -2 -3 -5 -4 -5 -1 2 -6 -3 -4
23
![Page 24: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/24.jpg)
Neural Networks Benefits
– Generally applicable – Can capture higher order correlations – Inputs other than sequence information
Drawbacks – Needs a lot of data (different solved structures
with low sequence identity). – Complex methods with several pitfalls.
24
![Page 25: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/25.jpg)
Neural Networks
I K E E H V I I Q A E
H E C
IKEEHVIIQAEFYLNPDQSGEF….. Window
Input Layer
Hidden Layer
Output Layer
Weights
25
![Page 26: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/26.jpg)
NetSurfP
Prediction of Real Value Solvent Accessibility
By Bent Petersen
26
![Page 27: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/27.jpg)
Objective Predict residues as being either buried or
exposed (25 % threshold) – Two states/classes, Buried/Exposed
Predict the Relative Solvent Accessibility – “Real” Value
27
![Page 28: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/28.jpg)
Why predict RSA?
Residues exposed on surface can be: – Involved in PTM’s – Potential antigenic regions – Involved in Protein-Protein interactions – Prediction of Disease-SNP’s
28
![Page 29: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/29.jpg)
What is ASA?
Accessible Solvent Area, Å2
Surface area accessible to a rolling water molecule
29
![Page 30: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/30.jpg)
RSA
RSA = Relative Solvent Accessibility ACC = Accessible area in protein structure ASA = Accessible Surface Area in Gly-X-Gly or Ala-X-Ala
Classification Networks “Real” value Networks
Classification: Buried = RSA < 25 %, Exposed = RSA > 25 %"“Real” Value: values 0 - 1, RSA > 1 set to 1"
30
![Page 31: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/31.jpg)
Learning / Training dataset Training set: Cull_1764:
– Max. Seq. ID: 25 % – Resolution: ≤ 2.0 Å – R-Factor: ≤ 0.2 – Seq. Length 30-3000 AA – Excluding non X-ray entries
31
![Page 32: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/32.jpg)
Learning / Training dataset Homology reduced against evaluation set
CB513 (302 sequences removed)
Final Training set: – 1764 sequences – 417.978 amino acids
• Buried: 55.80 % (233.221 amino acids) • Exposed: 44.20 % (184.757 amino acids)
32
![Page 33: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/33.jpg)
Neural Network - Input Position Specific Scoring Matrices, PSSM
A R N D C Q E G H I L K M F P S T W Y V
B H 2BEM.A 1 -4 -3 -2 -4 -6 -2 -3 -5 11 -6 -5 -3 -4 -4 -5 -3 -4 -5 -1 -6
A G 2BEM.A 2 -2 -5 -3 -4 -5 -4 -5 7 -5 -7 -6 -4 -5 -6 -5 -3 -4 -5 -6 -6
A Y 2BEM.A 3 -1 1 -4 -3 -5 -4 -4 -4 1 -4 -1 -4 -1 2 -5 0 -1 4 7 -2
A V 2BEM.A 4 -1 -5 -5 -6 -4 -4 -5 -5 -5 4 1 -5 6 -3 -2 -2 0 -5 -4 4
B E 2BEM.A 5 -2 -4 -3 0 -4 -1 3 -2 -4 0 -3 -2 1 -2 -3 3 3 -5 -4 0
4 time iterativ psi-blast against nr70
• Secondary Structure predictions B H 2BEM.A 1 0.003 0.003 0.966
A G 2BEM.A 2 0.018 0.086 0.868
A Y 2BEM.A 3 0.020 0.199 0.752
A V 2BEM.A 4 0.021 0.271 0.679
B E 2BEM.A 5 0.020 0.199 0.752
(sec predictor by Pernille Andersen)
33
![Page 34: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/34.jpg)
Method
34
![Page 35: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/35.jpg)
Results - Real Value Prediction
Training / Evaluation
Train Evaluated Method
Ahmad et al. (2003) Not Published 0.48 ANN
Yuan and Huang (2004) Not Published 0.52 SVR
Nguyen and Rajapakse(2006) Not Published 0.66 Two-Stage SVR
Dor and Zhou (2007) 0.738 Not Published ANN
NetSurfP 0.722 0.70 ANN
35
![Page 36: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/36.jpg)
NetSurfP
/usr/cbs/bio/src/NetSurfP/NetSurfP -h
36
![Page 37: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/37.jpg)
NetSurfP Output
37
![Page 38: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/38.jpg)
Programme 8.00-8.30 Last week’s quiz results 8.30-9.00 Prediction of secondary structure &
surface exposure 9.00-9.20 Protein disorder prediction 9.20-9.30 Break 9.30-11.00 Ex.: Secondary structure prediction 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
38
![Page 39: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/39.jpg)
Introduction to
DisEMBL, IUPred & FoldUnfold
Protein D i s o r d e r 39
![Page 40: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/40.jpg)
Protein Folding Initially formed
structure is in molten globule state (ensemble).
Molten globule condenses to native fold via transition state.
E
U
F
T
ΔG
Unfolded state, ensemble
Native fold, one structure
Transition state(s), one or more narrow ensembles
40
![Page 41: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/41.jpg)
Degrees of Structure
41
![Page 42: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/42.jpg)
Structures of Unstructured Regions
Estimate: 20% of all proteins contain unstructured regions. – 1% of structures in PDB contain
unstructured regions.
Structural genomics – Special structural genomics
projects – Selection and modification of
targets – Prediction of crystallisable
domains
Protein disorder publications in PubMed
Iakoucheva & Dunker Structure 2003
42
![Page 43: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/43.jpg)
What’s the Fuss About? Properties of Disordered Regions
– Flexible, i.e. adaptable – Accessible
• Contain Extended Linear Motifs (ELM)
– Different behaviour in interaction interfaces • Very adaptable • Many hydrophobic interactions (close packing)
– No fixed structure without interaction partner – Folding upon binding
43
![Page 44: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/44.jpg)
DisEMBL Basic notion
– No consensus on protein disorder definition. – Defines three types of disorder
The method – ANN-based
Disorder definitions – Loop/Coil (DSSP-assigned residues: T, S, B, I) – Hot loops (high B-factor) – Missing residues (in X-ray structures, “Remark 465”)
Linding et al. Structure 2003 44
![Page 45: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/45.jpg)
DisEMBL Derived propensity scale (implicit)
45
![Page 46: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/46.jpg)
DisEMBL Output Ero1-Lα
46
![Page 47: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/47.jpg)
IUPred Basic notion:
– Globular proteins need to make a large number of inter-residue interactions to overcome the loss of entropy upon folding.
The method – 20 x 20 energy predictor matrix (pairwise interactions).
• Derived from globular proteins. – Quadratic expression in amino acid composition.
Definitions – Binary definition: Order/disorder – Two ranges:
• long ~ regions/domains • Short ~ loops
– Domain prediction (inverse of long range predictions).
Dosztanáyi et al. Bioinformatics 2005 47
![Page 48: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/48.jpg)
IUPred Output Ero1-Lα
Position Residue Disorder Tendency 1 E 0.5055 2 E 0.3740 3 Q 0.1731 4 P 0.2164 5 P 0.1852
…
…
…
48
![Page 49: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/49.jpg)
FoldUnfold Basic notion
– Globular proteins need to establish a high number of interactions to compensate for the loss of entropy during the folding process.
The method – Mean packing density
• Derived from globular proteins. – ANN-based.
Definitions – Binary definition: Order/disorder – Two ranges:
• Long ~ regions/domains • Short ~ loops
Galzitskaya et al. Bioinformatics 2006
& Protein Science 2000 49
![Page 50: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/50.jpg)
FoldUnfold Output
Ero1-Lα
disordered: 77 ̶̶ 99 disordered: 110 ̶̶ 126 disordered: 135 ̶̶ 152 disordered: 196 ̶̶ 207 disordered: 341 ̶̶ 351
50
![Page 51: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/51.jpg)
Comparison
Disordered residues: 77 ̶̶ 99 110 ̶̶ 126 135 ̶̶ 152 196 ̶̶ 207 341 ̶̶ 351
DisEMBL
IUPred
FoldUnfold
51
![Page 52: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/52.jpg)
Ero1 example
52
![Page 53: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/53.jpg)
Links
DisEMBL: – http://dis.embl.de/
IUPred: – http://iupred.enzim.hu/
FoldUnfold – http://skuld.protres.ru/~mlobanov/ogu/
53
![Page 54: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/54.jpg)
Programme 8.00-8.30 Last week’s quiz results 8.30-9.00 Prediction of secondary structure &
surface exposure 9.00-9.20 Protein disorder prediction 9.20-9.30 Break 9.30-11.00 Ex.: Secondary structure prediction 11.00-11.10 Break 11.10-11.40 Summary & discussion 11.40-12.00 Quiz
54
![Page 55: 6 1D predictions 2012 - cbs.dtu.dkblicher/Courses/6_1D_predictions_2012.pdf · 1-D Predictions Prediction of local features: Secondary structure & surface exposure 4 . Learning Objectives](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057541435d0d6366e784273/html5/thumbnails/55.jpg)
Exercise
http://xray.bmc.uu.se/gerard/embo2001/predic/index.html Step 1-5
55