. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
-
date post
21-Dec-2015 -
Category
Documents
-
view
220 -
download
4
Transcript of . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
![Page 1: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/1.jpg)
.
Protein Structure Prediction
[Based on Structural Bioinformatics, section VII]
![Page 2: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/2.jpg)
Predicting protein 3d structure
Goal: 3d structure from 1d sequence
What kind of fold the given sequence may
adopt?
Fold recognition
Comparative modeling
ab-initio
An existing fold
A new fold
![Page 3: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/3.jpg)
Measuring progress
CASP – Critical Assessment of Structure Prediction
CAFASP – Critical Assessment of Fully Automated Structure Prediction
Targets: unpublished NMR or X-ray structuresGoal: predict target 3d structure and submit it
for independent and comparative review
![Page 4: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/4.jpg)
What Forces Hold the Structure?
Hydrogen Bonds
![Page 5: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/5.jpg)
What Forces Hold the Structure?
• Charge-charge interactions• Positive charged groups prefer to be
situated against negatively charged groups
• Hydrophobic effect
![Page 6: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/6.jpg)
What Forces Hold the Structure?
Disulfide bonds S-S bonds between
Cysteine residues
![Page 7: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/7.jpg)
Homology modeling
Based on the two major observations:
1. The structure of a protein is uniquely defined by its amino acid sequence.
2. Similar sequences adopt practically identical structures, distantly related sequences still fold into similar structures.
![Page 8: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/8.jpg)
Growth of the Protein Data Bank
![Page 9: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/9.jpg)
Fraction of New Folds
![Page 10: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/10.jpg)
[Rost, Protein Eng. 1999]
Two zones of sequence alignment
![Page 11: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/11.jpg)
The 7 steps to homology modeling
1. Template recognition and initial alignment― BLAST, FASTA
2. Alignment correction― Better alignment, MSA
![Page 12: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/12.jpg)
The 7 steps to homology modeling
3. Backbone generation― Copy backbone atoms [and side-chains
of conserved residues]
4. Loop modeling― Knowledge based― Energy based
![Page 13: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/13.jpg)
The 7 steps to homology modeling
5. Side-chain modeling― Rotamer: a low energy
side-chain conformation― Rotamer library [backbone
independent, dependent]― HUGE search space [~5N]
High accuracy for residues in the hydrophobic core [90%], much lower for residues in the surface [50%]
![Page 14: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/14.jpg)
The 7 steps to homology modeling
6. Model optimization― Predict the side-chains, then the resulting
shifts in the backbone, then the rotamers for the new backbone …
7. Model validation― Calculating the model’s energy― Determination of normality indices:
― bond lengths, bond and torsion angles― Inside/outside distribution of polar residues― Radial distribution function
![Page 15: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/15.jpg)
Predicting protein 3d structure
Goal: 3d structure from 1d sequence
What kind of fold the given sequence may
adopt?
Fold recognition
Comparative modeling
ab-initio
An existing fold
A new fold
![Page 16: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/16.jpg)
Fold recognition
Which of the known folds is likely to be similar to the (unknown) fold of a new protein when only its amino-acid sequence is known?
![Page 17: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/17.jpg)
Fraction of new folds (PDB new entries in 1998)
Koppensteiner et al., 2000,Koppensteiner et al., 2000,JMB 296:1139-1152.JMB 296:1139-1152.
![Page 18: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/18.jpg)
Unrelated proteins adopt similar folds
Only 100 folds account for ~50% of all protein superfamilies
Possible explanations:1. Divergent evolution2. Convergent evolution3. Limited number of folds4. Misguided analysis
![Page 19: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/19.jpg)
Proteins as seen by a Biologist
Does a new protein sequence belong to a given family of proteins (with a specific set of mutation rules)?
Fold recognition is based on:• Sequence alignment, multiple sequence
alignment• Profile HMM, PSI-BLAST
![Page 20: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/20.jpg)
Proteins as seen by a Physicist
“Thermodynamic hypothesis”: The native conformation of a protein corresponds to a global free energy minimum of the system (protein + solvent)
Naïve approach: having a correct energy function, search for the native structure in the conformational space
![Page 21: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/21.jpg)
Threading
Threading: energy based fold recognition
Define:1. Protein model and interaction description2. Alignment algorithm3. Energy parameterization
11
22
33
44
55
66
77
1010
88
99
AA
CC
CC
EE
CC
AA
DDAA
AA
CCEEabab A C D E …..
A -3 -1 0 0 ..C -1 -4 1 2 ..D 0 1 5 6 ..E 0 2 6 7 ... . . . .
E Eji, positions
ba ji
![Page 22: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/22.jpg)
MAHFPGFGQSLLFGYPVYVFGD...
Potential fold
...
1) ... 56) ... n)
...
-10 ... -123 ... 20.5
Find best fold for a protein sequence:
Fold recognition (threading)
![Page 23: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/23.jpg)
GenTHREADER(Jones , 1999, JMB 287:797-815)
For each template provide MSA align the query sequence with the MSA assess the alignment by sequence
alignment score assess the alignment by pairwise
potentials assess the alignment by solvation function record lengths of: alignment, query,
template
![Page 24: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/24.jpg)
Essentials of GenTHREADER
![Page 25: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/25.jpg)
Predicting protein 3d structure
Goal: 3d structure from 1d sequence
What kind of fold the given sequence may
adopt?
Fold recognition
Comparative modeling
ab-initio
An existing fold
A new fold
![Page 26: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/26.jpg)
Ab-initio folding
Goal: Predict structure from “first principles”
Requires: A free energy function, sufficiently close to
the “true potential” A method for searching the conformational
space
Benefits: Works for novel folds Shows that we understand the process
![Page 27: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/27.jpg)
Ab-initio folding – the challenge
1. Current potential functions have limited accuracy
2. The conformational space is HUGE
Possible simplifications: Reduced representation Simplified potentials Coarse search strategies
![Page 28: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/28.jpg)
Representation
Detailed representation – include all atoms of the protein and the surrounding solvent computational expansive
• Implicit solvent models• United atom representation• Side-chain as centroid or cα
• Restricted side-chain configurations (rotamers)
• Restricted backbone torsion angles
![Page 29: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/29.jpg)
Rosetta[Simons et al. 1997]
• “Structural” signatures are reoccurring within protein structures
• Use these as cues during structure search
I-sites Library – a catalog of local sequence-structure correlations
Serine hairpin Type-I hairpin Frayed helix
![Page 30: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/30.jpg)
Fragment insertion Monte Carlo
Energyfunctionchange
backbone angles
Convert to 3D
accept or reject
Choose a fragment
frag
men
tsbackbone torsion angles
Rosetta: a folding simulation program
evaluate
![Page 31: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/31.jpg)
Potential functions
• Molecular mechanics – models the forces that determines protein conformation
• Van der Waals: Lennard-Jones 12-6• Electrostatic: Coulomb’s law
• Scoring functions – empirically derived from solved structures
• Useful with reduced complexity models• Useful in treating aspects of protein
thermodynamics
![Page 32: . Protein Structure Prediction [Based on Structural Bioinformatics, section VII]](https://reader030.fdocuments.net/reader030/viewer/2022032521/56649d575503460f94a35877/html5/thumbnails/32.jpg)
Search methods
• Molecular dynamics – Simulates the motion of a molecule in a given potential
• Impractical …
• Coarse sampling of energy landscape:• Simulated annealing, genetic algorithms,
…