Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)
-
Upload
isaiah-taylor -
Category
Documents
-
view
219 -
download
3
Transcript of Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)
![Page 1: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/1.jpg)
Protein Folding
Bioinformatics Ch 7
(with a little of Ch 8)
![Page 2: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/2.jpg)
The Protein Folding Problem
• Central question of molecular biology:“Given a particular sequence of amino acid Given a particular sequence of amino acid residues (primary structure), what will the residues (primary structure), what will the tertiary/quaternary structure of the resulting tertiary/quaternary structure of the resulting protein be?”protein be?”
• Input: AAVIKYGCAL…Output: 11, 22…= backbone conformation:(no side chains yet)
![Page 3: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/3.jpg)
Disulfide Bonds
• Two cyteines in close proximity will form a covalent bond
• Disulfide bond, disulfide bridge, or dicysteine bond.
• Significantly stabilizes tertiary structure.
![Page 4: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/4.jpg)
Protein Folding – Biological perspective
• ““Central dogma”: Central dogma”: Sequence specifies structureSequence specifies structure• Denature – to “unfold” a protein back to random
coil configuration -mercaptoethanol – breaks disulfide bonds– Urea or guanidine hydrochloride – denaturant– Also heat or pH
• Anfinsen’s experiments– Denatured ribonuclease– Spontaneously regained enzymatic activity– Evidence that it re-folded to native conformation
![Page 5: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/5.jpg)
Folding intermediates
• Levinthal’s paradox – Consider a 100 residue protein. If each residue can take only 3 positions, there are 3100 = 5 1047 possible conformations.– If it takes 10-13s to convert from 1 structure to another,
exhaustive search would take 1.6 1027 years!
• Folding must proceed by progressive stabilization of intermediates– Molten globules – most secondary structure formed,
but much less compact than “native” conformation.
![Page 6: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/6.jpg)
Forces driving protein folding
• It is believed that hydrophobic collapse is a key driving force for protein folding– Hydrophobic core– Polar surface interacting with solvent
• Minimum volume (no cavities)• Disulfide bond formation stabilizes• Hydrogen bonds• Polar and electrostatic interactions
![Page 7: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/7.jpg)
Folding help
• Proteins are, in fact, only marginally stable– Native state is typically only 5 to 10 kcal/mole
more stable than the unfolded form
• Many proteins help in folding– Protein disulfide isomerase – catalyzes
shuffling of disulfide bonds– Chaperones – break up aggregates and (in
theory) unfold misfolded proteins
![Page 8: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/8.jpg)
The Hydrophobic Core
• Hemoglobin A is the protein in red blood cells (erythrocytes) responsible for binding oxygen.
• The mutation E6V in the chain places a hydrophobic Val on the surface of hemoglobin
• The resulting “sticky patch” causes hemoglobin S to agglutinate (stick together) and form fibers which deform the red blood cell and do not carry oxygen efficiently
• Sickle cell anemia was the first identified molecular disease
![Page 9: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/9.jpg)
Sickle Cell Anemia
Sequestering hydrophobic residues in Sequestering hydrophobic residues in the protein core protects proteins from the protein core protects proteins from hydrophobic agglutination.hydrophobic agglutination.
![Page 10: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/10.jpg)
Computational Problems in Protein Folding
• Two key questions:– Evaluation – how can we tell a correctly-folded protein
from an incorrectly folded protein?• H-bonds, electrostatics, hydrophobic effect, etc.• Derive a function, see how well it does on “real” proteins
– Optimization – once we get an evaluation function, can we optimize it?
• Simulated annealing/monte carlo• EC• Heuristics
![Page 11: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/11.jpg)
Fold Optimization
• Simple lattice models (HP-models)– Two types of residues:
hydrophobic and polar– 2-D or 3-D lattice– The only force is
hydrophobic collapse– Score = number of HH
contacts
![Page 12: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/12.jpg)
• H/P model scoring: count noncovalent hydrophobic interactions.
• Sometimes:– Penalize for buried polar or surface hydrophobic residues
Scoring Lattice Models
![Page 13: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/13.jpg)
What can we do with lattice models?
• For smaller polypeptides, exhaustive search can be used– Looking at the “best” fold, even in such a simple
model, can teach us interesting things about the protein folding process
• For larger chains, other optimization and search methods must be used– Greedy, branch and bound– Evolutionary computing, simulated annealing– Graph theoretical methods
![Page 14: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/14.jpg)
• The “hydrophobic zipper” effect:
Learning from Lattice Models
Ken Dill ~ 1997
![Page 15: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/15.jpg)
• Absolute directions– UURRDLDRRU
• Relative directions– LFRFRRLLFL– Advantage, we can’t have UD or RL in absolute– Only three directions: LRF
• What about bumps? LFRRR– Bad score– Use a better representation
Representing a lattice model
![Page 16: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/16.jpg)
Preference-order representation
• Each position has two “preferences”– If it can’t have either of the two, it will take the “least
favorite” path if possible
• Example: {LR},{FL},{RL},{FR},{RL},{RL},{FR},{RF}
• Can still cause bumps:{LF},{FR},{RL},{FL},{RL},{FL},{RF},{RL},{FL}
![Page 17: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/17.jpg)
“Decoding” the representation
• The optimizer works on the representation, but to score, we have to “decode” into a structure that lets us check for bumps and score.
• Example: How many bumps in: URDDLLDRURU?
• We can do it on graph paper– Start at 0,0– Fill in the graph
![Page 18: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/18.jpg)
More realistic models
• Higher resolution lattices (45° lattice, etc.)
• Off-lattice models– Local moves– Optimization/search methods and /
representations• Greedy search
• Branch and bound
• EC, Monte Carlo, simulated annealing, etc.
![Page 19: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/19.jpg)
Threading: Fold recognition
• Given:– Sequence: IVACIVSTEYDVMKAAR…
– A database of molecular coordinates
• Map the sequence onto each fold
• Evaluate– Objective 1: improve scoring
function– Objective 2: folding
![Page 20: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/20.jpg)
X-Ray Crystallography
~0.5mm
• The crystal is a mosaic of millions of copies of the protein.
• As much as 70% is solvent (water)!
• May take months (and a “green” thumb) to grow.
![Page 21: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/21.jpg)
X-Ray diffraction
• Image is averagedover:– Space (many copies)– Time (of the diffraction
experiment)
![Page 22: Protein Folding Bioinformatics Ch 7 (with a little of Ch 8)](https://reader035.fdocuments.net/reader035/viewer/2022081518/5515d7c1550346cf6f8b493c/html5/thumbnails/22.jpg)
The Protein Data Bank
ATOM 1 N ALA E 1 22.382 47.782 112.975 1.00 24.09 3APR 213ATOM 2 CA ALA E 1 22.957 47.648 111.613 1.00 22.40 3APR 214ATOM 3 C ALA E 1 23.572 46.251 111.545 1.00 21.32 3APR 215ATOM 4 O ALA E 1 23.948 45.688 112.603 1.00 21.54 3APR 216ATOM 5 CB ALA E 1 23.932 48.787 111.380 1.00 22.79 3APR 217ATOM 6 N GLY E 2 23.656 45.723 110.336 1.00 19.17 3APR 218ATOM 7 CA GLY E 2 24.216 44.393 110.087 1.00 17.35 3APR 219ATOM 8 C GLY E 2 25.653 44.308 110.579 1.00 16.49 3APR 220ATOM 9 O GLY E 2 26.258 45.296 110.994 1.00 15.35 3APR 221ATOM 10 N VAL E 3 26.213 43.110 110.521 1.00 16.21 3APR 222ATOM 11 CA VAL E 3 27.594 42.879 110.975 1.00 16.02 3APR 223ATOM 12 C VAL E 3 28.569 43.613 110.055 1.00 15.69 3APR 224ATOM 13 O VAL E 3 28.429 43.444 108.822 1.00 16.43 3APR 225ATOM 14 CB VAL E 3 27.834 41.363 110.979 1.00 16.66 3APR 226ATOM 15 CG1 VAL E 3 29.259 41.013 111.404 1.00 17.35 3APR 227ATOM 16 CG2 VAL E 3 26.811 40.649 111.850 1.00 17.03 3APR 228
• http://www.rcsb.org/pdb/