Introductory biophysics A. Y. 2017-18 - Istituto …milotti/Didattica/...Edoardo Milotti...

60
Introductory biophysics A. Y. 2017-18 9. Proteins and their structure Edoardo Milotti Dipartimento di Fisica, Università di Trieste

Transcript of Introductory biophysics A. Y. 2017-18 - Istituto …milotti/Didattica/...Edoardo Milotti...

Introductory biophysicsA. Y. 2017-18

9. Proteins and their structure

Edoardo MilottiDipartimento di Fisica, Università di Trieste

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Lysozyme, a simple protein enzyme

from

D. C

. Phi

llips

, PNA

S 57

(196

7) 4

83

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Attacking BacteriaLysozyme protects us from the ever-present danger of bacterial infection. It is a smallenzyme that attacks the protective cell walls of bacteria.

Bacteria build a tough skin of carbohydrate chains, interlocked by short peptide strands,that braces their delicate membrane against the cell's high osmotic pressure. Lysozymebreaks these carbohydrate chains, destroying the structural integrity of the cell wall. Thebacteria burst under their own internal osmotic pressure.

The First AntibioticAlexander Fleming discovered lysozyme during a deliberate search for medical antibiotics.Over a period of years, he added everything that he could think of to bacterial cultures,looking for anything that would slow their growth.

He discovered lysozyme by chance. One day, when he had a cold, he added a drop of mucusto the culture and, much to his surprise, it killed the bacteria. He had discovered one of ourown natural defenses against infection. Unfortunately, lysozyme is a large molecule that isnot particularly useful as a drug. It can be applied topically, but cannot rid the entire bodyof disease, because it is too large to travel between cells. Fortunately, Fleming continued

his search, finding a true antibiotic drug five years later: penicillin.

(from http://www.rcsb.org/pdb/101/motm.do?momID=9 )

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Proteins are ubiquitous and carry out many functions in living organisms.

The lowest-level view of proteins is that they are linear heteropolymers. The individal monomers in these linear chains are aminoacids.

Edoardo Milo* - Introductory biophysics - A.Y. 2017-18

Proteins are ubiquitous and carry out many functions in living organisms.

The lowest-level view of proteins is that they are linear heteropolymers. The individal monomers in these linear chains are aminoacids.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

D. V

oeta

nd J.

G. V

oet,

“Bio

chem

istry

, 4th

ed.”,

Wile

y 20

11

carboxyl group, proton donor (acid)

amino group, proton acceptor (base)

side-chain

in chemistry, zwitterionsact at the same time as acid and bases

Aminoacids

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

D. V

oeta

nd J.

G. V

oet,

“Bi

oche

mis

try,

4th

ed.”,

Wile

y 20

11

Peptide bond

When an aminoacid shares two peptide bonds it has effectively lost the equivalent of one water molecule, and the whole structure is referred to as the “aminoacidresidue”.

Therefore its molecular weight in the peptide chain corresponds to the molecular weight of the free aminoacidminus the molecular weight of water.

The R group is called “sidechain”.

from

D. V

oet

and

J. G

. Voe

t, “B

ioch

emis

try,

4th

ed.”,

Wile

y 20

11

mass is often given in D = dalton = atomic mass unit

The residue masses are given for the neutral residues. For molecular masses of the parent amino acids, add 18.0 D, the molecular mass of H2O, to the residue masses.

For side chain masses, subtract 56.0 D, the formula mass of a peptide group, from the residue masses.

from

D. V

oeta

nd J.

G. V

oet,

“Bi

oche

mis

try,

4th

ed.”,

Wile

y 20

11

+ Xaa = unknown

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Disulfide bond

from

D. V

oeta

nd J.

G. V

oet,

“Bio

chem

istry

, 4th

ed.”,

Wile

y 20

11

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from D. Voet and J. G. Voet, “Biochemistry, 4th ed.”, Wiley 2011

Proteins as polypeptide chains

Note that DNA codes for the linear chain, but post-transcriptional modifications produce the final chemical configuration of the protein.

Newly synthesized polypeptides in the Endoplasmic Reticulum undergo five principal modifications before they reach their final destinations:

1. Formation of disulfide bonds (in the rough ER)2. Proper folding (in the rough ER)3. Addition and processing of carbohydrates (Golgi apparatus and ER)4. Specific proteolytic cleavages (Golgi apparatus and ER)5. Assembly into multimeric proteins (in the rough ER)

1. Nucleus 2. Nuclear pore 3. Rough endoplasmic reticulum (RER) 4. Smooth endoplasmic reticulum (SER) 5. Ribosome on the rough ER 6. Proteins that are transported 7. Transport vesicle 8. Golgi apparatus 9. Cis face of the Golgi apparatus 10. Trans face of the Golgi apparatus 11. Cisternae of the Golgi apparatus

(from https://en.wikipedia.org/wiki/Endoplasmic_reticulum)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

The Golgi apparatus

The Endoplasmic Reticulum

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Lysozyme

from

D. C

. Phi

llips

, PNA

S 57

(196

7) 4

83

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Folding of the hemagglutinin (HA) precursor polypeptide HA0 and formation of an HA0 trimer within the ER.

How do proteins fold in their final shape? The ER has an important role in this process.

Many reduced, denatured proteins can spontaneously refold into their native state in vitro. In most cases such refolding requires hours to reach completion, yet new secretory proteins generally fold into their proper conformation in the ER lumen within minutes after their synthesis. The ER contains several proteins that accelerate the folding of newly synthesized proteins within the ER lumen. Protein disulfide isomerase (PDI) is one such folding catalyst; the chaperone Hsc70 is another. Like cytosolic Hsc70, this ER chaperone transiently binds to proteins and prevents them from misfolding or forming aggregates, thereby enhancing their ability to fold into the proper conformation. Two other ER proteins, the homologous lectins calnexin and calreticulin, bind to certain carbohydrates attached to newly made proteins and aid in protein folding. (from Molecular Cell Biology. 4th edition. Lodish H, Berk A, Zipursky SL, et al.New York: W. H. Freeman; 2000.)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Modified residues(edited from http://www.uniprot.org/help/mod_res )

Chemical modifications include phosphorylation, methylation, acetylation, amidation, formation of pyrrolidone carboxylic acid, isomerization, hydroxylation, sulfation, flavin-binding, cysteine oxidation and nitrosylation. Here we consider only the three most common modification.

1. Phosphorylation

Phosphorylation refers to the transfer of a phosphate group to an amino acid. It is a key mechanism for signaling in both eukaryotic and prokaryotic cells. It can occur on a number of cytoplasmic and nuclear residues, i.e. on the hydroxyl group of serine, threonine or tyrosine, on the nitrogen of arginine, histidine or lysine, on the carboxyl group of aspartate, or on the sulfhydryl group of cysteine.

2. Methylation

Cytoplasmic and nuclear proteins can be enzymatically modified in several ways by the addition of methyl groups. Methylation reactions occurring on carboxyl groups can be reversible and modulate the activity of the target protein, while those on nitrogen atoms at the N-terminus and on side-chains are usually irreversible.

Phosphate group

Methyl group

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

3. Acetylation

N-terminal acetylation

N-terminal acetylation is one of the most common post-translational modifications in eukaryotes, but it is rare in prokaryotes. It refers to the addition of an acetyl group to the alpha-amino group of the first residue of a protein, often after the cleavage of the initiator methionine. The most commonly acetylated residues are glycine, alanine, serine or threonine. This reaction occurs in the cytosol. Methionine residues can also be modified if the next residue is an aspartate, glutamate, leucine, isoleucine, tryptophan, phenylalanine or asparagine residue. Note that the modified position may not correspond to the first amino acid of the displayed sequence if N-terminal acetylation occurs after proteolytic processing of the chain.

Internal acetylation

Internal acetylation is the addition of a N-alpha-acetyl group from to the side chain of a lysine residue. In eukaryotes, it generally takes place in the nucleus and affects mainly, but not exclusively, histones. It also occurs in prokaryotes.Lysine acetylation can compete with acetylation on the same residue, in which case both modifications are described as ‘alternate’.

Acetyl group

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

P. E

chen

ique

, “In

trodu

ctio

n to

pro

tein

fold

ing

for p

hysic

ists”

, Co

ntem

p. P

hys. 48

(200

7) 8

1

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

P. E

chen

ique

, “In

trodu

ctio

n to

pro

tein

fold

ing

for p

hysic

ists”

, Co

ntem

p. P

hys. 48

(200

7) 8

1

(Note: Xaa = unspecified aminoacid)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

P. E

chen

ique

, “In

trodu

ctio

n to

pro

tein

fold

ing

for p

hysic

ists”

, Co

ntem

p. P

hys. 48

(200

7) 8

1

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from D. Voet and J. G. Voet, “Biochemistry, 4th ed.”, Wiley 2011

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Angle is usually 180°, because of the partial double bond

ω

Angle distributions for 81,234 non-Gly, non-Pro, non-prePro residues, with backbone B-factor30 from the 500-structure high-resolution database, along with validation contours for favored and allowed regions. From S. C. Lovell et al., “Structure Validation by Cα Geometry: phi, psi and CβDeviation”, Proteins, 50 (2003) 437

Ramachandran plot

Beta sheets

Alpha helices

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Linus Carl Pauling

Born: 28 February 1901, Portland, OR, USA

Died: 19 August 1994, Big Sur, CA, USA

Nobel Prize in Chemistry in 1954 "for his research into the nature of the chemical bond and its application to the elucidation of the structure of complex substances”

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Linus Pauling and Robert Corey (A) and Herman Branson (B). Pauling’s deep understanding of chemical structure and bonding, his retentive memory for details, and his creative flair were all factors in in the discovery of the alpha-helix. The wooden helix between Pauling and Corey has a scale of 1 inch per Å, an enlargement of 254,000,000 times (from D. Eisenberg, “The discovery of the alpha-helix and beta-sheet, the principal structural features of proteins”, PNAS 100 (2003) 11207)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

hydrogen bond between N-H and C=O groups every fourth aminoacid leads to folding into alpha-helix

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from D. Voet and J. G. Voet, “Biochemistry, 4th ed.”, Wiley 2011

Helical structures are reinforced by the establishment of hydrogen bonds between widely spaced chemical groups

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

P. E

chen

ique

, “In

trodu

ctio

n to

pro

tein

fold

ing

for p

hysic

ists”

, Con

tem

p. P

hys. 48

(200

7) 8

1

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

D. V

oet a

nd J.

G. V

oet,

“Bio

chem

istry

, 4th

ed.”,

Wile

y 20

11

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

D. V

oet a

nd J.

G. V

oet,

“Bio

chem

istry

, 4th

ed.”,

Wile

y 20

11

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Proteins display important structural motifs

• Primary structure: aminoacid sequence of polypeptide chain• Secondary structure: spatial arrangement of polypeptide

backbone (without regard to side chains)

• Tertiary structure: three-dimensional structure of complete

chain• Quaternary structure: arrangement of subunits

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Image from the Irving Geis Collection, Howard Hughes Medical Institute.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Irving Geis (1908-1997) was one of the greatest scientific artists of the 20 th

century.

His innovations, particularly in depicting the structures of biological macromolecules such as DNA, earned him an international reputation. Many of his illustrations appeared in Scientific American , including a painting of the first protein crystal structure, of myoglobin, published in 1961.

(http://www.hhmi.org/news/geis.html)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Structure of human hemoglobin. It is a tetramer composed of two α and two β subunits. The proteins' α and β subunits are shown in red and blue, and the iron-containing heme groups are shown in green.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

The heme B group (an iron atom is strongly held at the center of a porphyrin ring)

Porphyrins are a group of heterocyclic organic compounds, composed of four pyrrole subunits interconnected at their α carbon atoms via methine bridges (=CH−).

The figure shows the structure of porphin, the simplest porphyrin

The shape of proteins is important in many ways. In the sickle cell disease (thalassemia), hemoglobin is malformed and it clusters into strands inside red blood cells.

Sickle hemoglobin (HbS), a structural variant of normal adult hemoglobin, results from a single amino acid substitution at position 6 of the beta globin molecule (β 6Glu→Val).

The sickle cell disease is a dangerous form of anemia, however in it its mild form it is known to confer some resistance against malaria.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

In 1949, it was suggested that the Darwinian paradox of high frequencies of genetic blood disorders could result from a selective advantage conferred by such disorders in protecting against Plasmodium falciparum (a protozoan parasite) malaria infection in heterozygotes.

This balancing selection, commonly referred to as the ‘malaria hypothesis’, was

originally suggested to explain the geographical correspondence between the distribution of thalassemia and malaria in the Mediterranean region, and was later confirmed in many locations including Sardinia, Melanesia and Kenya. At the same time, a similar relationship between HbS and malaria was independently discovered in Africa. ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1104

NATURE COMMUNICATIONS | 1:104 | DOI: 10.1038/ncomms1104 | www.nature.com/naturecommunications

© 2010 Macmillan Publishers Limited. All rights reserved.

HbS allele frequency calculated within each endemicity area allowed us to quantify the statistical strength of such differences, taking into account the inherent uncertainty of the predicted HbS allele fre-quencies (see Methods). Differences in areal means between ende-micity regions were calculated for 100 unique realizations of the HbS allele frequency map generated by the Bayesian model (Fig. 4 and Supplementary Fig. S2). When combined, these realizations produced predictive probability distributions for the difference in areal mean HbS allele frequency between each successive endemic-ity class (see Table 1 and Methods).

These geostatistical measures provide the first quantitative evidence for a geographical link between the global distribution of HbS and malaria endemicity. At the global level, we found clear

Malaria endemicity

Malaria free

Epidemic

Hypoendemic

Mesoendemic

Hyperendemic

Holoendemic

HbS allele frequency (%)0 – 0.510.52 – 2.022.03 – 4.044.05 – 6.066.07 – 8.088.09 – 9.609.61 – 11.1111.12 – 12.6312.64 – 14.6514.66 – 18.18

HbS data pointsPresenceAbsence

Figure 1 | Global distribution of the sickle cell gene. (a) Distribution of the data points. Red dots represent the presence and blue dots the absence of the HbS gene. The regional subdivisions were informed by Weatherall and Clegg19, and are as follows: the Americas (light grey), Africa, including the western part of Saudi Arabia, and Europe (medium grey) and Asia (dark grey); (b) Raster map of HbS allele frequency (posterior median) generated by a Bayesian model-based geostatistical framework. The Jenks optimized classification method was used to define the classes45; (c) The historical map of malaria endemicity29 was digitized from its source using the method outlined in Hay et al.44 The classes are defined by parasite rates (PR2 − 10, the proportion of 2- up to 10-year olds with the parasite in their peripheral blood): malaria free, PR2 − 10 = 0; epidemic, PR2 − 10 y 0; hypoendemic, PR2 − 10 < 0.10; mesoendemic, PR2 − 10 q 0.10 and < 0.50; hyperendemic, PR2 − 10 q 0.50 and < 0.75; holoendemic, PR0 − 1 q 0.75 (this class was measured in 0- up to 1-year olds)29,30.

High : 0.47

Lo w : 0.00

Figure 2 | Map of the uncertainty of the HbS allele frequency prediction. Interval between the 2.5 and 97.5% quantiles (95% probability) of the per-pixel predicted allele frequency using a continuous scale.

from

Pie

let a

l. “G

loba

l dis

trib

utio

n of

the

sick

le c

ell g

ene

and

geog

raph

ical

con

firm

atio

n of

the

mal

aria

hy

poth

esis

”, N

atur

e Co

mm

. 1:1

04 (2

010)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

ARTICLE

NATURE COMMUNICATIONS | DOI: 10.1038/ncomms1104

NATURE COMMUNICATIONS | 1:104 | DOI: 10.1038/ncomms1104 | www.nature.com/naturecommunications

© 2010 Macmillan Publishers Limited. All rights reserved.

HbS allele frequency calculated within each endemicity area allowed us to quantify the statistical strength of such differences, taking into account the inherent uncertainty of the predicted HbS allele fre-quencies (see Methods). Differences in areal means between ende-micity regions were calculated for 100 unique realizations of the HbS allele frequency map generated by the Bayesian model (Fig. 4 and Supplementary Fig. S2). When combined, these realizations produced predictive probability distributions for the difference in areal mean HbS allele frequency between each successive endemic-ity class (see Table 1 and Methods).

These geostatistical measures provide the first quantitative evidence for a geographical link between the global distribution of HbS and malaria endemicity. At the global level, we found clear

Malaria endemicity

Malaria free

Epidemic

Hypoendemic

Mesoendemic

Hyperendemic

Holoendemic

HbS allele frequency (%)0 – 0.510.52 – 2.022.03 – 4.044.05 – 6.066.07 – 8.088.09 – 9.609.61 – 11.1111.12 – 12.6312.64 – 14.6514.66 – 18.18

HbS data pointsPresenceAbsence

Figure 1 | Global distribution of the sickle cell gene. (a) Distribution of the data points. Red dots represent the presence and blue dots the absence of the HbS gene. The regional subdivisions were informed by Weatherall and Clegg19, and are as follows: the Americas (light grey), Africa, including the western part of Saudi Arabia, and Europe (medium grey) and Asia (dark grey); (b) Raster map of HbS allele frequency (posterior median) generated by a Bayesian model-based geostatistical framework. The Jenks optimized classification method was used to define the classes45; (c) The historical map of malaria endemicity29 was digitized from its source using the method outlined in Hay et al.44 The classes are defined by parasite rates (PR2 − 10, the proportion of 2- up to 10-year olds with the parasite in their peripheral blood): malaria free, PR2 − 10 = 0; epidemic, PR2 − 10 y 0; hypoendemic, PR2 − 10 < 0.10; mesoendemic, PR2 − 10 q 0.10 and < 0.50; hyperendemic, PR2 − 10 q 0.50 and < 0.75; holoendemic, PR0 − 1 q 0.75 (this class was measured in 0- up to 1-year olds)29,30.

High : 0.47

Lo w : 0.00

Figure 2 | Map of the uncertainty of the HbS allele frequency prediction. Interval between the 2.5 and 97.5% quantiles (95% probability) of the per-pixel predicted allele frequency using a continuous scale.

from

Pie

let a

l. “G

loba

l dist

ribut

ion

of th

e sic

kle

cell

gene

and

geo

grap

hica

l con

firm

atio

n of

the

mal

aria

hy

poth

esis”

, Nat

ure

Com

m. 1

:104

(201

0)

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

J. S

. Ric

hard

son,

“Ea

rly ri

bbon

dra

win

gs o

f pr

otei

ns”,

Nat

ure

Stru

ctur

al B

iolo

gy 7

(200

0) 6

24

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Chymotrypsinogen (1CHG) a proteolytic enzyme:

sticks

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Chymotrypsinogen (1CHG):

ribbon

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Chymotrypsinogen (1CHG):

cartoon

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

The shape of proteinsThe shape of proteins changes, and these conformational changes are important both in mechanical actions like those of ATP-synthase, helicase or

dynein, and in the enzymatic activity of proteins.

X-ray diffraction studies

X-ray diffraction is the most important method for protein structure determination.

High luminosity femtosecond X-ray sources provide an extension of conventional X-ray methods.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

As we have see, proteins are synthesized by ribosomes ...

... now the destruction mechanism: proteins are destroyed just as the are built by cells ...

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Proteins are destroyed by the ubiquitin-proteasome mechanism

ubiquitin molecules attach to protein to be discarded

the proteasome destroys the ubiquitin-tagged proteins

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Can we predict protein structure?Proteins have a very large number of configurations.

This leads to the well-known Levinthal’s paradox: a protein with Naminoacids, and an average of, say, 3 possible equilibrium positions per aminoacid, has ~3N ≈ 100.5N possible folded configurations.

For a small protein like lysozyme, N = 130, and therefore100.5N = 1065 configurations.

Trying 1 configuration/ns would mean about 3·1016

configurations/year, and searching all the configurations to find the ground state would thus take about 3·1048 years.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Levinthal’s paradox is relevant:

• for natural protein folding: how does a protein fold in just a few microseconds?

• for computational protein folding: how can we efficiently find the ground state configuration of proteins?

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

One of the keys to understanding: the hydrophobic effect

Consider the Gibbs free energy expression for mixing

Now note that

• Polar aminoacids form hydrogen bonds with water. • Nonpolar aminoacids do not form hydrogen bonds, and water

molecules have “less freedom”.

ΔGm = ΔHm −TΔSm

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Configurations of liquid water molecules near hydrophobic cavities in molecular-dynamics simulations. The blue and white particles represent the oxygen (O) and hydrogen (H) atoms, respectively, of the water molecules. The dashed lines indicate hydrogen bonds (that is, O-H within 35° of being linear and O-to-O bonds of no more than 0.35 nm in length). The space-filling size of the hydrophobic (red) particle in a is similar to that of a methane molecule. The hydrophobic cluster in b contains 135 methane-like particles that are hexagonally close-packed to form a roughly spherical unit of radius larger than 1 nm. In both cases, the water molecules shown are those that are within 0.8 nm of at least one methane-like particle. For the single cavity pictured in a, each water molecule can readily participate in four hydrogen bonds. (Owing to thermal motions, hydrogen bonding in liquid water is disordered.) Water molecules in a are typical of the bulk liquid where most molecules participate in four hydrogen bonds. The water molecules shown in b, however, are not typical of the bulk. Here, the cluster is sufficiently large that hydrogen bonds cannot simply go around the hydrophobic region. In this case, water molecules near the hydrophobic cluster have typically three or fewer hydrogen bonds.

from

D. C

hand

ler:

“In

terf

aces

and

the

driv

ing

forc

e of

hyd

roph

obic

ass

embl

y” N

atur

e 437

(200

5) 6

40

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

A free water molecule can form hydrogen bonds in 4 directions, and protons can occupy 2 out of 4 positions, i.e., there is a total of 6 states.

A water molecule close to a hydrophobic surface can form hydrogen bonds in (roughly) 3 directions only, and therefore there is a total of 3 states.

Entropy difference (per mole):

ΔS = NA kB ln6 − kB ln 3( ) = R ln2 ≈1.37 cal K−1 mole−1

TΔS ≈ 0.41 kcal mole−1 at 300 K

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

A large surface leads to a larger entropy: this means that the total surface must be minimized.

The Gibbs free energy change is negative when hydrophobic particles aggregate and thus minimize the total surface.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

Computational efforts proceed mostly by “brute force calculations”.

At the moment several small proteins have been “solved”, however larger proteins still do not yield to computational attacks.

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

K. A

. Dill

and

J. L

. Mac

Callu

m, “

The

Prot

ein-

Fold

ing

Prob

lem

, 50

Year

s O

n”, S

cien

ce 338

(201

2) 1

042

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

from

K. A

. Dill

and

J. L

. Mac

Callu

m, “

The

Prot

ein-

Fold

ing

Prob

lem

, 50

Year

s O

n”, S

cien

ce 338

(201

2) 1

042

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

“... we are still missing a “folding mechanism.” Bymechanism, we mean a narrative that explains how thetime evolution of a protein’s folding to its native statederives from its amino acid sequence and solutionconditions. A mechanism is more than just the sequencesof events followed by any one given protein in experimentsor in computed trajectories. ... ”

from K. A. Dill and J. L. MacCallum, “The Protein-Folding Problem, 50 Years On”, Science 338 (2012) 1042

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

• We have little experimental knowledge of protein-folding energy landscapes.

• We cannot consistently predict the structures of proteins to high accuracy.

• We do not have a quantitative microscopic understanding of the folding routes or

transition states for arbitrary amino acid sequences.

• We cannot predict a protein’s propensity to aggregate, which is important for aging and

folding diseases.

• We do not have algorithms that accurately give the binding affinities of drugs and small

molecules to proteins.

• We do not understand why a cellular proteome does not precipitate, because of the

high density inside a cell.

• We know little about how folding diseases happen, or how to intervene.

• Despite their importance, we still know relatively little about the structure, function,

and folding of membrane proteins.

• We know little about the ensembles and functions of intrinsically disordered proteins,

even though nearly half of all eukaryotic proteins contain large disordered regions. This

is sometimes called the “protein nonfolding problem” or “unstructural biology.”

from K. A. Dill and J. L. MacCallum, “The Protein-Folding Problem, 50 Years On”, Science

338 (2012) 1042

Edoardo Milotti - Introductory biophysics - A.Y. 2017-18

http://tedxtalks.ted.com/video/The-protein-folding-problem-a-m;TEDxSBU