Levinthal Paradox

24
Loop Fold Structure of Proteins: Resolution of Levinthal’s Paradox http://www.jbsdonline.com Abstract According to Levinthal a protein chain of ordinary size would require enormous time to sort its conformational states before the final fold is reached. Experimentally observed time of folding suggests an estimate of the chain length for which the time would be sufficient. This estimate by order of magnitude fits to experimentally observed universal closed loop ele- ments of globular proteins – 25-30 residues. Key words: Levinthal’s paradox, protein folding, chain conformation, closed loops, loop fold structure In 1968 Levinthal in his report of which only brief summary is available (1) noted that reversibly denaturable proteins during transition from “random disordered state into a well-defined unique structure” have to go through conformational space with immense number of states, so that the time required for visiting all the states would also be very large. Indeed, (e.g. (2)) for a protein chain of length L = 150 (residues) with n = 3 alternative conformations for every residue, the time t required for sorting out all possible conformations of the chain is: t = n L ⋅τ = 3 150 10 -12 s = 10 48 yrs [1] (τ = 10 -12 s is time for elementary transition (2)). Observed values of t are in the range 10 -1 to 10 3 s (2), that is, the full sorting as above is impossible. Thus, pro- tein folding has to proceed along a certain path that would avoid most of the con- formational space. The path should somehow be directed by an as yet unknown sequence-dependent folding rule(s). The size of the short chain for which the observed time span of 10 -1 to 10 3 s would be sufficient for trying every possible state can be calculated from [1], with the same assumptions, as l 0 = lg ( t / τ ) = 23 to 31 residues. In this case all conform- lg n ations could be tried during the given time, and the lowest energy state(s) attend- ed. Being logarithmic this estimate is rather insensitive to the choice of the values for n which according to various authors may change between 1.6 and 10 elemen- tary conformations (3, 4). With these extreme values the above estimate spans the range l 0 = 11 to 74 residues. The l 0 value may, thus, serve as a rough estimate of the size of the units (chains or chain segments), which could attend all conforma- tional states during observed time. The estimated size of the hypothetical unit is identical to the optimum of loop clo- sure for polypeptide chains, 20 to 50 residues (5), and to the observed size of recently discovered closed loop elements, 25-30 amino acid residues (5-7), of Journal of Biomolecular Structure & Dynamics, ISSN 0739-1102 Volume 20, Issue Number 1, (2002) ©Adenine Press (2002) Igor N. Berezovsky 1,* Edward N. Trifonov 2 1 Department of Structural Biology The Weizmann Institute of Science P.O.B. 26 Rehovot 76100 Israel 2 Genome Diversity Center Institute of Evolution University of Haifa Haifa 31905 Israel 5 Phone: 972-8-9343367 Fax: 972-8-9344136 E-mail: [email protected] Express Communication

Transcript of Levinthal Paradox

Page 1: Levinthal Paradox

Loop Fold Structure of Proteins:Resolution of Levinthal’s Paradox

http://www.jbsdonline.com

Abstract

According to Levinthal a protein chain of ordinary size would require enormous time to sortits conformational states before the final fold is reached. Experimentally observed time offolding suggests an estimate of the chain length for which the time would be sufficient. Thisestimate by order of magnitude fits to experimentally observed universal closed loop ele-ments of globular proteins – 25-30 residues.

Key words: Levinthal’s paradox, protein folding, chain conformation, closed loops, loopfold structure

In 1968 Levinthal in his report of which only brief summary is available (1) notedthat reversibly denaturable proteins during transition from “random disorderedstate into a well-defined unique structure” have to go through conformational spacewith immense number of states, so that the time required for visiting all the stateswould also be very large. Indeed, (e.g. (2)) for a protein chain of length L = 150(residues) with n = 3 alternative conformations for every residue, the time trequired for sorting out all possible conformations of the chain is:

t = nL ⋅ τ = 3150 ⋅ 10-12 s = 1048 yrs [1]

(τ = 10-12 s is time for elementary transition (2)). Observed values of t are in therange 10-1 to 103 s (2), that is, the full sorting as above is impossible. Thus, pro-tein folding has to proceed along a certain path that would avoid most of the con-formational space. The path should somehow be directed by an as yet unknownsequence-dependent folding rule(s).

The size of the short chain for which the observed time span of 10-1 to 103 s wouldbe sufficient for trying every possible state can be calculated from [1], with the

same assumptions, as l0 = lg(t/τ) = 23 to 31 residues. In this case all conform-lg n

ations could be tried during the given time, and the lowest energy state(s) attend-ed. Being logarithmic this estimate is rather insensitive to the choice of the valuesfor n which according to various authors may change between 1.6 and 10 elemen-tary conformations (3, 4). With these extreme values the above estimate spans therange l0 = 11 to 74 residues. The l0 value may, thus, serve as a rough estimate ofthe size of the units (chains or chain segments), which could attend all conforma-tional states during observed time.

The estimated size of the hypothetical unit is identical to the optimum of loop clo-sure for polypeptide chains, 20 to 50 residues (5), and to the observed size ofrecently discovered closed loop elements, 25-30 amino acid residues (5-7), of

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 1, (2002)©Adenine Press (2002)

Igor N. Berezovsky1,*

Edward N. Trifonov2

1Department of Structural Biology

The Weizmann Institute of Science

P.O.B. 26

Rehovot 76100

Israel 2Genome Diversity Center

Institute of Evolution

University of Haifa

Haifa 31905

Israel

5

Phone: 972-8-9343367Fax: 972-8-9344136E-mail: [email protected]

Express Communication

Page 2: Levinthal Paradox

which globular proteins are universally built. One can hypothesize that the closedloops are also elementary folding units. In this case their linear arrangement with-in the protein folds (5, 8) would suggest a straightforward folding path: sequentialformation of the closed loop units, along with their synthesis in the ribosome. Ifsuch successive formation of the stable folding units in the course of translation isassumed, it will require time proportional to the number of the units, that is, onlyseveral fold larger than required for a single unit. The above scenario is consistentwith the typical rates of translation, 3 to 20 residues per second (9). Synthesis ofthe protein of length L = 150 takes, thus, 8 to 50 seconds, which is a fair match tothe above range of folding rates. Thus, according to the estimates above the con-secutive formation of the loop-like folding units of 25-30 residues is by the orderof magnitude time-wise consistent with both folding and translational experiments.

Acknowledgement

We are grateful to Prof. A. Yu. Grosberg for valuable comments and enlighteningdiscussions.

References and Footnotes

6

Berezovsky and Trifonov

1.2.3.4.

5.6.7.8.9.

Levinthal, C. J. Chim. Phys. Chim. Biol. 65, 44-45 (1968).Branden, C. and Tooze, J. Introduction to Protein Structure, Garland Publishing (1999). Bryngelson, J. D. and Wolynes, P. G. Proc. Natl. Acad. Sci. USA 84, 7524-7528 (1987).Pande, V. S., Grosberg, A. Yu. and Tanaka, T. Reviews of Modern Physics 72, 259-314(2000).Berezovsky, I. N., Grosberg A. Y. and Trifonov, E. N. FEBS Letters 466, 283-286 (2000).Berezovsky, I. N. and Trifonov, E. N. J. Biomol. Struct. Dyn. 19, 397-403 (2001).Berezovsky, I. N. and Trifonov, E. N. J. Mol. Biol. 307, 1419-1426 (2001).Berezovsky, I. N. and Trifonov, E. N. Prot. Engineering 14, 403-407 (2001).Varenne, S., Buc, J., Lloubes, R. and Lazdunski, C. J Mol Biol. 180, 549-576 (1984).

Date Received: July 6, 2002

Communicated by the Editor M. D. Frank-Kamenetskii

Page 3: Levinthal Paradox

Cunning Simplicity of a Hierarchical Folding

http://www.jbsdonline.com

Abstract

A hierarchic scheme of protein folding does not solve the Levinthal paradox since it cannotprovide a simultaneous explanation for major features observed for protein folding: (i) fold-ing within non-astronomical time, (ii) independence of the native structure on large varia-tions in the folding rates of given protein under different conditions, and (iii) co-existence,in a visible quantity, of only the native and the unfolded molecules during folding of mod-erate size (single-domain) proteins. On the contrary, a nucleation mechanism can accountfor all these major features simultaneously and resolves the Levinthal paradox.

Key words: protein folding, Levinthal paradox, two-state kinetics, mid-transition, co-exis-tence of the native and the unfolded phases, folding nucleus, rate of folding

Berezovsky & Trifonov (1) have recently revisited an attractive idea of a hierarchicprotein folding (2, 3) in an attempt to resolve the Levinthal paradox (4), i.e., toexplain how a problem of sampling the impossibly large number of conformationsby the folding protein chain can be avoided. A “hierarchic mechanism” means thatsome structures formed at the first stage serve, without any significant reconstruc-tion, as building blocks at the next stage of folding, and then the bigger structuresobtained at the second stage serve as the building blocks for the next stage, etc.Specifically, Berezovsky & Trifonov in their clearly written paper (1) assume thatlocal “closed loops” of 25-30 residues (the smallest folding units) find their low-est-energy structures by an exhaustive search of all their conformations and thenstick together, entering the tertiary protein fold in the already found form.

In principle, a hierarchic mechanism can help to avoid sampling of the huge con-formational space. However, I would like to note that any hierarchic scenario can-not serve as a general solution of the Levinthal paradox. Such a mechanism maywork only when the native structure is much more stable than the unfolded or dena-tured state of protein chain. It cannot work, though, when protein folding occursnear the point of thermodynamic equilibrium between the native and denaturedstates of the protein. Indeed, any hierarchic mechanism (including the one sug-gested by Berezovsky & Trifonov) implies that the formed folding unit has to pre-serve its once found form at least until the next unit will be formed, i.e., for the timecomparable with that of its own formation. Long life of the once found form meansthat it is thermodynamically stable as compared to the initial denatured state of thesame piece of the chain. Since the folding units then stick together, this means thatthe native protein is, in turn, more stable than the sum of these sable folded units,and thus much more stable than the denatured state of protein chain.

However, a high stability of the native structure is not obligatory for folding. Asdemonstrated by numerous chevron plots (5-8), many single-domain proteins (havingup to 200–250 residues) successfully fold near and even in the point of thermody-namic equilibrium between their native and denatured states. This refers to proteins

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

Alexei V. Finkelstein

Institute of Protein Research

Russian Academy of Sciences

142290, Pushchino

Moscow Region, Russia

311Phone/Fax: (+7-095) 924 0493Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #1

Page 4: Levinthal Paradox

having a simple two-state folding kinetics, as well as to those that have three-statefolding kinetics (i.e., that have detectable folding intermediates far from the equilibri-um point). The discovery of this fact in kinetics was actually a re-discovery of well-known “all-or-none” thermodynamics of de- and renaturation of single-domain pro-teins (9). The “all-or-none” transition means that only the native and denatured pro-teins are present (close to the mid-transition) in a visible quantity, while others, i.e.,“semi-native” forms, are unstable and therefore present in a very small quantity.

An “all-or-none” transition is a microscopic analog of first order phase transitions(i.e., crystallization) in macroscopic systems (9, 10). Therefore, it is not a surprisethat this transition in proteins was shown to follow a nucleation mechanism (8) thatis well known (in physics) to be specific for the first order phase transitions (10).Moreover, it has been shown that the nucleation mechanism (that pays main attentionto the boundary between the folded and denatured phases within a protein molecule)can resolve the Levinthal paradox (11) and leads to realistic estimates of the proteinfolding rates (12). The only necessary prerequisite of such a transition is an energygap between the lowest-energy native fold and misfolded structures (11, 13, 14).

On the other hand, the simplicity of a hierarchic folding is cunning. In its strict form,the hierarchic mechanism means that the native protein structure is controlled bykinetics rather than by stability of the whole protein. If so, one has to explain why thesame native protein structure results from foldings held under different conditions andhaving 1000-fold difference in speed (which, according to the logic of Berezovsky &Trifonov (1), has to lead to folding units of rather different sizes); it is noteworthy thatthis question does not arise at all when the native fold is determined by its stability.Also, one has to explain why the folding with detectable folding intermediates (farfrom the mid-transition) is not drastically faster than the folding that does not havesuch intermediates, and why the evidently non native-like intermediates, observed insome cases (15, 16), do not prevent protein from correct hierarchic folding. It isworthwhile to note that the latter fact is easily explained if the choice of the native foldis determined by its stability (rather than by folding kinetics): it is not obligatory thatevery element of the lowest-energy fold has an enhanced stability, though most ofthem have to have an enhanced stability for pure statistical reason (17).

A hierarchic folding does not provide a general solution of the Levintal paradox since itcannot simultaneously explain for major features, observed for folding of single-domainproteins: (i) folding within non-astronomical time, (ii) independence of the native struc-ture on large variations in the folding rates of given protein under different conditions,and (iii) co-existence, in a visible quantity, of only the native and the denatured proteinmolecules during the folding near the point of thermodynamic mid-transition betweenthese two states. However, this does not mean that a hierarchic folding scenario is com-pletely inconsistent with all proteins. Specifically, a hierarchic folding seems to takeplace in large multi-domain proteins whose denaturation is not an “all-or-none” transi-tion but proceeds as a sum of “all-or-none” denaturations of their domains (18).

The presented considerations give an important criterion of applicability of anyprotein-folding model to single-domain proteins: whether or not this model canexplain protein folding near the point of thermodynamic mid-transition betweenthe folded and unfolded states. A hierarchic folding scheme discussed in this papercannot satisfy this criterion. The same refers (19) to the simplest versions of a fun-nel model of protein folding (20-22). On the contrary, the nucleation model of pro-tein folding meets this criterion and resolves the Levinthal paradox.

Acknowledgements

This work was supported by an International Research Scholar’s Award from theHoward Hughes Medical Institute and by a grant from the Russian Foundation forBasic Research.

312

Finkelstein

Page 5: Levinthal Paradox

References and Footnotes 313Cunning Simplicity of a

Hierarchical Folding1.2.3.4.5.6.7.8.9.10.11.12.13.14.

15.16.17.18.19.20.21.22.

Berezovsky, I. N. and Trifonov, E. N., J. Biomol. Struct. Dyn. 20, 5-6 (2002).Ptitsyn, O. B., Dokl. Akad. Nauk SSSR 210, 1213-1215 (1973).Baldwin, R. L. and Rose, G. D., Trends Biochem. Sci. 24, 26-33, 77-83 (1999).Levinthal, C., J. Chim. Phys. Chim. Biol. 65, 44-45 (1968).Segawa, S.-I. and Sugihara, M., Biopolymers 23, 2473-2488 (1984).Matouschek, A., Kellis, J. T., Jr., Serrano, L. and Fersht A. R. Nature 340, 122-126 (1989).Kiefhaber, T., Proc. Natl. Acad. Sci. USA 92, 9029-9033 (1995).Fersht, A. R., Curr. Opin. Struct. Biol. 7, 3-9 (1997). Privalov, P. L., Adv. Protein Chem. 33, 167-241 (1979).Landau, L. D. and Lifshitz, E. M., Statistical Physics. London, Pergamon (1959).Finkelstein, A. V. and Badretdinov, A. Y., Folding & Design 2, 115-121 (1997).Galzitskaya, O. V., Ivankov, D. N. and Finkelstein, A. V., FEBS Letters 489, 113-118 (2001).J. B. Bryngelson and P. G. Wolynes, Proc. Natl. Acad. Sci. USA 84, 7524 (1987).Goldstein, R. A., Luthey-Schulten, Z. A. and Wolynes, P. G., Proc. Natl. Acad. Sci. USA 894918-4122; ibid., 9029-9033 (1992).Baldwin, R. L., Nature Struct. Biol. 8, 92-94 (2001).Fernandez, A., J. Biomol. Struct. Dyn. 19, 735-737 (2002).Finkelstein, A. V., Badretdinov, A. Ya., and Gutin, A. M., Protein 23, 142-150 (1995).Privalov, P. L., Adv. Protein Chem. 35, 1-104 (1982).Bogatyreva, N. S. and Finkelstein A. V., Protein Engineering 14, 521-523 (2001).Zwanzig, R., Szabo, A. and Bagchi, B., Proc. Natl. Acad. Sci. USA 89, 20-22 (1992).Chan H. S. and Dill K. A., Proteins 30, 2-33 (1998).Bicout, D. J. and Szabo, A., Protein Science 9, 452-465 (2000).

Date Received: September 16, 2002

Communicated by the Editor Ramaswamy H Sarma

Page 6: Levinthal Paradox
Page 7: Levinthal Paradox

Back to Units of Protein Folding

http://www.jbsdonline.com

Entia non sunt multiplicanda praeter necessitatem(Occam)

Abstract

In response to the criticism by A. Finkelstein (J. Biomol. Struct. Dyn. 20, 311-314, 2002) ofour Communication (J. Biomol. Struct. Dyn. 20, 5-6, 2002) several issues are dealt with.Importance of the notion of elementary folding unit, its size and structure, and the necessi-ty of further characterization of the units for the elucidation of the protein folding in vivo arediscussed. The criticism (J. Biomol. Struct. Dyn. 20, 311-314, 2002) on the hierarchical pro-tein folding is also briefly addressed.

Key words: Levinthal paradox; folding units; protein folding, biological relevance.

The main points of our Communication (2) – the structural units of folding andclosed loops as likely candidates for that role – are not challenged by A. Finkelstein(1). But his piece rather invites to a separate chapter in protein folding studies – ahierachical folding (3-8). The passage in our note (2) “If such successive formationof the stable folding units in the course of translation is assumed, …” is not a state-ment, and it is not about hierarchical folding, but rather a simple illustration of howthe elementary structural units may partake in the initial stages of folding processduring translation, along with the synthesis of the polypeptide chain. This biologi-cal dimension is, apparently, a crucial component in protein folding problem.

The structural units of folding may or may not remain intact once formed, and theirfate during protein folding may follow many different scenaria (6, 7). But again,the discussion on these scenaria is well beyond the scope of our originalCommunication (2). The very notion of independently folding units is an impor-tant concept, and every contribution towards their characterization (9-13) is impor-tant. Our study suggests a rather narrow size range for the hypothetical foldingunits, and the structural component to it - chain-return nature of the units (14).

Calculations for the ordinary protein sizes on the basis of Levinthal’s original idea(15) lead to astronomical times. This initiated decades of theoretical and experi-mental research (16-20) to figure out what is a time-wise agreeable path of foldingof the full-size proteins. The same calculation in reverse as in (2) leads to what wewould call the Levinthal limit – in other words the size of the polypeptide chain orpart thereof for which the observed times of protein folding would be sufficient tosort out all possible conformations. The respective simple formalism provides arough estimate of the size of such a chain, under the assumption that there are noreturns to the same conformations during the sorting. More elaborate approachesshould take into consideration both physical factors and biological circumstancesthat may influence the estimate. It may, thus, become either smaller (e.g. due tolong-living native and non-native meta-stable states (5,7)), or larger (e.g. due to

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

Igor N. Berezovsky1,*

Edward N. Trifonov2

1Department of Structural Biology

The Weizmann Institute of Science

P.O.B. 26, Rehovot 76100, Israel2Genome Diversity Center

Institute of Evolution

University of Haifa

Haifa 31905, Israel

315

Phone: 972-8-9343367Fax: 972-8-9344136Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #2

Page 8: Levinthal Paradox

special sequence organization (4, 21-24), and possible structural guidance withinthe ribosome or in chaperones (25, 26)). The balance is still to be found. Possibleclues are apparent evolutionary fingerprints in protein sequences (23, 24), that indi-cate the same size range, 25-30 amino acid residues, as the one suggested by thepolymer statistics of the polypeptide chains (14) and by the observed rates of trans-lation (2). We believe, that both experimental and theoretical studies on the units offolding is the right way to elucidation of how exactly the protein folds in vivo.

References and Footnotes

316

Berezovsky and Trifonov

1.2.3.4.5.6.7.8.9.10.

11.

12.13.14.15.16.17.18.19.20.21.22.

23.

24.25.26.

Finkelstein, A. V., J. Biomol. Struct. Dyn. 20, 311-314 (2002). Berezovsky, I. N. and Trifonov, E. N., J. Biomol. Struct. Dyn. 20, 5-6 (2002).Ptitsyn, O. B,. Dokl. Akad. Nauk SSSR 210, 1213-1215 (1973).Baldwin, R. L. and Rose, G. D., Trends Biochem. Sci. 24, 26-33 (1999).Baldwin, R. L. and Rose, G. D., Trends Biochem. Sci. 24, 77-83 (1999).Fersht, A. R., Proc. Natl. Acad. Sci. USA 97, 1525-1529 (2000).Baldwin, R. L., Nat. Struct. Biol. 6, 814-817 (1999).Fernandez, A., J. Biomol. Struct. Dyn. 19, 735-737 (2002).Abkevich, V. I., Gutin, A. M., Shakhnovich, E. I., Biochemistry 33, 10026-10036 (1994).Panchenko, A. R., Luthey-Schulten, Z. and Wolynes, P. G., Proc. Natl. Acad. Sci. USA 93,2008-2013 (1996).Panchenko, A. R., Luthey-Schulten, Z., Cole, R. and Wolynes, P. G., J. Mol. Biol. 272, 95-105 (1997).Berezovskii, I. N., Esipova, N. G. and Tumanyan, V. G., Biophysics 42, 557-565 (1997).Galzitskaya, O. V., Ivankov, D. N. and Finkelstein A. V., FEBS Lett 489, 113-118 (2001).Berezovsky, I. N., Grosberg A. Y. and Trifonov, E. N. FEBS Letters 466, 283-286 (2000).Levinthal, C. J., Chim. Phys. Chim. Biol. 65, 44-45 (1968).Baldwin, R. L., Nat. Struct. Biol. 6, 814-817 (1999).Fersht, A. R., Curr. Opin. Struct. Biol. 7, 3-9 (1997).Shakhnovich, E. I., Curr. Opin. Struct. Biol. 7, 29-40 (1997).Pande, V. S., Grosberg, A. Yu. and Tanaka, T., Reviews of Modern Physics 72, 259-314 (2000).Fersht, A. R. and Daggett, V., Cell 108, 573-582 (2002).Shakhnovich, E. I. and Gutin, A. M., Proc. Natl. Acad. Sci. USA 90, 7524-7528 (1993).Pande, V. S., Grosberg, A. Yu. and Tanaka, T., Proc. Natl. Acad. Sci. USA 91, 12972-12975(1994).Trifonov, E. N., Kirzhner, A., Kirzhner, V. M. and Berezovsky, I. N., J. Mol. Evol. 53, 394-401 (2001).Berezovsky, I. N. and Trifonov, E. N., Comparative and Functional Genomics, in press (2002).Gething, M. J. and Sambrook, J., Nature 355, 33-45 (1992). Ruddon, R. W. and Bedows, E., J. Biol. Chem. 272, 3125-3128 (1997).

Date Received: October 25, 2002

Communicated by the Editor Ramaswamy H Sarma

Page 9: Levinthal Paradox

A Few Disconnected Notes Related to Levinthal Paradox

http://www.jbsdonline.com

Abstract

We estimate that the longest protein chain capable of exhaustive sampling of all its confor-mations within a millisecond is shorter than 15 residues. This reinforces the understandingof Levinthal paradox which emerged in the last decade, namely, that cooperative (all-or-none) character of folding and unfolding transition is indicative of the sequences selected,such that reliable folding does not require exhaustive conformation sampling. The opinionis formulated that the discussions of Levinthal paradox should now fly to the new spheres.

A long time ago, well before the breakup of Soviet Union, a large biophysics meet-ing was held in the then Soviet Republic of Georgia. The site was a rural place inthe center of a famous wine producing region, the month was October, and themajor event was the all-Georgian wine testing festival, advertised as a merry tradi-tional peasant holiday. When biophysicists arrived, they found a large plaza withdozens of pavilions, each representing a particular village, and each offering (forfree!) a glass of young wine. Many biophysicists deem themselves experts, andtheir intent was to test systematically all different sorts of wine. Before long, how-ever, the cloud of biophysicists seemed perfectly obeying the diffusion equation,with each individual in the cloud undergoing random walks.

The few participants (perhaps bad biophysicists) who were able to continue scien-tific observations realized soon that wine testing continued for an unexpectedlylong time. Assuming visit to one pavilion takes time τ, and assuming there weresome M pavilions, one could have naively expected that after time close to Mτ thetesting would be over. Such expectation was proved to be terribly wrong, and thereal time was much longer than that. One possible reason is trivial: wine testers,even if they were to complete the exhaustive testing, are unlikely to realize thecompletion of the task and to stop at that. But there is also another more interest-ing reason: the time necessary for the random walk to visit all M sites does not scaleas Mτ, it can be significantly larger than that, simply because random walk visitssome sites great many times before the first visit to some other sites.

This simple property of random walks and diffusion processes seems to be under-appreciated in the current discussions revolving around the celebrated Levinthalparadox (1), particularly in the ones initiated by the recent very clearly written arti-cle (2) by I. Berezovsky and E. Trifonov (IBET for brevity). In the most standardformulation, Levinthal paradox arises from the idea that the time required for a pro-tein molecule to sample all of its conformations is at least Mτ, where M is the num-ber of distinct conformations, and τ is the time necessary to sample one conforma-tion. Then, the paradox goes, unguided folding into one particular (native) staterequires at least time of order Mτ which is far too long, because M is astronomi-cally large. In fact, M is so large that, for instance, H. Fraunfelder (3) suggests to

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

A. Grosberg1,2

1Department of Physics

University of Minnesota

Minneapolis, MN 55455, USA2Institute of Biochemical Physics

Russian Academy of Sciences

Moscow 117977, Russia

317Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #3

Page 10: Levinthal Paradox

call it a “biological number,” where biological numbers dwarf astronomical ones.Clearly, M is so large because it is exponential in the degree of polymerization ofthe protein chain, N: M = esN, where s is a constant close to unity (see, e.g., (4)).

I. Berezovsky and E. Trifonov (2) turned the Levinthal’s argument up-side-down.They argued that while exhaustive sampling of all conformations is out of reach forlarge N, it is definitely possible for sufficiently small N. Thus, they decided to esti-mate the length of protein chain N0 such that at N < N0 protein can exhaustivelysample all of its conformations within some specified time T, say, one millisecond.Assuming exhaustive sampling time Mτ, they wrote τesN0 = T and obtained

N0 = (1/s) 1n (T/τ) . [1]

This number, according to (2), turns out to be somewhere between 23 and 31. Thisled IBET to a series of attractive speculations. Leaving speculations aside for amoment, let us discuss the estimate of N0 more closely.

First of all, the very question is perfectly legitimate: what is the maximal length ofthe chain which can exhaustively sample all of its conformations within the speci-fied time interval T, such as a millisecond to a second? Comparison with wine test-ing makes it immediately clear that the answer depends on how the sampling isorganized. Indeed, in terms of wine testing, Mτ is the time required to visit allpavilions in an orderly fashion, one pavilion after another, never returning to thealready visited place. Of course, a sober person can do that, but sober model isunrealistic for the wine tester. Equally unrealistic is the model of protein chaindynamics which orderly samples all conformational states, never returning to theonce visited conformation. For all other sampling strategies, the exhaustion timeis larger than Mτ. In fact, Levinthal did not say that the time of exhaustive con-formation sampling (or wine testing) was Mτ - he said it was at least Mτ; in otherwords, he said it is ≥ Mτ. This was sufficient for him to conclude that exhaustivesampling is impossible for realistic N, such as N = 150.

The opposite, and perhaps more realistic, model of protein dynamics (and also ofwine testing) would be purely random walk in the space of conformations. Tobegin with, suppose it is an unbiased random walk, which means there is no con-formation dependent (free) energy landscape involved. (Note that energy land-scape, when present, will increase the exhaustion time (because exhaustionrequires visiting all the tops of all energy barriers) – unlike folding time, which, ofcourse, can be decreased). How can we estimate the exhaustive sampling time forthe unbiased random walk model? Consider first that wine testing pavilionsarranged along a line. Then random walk of longevity t brings us as far as about√ t/τ, which means we cover all M sites when √t/τ ≅ M, and the time of exhaustivesampling is t ≅ τΜ2. Needless saying, the difference between τM and τM2 is verysignificant. Based on τM2, IBET would have obtained

N0 = (1/2s) 1n (T/τ) , [2]

twice smaller than (1). This is the length between 9 and 15, which seems to ruleout most of the speculations by IBET.

In fact, accurate estimate of exhaustive sampling time by a random walk is notcompletely trivial. More sophisticated estimate for one dimensional case, whichwill not be derived here, reads t ≅ τM2 / 1nM. This increases the result for N0 tobetween 11 and 16. For the random walk in the space of higher dimension d, theresult depends on d. When d crosses over 2, the mechanism of sampling changes,because random walk tends to leave behind large unvisited regions. At d > 2,exhaustive sampling is only possible because the overall volume is restricted, andrandom walk is forced to come back.

318

Grosberg

Page 11: Levinthal Paradox

What is d in reality is anybody’s guess. Please do not forget that d here is thedimension of the abstract space of protein conformations, not the real three-dimen-sional space. The relevant dimension was measured for the vicinity of the nativeconformation of a lattice model protein (5). Measurements in different regions ofconformation space yield results for d between 1.4 and 4.5.

The result close to (1) would be correct for the dimension d as high as M; in thiscase, exhaustion time would have scaled as τM ln M, yielding N0 between 18 and25. However, this estimate is completely unrealistic, because d = M correspondsto the situation where each conformation (site in conformation space, or winepavilion) can be equally probably reached from every other conformation in justone step τ. Clearly, real protein chains are nowhere near this extreme.

There are quite a few other factors (5, 6), all reducing the result for N0, and all fun-damentally arising from the fact that we are dealing with a polymer chain in whichall units are linearly connected.

To conclude this part, it appears that IBET significantly overestimated the length ofa protein capable of exhaustive conformation sampling. It seems safe to say thatthis length is smaller than 15. We leave it for the reader to decide whether the con-cept of folding should be applied to such a short chain.

At the next stage, IBET suggested that exhaustively sampling blocks combinetogether to form hierarchically folding large proteins. This possibility was criti-cally reviewed recently by Finkelstein (7). His analysis seems quite convincing.His major point is that reasonably fast folding is frequently observed (see (8) andreferences therein) under the conditions where native state is not significantlylower in free energy (or not lower at all!) than fully denatured state – that is, in thepoint of thermodynamic equilibrium between native and denatured states. Undersuch conditions, even the globule as a whole is not particularly stable, while theparts – supposedly the units of hierarchical scenario – are not stable at all. In addi-tion to this convincing argument by Finkelstein, the small value of N0 makes thestability of the presumed folding units questionable, and the whole of the hierar-chical scheme even more difficult to imagine.

It should be noted that the understanding of Levinthal paradox has progressed verysignificantly since it was first formulated (1). First of all, it is found that the fold-ing time, under the conditions of thermodynamic equilibrium between folded andunfolded states, scales as τ exp (s´N2/3) (9-11), which is very significantly smallerthan Levinthal time proportional to τ exp (sN). This estimate, as already said, isvalid under the conditions of thermodynamic equilibrium, which means that itrelies on the transition between denatured and native forms being highly coopera-tive, of all-or-none type. Indeed, high cooperativity is a well established experi-mental fact (12). It is also well understood that high cooperativity is the propertyof proteins which is due to their peculiar selected sequences. Among randomsequences, vast majority would not have exhibited any signs of cooperativity, as itwas first established by Shakhnovich and Gutin (13). This latter fact has beenextensively tested using lattice models (as described, e.g., in the review article (4);see also references therein). Since everything related to the lattice models is per-ceived with a large dose of (healthy?) skepticism in protein community, it is impor-tant to emphasize that the fact of non-cooperative folding in the majority ofsequences is well understood beyond lattice models. Actually, it was foreseen byBryngelson and Wolynes a long time ago (14).

Speaking about the relation between sequence selection, the all-or-none coopera-tive mechanism of folding, as the scaling of folding time, it is interesting to men-tion that experimental observations do not provide any evidence on the folding(under equilibrium conditions) time dependence on the chain length. That means,

319Disconnected Notes Related

to Levinthal Paradox

Page 12: Levinthal Paradox

the above mentioned theoretical prediction, τ exp (s´N2/3), although sufficient torule out any paradoxes, may be still an overestimate.

The role of sequence selection is also well understood from a different view point,namely, related to the mutation stability (see also in the review article (4)). In themajority of sequences, every mutation breaks the stability of the native state withthe probability very close to 100%. By contrast, the selected sequences – the sameones which exhibit highly cooperative folding-unfolding transition! – are reliablein the sense that their native state with high probability survives and remains sta-ble even after several mutations.

While the real mechanisms of evolutionary sequence selection remain unknown,and while the computational models of sequence selection keep improving sincethe first suggestions (15, 16), it is getting increasingly clear that there are manysequences which meet the sufficient criteria of reliable folding.

To summarize, in the light of all the findings of the last decade, it seems clear (tothe present author, at least) that the discussions about Levinthal paradox must nowmove forward to the new spheres. How does the sequence selection work (orworked) in real evolution? What are the specific scenario of folding dynamics forselected sequences – how specific is the nucleation, how many and which confor-mations belong to the transition state, what is the reaction coordinate associatedwith folding; in other words, how precisely do these selected sequences slide downtheir folding funnels (17-20)? What are the physical principles behind the selec-tion of certain spatial structures, or folds and fold families (21)? What are the gen-eral physical principles behind the enzymatic, motor and other functions of pro-teins, and do they have any relation to the principles involved in folding? What arethe mechanisms of aggregation, or mechanisms preventing aggregation, of pro-teins? There are very many works on these subjects, to make the list of them is adaunting task far beyond the framework of the present note. However many ques-tions remain open, it seems that the Levinthal’s question – how can protein sample“biologically large” number of conformations – has been answered: protein doesnot sample them. Most importantly, there are sufficiently many “good” sequencesfor the evolution to select from, where every “good” sequence is capable of fold-ing, and does not need exhaustive conformation sampling to do so. Understandingthis was a remarkable achievement of the last decade.

I am indebted to I. Berezovsky and E. Trifonov. Their paper (2) initiated the pres-ent note, and my personal discussions with them were useful and pleasant. I thankalso V. Pande for critical reading of the first draft of this manuscript.

Reference and Footnotes

320

Grosberg

1.2.3.

4.5.6.7.8.9.

10.11.12.13.14.15.16.17.

Levinthal, C., J. Chim. Phys. Biol. 65, 44-45 (1968).Berezovsky I., Trifonov E., J. Biomol. Struct. Dyn. 20, 5-6 (2002).H. Fraunfelder, Colloquium talk at the University of Minnesota Physics Department,October, 2002.Pande, V. S., Grosberg A. Yu. and Tanaka, T., Reviews of Modern Physics 72, 259-314 (2000).Du, R., Grosberg, A., Tanaka, T., Phys. Rev. Letters 84, 1828-1831 (2000).Scala, A., Nunes Amaral, L. A., Barthèlèmy, M., Europhys. Lett. 55, 594-600 (2001).Finkelstein, A. V., J. Biomol. Struct. Dyn. 20, 311-314 (2002).Fersht, A. Curr. Opin. Struct. Biol. 7, 3-9 (1997).Finkelstein, A., Badretdinov, A., Folding & Design 2, 115 (1997); Folding & Design 3, 67(1998).Du, R., Grosberg, A. Yu., Tanaka, T., Phys. Rev. Letters 83, 4670-4673 (1999).Finkelstein, A. V., Ptitsyn, O. B., Protein Physics, Academic Press, 2002.Privalov, P. L., Adv. Protein. Chem. 33, 167-241 (1997).Shakhnovich, E., Gutin, A., Biophys. Chem. 34, 187, 1989.Bryngelson, J. D., Wolynes, P. G., Proc. Natl. Ac. Sci. 87, 7524-7528 (1987).Shakhnovich, E. I., Gutin, A. M., Proc. Nat. Acad. Sci., USA 90, 7195 (1993).Pande, V. S., Grosberg, A. Yu., Tanaka, T., Proc. Nat. Acad. Sci. USA 91, 12972 (1994).Shakhnovich, E. I., Folding and Design 1, 50-52 (1996); Shakhnovich, E. I. Curr. Opin.Struct. Biol. 7, 29-40 (1997).

Page 13: Levinthal Paradox

321Disconnected Notes Related

to Levinthal Paradox

18.

19.

20.21.

Pande, V. S., Grosberg, A. Yu., Tanaka, T., Rokhsar, D. S., Curr. Opin. Struct. Biol. 8, 68-79(1998).Onuchic, J. N., Socci, N. D., Luthey-Schulten, Z., Wolynes, P. G., Folding and Design 1,441-450 (1996).Dill, K. A., Chan, H. S., Nat. Struct. Biol. 4, 10-19 (1997).Li, H., Helling, R., Tang, C., Wingreen, N., Science 273, 666 (1996).

Date Received: October 27, 2002

Communicated by the Editor Ramaswamy H Sarma

Page 14: Levinthal Paradox
Page 15: Levinthal Paradox

Loop Folds in Proteins and Evolutionary Conservation of Folding Nuclei

http://www.jbsdonline.com

Abstract

We show that loops of close contacts involving hydrophobic residues are important in proteinfolding. Contrary to Berezovsky and Trifonov (J. Biomol. Struct. Dyn. 20, 5-6, 2002) the loopsimportant in protein folding usually are much larger in size than 23-31 residues, being insteadcomparable to the size of the protein for single domain proteins. Additionally what is importantare not single loop contacts, but a highly interconnected network of such loop contacts, whichprovides extra stability to a protein fold and which leads to their conservation in evolution.

Key words: protein folding, Levinthal paradox, loop fold structure, closed loops, evolution-ary conservation, folding nucleus

The title “Loop Fold Structure of Proteins: Resolution of Levinthal paradox” of thecommunication by Igor N. Berezovsky and Edward N. Trifonov (1) follows a longline of work, looking for order or regularity in proteins. It suggests that theLevinthal paradox really does exist and has yet not been resolved, which is untrue.There is significant literature on this problem, with probably the most fundamentalpaper on this “paradox” written ten years ago by Zwanzig et al. (2). The commonopinion in the protein folding community is that the Levinthal paradox (3) of find-ing a needle in a haystack doesn’t exist because proteins do not fold by randomlysearching all possible (extremely large) numbers of conformations. The foldinglandscape does not resemble a flat golf course with a single hole corresponding tothe native state. Rather it looks more like a bumpy funnel, so that the ball rollsalmost always (except for kinetic traps) downhill (4, 5). The funnel-like landscapeof folding is, according to the most popular theories, the result of hydrophobic col-lapse, which greatly reduces the total conformational space (6-8).

A protein does not fold from random coil conformations, and the assumption of arandom coil as a starting point for the folding process is the basis of the Levinthalparadox. Proteins fold to the native state from non-native (denaturated) states thatare already substantially structured and make all the arithmetic supporting theLevinthal paradox irrelevant.

In the letter “Cunning simplicity of a hierarchical folding” Alexei V. Finkelstein (9)does not support this simplest funnel-like mechanism of folding. Instead he explainsthe folding pathways through formation of a folding nucleus and a delicate thermody-namic balance between the native and denaturated states of the protein (10). He alsoclassified the idea of Berezovsky and Trifonov as a “hierarchical folding” and showsthat such hierarchical folding would lead to a native state that is too stable to unfold.

In our opinion the terminology “hierarchical folding” used by Finkelstein is improp-er. Hierarchical folding usually refers to the old views on protein folding, that theprimary structure (sequence) leads to the formation of the protein secondary struc-

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

Andrzej KloczkowskiRobert L. Jernigan*

Baker Center for Bioinformatics and

Biological Statistics

Iowa State University

123 Office and Lab Bldg.

Ames, IA 50011-3020

323

Phone: (515) 294-3833Fax: (515)294-3841Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #4

Page 16: Levinthal Paradox

ture, which in turn leads to the formation of the protein tertiary structure. Such ter-minology leads him to conclude that the closed loops with nearly standard size seg-ments of 23-31 residues reported by Trifonov and coworkers cannot be combinedwith more modern mechanisms of folding kinetics, such as a nucleation mechanism.

As a matter of fact closed loops, but not necessarily only those of almost regularsize 23-31 as reported by Berezovsky et al. in the series of their earlier papers (11-13), might be quite important in protein folding. The formation of such closedloops in the protein native state does not mean that residues forming such loop con-tacts always remain intact during the folding process, but there is a possibility thatthese loop contacts inside a protein core might be a part of the folding nucleus (14),especially if several of such loop contacts involving hydrophobic residues arelocated near one another inside the core.

We studied the non-functional evolutionarily conserved residues in several sub-families of proteins. The non-functional conserved residues are located inside aprotein core and their detection might give us valuable information about the mech-anism of protein folding. This work was initiated in our Lab by the late OlegPtitsyn who studied different subfamilies of c-type cytochromes (15). He came tothe conclusion that there is a common folding nucleus composed of four residues.Using the horse cytochrome c (1hrc) (which has the total length on 105 residues)as the reference for the numbering, the conserved residues are Gly(6), Phe(10),Leu(94) and Tyr(97). All these residues are hydrophobic and form a network offive conserved contacts 6-94, 6-97, 10-94, 10-97 and 94-97. According to Ptitsynthis network of conserved contacts could be the folding nucleus for cytochrome c,which is of critical importance for its folding. Note that the sizes of these loops aremostly substantially larger than the 23-31 proposed by Berezovsky et al.

The second subfamily of proteins studied by Ptitsyn and Kai-Li Ting was globins(16). Using sperm whale myoglobin (1mbd) as the reference the structurally con-served residues are Val(10), Trp(14), Ile(111), Leu(115), Met(131) and possiblyLeu(135). All these hydrophobic residues form a network of conserved contactswhich is possibly the folding nucleus.

Recently Kai-Li Ting and Jernigan (17) studied the conservation of the non-functionalresidues in the lysozyme/α-lactalbumin family and identified the possible foldingnucleus. They identified 19 conserved hydrophobic non-functional residues. That isprobably too many due to a lack of substantial evolutionary diversity in the studied pro-tein family. Another possibility is that there are present three conserved subclusters,where each subcluster is composed of a network of conserved contacts. Importantly,all of these conserved residues are obtained only by including phylogenetically diversecases, pointing up the uncertainties involved in sequence comparisons.

Kloczkowski and Jernigan (18) developed a theory of evolutionary conservedresidues based on the hydrophobic-polar HP lattice model of protein and showedthat a highly connected network of hydrophobic contacts provides extra stability toa protein fold and may be crucial in reaching the lowest energy native state. Wehave developed a model for locating the evolutionarily conserved residues in pro-teins. We first find a core of a protein with the known structure based on the pack-ing of residues, measured by number of contacts. Then we find the so called“supercore” inside the core, by maximizing the number of hydrophobic contacts.The predicted conserved residues for 1hrc and 1mbd are exactly the same as thosereported by Ptitsyn and Ptistyn and Ting, respectively. The theory was also appliedwith success for identifying the conserved residues reported by Mirny andShakhnovich (19) who used COC (conservatism of conservatism) method.

These results show that looping contacts are indeed important for protein folding.What is however important are not the single loop contacts of regular size 23-31

324

Kloczkowski and Jernigan

Page 17: Levinthal Paradox

residues, reported by Berezovsky et al., but rather an interconnected network (orcluster) of such contacts of loops of much larger size, but possibly involving alsoshort helical contacts. In the case of 1hrc such a loop is of size 90 residues (whichis nearly the total length 105 of the protein), while in the case of 1mbd the size ofthe loops is 100-120, also approaching the total length 153 of the protein. Theseare more consistent with the frequently remarked upon feature of proteins – that theends of the chain are close together (20).

References and Footnotes

325Loop Folds in Proteins and

Conservation of Folding Nuclei

1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.

19.20.

Berezovsky, I. N. and Trifonov, E. N., J. Biomol. Struct. Dyn. 20, 5-6 (2002).Zwanzig, R., Szabo, A. and Bagchi, B., Proc. Natl. Acad. Sci. USA 89, 20-22 (1992).Levinthal, C., J. Chim. Phys. Chim. Biol. 65, 44-45 (1968).Bryngelson, J. D. and Wolynes, P. G., Proc. Natl. Acad. Sci. USA 84, 7524-7528 (1987).Dill, K. A., Protein Sci. 8, 1166-1180 (1999).Lau, K. F. and Dill, K. A., Proc. Natl. Acad. Sci. USA 87, 638-642 (1990).Chan, H. S. and Dill, K. A., J. Chem. Phys. 95, 3775-3787 (1991).Yue, K. and Dill, K. A., Proc. Natl. Acad. Sci. USA 92, 146-150 (1995).Finkelstein, A. V., J. Biomol. Struct. Dyn. 20, 311-314 (2002).Finkelstein, A. V., Badretdinov, A. Y. and Gutin, A. M., Proteins 23, 142-150 (1995).Berezovsky, I. N., Grosberg A. Y. and Trifonov, E. N., FESB Letters 466, 283-286 (2000).Berezovsky, I. N. and Trifonov, E. N., J. Mol. Biol. 307, 1419-1426 (2001).Berezovsky, I. N. and Trifonov, E. N., J. Biomol. Struct. Dyn. 19, 397-403 (2001).Ptitsyn, O. B., Dokl. Akad. Nauk SSSR 210, 1213-1215 (1973).Ptitsyn, O. B., J. Mol. Biol. 278, 655-666 (1998).Ptitsyn, O. B. and Ting, K.-L., J. Mol. Biol. 291, 671-682 (1999).Ting, K.-L. and Jernigan, R. L., J. Mol. Evol. 54, 425-436 (2002).Kloczkowski, A. and Jernigan, R. L., Evolutionary Conserved Residues and Protein Folding,to be published.Mirny, L. A. and Shakhnovich, E. I., J. Mol. Biol. 291, 177-196 (1999).Bahar, I. and Jernigan, R. L., Biophys. J. 66, 454-466 (1994).

Date Received: November 6, 2002

Communicated by the Editor Ramaswamy H Sarma

Page 18: Levinthal Paradox
Page 19: Levinthal Paradox

What is Paradoxical about Levinthal Paradox?

http://www.jbsdonline.com

Abstract

We would be tempted to state that there has never been a Levinthal paradox. Indeed, Levinthalraised an interesting problem about protein folding, as he realized that proteins have no timeto explore exhaustively their conformational space on the way to their native structure. He didnot seem to find this paradoxical and immediately proposed a straightforward solution, whichhas essentially never been refuted. In other words, Levinthal solved his own paradox.

During a meeting held in 1969 in Monticello, Levinthal estimated the number of dif-ferent conformations accessible to a 150-residue protein to be roughly of the orderof 10300, whereas the number of conformations sampled by a natural protein beforereaching its final state is of the order of 108 (1). In the same report (usually incor-rectly referenced), Levinthal himself solved what could at that time appear as non-intuitive, even paradoxical. Indeed, he proposed that “protein folding is speeded andguided by the rapid formation of local interactions which then determine the furtherfolding of the peptide; this suggests local amino acid sequences which form stableinteractions and serve as nucleation points in the folding process.”

Of course, there are two words in Levinthal’s sentence that are questionable: localand stable. Let us start with the last one. Clearly, a few residues can only presenta very marginal stability. Anfinsen clarified this point in 1973 by introducing theconcept of ‘flickering equilibria’ (2): “it seems reasonable to suggest that portionsof a protein chain that serve as nucleation sites for folding will be those that can‘flicker’ in and out of the conformation that they occupy in the final protein, andthat they will form a relatively rigid structure stabilized by a set of cooperativeinteractions.” Moreover, Anfinsen did not restrict nucleation to local interactionsalong the protein chain, since he stated that “the nucleation centers might beexpected to involve substructures as helices, pleated sheets or β-bends”.

It was realized at about the same time that the folding of a protein into its nativestructure essentially occurs independent of the initial conditions; this conclusionwas reached on the basis of the experimental observation that small denatured pro-teins were able to refold in vitro (3), under suitable conditions. This ruled out thehypothesis, to our knowledge originally due to Chantrenne (4), that the only wayof achieving correct folding is by the sequential growth of the polypeptide chain onthe ribosome. In the 70’s, the prevailing view was that folding follows pathwaysalong which nucleation events take place. The number of pathways was (and isstill) a subject of debate. In particular, according to Honig et al. (5) in 1976, “pro-teins fold by following a multiply branched pathway”. In good agreement with thisassumption, most current scenarios involve a huge number of parallel pathwayspossibly sharing a number of key steps.

Another point of discussion concerned the question of whether protein folding isunder thermodynamic or kinetic control. According to Anfinsen (2), native protein

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

Marianne Rooman*

Yves DehouckJean Marc KwasigrochChristophe BiotDimitri Gilis

Ingénierie Biomoléculaire

Université Libre de Bruxelles

CP 165/64, 50 avenue Roosevelt

B-1050 Bruxelles, Belgium

327

*Phone: 32-2-650 2067/5572Fax: 32-2-650 3606Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #5

Page 20: Levinthal Paradox

structures correspond to global free energy minima. But Levinthal (1) argued that“the final conformation has not necessarily to be the one of lowest free energy. Itobviously must be a metastable state which is in a sufficiently deep energy well tosurvive possible perturbations in a biological system.”

These chosen extracts show that the folding problem was well understood in theearly 70’s. What have we learned since then?

Many new folding models, theories and concepts have been proposed, often basedon interesting ideas or experimental observations, and indisputably provide valu-able precisions and clarifications on the mechanisms of protein folding. We feelhowever that the originality of these ‘new’ concepts is often somewhat overratedby comparing to a hypothetical ‘classical view’ of protein folding which wouldinvolve well defined pathways, predominance of local interactions and an absolutenecessity for stable intermediates. As stressed by Honig (6), this simplistic view isvery unlikely to correctly reflect the thoughts prevailing in those pioneering days.Curiously however, the lack of contradiction between old and new views has notprevented ongoing, passionate, debates around Levinthal’s paradox, frequentlyopposing the partisans of different models.

First, there has been a continuing dispute between the supporters of nucleation andthose of hierarchic folding, where typically some secondary structure elements orloops form first. These two views easily reconcile when considering that smallstructure elements can only flicker in and out, as nicely pointed out by Anfinsen(2). Therefore, hierarchic folding units must not be viewed as rigid but rather asflickering entities. On the other hand, it is obvious that all residues along the chainare not equally prone to constitute nucleation centers. This is supported by theexperimental observation that some peptides are more structured in solution thanothers (7), which actually means that they flicker in and out of a specific confor-mation. Moreover, some specific protein residues have been observed to formnative tertiary contacts earlier than others and to stabilize folding nuclei (8).Hence, we do not feel that hierarchic folding must be opposed to nucleation; rather,these approaches should be considered as complementary.

In the context of hierarchic folding, it has been recently suggested that proteins aremade up of closed loops that fold separately (9). It is difficult to believe that all cur-rent proteins exhibit this property. However, the idea that original proteins weresmall closed-loop peptides with flickering stability, which have been assembled dur-ing evolution, and that some trace of this evolution is left in the current proteins, isquite attractive. Again, with this softened view, much of the controversy vanishes.

Another much debated issue is whether nucleation centers consist of local interac-tions along the chain or of tertiary contacts. The answer seems obvious: it is protein-dependent. Indeed, numerous experimental and theoretical studies have revealedprotein sequences exhibiting a strong signal towards the native structure, encodedlocally along the sequence or, on the contrary, in specific tertiary contacts. Both ten-dencies are probably often conjugated, with for example small flickering secondarystructure elements forming a tertiary contact and thereby inducing nucleation.

To achieve rapid folding towards the native state, it has been suggested that theenergy gap between the native conformation and the other conformations that arestructurally unrelated to it must be sufficiently large (10). This is probably, in gen-eral, a necessary condition, at least in a slightly modified form taking into accountthe existence of proteins adopting several folded structures, depending or not on theenvironmental conditions. Another concept proposed a few years ago is that pro-tein energy landscapes have the shape of a funnel (11). This unquestionably yieldsa very nice, intuitive, vision of the folding mechanism, where folding is funneledtowards low energy states, which are much less numerous than high energy states.

328

Rooman et al.

Page 21: Levinthal Paradox

However, this basically corresponds to requiring the independence towards initialconditions and the absence of insurmountable energy minima on the ways to thenative state, and to translating the ‘old view’ in the framework of statisticalmechanics. Besides, a funnel-like shape has been shown to constitute an insuffi-cient condition for ensuring consistent folding (12).

Finally, with the finding of proteins that polymerize, aggregate, exhibit domain swap-ping or several folded structures, it was realized that folding is even more complex topredict and simulate than originally hoped for, and that there is probably also not aunique answer to the question whether native structures correspond to relative orabsolute free energy minima. These considerations bring us to think about the fold-ing mechanism that evolution tends to favor. This issue is probably related to the pos-sible biological role played by the multiplicity of conformations, or to its pathologi-cal consequences. But most certainly, evolution singled out a tiny fraction of possi-ble amino acid sequences, having adequate folding and functional properties.

Pretending that nothing has evolved since the early 70’s is certainly an exaggera-tion. The development of potent experimental and theoretical techniques have ledto support and clarify many of the proposed views. We have, however, the impres-sion that some earlier contributions in the disciplne have been forgotten or misin-terpreted and that in light of these, recent advances should somewhat be relativized.

Acknowledgments

D. G. and M. R. are Research Assistant and Research Director, respectively, at theBelgian National Fund for Scientific Research (FNRS). C. B. is supported by aBioVal research program of the Walloon Region, and Y. D. by a grant from theFonds de la Recherche pour l’Industrie et l’Agriculture (FRIA).

References and Footnotes

329What is Paradoxical about

Levinthal Paradox?

1.

2.3.

4.

5.6.7.

8.9.

10.

11.12.

Levinthal, C. How to Fold Graciously. In Mossbauer spectroscopy in biological systems.Proceedings of a meeting held at Allerton House, Monticello, Illinois. Edited by DebrunnerP., Tsibris J. & Munck E. (University of Illinois Press, Urbana, Illinois, 1969), pp 22-24.Anfinsen, C. B., Science 181, 223-230 (1973).Anfinsen, C. B., Haber, E., Sela, M., and White F. H., Proc. Natl. Acad. USA 47, 1309-1314(1961); Anfinsen, C. B. General remarks on protein structure and biosynthesis. In InformationalMacromolecules, edited by Vogel, H. J., Bryson, V. and Lampen, J. O. (Academic Press, New-York & London, 1963), pp 153-166; Schechter, A. N., Chen, R. F., and Anfinsen, C. B., Science167, 886-887 (1970).Chantrenne H., The Biosynthesis of Proteins (Pergamon Press, Oxford, London, New-York& Paris, 1961), p 122.Honig, B., Ray, A., and Levinthal, C., Proc. Natl. Acad. Sci. USA 73, 1974-1978 (1976).Honig B., J. Mol. Biol. 293, 283-293 (1999).Dyson, H. J., Cross, K. J., Houghten, R. A., Wilson, I. A., Wright, P. E., and Lerner, R. A.Nature 318, 480-483 (1985); Dyson, H. J., Rance, M., Houghten R. A., Wright, P. E., andLerner, R. A., J. Mol. Biol. 201, 201-217 (1988); Wright, P. E., Dyson, H. J., and Lerner, R. A.,Biochemistry 27, 7167-7175 (1988).Itzaki, L. S., Otzen D. E., and Fersht, A. R., J. Mol. Biol. 254, 260-288 (1995).Berezovsky, I. N., Grosberg, A. Y., Trifonov, E. N., FEBS Lett. 466, 283-286 (2000);Berezovsky, I. N., and Trifonov, E. N., J. Mol. Biol. 307, 1419-1426 (2001); J. Biomol.Struct. Dyn. 20, 5-6 (2002).Sali, A., Shakhnovich, E., and Karplus, M., J. Mol. Biol. 235, 1614-1636 (1994); Gutin, A.M., Abkevich, V. I., and Shakhnovich, E. I., Proc. Natl. Acad. Sci. USA 92, 1282-1286 (1995).Bryngelson, J. D., Onuchic, J. N., Socci, N. D., and Wolynes P. G., Proteins 21, 167-195 (1995).Bogatyreva, N. S., and Finkelstein, A. V., Protein Eng. 14, 521-523 (2001).

Date Received: October 29, 2002

Communicated by the Editor Ramaswamy H Sarma

Page 22: Levinthal Paradox
Page 23: Levinthal Paradox

Protein Folding: Where is the Paradox?

http://www.jbsdonline.com

Abstract

In this contribution we shall try to argue that no folding scenario – be it hierachical, non-hierarchical, nucleation, etc. – needs to be invoked to solve Levinthal’s paradox: It failson its own grounds.

Since we could not find a satisfactory definition of paradox, we decided to coin oneof our own: A paradox is a logically consistent construction which sprouts from afalse premise but one which is not too obviously so, and produces a logical con-clusion which is very ostensibly false, causing surprise and forcing us to revise thestarting premise. In this way, paradoxes help us to think more rigorously and bemore critical of our own thoughts.

Here is, we believe, where the Socratic or moral content of the paradox resides.Thus, the layman might not know that the sum of infinite rational numbers mayyield a finite result, and for that reason, he might be surprised at some of Zeno’sconclusions, like “an arrow never reaches the target”. Since the layman would findthe latter statement more striking than the original premise (we assume he has notbeen exposed to Calculus), he will be forced to think carefully about something hemight have never thought about otherwise, i.e. that the sum of infinite rationalnumbers might be finite. And, after this thinking induced by the paradox, he mightbe a bit less of a layman.

Not quite in the ancient tradition of Zeno’s paradoxes, Levinthal argued that since thenumber of possible conformations of a protein chain may be estimated to be expo-nential in the number (N) of aminoacids, the exhaustive exploration of conformationspace in a finite time of biological relevance is practically impossible since it is alsoexponential in N. He assumed the number of possible conformations for each indi-vidual aminoacid to be small and fixed, say from 2 to 100, and that the conforma-tions available to each individual aminoacid may be assigned constant and equalprobabilities (although the equality condition may be relaxed) at all times through-out the exploration of conformation space. This premise is false: the conformationsof an individual aminoacid are not equally probable in time, nor are their probabili-ties constant in time. Thus, the number of a-priori possible conformations of thechain might indeed be exponential in N, but this does not imply that the time toexhaustively visit all accessible conformations of the chain is also exponential in N.

The probability of a conformation of an individual aminoacid within the chaindepends on the geometric or structural constraints and basins of attraction the chaingenerates as each aminoacid picks its coordinates. Thus, the repulsive Lennard-Jones terms in the intramolecular potential energy – to name a single contribution– dramatically reduce the probability of certain conformations (i.e. those producingan over-all structure at odds with excluded volume), while the attractive pairwisecontributions dramatically increase the probability of other conformations.

Journal of Biomolecular Structure &Dynamics, ISSN 0739-1102Volume 20, Issue Number 3, (2002)©Adenine Press (2002)

Ariel Fernández1,*

Alejandro Belinky2

María de las Mercedes Boland3

1Institute for Biophysical Dynamics

The University of Chicago

Chicago, IL 606372Finance and Economics Program

Columbia University Business School

3022 Broadway #8G

New York, NY 100273100 Morningside Drive

New York, NY 10027

331

*Phone: 773 834 4782Fax: 773 702 0439Email: [email protected]

An Opinion Piece: Conversation on Levinthal Paradox & Protein Folding #6

Page 24: Levinthal Paradox

In a nutshell: The very existence of an intramolecular potential renders the startingpremise of the Levinthal paradox false.

Moreover, even the idea that the protein chain exhaustively explores all conforma-tions available along a successful folding pathway is false. Why is the thermody-namic limit even relevant to protein folding? Where does the idea that the nativestructure is the free energy minimum come from? To the best of our knowledge,these are baseless hypotheses.

Thus, we fail to see why the Levinthal paradox attracts attention: Its premises aretoo obviously false and the striking conclusion it purports to reach is uncalled for.The former might not be much of a defect in a paradox, the latter certainly is.

332

Fernández et al.

Date Received: October 5, 2002

Communicated by the Editor Ramaswamy H Sarma