Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA...

32
Approximate Scaling Properties of RNA Free Energy Landscapes Subbiah Baskaran Peter F. Stadler Peter Schuster SFI WORKING PAPER: 1995-10-083 SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu SANTA FE INSTITUTE

Transcript of Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA...

Page 1: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Approximate Scaling Propertiesof RNA Free EnergyLandscapesSubbiah BaskaranPeter F. StadlerPeter Schuster

SFI WORKING PAPER: 1995-10-083

SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent theviews of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our externalfaculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, orfunded by an SFI grant.©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensuretimely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rightstherein are maintained by the author(s). It is understood that all persons copying this information willadhere to the terms and constraints invoked by each author's copyright. These works may be reposted onlywith the explicit permission of the copyright holder.www.santafe.edu

SANTA FE INSTITUTE

Page 2: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Approximate Scaling Properties of

RNA Free Energy Landscapes

By

Subbiah Baskarana�b�c� Peter F� Stadlerb�d� and Peter Schusterb�c�d��

aES��� Marshall Space Flight Center� Huntsville� AL����� USAbInstitut f�ur Theoretische Chemie� Universit�at Wien� Vienna� Austria

cInstitut f�ur Molekulare Biotechnologie eV� Jena� GermanydSanta Fe Institute� Santa Fe� NM ������ USA

�Correspondence to Institut f�ur Theoretische Chemie� Universit�at Wien

W�ahringerstra�e ��� A����� Vienna� AustriaPhone ���� � �� ��� � ���Fax ���� � �� ��� � ���

Email pks�tbiunivieacat

Page 3: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Abstract

RNA free energy landscapes are analyzed by means of �time�series� that are obtained fromrandom walks restricted to excursion sets� The power spectra� the scaling of the jump size dis�tribution� and the scaling of the curve length measured with di�erent yard stick lengths are usedto describe the structure of these �time�series�� Although they are stationary by construction�we �nd that their local behavior is consistent with both AR� and self�a�ne processes� Randomwalks con�ned to excursion sets i�e�� with the restriction that the �tness value exceeds a certainthreshold at each step� exhibit essentially the same statistics as free random walks�We �nd that an AR� time series is in general approximately self�a�ne on time scales up toapproximately the correlation length� We present an empirical relation between the correlationparameter � of the AR� model and the exponents characterizing self�a�nity�

Key Words

RNA Folding Excursion Sets Fractal Landscape AR� Process ��f noise

� � �

Page 4: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

�� Introduction

Evolutionary optimization as well as combinatorial optimization take place on

landscapes resulting from mapping �micro�con�gurations to scalar quantities like

�tness values� energies� or costs �Schuster � Stadler� ����� In most cases one

lacks a detailed understanding of the structure of �tness landscapes that underlies

a particular instance of biological evolution One resorts thus to using model land�

scapes Well studied examples include combinatorial optimization problems such

as the Traveling Salesman Problem� the Graph Matching Problem� or the Graph

Bipartitioning Problem� various spin glass models� among them the Sherrington�

Kirkpatrick models� and Kau�man�s Nk�models �Kau�man� ������ The ruggedness

of a landscape� often measured by means of a correlation function� is of crucial im�

portance for the dynamics of the evolution process �Eigen et al�� ����� Bonhoe�er

� Stadler� ����� Detailed studies of the correlation structure of model landscapes

can be found� for instance� in the following references �Stadler � Schnabl� ����

Stadler � Happel� ���� Stadler� ���� Weinberger� ����a� Weinberger� ����b�

Weinberger � Stadler� �����

Exclusively in the case of RNA landscapes do we have a sound biophysical model

for the �tness function Models based on RNA secondary structure prediction

algorithms have been analyzed in great details in a series of papers �Fontana et al��

����� Fontana et al�� ����� Bonhoe�er et al�� ����� Tacker et al�� ����� Schuster

et al�� ����� Evolutionary dynamics on such landscapes was the topic of extensive

research as well �Fontana � Schuster� ����� Fontana et al�� ����� Huynen et al��

����� A detailed understanding of these landscapes is a necessary prerequisite

for building simpler models based on spin glass or Nk model landscapes that are

signi�cantly less costly in computer simulations and that lend themselves much

easier to analytical treatment

Weinberger �Weinberger� ����� suggested to characterize a landscape by means of

a �time series� obtained by sampling the �tness values along a random walk in

sequence space While this method is rather indirect� it yields a data set that can

be analyzed by the standard methods of time series analysis �Hordijk� ����� In

� �

Page 5: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

this contribution we shall investigate the �fractal�like� features of landscapes in

terms of the approximate self�a�nity of these �time�series�

A great variety of systems� physical and biological� exhibit ��� power spectra�

commonly called ��f�noise or � icker� noise Some examples are resistivity uctu�

ation in conducting materials �Weissman� ������ luminosity uctuations of stars

and galaxies �Nolan et al�� ������ ow uctuations of highway tra�c �Musha �

Higuchi� ����� and of deep ocean waters �Taft et al�� ������ frequency variations

of quartz oscillator �Attkinson et al�� ������ the loudness uctuations in music and

speech �Voss � Clarke� ����� In biological systems ��f noise has been reported

for nerve membranes �Verveen � Derkson� ������ for the DNA sequences of the

non�coding introns �Voss� ���� Li � Kaneko� ���� as well as of coding regions

�Buldyrev et al�� ����� In this paper we will show that the �time�series� sampled

along a random walk on a RNA free energy landscapes also leads to ��f noise

This contribution is organized as follows In section we review some notions that

are basic to the theory of �tness landscapes In particular� we introduce a variety

of correlation measures and highlight their relations with each other In particular

we consider the class of landscapes that lead to exponential correlation functions

of the �time series� obtained from simple random walks In section � we brie y

consider self�a�ne time�series and show that AR��� processes mimic self�a�nity on

time scales up to their correlation length These �nding are applied to free energy

landscapes of RNA in section � In particular� we shall see that the mountainous

parts of the landscapes do not di�er signi�cantly from the average �tness regime�

at least as long as the excursion sets do not fragment into tiny pieces Section �

concludes our discussion The relaxation time of a simple random walk on a

sequence space is computed in the appendix

� � �

Page 6: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

�� Landscapes

����Rugged Landscapes

De�nition� A landscape is a map f C � IR� where C ! �X�d� is a �nite metric

space with metric d X �X � IR

In most applications of landscapes in biology� physics� or combinatorial optimiza�

tion the con�guration space �X�d� can be represented as a graph " Then two

con�gurations x and y are neighbors in " if d�x� y� ! � The metric d is often

obtained from an editing procedure that allows to interconvert two con�gurations

x� y � X by means of a �nite sequence of operations d�x� y� is commonly de�ned

as the number of operations in the shortest sequence that changes x into y or vice

versa In a biological context the �elementary operations� are in general muta�

tions We will restrict ourselves there to the case where X is a set of sequences

of common length n which are constructed from some alphabet with � letters In

this case d is the so�called Hamming distance �Hamming� ������ and the graph "

is known a the sequence space Qn�� or Boolean hypercube in the special case � !

For a recent review see �Schuster � Stadler� ����� Stadler� ����b�

����Correlation Functions

A very important characteristic of a landscape is its ruggedness Rugged land�

scapes are characterized by a large number of local optima �Palmer� ������ the

fact that uphill walks are short and easily trapped in local optima� and by short

correlation lengths �Kau�man� ����� There is ample evidence that heuristic op�

timization procedures work less e�ciently the more rugged a landscape is �Stadler

� � �

Page 7: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

� Schnabl� ���� Schuster � Stadler� ����� It will be convenient to de�ne for a

given landscape f

f !�

jXj

Xx�X

f�x� ��f !�

jXj

Xx�X

�f�x� � f

��� ����

It has been suggested by various authors �Eigen et al�� ����� Fontana et al�� �����

Sorkin� ����� Weinberger� ����� to measure �ruggedness� by some sort of corre�

lation measure We shall use the following de�nition� which was �rst proposed in

ref �Eigen et al�� �����

��d� !�

��f�

jDdj

X�x�y��Dd

�f�x� � f ��f�y� � f

����

Here Dd denotes the set of all pair of vertices that have mutual distance d in the

graph " For a sequence space we have for instance

jDdj ! �n�� � ��d�n

d

�� ����

This de�nition is useful if " is a distance regular graph �Brouwer et al�� ����� A

more general mathematical framework is developed in �Stadler� ����c� Happel �

Stadler� ����� Stadler � Happel� �����

Weinberger �Weinberger� ����� Weinberger� ����a� Weinberger� ����b� suggested

to investigate the properties of landscapes by sampling the values along a simple

random walk in the con�guration space C

x� � x� � x� � � � � � xk � � � �j j j jf� � f� � f� � � � � � fk � � � �

����

where xi and xi�� are neighbors in C At each step one of the neighbors of xi in "

is chosen with uniform probability� ie� the series fxig is a simple random walk on

C �Spitzer� ����� By evaluating the con�gurations along the walk fxig we obtain

a random walk on the landscape� ie� the �time series� ffi ! f�xi�g This series

is stationary by construction

� � �

Page 8: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

The autocorrelation function of a stationary time series is de�ned by

r�s� !hftft�si � hf�t i

������

where �� is the variance� which coincides with ��f de�ned above� and the angular

brackets indicate the expectation value taken over all random walks fxig and

all times t Provided the graph " is D�regular� ie� each con�guration x has

exactly D neighbors� we may write the transition matrix of the random walks as

T ! ���D�A The entry Axy of the adjacency matrix is � or �� depending on

whether the con�gurations x and y are neighbors or not It is shown in �Stadler�

����c� that the correlation function r�s� has the following algebraic representation

r�s� !�

��

hhf�Tsfi � f

�i� ����

Note that h � � � i denotes here a scalar product� not an expectation value# Another

useful �Fontana et al�� ����� representation is

r�s� ! ��h �ft�s � ft�� i

��� ����

The average squared di�erence h �ft�s � ft�� i was used as a correlation measure

in Sorkin�s pioneering paper �Sorkin� �����

The autocorrelation function ��d� of the landscape itself and the autocorrelation

r�s� of the �time�series� of the landscape are related via

r�s� !Xd

�sd��d� ����

where �sd is the probability that a simple random walk of length s ends at distance

d�x�� xs� ! d Explicit expressions for �sd can be found in ref �Fontana et al��

����� Stadler� ����a�� we shall not make use of them in this contribution

� � �

Page 9: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

����Elementary Landscapes

For a quite large number of model landscapes it has been found that the corre�

lation function r�s� is exactly a decaying exponential �Stadler � Happel� ����

Weinberger � Stadler� ������ numerically indistinguishable from a decaying ex�

ponential �Stadler � Schnabl� ���� Stadler� ����� or at least very close to a

decaying exponential �Weinberger� ����� Weinberger� ����a� It has been argued

that a nearly exponential autocorrelation function r�s� would be generic for land�

scapes with a Gaussian distribution of �tness values �Weinberger� ����� This

argument is wrong� however

It is not hard to check that r�s� is exponential whenever f is of the form f�x� !

f $ ��x�� where � is an eigenvector of the adjacency matrix A with eigenvalue

% Indeed� under these conditions one �nds r�s� ! �%�D�s In a more general

context is useful to assume that � is an eigenvector of the so�called graph Laplacian

�Mohar� ������ for regular graphs we have & ! A �DE� where E is the identity

matrix� ie� the eigenvectors of A are the same as the eigenvectors of the Laplacian

& Landscapes of this type have been termed elementary Lov Grover �Grover�

���� found that a number of well known model landscapes are elementary� for

instance the landscape of the Traveling Salesman Problem In �Stadler� ����c� it

is also shown that r�s� is exponential if and only if the landscape is elementary

Note that the possible eigenvalues % are uniquely determined by the adjacency

matrix A� ie� by the geometry of the con�guration space As a consequence

there is only a �nite small number of possible values for the parameter def

���%�D

of the exponential decay� ie� it is not possible to construct a landscape f with

autocorrelation function r�s� ! s with an arbitrarily prescribed parameter '

in contrast to the case of merely constructing a time series In other worlds� only

a very special set of time series is generated by random walks on landscapes

� � �

Page 10: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

����Power Spectra

Instead of a correlation function one can use power spectrum

S��� def

��� limN��

N

���

NXt��

ft cos��t�

$

�NXt��

ft sin��t�

� � ����

of the time series fftg as a means of characterizing the landscape Here N is

the number of points sampled from the time series fftg Power spectrum and

autocorrelation function of a stationary process are related by theWiener�Khinchin

theorem �see� eg� �Yaglom� ������

r�s� !

��

Z �

S��� cos��s�d�

S��� !��

�� $

�Xs��

r�s� cos��s�

�����

A negative slope of S��� implies some degree of correlation in ft A steeper slope

implies a higher degree of correlation A signal fftg is called ��f noise if a log�log

plot of the power spectrum versus frequency can be approximated by straight line

with slope close to �� in the frequency range of interest More generally� one

speaks of ��fa noise if the slope is �a We shall return to this type of time series

in section �

The most common de�nition of a correlation length in physics is simply the integral

of the autocorrelation function In the discrete case it is convenient to use

(� def

����

$

�Xs��

r�s� � �����

Comparing this de�nition with the Wiener�Khinchin theorem� equ����� yields

the simple relation

S��� !��

(� ����

which can be used as an alternative way of estimating the correlation length of a

time series

� � �

Page 11: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

����Excursion Sets

The parts of the landscape in which the values are close to the global maximum

or minimum are particular interest One might ask� for instance� how the �good�

solutions are distributed in sequence space) Are they clustered around a globally

optimal solution� or are con�gurations with close�to�optimal values scattered all

over the con�guration space) A suitable mathematical framework for this type of

questions is set by the notion of excursion sets �Adler� ����� In this subsection

we collect a few de�nitions and their immediate corrolaries which will be useful

for the discussion of the RNA free energy landscapes in section �

De�nition� Let f X � IR be an arbitrary landscape

�i� A con�guration x is a local optimum if for all neighbors y of x holds f�x� �

f�y� Two con�gurations x and y are called neutral if f�x� ! f�y�

�ii� The set AE ! fx � Xjf�x� � Eg is called the excursion set of f at level E

A connected component of AE is called a cycle �Freidlin � Wentzell� �����

�iii� A connected subgraph B � A is called neutral network in X if all elements

are neutral� and if all neutral neighbors of any x � B are elements of B as

well

For su�ciently small E we have of course AE ! X� the entire con�guration space

On the other hand� if E is larger than the global optimum of f � then AE is

empty Clearly� E � E� implies AE AE� � hence excursion sets introduce a

hierarchical structure on the landscape In general� AE will not be connected� ie�

it will decompose into more than one cycle �connected component� Bounds on

the number of cycles can be obtained for elementary landscapes and the special

value E ! f � for details see �Stadler� ����c� Cycles play a prominent role in

the analysis of simulated annealing techniques on combinatory landscapes� see

�Azencott� ���� for a recent review

Excursion sets� local optima� and neutral networks are closely related We list

here only a few simple geometric relationships �i� Suppose CE is a cycle and B

is a neutral network� then CE and B are either disjoint or B is subset of CE �ii�

� � �

Page 12: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Each cycle CE contains at least one local optimum �iii� A neutral network B is

a cycle if and only if it consists entirely of local optima Each cycle CE contains

a cycle of this type �iv� A neutral network which is a cycle contains no other

cycles except for itself �v� If a cycle consists of only one con�guration then this

con�guration is a local optimum

The notion of excursion sets suggests two percolation problems �i� At which level

E does AE cease to be a single cycle) �ii� At which level E does AE decom�

poses into many small cycles� as opposed to consisting of a single giant component

containing almost all vertices of AE and a number of very small islands) Both

problems have not been treated so far� although they seem to be of utmost im�

portance for the understanding of adaptation on combinatory landscapes In this

contribution we shall be content with investigating the structure of landscape at

�tness levels for which the cycles are still large in general

���Random Walks on Excursion Sets

Instead of performing the random walk on the entire con�guration space C one may

con�ne it to an excursion set AE � C The random walks is then automatically

constrained to a connected component of AE � ie� to a cycle We used the following

procedure to generate a walk within a cycle CE The process starts in a vertex

a� known to be in the desired excursion set These initial points are generated

by screening a large number of random con�gurations �Alternatively one might

use con�gurations obtained from some simple optimization heuristics as starting

points for higher excursion levels This would� however� bias the the sampling�

since con�gurations in large �mountains� would be favored� Then an attempt is

made to move to a neighboring vertex If it is contained in the same cycle� ie� if its

�tness is above the threshold level E then it is accepted� otherwise the attempt is

rejected The �time�series� is formed by the accepted moves only This procedure

generates a time series provided CE contains more than one con�guration In fact�

we are only interested in large cycles

� �� �

Page 13: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

It is clear that con�ning random walks to cycles means that they sample predom�

inantly in the vicinity of local optima One can hope� therefore� that the resulting

time�series provide information about the most interesting regions of the �tness

landscape ' the region of high �tness The major drawback is that� by equ����

the time series contains a superposition of two e�ects� namely the correlation of

�tness values on the landscape and the geometrical relaxation of the walk in CE

The correlation of a walk in CE is

rE�s� !Xd

�Esd�E�d� �E�hd�s�iE �� with hd�s�iE !

Xd

�Esdd� �����

Here �E�d� is the correlation of the restriction of the landscape to the excursion

set AE and hd�s�iE describes the geometric relaxation of a random walk in a cycle

CE Since the topology of CE is not known it is very di�cult to retrieve more than

qualitative information on the structure of the mountainous parts of the landscape

� �� �

Page 14: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

�� Self�A�ne Time Series and Fractal Landscapes

���� SelfA�ne Time Series

De�nition� A time series fFtg is self�a�ne �or fractal� if

s�H �Ft�s � Ft�d

��� �Ft�� � Ft�� �����

where s is the number of steps between the two measurements The notation d

���

indicates equality in the sense of distributions The parameterH ful�ls � � H � �

An example is fractional Brownian motion� see� eg� �Mandelbrot� ���� The

power spectrum of a time series with a distribution ful�lling ���� follows a power

law �Mandelbrot � vanNess� ����� of the form

S��� ! ��a with a ! � $ H ����

In case of fractional Brownian motion in continuous time� the parameter H and

the Haussdorf dimension DH of the resulting curve are related by DH ! H $ �

We remark that a time series ful�lling ���� strictly for all s cannot be stationary

Instead of using the power spectrum one can use more direct methods for char�

acterizing a self�a�ne time series Probably the most immediate approach is to

consider the jump size

J�s� ! hjFt�s � Ftji �����

As an immediate consequence of ���� we have J�s� � sH H can be obtained by

means of a least square �t from a log�log plot� see� eg� �Osborne � Provenzale�

����� A closely related technique has been proposed by Sorkin �Sorkin� �����

Multiplying ���� by itself and taking the expectation yields

s��H h �Ft�s � Ft�� i ! h �Ft�� � Ft�

� i �����

� � �

Page 15: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

and one obtains the slope H from a log�log plot of the mean square di�erences

versus the lag s

Another approach to self�similarity focuses on the curve length as function of the

yard�stick length used for the measurement The method outlined below was

proposed in ref �Higuchi� ����� as an improvement of the procedure given by

Burlaga and Klein �Burlaga � Klein� ����� It provides numerically stable scaling

exponents even for a small number of data points We divide the time series fFtg

into k partial series

Fm�s� ! fFm� Fm�s� Fm��s� � � � � Fm�bN�m

scsg�

and de�ne the length of Fm�s� as

Lm�s� !N � �

sbN�ms c

Xi

jFm�is � Fm��i���sj�

The curve length L�s� measured with step size s is then the average value taken

over all the partial series

L�s� !�

s

sXm��

Lm�s�� �����

If the time�series is self�a�ne� then the curve length follows a power law of the

form L�s� � s�D The correction factor in the de�nition of Lm�s� approaches �

for large data sets� and hence we �nd as an immediate consequence of equ����

that

L�s��N

ssHh jFt�� � Ftj i � sH��� �����

and therefore D ! ��H The parameters a� H� and D of a self�a�ne time series

are related by means of the equations

a ! � $ H ! �� D� �����

Independent estimates of a� H� and D can thus be used to determine to what

extent a given time series is consistent with the assumption of self�a�nity

� �� �

Page 16: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

����Fractal Landscapes

It is obvious that a time series obtained from a random walk on a landscape cannot

be strictly self�a�ne since it must be �at least approximately� stationary Hence

���� is an approximation that holds only for s n� where n is the maximal

distance in con�guration space

Dividing Sorkin�s equ���� by twice the ��nite� variance �� of the landscape and

substituting equ��� we �nd

s��H��� r�s�� ! �� r���� �����

Solving for the autocorrelation function yields r�s� ! � � cs�H The parameter

c can be obtained as follows Since a single step along the random walk always

leads to distance � we have r��� ! ���� def

��� � the nearest neighbor correlation of

the landscape Thus ! � � c� and we �nally obtain an autocorrelation function

of form

r�s� ! �� �� � �s�H � �����

Equ���� holds of course only for s small compared to the maximum distance in the

landscape It has been used for a classi�cation of rugged landscapes �Weinberger

� Stadler� ����� Stadler� ����b� in terms of the parameter

H !�

ln��� r�s��

ln s������

for small s

���� �AR ��Landscapes� are Locally Fractal

An AR��� �or Ornstein�Uhlenbeck� process is de�ned by the following recurrence

relation� see� eg� �Papoulis� ����� Feller� ����

Ft ! �Ft�� $ �t� �� � � � � ������

� �� �

Page 17: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

where �t is denotes Gaussian white noise with variance ��� The resulting time

series is stationary and has the Markov property Its autocorrelatation function is

r�s� ! s ! exp��s���� � def

��� ��

ln �����

where � is the correlation length as de�ned in �Weinberger� ����� Fontana et al��

����� Conversely� any Gaussian stationary Markov process has an autocorrelation

function of the form ���� The parameter measures the correlation of the time

series If � then the time series is almost uncorrelated� ie� fFtg is almost

white noise On the other hand� for � the time series approximates Brownian

motion

Before we proceed let us brie y discuss the relation between � and (� de�ned in

section The RNA free energy landscapes and almost all of the model landscapes

that have been investigated so far have correlation length that scale linearly with

n� see eg �Schuster � Stadler� ����� for a recent overview In other words� we

have def

��� � � x where x scales as ��n for large systems If r�s� is of the form

����� then we �nd

� !�

x��

$

�x$O�x��

(� !�

x��

������

Thus � and (� di�er only by a contribution of order ���� � ��n for the landscapes

of interest

The power spectrum of an AR��� time series is

S��� !����� ��

�� $ � � cos��� ������

see� eg� �Yaglom� ����� In fact� the Wiener�Khinchin theorem� equ���� shows

that equ����� hold for all elementary landscapes� irrespective of the distribution

function of the �tness values

Weinberger �Weinberger� ����� called a landscape f C � IR an AR��� landscape

if the time series obtained by a random walk on the landscape is Gaussian and

has an autocorrelation function of the form ����� The parameter describes

� �� �

Page 18: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

the ruggedness of the landscape The landscapes with exponential autocorrelation

functions are exactly the elementary landscape discussed above An AR��� land�

scape in the sense of Weinberger is thus an elementary landscape with a Gaussian

�tness distribution A number of model landscapes have been shown to be elemen�

tary �Grover� ���� Weinberger � Stadler� ����� Stadler� ����c� Most of them

have in fact a Gaussian distribution of �tness values� at least asymptotically as a

consequence of the central limit theorem The best known examples are the p�spin

models� the graph bipartitioning problem� graph matching� graph coloring� and

symmetric traveling salesman problems Kau�man�s Nk models are approximately

AR���� their decomposition into elementary components is discussed in detail in

�Stadler � Happel� ����� The class of landscapes that are approximately AR���

includes a variety of landscapes based on RNA secondary structures �Fontana

et al�� ����� Fontana et al�� ����� Bonhoe�er et al�� ����� Tacker et al�� �����

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0rho

0.0

0.1

0.2

0.3

0.4

0.5

H, 1

-D

Figure �� Approximated scaling exponents H solid line� and ��D dotted line� as a functionof the correlation � of an AR� process� The deviations are of the order of ��

The following considerations� like equ������ depend only on the form of correla�

tion function r�s�� not on the distribution function of the �tness values They are

� �� �

Page 19: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

therefore valid for any elementary landscape The linear approximation

r�s� �� s��� s n ������

of equ���� is a good approximation for highly correlated landscapes� ie� for

landscapes with correlation lengths � ! O�n� By comparing equ����� with

equ���� we observe that elementary landscapes with large correlation length are

locally self�a�ne� with scaling parameter H!��� ie� time series obtained from

such landscapes behave locally like ordinary Brownian motion

Surprisingly� however� we �nd that even AR��� time series with small correlation

length show approximate power laws for J�s�� equ���� and for the curve lengths

L�s� Numerical simulations show that we have H � � for � �� while � � �

yields H � �� Data obtained from direct measurement of H� according to

equ����� and estimates of the scaling exponent D of the curve length obtained

from ���� are consistent with each other Best �ts of the characteristic exponents

H and � �D as functions of are shown in Figure � It is interesting to note in

this context that certain log�normal distributions can also mimic ��f spectra in a

limited frequency domain �Montroll � Shlesinger� �����

� �� �

Page 20: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

��RNA Free Energy Landscapes

Folding biopolymer sequences into structures is a central problem in molecular

biology research Both robustness and accessibility of structures� as functions of

mutational change in the underlying sequence� are crucial to natural as well as

molecular evolution applied to biotechnology RNA molecules are an excellent

model system In fact� they are the only class of biopolymers for which the folding

problem has been solved at least at the level of secondary structures

An RNA sequence is a string of length n composed of an alphabet of size � In

nature the alphabet consists of the � ! � bases Guanine� Cytosine� Adenosine�

and Uracile In this paper we shall also consider the restricted alphabet fG�Cg

with � ! A natural distance between sequences is the Hamming distance

measuring the number of positions in which two sequences di�er �Hamming� �����

The con�guration space is hence a generalization of the Boolean hypercube known

as the sequence space

A secondary structure is tantamount to a list of Watson�Crick type and GU base

pairs Such a structure can be uniquely decomposed into structural elements that

are �i� base pair stacks� �ii� loops di�ering in size �number of unpaired bases�

and branching degree hairpin loops �degree one�� internal loops �degree two or

more�� and �iii� bases which are not part of a stack or a loop are termed external

�freely rotating joints and unpaired ends� Each stack or loop element contributes

additively to the overall free energy of the structure These energy terms are

empirically determined parameters that depend on the nucleotide sequence �Freier

et al�� ����� The folding process considered here maps an RNA sequence into a

secondary structure minimizing free energy This structure can be computed using

a dynamic programming algorithm �Zuker � Stiegler� ����� Zuker � Sanko��

����� The implementation used in this contribution is described in detail in

�Hofacker et al�� ������ it is available as a public domain package �Hofacker et al��

�����

In this contribution we focus not on the secondary structures themselves but rather

on the free energies� &G� of structure formation The bulk properties of these

� �� �

Page 21: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

-1.5 -1.0 -0.5 0.0 0.5log(omega)

0.0

1.0

2.0

3.0

4.0

5.0

log(

S(o

meg

a))

Figure �� Raw dat of the power spectrum obtained for a GC landscape with chain lengthn � �� at excursion level �G � �� Walk length is N � ����� The solid line is the best�t to S�� � ��a � with a � ���� The dotted line is ��f�noise�

minimum free energy landscapes haven been studied extensively in the past �Bon�

hoe�er et al�� ����� Fontana � Schuster� ����� Fontana et al�� ����� Fontana et al��

����� Fontana et al�� ����� Fontana et al�� ����� Schuster et al�� ����� Schuster �

Stadler� ����� They are typical representants of rugged landscapes

Figure shows a sample power spectrum obtained along a random walk as de�

scribed in section The data are rather noisy In order to smooth them we break

the walk into pieces of �� steps� calculate the power spectrum for each of them�

and then we average the power spectra ��� steps is about twice the diameter of

the sequence space in this case� thus signi�cantly longer walks are not meaningful

because the range of local self�a�nity is necessarily restricted to a small multiple

of the the geometrical relaxation time � of the random walk fxtg�

hd�s�i ! hd���i �� � e�s�� � �����

For a free random walk on a sequence spaces we �nd

� !�� �

�n$O��� � ����

� �� �

Page 22: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Table ��Power spectrum index a for time series obtained from RNA landscapes

The values of a as obtained directly from the power spectrum are compared to the

values calculated from jump exponentH and the scaling exponentD� aH ! �$H�

and aD ! �� D

n �� �� ��

&G� a aH aD a aH aD a aH aDAUGC

� ��� ���� ��� ���� ���� ���� ���� ��� ����� ���� ��� ���� ���� ���� ��� ���� ��� ������ ���� ���� ��� ��� ��� �� ���� ���� ����� ���� ���� ���� ���� ���� ���� ��� ��� ����� � ���� ���� ���� ���� ���� ���� ���� ����� � ���� ���� � ���� ���� ���� ���� ����

GC

� ���� ���� ��� ��� ��� ���� ��� ��� ���� ���� ���� ��� ��� ��� ���� ��� ��� ����� ���� ���� ��� ��� ��� ���� ��� ��� ����� ���� ���� ��� ��� ��� ���� ��� ��� ���� ���� ���� ��� ��� ��� ���� ��� ��� ���� ���� ���� ���� ��� ��� ���� ��� ��� ���

� �G in kcal�mol�� indicates insu�cient data�Systematic errors are estimated to be of the order of ��� compare Figure �

This expression will be derived in the Appendix

The data are consistent with a ��a spectrum with a not much larger than �

Numerical values are shown in Table � It turns out� however� that the data are

also consistent with the power spectrum of an AR��� process� see Figure �

The additivity of the energy contributions implies a certain degree of neutrality in

the landscape� for details see �Fontana et al�� ����� Several structures which con�

sist of identical sets of substructures map onto the same selective values� although

their phenotypic appearances are di�erent In fact� there are very large neutral

networks on the level of secondary structures themselves �Schuster et al�� �����

Reidys et al�� ����� Gr�uner� ����� This implies that even at fairly high excursion

� � �

Page 23: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

-1.4 -1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4log(omega)

2.0

3.0

4.0

5.0

log(

S(o

meg

a))

Figure �� The dots are spectral data for walks on a GC landscape with n���� excursion level�� averaged over � walks� The solid curve is the best �t to the AR� power spectrum�equ������ with an estimate for the parameter � � ����� corresponding to a correlationlength � � ���� The correlation length estimated directly from the autocorrlationfunction is about ��� as shown in ref� Fontana et al�� ����� The corresponding powerspectrum is shown as dotted line� The solid straight line is a least square �t with apower law ��a we �nd a � �����

levels �a couple of standard deviation above the mean� the excursion sets are still

large

We �nd that the scaling properties do not depend strongly on the excursion level

There is� however� a systematic trend towards smaller values of � for higher excur�

sion levels Our data indicate that mountainous regions of the landscape are not

drastically di�erent from the average Our data are biassed by the geometry of

the cycles CE� however� and hence a detailed quantitative analysis is not possible

at present

� � �

Page 24: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

��Conclusions

The structure of the mountainous parts of RNA free landscapes was studied by

random walks con�ned to excursion sets at given energy levels

Spectral data and local scaling analysis of the series generated by simple random

walks show self�a�nity consistent with the low�frequency behavior of an AR���

time series We �nd that in general an AR��� processe appears to be approx�

imately self�a�ne on length scales smaller than a few correlation lengths The

data obtained from RNA free energy landscapes indicate that a fractal�like struc�

ture is present at length scales up to the diameter of the sequence space This

is a consequence of the fact that the correlation length of the RNA free energy

landscapes is comparable to the sequence length

Our computer experiments exhibit no signi�cant dependence of the statistical

properties of the excursion set con�ned walk on the energy level Hence� at least

qualitatively� the statistical properties of the mountains do not di�er from the

low�lands This is true at least as long as the excursion set does not break up in

very small cycles

The present study suggests that a detailed investigation of the percolation of ex�

cursion sets� of the geometry of excursion sets� and of the geometrical relaxation of

random walks con�ned to cycles will be necessary before a complete understanding

of the structure of the mountain ranges of �tness landscape is possible

Acknowledgments

The work was funded by �OAD projno Z� ����EH��� EH�Project ������ SB

gratefully thanks Prof Murali Sheshadri and Prof A Nadarajan for their interest

and support PFS thanks the Inst Ciencias Nucleares and the Inst de Fisica of

the Universidad Nacional Autonoma de Mexico for their hospitality in September

����� when this paper was �nished

� �

Page 25: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

References

Adler� D ������ The Geometry of Random Fields New York John Wiley �

Sons

Attkinson� W� Fey� L� � Newman� J ������ Spectrum analysis of extremely

low frequency variation of quartz ocillators Proc�IEEE ��� ���

Azencott� R ����� Simulated Annealing parallelization techniques New

York John Wiley � Sons

Bonhoe�er� S� McCaskill� J� Stadler� P� � Schuster� P ������ Temperature

dependent RNA landscapes� a study based on partition functions European

Biophysics Journal ��

Bonhoe�er� S � Stadler� P F ������ Errortreshold on complex �tness land�

scapes J�Theor�Biol� ��� ������

Brouwer� A� Cohen� A� � Neumaier� A ������ Distance�regular Graphs Berlin�

New York Springer Verlag

Buldyrev� S� Goldberger� A� � Stanley� H ������ Long�range correlation prop�

erties of coding and noncoding dna sequences Genbank analysis Phys�Rev�E

��� ���������

Burlaga� L � Klein� L ������ Fractal structure of the interplanetary magnetic

�eld J�Geophys�Res� ��� ����)))

Eigen� M� McCaskill� J� � Schuster� P ������ The molecular Quasispecies

Adv� Chem� Phys� ��� ��� � ��

Feller� W ����� An Introduction to Probability Theory and its Applications

New York Wiley

Fontana� W� Griesmacher� T� Schnabl� W� Stadler� P� � Schuster� P ������

Statistics of landscapes based on free energies� replication and degredation rate

constants of RNA secondary structures Monatshefte der Chemie ���� �������

� � �

Page 26: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Fontana� W� Konings� D A M� Stadler� P F� � Schuster� P ������ Statistics

of rna secondary structures Biochemistry ��� ���������

Fontana�W� Schnabl�W� � Schuster� P ������ Physical aspects of evolutionary

optimization and adaption Physical Review A �� ���� ��������

Fontana� W � Schuster� P ������ A computer model of evolutionary optimiza�

tion Biophysical Chemistry �� ������

Fontana� W� Stadler� P F� Bornberg�Bauer� E G� Griesmacher� T� Hofacker�

I L� Tacker� M� Tarazona� P� Weinberger� E D� � Schuster� P ������ RNA

folding and combinatory landscapes Phys� Rev� E �� ���� ��� � ���

Freidlin� M �Wentzell� A ������ Random Perturbations of Dynamical Systems

New York Springer�Verlag

Freier� S M� Kierzek� R� Jaeger� J A� Sugimoto� N� Caruthers� M H� Neilson�

T� � Turner� D H ������ Improved free�energy parameters for predictions of

RNA duplex stability Proc� Natl� Acad� Sci� USA ��� ���������

Grover� L ����� Local search and the local structure of NP�complete problems

Oper�Res�Lett� ��� �����

Gr�uner� W ������ Evolutionary Optimization on RNA Folding Landscapes PhD

thesis Inst of Theoretical Chemistry� Uni Vienna� Austria

Hamming� R W ������ Error detecting and error correcting codes Bell

Syst�Tech�J� ��� �������

Happel� R � Stadler� P F ������ Canonical approximation of �tness landscapes

Santa Fe Institute Preprint ���������

Higuchi� T ������ Approach to an irregular time series on the basis of fractal

theory Physica D ��� ����

Hofacker� I L� Fontana� W� Stadler� P F� Bonhoe�er� L S� Tacker� M�

� Schuster� P ������ Vienna RNA Package pub�RNA�ViennaRNA����� �

ftp�itc�univie�ac�at �Public Domain Software�

� � �

Page 27: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Hofacker� I L� Fontana� W� Stadler� P F� Bonhoe�er� S� Tacker� M� �

Schuster� P ������ Fast folding and comparison of RNA secondary structures

Monatsh� Chemie ��� ��� �������

Hordijk� W ������ A measure of landscapes Santa Fe Institute Preprint ������

���

Huynen� M A� Stadler� P F� � Fontana� W ������ Evolution of RNA and

the Neutral Theory Proc�Natl�Acad�Sci� in press� Santa Fe Institute Preprint

���������

Kau�man� S ������ The Origin of Order New York� Oxford Oxford University

Press

Li� W � Kaneko� K ����� Long�range correlation and partial ��f� spectrum

in a noncoding dna sequence Europhys�Lett� ��� �������

Mandelbrot� B B ����� The Fractal Geometry of Nature New York Freeman

Mandelbrot� B B � vanNess� J W ������ Fractional brownian motion� frac�

tional noise� and applications SIAM Rev� ��� �����

Mohar� B ������ The laplacian spectrum of graphs In Graph Theory Combi�

natorics and Applications� �Alavi� Y� Chartrand� G� Ollermann� O� � Schwenk�

A� eds� pp �������� New York John Wiley � Sons

Montroll� E � Shlesinger� M ������ Maximum entropy formalism� fractals�

scaling phenomena� and ��f noise A tale of tails J�Stat�Phys� ��� �����

Musha� T � Higuchi� H ������ The ��f uctuation of a tra�c current on an

expressway Jap�J�Appl�Phys� ��� �������

Nolan� P L� Gruber� D E� Matteson� J L� Peterson� L E� Rothschild� R E�

Doty� J P� Levine� A M� Lewin� W H G� � Primini� F A ������ Rapid

variability of ������ kev X�rays from Cygnus X�� Astrophys�J� ��� �������

Osborne� A � Provenzale� A ������ Finite correlation dimension for stochastic

systems with power law spectra Physica D ��� �������

� � �

Page 28: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Palmer� R ������ Optimization on rugged landscapes In Molecular Evolution

on Rugged Landscapes� Proteins RNA and the Immune System� �Perelson� A S

� Kau�man� S A� eds� pp ��� Addison Wesley Redwood City� CA

Papoulis� A ������ Probability Random Variables and Stochastic Processes

New York McGraw Hill

Reidys� C� Schuster� P� � Stadler� P F ������ Generic properties of combi�

natory maps Neutral networks of RNA secondary structures Santa Fe Institute

Preprint ���������

Schuster� P� Fontana� W� Stadler� P F� � Hofacker� I L ������ From

sequences to shapes and back A case study in RNA secondary structures

Proc�Roy�Soc�Lond�B ���� �����

Schuster� P � Stadler� P F ������ Landscapes Complex optimization problems

and biopolymer structures Computers Chem� ��� ������

Sorkin� G B ������ Combinatorial optimization� simulated annealing� and frac�

tals Technical Report RC����� �No����� IBM Research Report

Spitzer� F ������ Markov random �elds and gibbs ensembles Amer� Math�

Monthly ��� ������

Stadler� P F ����� Correlation in landscapes of combinatorial optimization

problems Europhys� Lett� ��� ������

Stadler� P F �����a� Random walks and orthogonal functions associated with

highly symmetric graphs Disc� Math� in press� Santa Fe Institute Preprint

��������

Stadler� P F �����b� Towards a theory of landscapes In Complex Systems

and Binary Networks� �L*opez Pe+na� R� ed� Springer�Verlag New York in press�

Santa Fe Institute Preprint ��������

Stadler� P F �����c� Landscapes and their correlation functions Santa Fe

Institute Preprint ���������

� � �

Page 29: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Stadler� P F � Happel� R ����� Correlation structure of the landscape of the

graph�bipartitioning�problem J� Phys� A�� Math� Gen� ��� ���������

Stadler� P F � Happel� R ������ Random �eld models for �tness landscapes

Santa Fe Institute Preprint ���������

Stadler� P F � Schnabl� W ����� The landscape of the traveling salesman

problem Phys� Letters A ��� �������

Tacker� M� Fontana� W� Stadler� P� � Schuster� P ������ Statistics of RNA

melting kinetics Eur� J� Biophys� ��� ����

Taft� B� Hickey� B� Wunsch� C� � Baker� D ������ Equatorial undercurrent

and deeper ows in the central paci�c Deep Sea Res� ��� �������

Verveen� A A � Derkson� H E ������ Fluctuation phenomena in nerve mem�

branes Proc�IEEE �� �������

Voss� R F ����� Evolution of long�range fractal correlations and ��f noise in

DNA base sequences Phys�Rev�Lett� �� ���������

Voss� R F � Clarke� J ������ ��f noise in music and speech Nature ����

�������

Weinberger� E D ������ Correlated and uncorrelated �tness landscapes and

how to tell the di�erence Biol�Cybern� �� ������

Weinberger� E D �����a� Local properties of Kau�man�s N�k model A tunably

rugged energy landscape Phys� Rev� A �� ���� ���������

Weinberger� E D �����b� Fourier and Taylor series on �tness landscapes Bio�

logical Cybernetics �� ������

Weinberger� E D � Stadler� P F ������ Why some �tness landscapes are

fractal J� Theor� Biol� ��� �����

Weissman� M ������ ��f noise and other slow� non�exponential kinetics in con�

densed matter Rev�Mod�Phys� �� �������

Yaglom� A ������ Correlation Theory of Stationary and Related Random Func�

tions� volume �� New York Springer�Verlag

� � �

Page 30: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Zuker� M � Sanko�� D ������ RNA secondary structures and their prediction

Bull�Math�Biol� � ���� ������

Zuker� M � Stiegler� P ������ Optimal computer folding of large RNA sequences

using thermodynamic and auxilliary information Nucl�Acid Res� �� �������

� � �

Page 31: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Appendix Relaxation of Random Walks in Sequence Spaces

For a sequence space a detailed analysis of the geometric relaxation of a simple

random walk is possible The probabilities �sd as de�ned in sect can be obtained

recursively from

��� ! �

�sd ! � for s � d

�sd ! w��d� ���s���d�� $w��d��s���d $ w��d $ ���s���d��

�A���

where coe�cients w��d�� w��d�� and w��d� are the probabilities for making a step

forwards� backwards or sidewards given one is in distance d from the origin of the

walk For sequence spaces we have �Fontana et al�� �����

w��d� !n� d

nw��d� !

d

n

� �

� � �w��d� !

d

n

� � ��A��

De�ne the moments of the distribution �sd by

&m�s� ! hd�s�mi !Xd

�sddm �A���

&��s� is then the average distance after s steps Inserting the recursion �A�� into

the de�nition of &m�s� yields after considerable algebra the following recursion

for the m�th moment

&m�s� ! � $m��X���

�m

m� �

����

m� �

�$ �

�� �

n

�&m���s � ��

$

�n

m����X���

�m

m� �

�&m������s� ��

�A���

This recursion is of the form �&�s� ! �$A � �&�s� ��� where A is lower triangular

Hence the eigenvalues �m of A are given by the diagonal elements of A

�m ! ��m�

n

�� ��A���

The m�th moment is therefore of the form

&m�s� ! &m��� ,�� ak�sk- � �A���

The slowest mode corresponds to the eigenvalue �� The corresponding relaxation

time is

�� ! ��

ln��!

�� �

�n$O��� �A���

for large n Explicit expressions for the long time limits &m��� of the moments

are obtained as non�zero �xed points of the recursions �A��

� � �

Page 32: Approximate Scaling Properties of RNA Free Energy Landscapes...Baskaran et al Scaling in RNA Landscapes Abstract RNA free energy landscap es are analyzed b y means of timeseries that

Baskaran et al� Scaling in RNA Landscapes

Table of Contents

� Introduction

Landscapes �

� Rugged Landscapes �

Correlation Functions �

� Elementary Landscapes �

� Power Spectra �

� Excursion Sets �

� Random Walks on Excursion Sets ��

� Self�A�ne Time Series and Fractal Landscapes �

�� Self�A�ne Time Series �

� Fractal Landscapes ��

�� �AR����Landscapes� are Locally Fractal ��

� RNA Free Energy Landscapes ��

� Conclusions

Acknowledgments

References �

Appendix Relaxation of Random Walks in Sequence Spaces �

� i �