Computational illustration of the bootstrap effect

8
BioSystems, 13 (1980) 57--64 57 © Elsevier/North-Holland Scientific Publishers Ltd, COMPUTATIONAL ILLUSTRATION OF THE BOOTSTRAP EFFECT* MICHAEL CONRAD a** and MATEEN M. RIZKI b aDepartment of Computer Science and aDepartment of Biological Sciences, Wayne State University. Detroit, Michigan 48202, and bUniversity of Michigan, Ann Arbor, Michigan 48109, U.S.A. (Received January 2nd, 1980) A constructive computational model is used to show that even under poor case assumptions genetic systems accumulate features which increase their amenability to evolution. These features decrease the fitness of the individual organism, but hitchhike along with the fitness-increasing traits whose probability of appearance they increase. According to the bootstrap principle of genetic adaptability, the amenability of genes and genetic systems to evolution is a property which is itself subject to natural selection (Conrad, 1977, 1978a, 1979a). Single gene amenability is possible if the redundancy of weak interactions in a protein reduces the effect of typical mutations on features of its shape critical for function. Such redun- dancy can be increased by increasing the number of amino acids or by utilizing amino acids with a greater number of close structural analogs. At the multigene level amenability is increased by greater reliance on polygenic or quantitative inheritance and on regulatory genes. Such mechanisms (and other mechan- isms which have been proposed, cf. Kirk- patrick, 1979) are always costly in terms of energy, therefore not an advantage to the individual organism. Nevertheless, amenability can hitchhike along with the advantageous traits whose appearance it makes possible. If this hitchhiking leads to increases in amenabi- lity more rapidly than individual selection leads to its decrease, further rounds of increase are possible. In effect, the amenability pulls itself up by its own bootstraps. *This research was supported by Contract N00014- 79-C-075 from the Division of Information Systems, Office of Naval Research. **To whom correspondence should be addressed. The basic argument for bootstrapping is that if n simultaneous genetic events are required in order to move from one fitness value to some higher fitness value, the rate of evolution goes as p', where p is the prob- ability of each event. If the higher value is connected to the lower value by a sequence of intermediate values, each of which can be reached through one genetic event, the rate of evolution depends on the sum of the prob- abilities and is therefore very much faster (Conrad, 1972). The presence of buffering mechanisms does not guarantee that the gene or the genetic system has any future evolu- tionary potential. However, their existence is a precondition for any evolution that does take place. (For a more complete description of the bootstrapping process see Conrad, 1979b and accompanying discussion by Bargiello and Grossfield, 1979. The term hitchhiking was originally introduced in the contest of selection for high mutation and recombination rates (Cox and Gibson, 1974; Strobeck et al., 1976).} The purpose of this note is to show that bootstrapping would occur even under conditions unrealistically unfavorable to it. The strategy will be to specify general features which should be incorporated into any boot- strapping model and then to compute a speci- fic "poor case" model. The model is based on

Transcript of Computational illustration of the bootstrap effect

BioSystems, 13 (1980) 57--64 57 © Elsevier/North-Holland Scientific Publishers Ltd,

COMPUTATIONAL ILLUSTRATION OF THE BOOTSTRAP EFFECT*

MICHAEL CONRAD a** and MATEEN M. RIZKI b

aDepartment of Computer Science and aDepartment of Biological Sciences, Wayne State University. Detroit, Michigan 48202, and bUniversity of Michigan, Ann Arbor, Michigan 48109, U.S.A.

(Received January 2nd, 1980)

A constructive computational model is used to show that even under poor case assumptions genetic systems accumulate features which increase their amenability to evolution. These features decrease the fitness of the individual organism, but hitchhike along with the fitness-increasing traits whose probability of appearance they increase.

According to the bootstrap principle of genetic adaptability, the amenability of genes and genetic systems to evolution is a property which is itself subject to natural selection (Conrad, 1977, 1978a, 1979a). Single gene amenability is possible if the redundancy of weak interactions in a protein reduces the effect of typical mutat ions on features of its shape critical for function. Such redun- dancy can be increased by increasing the number of amino acids or by utilizing amino acids with a greater number of close structural analogs. At the multigene level amenability is increased by greater reliance on polygenic or quantitative inheritance and on regulatory genes. Such mechanisms (and other mechan- isms which have been proposed, cf. Kirk- patrick, 1979) are always costly in terms of energy, therefore not an advantage to the individual organism. Nevertheless, amenability can hitchhike along with the advantageous traits whose appearance it makes possible. If this hitchhiking leads to increases in amenabi- lity more rapidly than individual selection leads to its decrease, further rounds of increase are possible. In effect, the amenability pulls itself up by its own bootstraps.

*This research was supported by Contract N00014- 79-C-075 from the Division of Information Systems, Office of Naval Research. **To whom correspondence should be addressed.

The basic argument for bootstrapping is that if n simultaneous genetic events are required in order to move from one fitness value to some higher fitness value, the rate of evolution goes as p ' , where p is the prob- ability of each event. If the higher value is connected to the lower value by a sequence of intermediate values, each of which can be reached through one genetic event, the rate of evolution depends on the sum of the prob- abilities and is therefore very much faster (Conrad, 1972). The presence of buffering mechanisms does not guarantee that the gene or the genetic system has any future evolu- tionary potential. However, their existence is a precondition for any evolution that does take place. (For a more complete description of the bootstrapping process see Conrad, 1979b and accompanying discussion by Bargiello and Grossfield, 1979. The term hitchhiking was originally introduced in the contest of selection for high mutat ion and recombinat ion rates (Cox and Gibson, 1974; Strobeck et al., 1976).}

The purpose of this note is to show that bootstrapping would occur even under conditions unrealistically unfavorable to it. The strategy will be to specify general features which should be incorporated into any boot- strapping model and then to compute a speci- fic "poor case" model. The model is based on

58

the conceptual division of fitness into a specific and an amenability component . The fitness of organism i at generation t, denoted by fi(t), is defined as the number of offspring (or genes) which it is likely to contr ibute to generation t + 1. This is the definition com- monly used in populat ion genetics. Here we also interpret it as the ratio between the efficiency of organism i, denoted by f i ( t ) , and the average ef f ic iency of the population, denoted by {f(t)). Fitness must on the aver- age equal unity for a stable population; but efficiency (taken relative to an arbitrary initial base value) might increase to values very much greater than unity. Alternatively, the efficiency of organism i can be interpreted as its relative growth rate in comparison to a base line organism. The specific componen t of fitness (denoted by st) is the fitness of a hypo- thetical organism equivalent in all functional respects to organism i, but stripped of any amenabili ty conferring redundancy. Similarly, the specific componen t of efficiency (denoted by ~.) is the efficiency of a hypothetical orga- nism equivalent in all functional respects to a hypothetical organism stripped of amenability. The stripped down organisms will be able to perform the same tasks as the amenable or- ganisms, but more efficiently in terms of energy expenditure. The cost of amenability in terms of fitness is denoted by ai(t) and in terms of efficiency by ai(t).

A poor case model of bootstrapping should incorporate the following features:

(i) If amenability is zero, genetic change can never lead to offspring with a greater specific componen t of fitness than the parent. As the amenability of the parent increases, the chances of its producing an offspring with a greater specific componen t increases, but never beyond 0.5. The poor case assumption is that any such increase becomes smaller as amenability increases.

(ii) The worst case assumption is that the environment is constant, that the specific component of fitness has a saturation value, and that increases in the specific componen t tend to become smaller as it approaches this

saturation value. In a constant environment saturation would only occur if the system reached the top of a peak which remained isolated despite the presence of amenability in the system. In a more favorable case the discovery of new, higher peaks would always be possible, thus favoring the continuation of the bootstrap process.

(iii) Since amenability is based on redun- dancies, it should be possible to increment or decrement it through single genetic changes (therefore in a gradual way). The worst case assumption is that the increment or decrement in amenability resulting from such single genetic changes becomes smaller as amenability increases.

The following "proof-in-principle" model shows that even under the poor case assump- tions bootstrapping is possible.

f i ( t ) = si(t) + ai(t), (1)

~ ( t + l ) = si(t) + cqRlai( t )

fi~(t+l) = ~i(t) + fliR~ , (3)

where ~( t ) /> O, ~(t)~> O, and ~i(t)<<. O. Ri is a random number, newly generated for each use and for simplicity taken as uniformly distri- buted in the interval [0,11, the ~i and fli are assigned values of 1 or - 1 according to a rule to be specified, ~ is the saturation value for the specific component , and & is the saturation value for the amenability component . The index i runs from 1 to N, where N is the num- ber of organisms in the population. For sim- plicity we assumeN fixed. In fact this is a poor case assumption, since increases in populat ion size resulting from new adaptations would enhance amenability. The mean efficiency of the population is given by

N /~(t) <f(t)> = ~ N (4)

i=1

The fitness (or relative efficiency) of organism i is given by

~.(t) f ; ( t ) - - - ( 5 )

<f( t )>

Thus

N N = ~ fi(t)

i=1

(6)

which means that the fitness of organism i can be interpreted in terms of its proportional contr ibution to the next generation (in accor- dance with the original definition). Since the number of offspring must be an integer, we use the following function to determine the actual contribution:

h i ( t + 1) = T [ f~(t ) ] (7)

+ I I if fi(t)--T[f~(t)l > R3

! 0 otherwise

where n~(t+l) is the number of organisms at t +1 with efficiency }i(t) and amenability £ i(t). T[fi( t)] is the truncation of fi(t), and R3 is again a uniformly distributed random number in the interval [0,1]. The poor case constraint

N

i = l

n ; ( t+ l ) = N (8)

is imposed by the next generation algorithm:

A1. If N

~] ni(t+l) < N, i = l

the number of offspring is increased by selec- ting organisms from the population at t ime t

until Eqn. (8) is satisfied. If

59

N

ni(t + 1)>N, i = l

then organisms are removed at random from the population at t ime t + 1 until Eqn. (8) is satisfied.

In the actual program the values of si and ai are represented in an N-dimensional array. The random addition or deletion used to maintain constant array size is a worst case assumption.

A subsidiary algorithm is also used to pre- vent .~ and 5 from being absorbing values:

A2. If si(t) >t 0.95~, then the saturation term (§ - - si(t))/~ is replaced by 0.05. Ifai(t) <0.95d, then the saturation terms (~ - ai(t))/a are re- placed by 0.05.

A2 could be avoided by adding small con- stants to ~ and ~ in Eqns. (2) and (3). However, this would introduce an unwarranted advan- tage for small si and ai, which would not be a worst case assumption. The model is com- pleted by specifying ~i and ai. According to assumption (iii) the ratio of amenability increasing to amenability decreasing genetic changes is 0.5. Thus

1 if R 4 1 > 0 . 5 .

~i = (9) - 1 otherwise

According to assumption (i) the ratio of si- increasing to ~i-decreasing genetic changes in general increases as amenability increases, reaching a maximum of 0.05. Thus

- 1 if 0.5 1>Rs

- 1 if g(ai,a)>~Rs

1 otherwise

(I0)

As g (ai, a) decreases, the rate of evolution in- creases. Since rate of evolution goes as p" and n o: 1/ai, any choice of g(ai,a) must satisfy

6O

g' (t]i,a) < 0 and g" (ai, a) > 0. A simple choice is

g ( a i , a ) = 1 - (ai/a)". (11)

Smaller values of u are more favorable to boot- strapping since ai = 1 becomes frequent more rapidly. Bootstrapping could work for any allowable choice of exponent if it can work for u = 1 (since g " (ai ,a) = 0 is worse than the worst allowable case).

The model has been studied numerically for representative parameter values. A num- ber of solutions are illustrated in Figs. 1--5. Points to note are:

(1) Bootstrapping does occur (Figs. 1--5).

In each case the population is assigned low initial values for the si and the~i (~i(0) = 1 and ~i (0) = 0 for all i). Even the worst case linear function (u = 1) allows bootstrapping (Figs. 1 and 3). Initially almost all of the genetic changes are unfavorable. Figure 2 (with u = 1/2) illustrates clearly how the increase in the amenability component {negative on the efficiency scale) is correlated to the increase in efficiency. Bootstrapping starts at the beginning, but becomes marked when the amenability increases enough to allow more genetic changes to be favorable. As lal is made smaller, the take-off time becomes smaller. Lower values of N and higher values of u increase the magnitude of a required for boot- strapping, but are much less critical than a,

8

J

~" SPECIFIC COMPONENT ~ OF EF FIC~ENCY I ' - ' I ' ~ ~ " ' ~ A ~

_o

>- c5

i "'''''<''" - , . . . . , , , , __

100 200 300 400 500 60(] 700 800 900 000 GENERATION

Fig. 1. Bootstrapping process for unfavorable choice of u (rate of evolution advantage). The curves represent population averages, with zero as the initial value of the amenability and one as the initial value of the efficiency. The amenability is equal to the separation between the efficiency and the specific component of the efficiency. Its magnitude reaches a maximum value (--7.07) at generation 864, but the decrease is most marked early in the run (before generation 100). This period of decrease is accompanied by the most marked increase in fitness.

61

o~

8 u-

,~ = -5,0 N=100 u = 1/2

I I 0 100 200

SPECIFIC COMPONENT

OF EFFICiN ~ " "EFFICIENCY

' ' - ° - " ,~o 8~o -°9oo - - - , ~ 300 400 500 600

GENERATION

Fig. 2. Bootstrapping process for unfavorable choice of ~. Note the correlation between the increase in the magni- tude of the amenability and the increase in the magnitude of the efficiency (near generation 300). The magnitude of the amenability reaches its maximum value ( -23 ) between generation 825 and 970.

For N = 100 and u = 1, borderline bootstrap- ping occurs for ~ = -50 . (In comparing the figures note that the scales on the efficiency axes differ).

(2) Figure 3 (with u = 1 and ~ = - 1 0 ) illus- trates bootstrapping in the absence of the flat- tening terms in Eqns. 2 and 3. Amenability is bootstrapped most in this case (to am= = 59), though this is obscured by the scale required for the very large efficiency values which the population reaches. Note that an~= < a and that beyond ~ ~/2 amenability is accumulating because it increases the amount by which the ~ can increase in a generation, not because it increases the ratio of favorable to unfavor- able genetic changes.

(3) Figure 4 (with u = 1/2) illustrates the saturation behavior. As before, after a period

of slow growth the accumulation of both efficiency and amenability accelerates. When the specific component of efficiency ap- proaches a saturation value, the population sheds some of its amenability. Organisms which shed amenability have an efficiency advantage. If too much amenability is shed, more of the genetic variations will be harmful. This is the reason for the dip in efficiency around generation 600. The subsequent jump in efficiency is correlated to another period of amenability accumulation (generation 620-- 670). It may seem odd that the population can shed amenability and then decline in efficiency. It would appear that the decline should be prevented by a re-accumulation of amenability. The explanation is that the curves describe average values for the whole popula-

6 2

o

o

=o8 (.3 • u. ~"

V) > 8 uu = uJ >-8

z w =E < 8

8

SPECIFIC COMPONENT / OF EF F I C ~

~II~NAB ILITY

. , , , . , , . - - 1

GENERATION Fig. 3. Bootstrapping process in absence of saturation terms. The Term (g - gi(t))/g is absent from Eqn. (2) and the term (a - ai(t))/a is absent from Eqn. (3). The latter term is left in Eqn. (2) since it is unfavorable to bootstrap- ping. The magnitude of the amenability reaches its maximum value (--59) between generation 890 and 930, The separation between the efficiency curves is obscured due to the large values involved.

tion. The possibility of decline results from the fact that the distribution of individual or- ganism amenabilities and efficiencies changes. Types which have achieved a high efficiency by shedding their amenability are more suc- cessful at displacing more amenable types near saturation, but axe incapable of holding on to this efficiency. In nature accumulated amenability may often become linked to the advantageous traits whose occurrence it favored, thus making it more difficult to shed this amenability.

(4) Figure 5 illustrates the effect of small population size (N = 10 as compared to N = 100 for Figs. 1--4). It shows the same quali- tative features as other bootstrapping curves, but the fluctuations become obvious and the bootstrapping is slower than it would be for larger N. Bootstrapping fails for N = 10

when u = 1, ~ = 1000, and ~ -~ - 1 0 . In this case (not illustrated) N = 10 is not sufficient to ensure that the higher amenability fraction of the population will be large enough to produce a more efficient individual. Such failure of bootstrapping is accompanied by a decrease in efficiency and would certainly lead to extinction in the absence of the assumption (usually a poor case assumption) that population size is fixed. This shows that amenability plays an important role in con- ferring reliability as well as in facilitating evolution.

The bootstrapping model presented here contains no real genetics and is clearly artifi- cial in a number of respects. As a poor case model it does demonstrate that inefficiently organized genes and genetic systems will develop in the course o f e v o l u t i o n if such

63

uJ

~ 8

u.

±

z

< 8

8

8 c:;

- ~ - A M E N A B I L ITY

' ' ' ' 100 20O 500 600

GENERATION

£=200 J = - 5 0 N=100 u :- 112

i i i - - 700 800 900 1 ooo

Fig. 4.Bootstrapping process illustrating saturation behavior and shedding of amenability. Note the two max imum magnitudes o f the amenability (--11 between generations 154 and 207 and --8 between generation 662 and 668).

8

z

u.

tn ~ 8 u c ~ ZW,

u u. u. w

>-'8

co

} 10(30 3 = - 5 0 N 10 u = I/2

SPECIFIC COMPONENT OF EFFJCIENCY

EFFICIENCY

8

A M E N i B I L I T Y

0 100 200 300 400 500 600 700 GENERATION

800 900 ! 0(30

Fig. 5. Effect o f population size. The amenability reaches its max imum magnitude (---6) between generation 500 and 600. The curves are qualitatively similar to those in Fig. 2, but with marked fluctuation and less marked increase in amenability and efficiency. If N = 1000 the values reached by the bootstrapping curve are nearly the same as in Fig. 2, but the takeoff point occurs earlier.

64

inefficiency sufficiently increases the rate of evolution. Amenability-conferring inefficiency is advantageous for the population, not for the individual. Nevertheless its evolution can be mediated by individual selection, not by group selection. This is what the model establishes.

References

Bargiello, T. and J. Grossfield, 1979, Commentary. BioSystems 11,180.

Conrad, M., 1972, Information processing in mole- cular systems. Curr. Mod. Biol. (now BioSystems) 5,1.

Conrad, M., 1977, Evolutionary adaptability of bio- logical macromolecules. J. Mol. Evol. 10, 87.

Conrad, M., 1978, Evolution of the adaptive land- scape, in: Theoretical Approaches to Complex Systems, R. Helm and G. Palm (eds.), Springer Lecture Notes on Biomathematics No. 21 (Sprin- ger, Heidelberg).

Conrad, M., 1979a, Mutation-absorption model of the enzyme. Bull. Math. Biol. 41,387.

Conrad, M., 1979b, Bootstrapping on the adaptive landscape. BioSystems 11,167.

Cox, E.C. and T.C. Gibson, 1974, Selection for high mutation rates in chemostats. Genetics 77, 169.

Kirkpatrick, F.H., 1979, Commentary. BioSystems 11,181.

Strobeck, C., J. Maynard Smith and B. Charlesworth, 1976, The effects of hitchhiking on a gene for recombination. Genetics 82, 547.