Weitzman 1987

23
The Evolution of Manuscript Traditions Author(s): Michael P. Weitzman Source: Journal of the Royal Statistical Society. Series A (General), Vol. 150, No. 4 (1987), pp. 287-308 Published by: Wiley for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2982040 . Accessed: 09/01/2014 04:55 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to Journal of the Royal Statistical Society. Series A (General). http://www.jstor.org This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AM All use subject to JSTOR Terms and Conditions

description

Cosos secreto

Transcript of Weitzman 1987

Page 1: Weitzman 1987

The Evolution of Manuscript TraditionsAuthor(s): Michael P. WeitzmanSource: Journal of the Royal Statistical Society. Series A (General), Vol. 150, No. 4 (1987), pp.287-308Published by: Wiley for the Royal Statistical SocietyStable URL: http://www.jstor.org/stable/2982040 .

Accessed: 09/01/2014 04:55

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access toJournal of the Royal Statistical Society. Series A (General).

http://www.jstor.org

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 2: Weitzman 1987

J. R. Statist. Soc. A (1987) 150, Part 4, pp. 287-308

The Evolution of Manuscript Traditions

By MICHAEL P. WEITZMANt

University College London

[Read before the Royal Statistical Society on Wednesday May 13th, 1987, the President Professor J. Durbin in the Chair]

SUMMARY The originals of ancient literary works gave rise to copies. These manuscripts were often copied in turn; often, too, manuscripts eventually perished. The editor of an ancient work must consider the relations among the manuscripts, extant and lost, which transmitted the text, and the structure of such manuscript populations in general. For example, most family-trees reconstructed by editors for the manuscripts of ancient works show exactly two main branches; is this to be expected, or due to some flaw in methods of reconstruction? Such questions are approached by modelling the evolving manuscript population through a birth-and-death process, with illustrative data for Greek and Latin literature.

Keywords: BIRTH-AND-DEATH PROCESSES; GREEK LITERATURE; HISTORY OF CLASSICAL SCHOLARSHIP; LATIN LITERATURE; TEXTUAL CRITICISM; TEXTUAL TRANSMISSION

1. INTRODUCTION Ancient writings have not reached us as they left their authors' hands. We have only copies, each at an unknown number of removes. The manuscripts that preserve a given work are collectively called its tradition. As every act or copying introduces fresh errors, the extant manuscripts differ among themselves, and all differ from the lost original. Modern editors can often identify errors in manuscripts, as offences against grammar, poetic metre, consistency or good sense. Where two or more manuscripts share errors (or rather, readings that can safely be said not to have stood in the original) not attributable to coincidence, editors infer common descent. These historical inferences, which ideally add up to a stemma (i.e. family-tree), later form a basis for choice between the rival readings in passages where intrinsic criteria fail.

These procedures raise some questions that traditional scholarship cannot answer. For example, any codex descriptus (i.e. a manuscript descended from another extant manuscript) must obviously be discarded; but how are such manuscripts identified? Maas (1958, p. 52) would presume every manuscript, apart from the oldest, to be a descriptus, failing evidence to the contrary normally, an original reading found in no earlier manuscript, and not attributable to conjecture. Pasquali (1952, pp. 30 ff), on the other hand, required direct proof of the dependence of one manuscript on another, as when the earlier manuscript has a missing leaf or a stain and the later manuscript has an exactly corresponding lacuna without external cause as described in Souilhe (1943, p. lxxii). Given the rarity of cogent evidence either way, the presumption of dependence will exclude far more manuscripts than that of independence. Yet conventional methods cannot show which presumption is likelier.

Nor has conventional scholarship agreed an explanation of two features common to the great majority of manuscript traditions, whether classical, patristic or medieval. First, all the extant manuscripts are usually found to share some errors, and therefore held to derive from an archetype i.e. latest common ancestor later than the original. (The set of ancestors of a manuscript is here defined to include that manuscript itself.) Second, the reconstructed stemma almost invariably shows exactly two branches issuing from that archetype. Most stemmata are thus found to share the basic form of Fig. 1. Many explanations peculiar

t Addressfor correspondence: Dept of Hebrew and Jewish Studies, University College London, Gower St., London WC1E 6BT, UK

? 1987 Royal Statistical Society 0035-9238/87/1 50287 $2.00

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 3: Weitzman 1987

288 WEITZMAN [Part 4,

original

archetype

Fig. 1

to one particular literature have been proposed. For the first phenomenon (the archetype), for example, Maas (1952, p. 490) offered an explanation peculiar to Greek: when uncial script gave way in the ninth century AD to minuscule, furnished with accents and breathings, the first minuscule copy became the source for all later copies. Again, the prevalence of two-branched stemmata for medieval Provenpal works was attributed by D. S. Avalle (1961, pp. 95f.) to a method, attested in some medieval Romance manuscripts, of accelerating production: two scribes simultaneously copied the two halves of a single exemplar, changing over midway. General phenomena, however, require general explanations.

The present paper seeks such explanations by modelling the development of manuscript traditions as a birth-and-death process. In any such model, various parameters the probabilities of birth and death, and the duration of the process must be specified. It would obviously be inappropriate to use the same values for all literature. Let us therefore set up different classes of literature, estimating the parameters for each class separately, on the basis of the surviving works. The model aims to shed light upon the properties of traditions in general within the relevant class, rather than directly upon any particular mahuscript or literary work. The parameters are estimated here for two classes: classical Greek and classical Latin. The estimates make no claim to finality, and in any case it is hoped that the interest of the model will not be confined to these applications.

2. CONSTRUCTION OF A MODEL Haigh (1971) viewed the growth of a manuscript tradition as a pure birth process, on which

basis he derived a procedure for deciding which of the manuscripts transmitting a given work was likeliest to be the original. In fact, however, the originals of virtually all ancient works have perished, and the "death" of manuscripts cannot realistically be ignored. Whitehead and Pickford (1951), Castellani (1957) and Dearing (1962, pp. 25 ff) all constructed a hypothetical tradition (though its form differed among the three studies, and all but Castellani tried out many alternatives), and considered the set of all manuscripts that it had ever contained. They then killed off the majority of these manuscripts, leaving a small number selected at random, and examined the relations between the survivors. Kleinlogel (1968, p. 73) proposed the refinement that the younger manuscripts should be assigned a higher probability of survival, i.e. that the probability of death be age-dependent. Most of these studies aimed to explain the preponderance of two-branched stemmata apart from that by Dearing, who, like Haigh, sought a procedure for identifying the root of a manuscript tree. All the models assume a pure birth phase followed by a pure death phase; in fact, of course, the processes of birth and death were contemporaneous.

Dearing (1974, pp. 123ff., 217ff.) later described a series of simulations in which birth and death went on side by side. He fed a sequence of events (e.g. "four births, one death, one birth ...") into the computer, which chose at random the manuscript that each event befell. No rationale, however, is offered for the length of each run of births or deaths. The underlying birth- and death-probabilities should first have been determined, in the light of historical conditions.

In a pilot study (1982), the present writer considered a birth-and-death process, starting at AD 500, midway between the dates of classical and medieval works. A phase with constant

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 4: Weitzman 1987

1987] The Evolution of Manuscript Traditions 289

birth-rate A and death-rate ,t was posited until AD 1500 (the age of print), and a pure death phase thereafter. Fifteen traditions simulated by computer are there described.t

In the model here proposed, the probabilities of birth (A) and death (4tt) depend on time, undoubtedly the most important factor. L(t) is defined as fo A(r)dr, and M(t) as f' M(r)dr. The process starts at time 0 and ends at time , the present. Birth means the production of a new copy. The death of a manuscript means the loss of at least half its text; papyri and palimpsests are considered dead, as they fell out of circulation.

The discrepancies between the model and real life can be classified thus:

(a) The real-life counterparts of the mathematical concepts are not always clear-cut. Firstly, the definition of the original, with which the process begins, is obscured if the work was revised by the author, or issued in instalments, or first transmitted orally (Dain 1975: 103-6). Secondly, birth is not instantaneous. Thirdly, a copy may spring from more than one source in the event of contamination, i.e. if the scribe consulted more than one source-a practice sometimes directly attested (Laistner 1957: 255) and more often to be inferred from tangled relationships between manuscripts. (The immediate ancestor of a manuscript can however always be defined uniquely as its main source, i.e. the source from which it deviates the least.) Fourthly, and above all, the fate of manuscripts was only rarely decided by sheer chance, as when the sole surviving manuscript of the Homeric Hymn to Demeter was discovered in 1777 in a Moscow stable where it had "lain many years amongst the chickens and pigs" (Richardson, 1974, pp. 65 f). Chance in the model must also stand for many factors that were not accidental (e.g. local riots, scribal whims).

(b) The model assumes that all members of the relevant class of literature faced the same birth- and death-probabilities. In fact, the more popular authors faced a higher birth probability (if not also a lower death probability). Again, the starting-point, and therefore the time-span (z), varied within the class.

(c) Variation of the rates with factors other than time (e.g. age, local conditions, supply of copyists and materials), is neglected. So too is interaction between birth and death; for example, Renaissance scholars tended to lose old manuscripts that they had copied (Reynolds and Wilson 1974, p. 124).

(d) Even in relation to the one factor retained, namely time, the available data offer no more than a broad indication of the course of the birth- and death-probabilities.

In brief this model, like other models, does not match the complexity of real life. Such shortcomings were already noted by Kleinlogel, who considered the birth-and-death process more suitable than his "birth phase + death phase" model, but impractical, on the ground that the requisite probabilities could not be estimated (1968, p. 74). Similarly Timpanaro (1985, p. 131) denied that any "mere mathematical calculation" could resolve the problem of the two-branched stemma, or, presumably, other problems concerning manuscripts. But the fact that the great majority of stemmata over so many literatures share the form of Fig. 1 must be due to a limited number of factors which are common to manuscript traditions in general and which one can hope to incorporate in the mathematical model.

3. ESTIMATION OF PARAMETERS: THE EXAMPLES OF GREEK AND LATIN The birth and death probabilities, as functions of time, may be traced on the basis of (i)

the known course of political and scholastic history, and (ii) censuses of surviving manuscripts broken down by age. The starting-point was fixed at 450 BC for Greek and 50 BC for Latin, and the end at AD 1950; thus T (in years) is 2400 for Greek and 2000 for Latin. On the historical side, I was fortunate to consult two eminent and patient specialists, Mr Nigel Wilson (Lincoln College, Oxford) for Greek and Mr Robert Ireland (University College London) for

t Weitzman (1982) contains some misprints that require correction: on p. 56, after the phrase "the interval elapsing before the event is", insert "-In x/{N(Q + M)}". On p. 57, for "no. 90" read "nos. 90 and 105". On p. 59, in the final diagram two extant manuscripts (namely nos. 90 and 105) independently descended from no. 40 were omitted.

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 5: Weitzman 1987

290 WEITZMAN [Part 4,

High renaissance

Poggio LATIN

i imperial Carolingian - birth-rate peak revival

?1 0- I \ A \ death-rate 010-

noundation

of monasteries

Twelfth-century \ I \ ~~~~~~~~~~~ ~ ~~~renaissanceI rs

tirte usurping northern printed emperors invasions Justinian's wars editions

005:1 ,-t A arnl i Italy

ICludia n I < 8 I Jerome

Latin. I presetem pobbDiocletian moo establishment of libraries Constantin e destroyeda-

by printers

a a ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~improved Ltage, the scale of the functions wasnotknown;thefunctionswerinsular g conservation

0-

scal as A* an I (wt inerl L* an A{) sotaAt IA() ,utI p() hr I and

50 0 100 200 300 400 h00 500 700 800 900 1000 1100 1200 1300 1400 lhOO 1600 1700 1800 1800 1950 AD

Fig. 2

Latin. I pressed them, probably harder than they would have wished, to quantify their impressions, and am fully responsible for any flaws in the resulting graphs (Figs 2-3). At this stage, the scale of the functions was not known; the functions were graphed on an arbitrary scale as A* and [j* (with integrals L* and M *), so that A(t) = IU*(t), M(t) = mM*(t), where I and m were unknown constants. The Greek history was complicated by three large-scale disasters, namely the destruction of the Serapeum library in Alexandria (AD 391) and the sackings of Constantinople by the Crusaders (AD 1204) and by the Turks (AD 1453). The probabilities of death at these points (based on guesses at the proportion that perished of the population) were assumed at 0.01, 0.025 and 0.025 respectively.

The age distribution of the extant manuscripts provided an indication of 1. With a deterministic approximation, the population at time t is exp[L(t) - M(t)]. Of these, a proportion exp[M(t) - M(S)] and therefore a total of exp[L(t) - M(S)] survive until , so that the proportion of manuscripts in the final population that pre-date t is exp[L(t) - L(T)]. Gerrard (1983) has shown by a stochastic analysis that this age distribution holds asymptotically as oo -* co. It follows that if N(t) is the observed number of extant manuscripts pre-dating t, then an estimator of I is ln{N(t2)/N(t1)} (1)

L*(t2)- L*(t 1)

where t1 and t2 are two different values of t. Cumulative distributions of the age of the manuscripts in sample texts in Greek and Latin are shown in Table 1. Only a minority of manuscripts bear a date recorded by the scribe, but the rest can usually be dated fairly confidently within a century, by comparison of handwriting and physical composition.

As an alternative approach, the above age distribution implies that the mean date of the manuscripts of the final population is

T- el[L*(t) - L*(T)] dt (2)

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 6: Weitzman 1987

1987] The Evolution of Manuscript Traditions 291 Constantinople

sacked by Destruction ofo

Serapeum (Alexandria) Crusaders Turks AD 391 (AD 1204) (AD 1453)

GREEK ~~~~a Araba GREEK | invasions

0 03 _ _ _ _ _ birth-rate iconoclasts

____--death-rate a I

0 025- ,

in I

0 02- i

4~~ ~ ~ ~~ E r y jEarly | t d \ z ~~~~~~Byzantine I Palaeologan

per I I renaissance

o o1 | / ~ ~Rllo}an civil wars | t

] | | F ~~~~~~Constantine l c

! , , , , ,~~~~~~~~~~~~~~~~ Nicaea _- - - - _____ ' L__ ----'zconoclast/ L. -- __A

Arab renaissanSce %%

invasijons %%

450 400 300 200 1 00 0 1 00 200 300 400 500 600 700 800 900 1 000 1 100 1 200 1 300 1 400 1 500 1 600 1 700 1 800 1900 BC AD 1950

Fig. 3

which can be compared with the observed mean, which my informants set no later than AD 1425 for classical Greek literature and AD 1400 for classical Latin.

Two statistics for estimating m stem from the distribution of final population size. Kendall (1948) found the p.g.f. to be G(0, r, z), where

V(t1, t2) = exp[M(t2) -M(tj) -L(t2) + L(tj)],

rt2

I(tj, t)= J A(r) V(t 1, r)dr t I

G(tj, t2, z) = 1 + I V(t1 I t2)/(Z-1)i(ti, t2) (3)

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 7: Weitzman 1987

292 WEITZMAN [Part 4,

TABLE 1

t = DATE (AD) 1100 1200 1300 1400 1500 1600 No. of extant mss.

of earlier date 25 35 111 306 780 908 Gk %age of above 2.75 3.85 12.22 33.70 85.90 100

L(t) - by eqn. (1) 0.34 1.15 1.01 0.94 0.15 by Fig. 3 0.65 0.85 0.82 0.76 0.35

No. of extant mss. of earlier date 39 97 136 182 512 *

Lat %age of above 7.62 18.95 26.56 35.55 100 L(t) - by eqn. (1) 0.91 0.34 0.29 1.04

-by Fig. 2 0.75 0.57 0.46 1.09

The sources are as follows; where a single work has been extracted, its title is enclosed in square brackets. Greek: White, J. W. (1906)"The Manuscripts of Aristophanes", Classical Philology 1, 1-20 [Plutus]; Stein, H. (1869-71) Herodoti

Historiae, Berlin, v-xix; Allen, T. W. (1931), Homeri Ilias, vol. 1, Oxford, 11-55; ibid. (1910) "The Text of the Odyssey", Papers of the British School at Rome 5, 1, 3-16; Drerup, E. (1906) Isocratis Opera Omnia, vol. 1, Leipzig [to Demonicus]; Irigoin, J. (1952) Histoire du texte de Pindare, Paris, 432-442 [Olympian Odes]; Post, L. A. (1934) The Vatican Plato and its Relations, Middletown, Conn. [Gorgias]; Turyn, A. (1944), "The Manuscripts of Sophocles", Traditio 2, 1-41 [Ajax]; Alberti, G. B. (1972), Thucydidis Historiae, vol. 1, Rome, ix-xxviii; Widdra, K. (ed.) (1964) Xenophon; on Horsemanship. Throughout, the figures are restricted to manuscripts 'alive' on the definition of Section 2.

Latin: Reynolds et al. (1983), p. xxvii, on Sallust. * not given; Latin texts were hardly copied after AD 1500 (Reynolds et al., n. 103).

It follows, firstly, that the probability that the tradition dies out is

E = I - 1/{V(0, T) + I(0, T)} (4)

As only a small proportion of ancient literature has survived, this probability should be high, say 0.9-though, failing a precise definition of literature, it cannot be quantified exactly. (A glimpse of Latin losses is offered by Bardon, 1952-60.)

Secondly, the mean size of surviving families is

S = 1 + I(0, T)/V(0, T) (5) Both in Greek and in Latin, real traditions vary in size between 1 and well over 100, and a mean of 55 was assumed (though in fact the mean size of surviving traditions is probably higher for Latin than for Greek). Hence S(1 - E) should equal 5.5, so that

IL*(T) - ln(5.5) 6 m= M*(T) (6)

A first estimate of I for Latin was found from eqn. (1) applied to Table 1, with t1 = AD 1100, t2 = AD 1500. Thence m was estimated by eqn. (6). When substituted in eqns. (4-5), however, the values of I and m for Latin made both S and E too low, at 17.43 and 0.68 respectively; the average date on eqn. (2) was AD 1382. It proved possible to raise both S and E by increasing I- and consequently m, as estimated by eqn. (6) - but then the average date became later. Hence I was increased to the level yielding the latest acceptable average date (AD 1400); this gave S = 18.89, E = 0.71. For Greek, I was derived similarly by eqn. (1), with t1 = AD 1100, t2 = AD 1600, and thence m, again by eqn. (6). Here again S and E were too low (at 17.84 and 0.69); moreover, the average date was too late, at AD 1432. To increase 1 would make the date later still; to lower I would depress S and E yet further; and I shrank to change the shape of the rates without historical evidence. In the end I was lowered slightly in order that at least the average date should fall within the acceptable range, at AD 1425; thence S = 17.33, E = 0.68. The final choices of I and m underlie the scale of Figs 2-3. The rates shown are generally higher for Greek than for Latin, but the constraints leave little room to adjust them.

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 8: Weitzman 1987

1987] The Evolution of Manuscript Traditions 293

Another aspect of the low values of S and E is that the true variation in population size of extant traditions is understated. The upper 1 % limit, for example, can be found through eqns. (3-5) to be

ln(0.01)-ln(1-E) ln(1 - 1/S)

giving 59 for Greek and 63 for Latin, whereas in fact far more than 1% of surviving traditions exceed these values. All these discrepancies seem due to the heterogeneity, which the model ignores, within each literature; neither the loss of the least popular works nor the proliferation of the most popular is fully reflected.

4. EXTINCTION AND VARIATION IN FINAL POPULATION SIZE Our random model, then, does not explain totally the high levels observed in both the

above. But neither is conscious selection a sufficient explanation. In the case of Greek tragedy, for example, von Wilamowitz-Moellendorff (1907, pp. 195-200) suggested that some school- master between the second and fourth centuries AD set up a restricted syllabus which gained general acceptance, other plays effectively ceasing to be copied. Other categories of Greek literature (e.g. lyric poetry) were supposed to have suffered similarly (p. 179). As Reynolds and Wilson (1974, p. 47) object, however, there is no indication who the schoolmaster was, or why his syllabus was so influential; and many books now lost were still available to the ninth-century scholar Photius. Furthermore, the theory is confined to Greek literature, while high levels of extinction and of variation in population size are general phenomena. It is better to attribute them partly to chance (or rather the host of largely unknown minor factors for which chance stands) and partly to choice, due to heterogeneity in popularity and not exercised in any particular period. Once the chance factor is taken into account, narrowing of the school syllabus can be viewed more often as the effect than the cause of the loss or scarcity of particular texts.

For Latin too, according to Reynolds et al. (1983, pp. xv, xxviii, with n. 106), historians agree that the extinctions of different traditions were spread over many centuries. This feature is shared by both applications of the model, unlike Wilamowitz's theory. The probability of extinction by time t is found from eqn. 3 to be G(0, t, 0) or

rt 1 - {exp[M(t) - L(t)] + )A(r) exp[M(r) - L(r)]dr}-

whence Table 2. In both applications, extinctions were commonest in the earliest times, but over 10% of all books now lost were still extant in AD 900.

Another aspect of extinction is whether a manuscript originating at time t will be extant, or leave extant progeny, at time r. Eqn. 3 gives the probability as

S(t, r) - exp[M(t) - L(t)] ( exp[M(T) - L(T)] + f' A(r) exp[M(r) - L(r)] dr

S(t, T) = 1 - E (equalling 0.32 for Greek, 0.29 for Latin) at t =0 and 1 at t = T. Were A and p constant, it would rise monotonically. As medieval conditions were much harsher than those

TABLE 2

Percentage of all extinctions 25 50 75 90

Date whereby that percentage of Greek BC 398 BC 243 AD 848 AD 1018 all extinctions had occurred Latin AD 57 AD 407 AD 703 AD 918

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 9: Weitzman 1987

294 WEITZMAN [Part 4,

Q

A/ \

Fig. 4

in antiquity, however, the probability drops, for Latin, to a minimum of 0.065 in AD 185, not recovering to 0.29 until AD 660; when birth ceases in AD 1550, its value is 0.47. For Greek it drops more sharply still to a minimum of 0.00068 in AD 605, first rising again to 0.32 in AD 865; birth ceases in AD 1600, and here again the probability equals 0.47.

5. THE BACKGROUND TO THE EXTANT TRADITION As mentioned above, editors seek to construct a family-tree or stemma of manuscripts. Let

us first consider all the manuscripts that ever existed in a given tradition, and then identify successive subsets. The set of all manuscripts, living or dead, of a tradition was termed by Fourquet (1946, p. 5) the arbre reel. Of these manuscripts, those that survive or leave extant progeny may be called relevant manuscripts. (A weakness of this term is that in the event of contamination, a lost manuscript with no direct progeny extant may nevertheless have trausmitted readings, and be in that sense relevant, to the extant manuscripts.) Consider now the genealogy of Fig. 4, where Greek letters represent lost manuscripts. Ms. f will add some errors to those of oc, and pass them on to B, which adds others still. The editor faced with A and B alone cannot normally know that the errors peculiar to B are due to two sources. He cannot tell that ,B ever existed; nor will that ignorance mar his reconstruction of oc, which will show the reading common to AB, where they agree, and be indeterminate where they do not. The existence of oc, however, can be inferred from the errors common to AB. The only manuscripts of which an editor is normally aware are thus (i) extant manuscripts, (ii) lost manuscripts from which two or more extant branches issue, and (iii) the original. This subset of the relevant manuscripts may be termed the point manuscripts, as they correspond to the points of the stemma. (The other relevant manuscripts may be analogy be called arc manuscripts.) The extant manuscripts are in turn a subset of the point manuscripts. Some, however, are descripti (?1), and interest centres on the remainder, which form a subset of 'independent' manuscripts. Finally, in order to reconstruct all that has survived of the original text, one need not consult all the independent mss; a smaller "team" of manuscripts can usually be identified that together contain every good reading anywhere preserved (Bevenot, 1961, pp. 133 ff; West, 1973, p. 43).

The ratio of the size of the arbre reel to the extant population can be estimated by a deterministic approximation. As the population at time t totals exp[L(t) - M(t)], the arbre reel comprises the original and the births A(t) exp[L(t) - M(t)]. bt during [t, t + bt]. The required ratio is then

exp[M() - L(T)] [ + J{(t) exp{L(t) - M(t)} dtl, calculated at 503.1 for Greek and 7.1 for Latin. The Greek figure is suspiciously high, but even the Latin figure shows that far more manuscripts have perished than survived.

The Greek and Latin applications of the model agree better on the ratio of relevant to extant manuscripts. Consider a manuscript born at time t. Let G(t, z) be the p.g.f. of the

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 10: Weitzman 1987

1987] The Evolution of Manuscript Traditions 295

number of manuscripts, among that initial manuscript and its descendants, which either survive or leave extant progeny at time . The number of births generated by that manuscript in any time interval [r, r + br] during its life is a Poisson variate, with p.g.f. exp[A(r)(z - 1)3r]. The p.g. f. of the number of manuscripts, in the population arising from (and including) these births, which either survive or leave progeny extant at time z is then exp[A(r){G(r, z) - I}br]. Now there is probability exp[M(t) - M(T)] that the initial manuscript will survive until T, and probability ,u(s) exp[M(t) - M(s)]bs that it will die in the interval [s, s + 3s] (s < T). The initial manuscript itself will be a relevant manuscript only if its family survives, with probability S(t, T). Hence

G(t, z) = z expLM(t) - M() - L(T) + L(t) + A(r)G(r, z)drl

+ z exp[M(t) + L(t)] X u(s) exp -M(s) - L(s) + A(r)G(r, z)drl ds

+ (1 -z){1 -S(t, T)}.

The solution is z exp[L(t)- zM(t)] [S(t, T)] J+ Z

G(t z)= - (t -u +exp[L(Tu) - ZM(T)] - Zf7' t(r) exp [L(r) - zM(r)] [S(r, T)] 1 ?zdr

The mean is then

S(t, T) + exp[M(t) - L(t)] XA(r) exp[L(r) - M(r)] . S(r, z)dr.

Putting t =0, we obtain the expected number of relevant manuscripts. Dividing by exp[L(z) - M(T)], the expected number of manuscripts altogether, we find the ratio of relevant to extant manuscripts to be

eM(t) L(t)LS(0, T) + { )(r)eL(r) -M(r)S(r T)dr1 Strictly, this is the ratio of the expectations, rather than the expected ratio. It is calculated at 3.9 for Greek and 2.4 for Latin. Comparison with the above ratios between the sizes of the arbre reel and the extant manuscripts suggests that the relevant manuscripts comprise an unexpectedly narrow sector of the whole tradition.

This point is confirmed by investigation of the number of removes between the original and an extant manuscript of given date. This number is termed the generation of the manuscript; the original has generation zero. At time t let n(i, t) be the number of live manuscripts of generation i, and put

F(t, z) = , n(i, t)zi.

Over the interval [t, t + bt], any manuscript of generation i is subject to probabilities

A(t)bt of generating a new manuscript, of generation (i + 1);

y(t)bt of dying. On a deterministic approximation, zi rises by

IA(t)zi+ 1 - A(t)zi}5t.

Hence

OF =(Az - yu)F; and as F(0, z) = 1, the required solution is

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 11: Weitzman 1987

296 WEITZMAN [Part 4,

F(t, z) = exp[zL(t) - M(t)]. As the population at time t totals exp[L(t) - M(t)], the generation of the manuscripts then extant is a Poisson variate with mean L(t), so that the generation of a manuscript originating at that date is unity plus such a Poisson variate. Table 3 shows I + L(t) for various values of t, followed by L(X). The figures for Greek again seem high, and those for Latin seem low, but two features are more important than the figures themselves. First, the overall mean generation L(X) exceeds the ratio of relevant to extant manuscripts, fivefold for Greek and fourfold for Latin. This testifies to the great overlap between the ancestors of the different extant manuscripts, showing in other words how narrow a line the extant manuscripts represent. Second, although later manuscripts tend to be of higher generation, the variance implies that, for example, of two manuscripts dated AD 750 and AD 1500, the younger may well be of lower generation, with probability 0.20 for Greek and 0.08 for Latin.

A subset of the relevant, and superset of the extant, manuscripts comprises the point manuscripts. It is easily proved by induction that these cannot total more than twice the number of extant manuscripts, the maximum occurring only if there are no descripti and every lost point manuscript generates exactly two extant branches.

Finally, let us consider population size in former times, when the final population is a known number n. The population size is thus fixed at both the beginning and the end of the process. In terms of eqn. (3), the probability that the population equals j at time t and k at time -c is the coefficient of xjyk in G[0, t, xG(t, -, y)]. Hence the p.g.f. for the conditional probability required is

z[{I(0, t) + V(0, t)}I(t, T) + I(0, t){ 1 - I(t, T)}Z]n - '[I(0, T) + V(O, T)]fn 1

[{I(0, t) + V(0, t)}{I(t, T) + V(t, )} -I(0, t){I(t, T) + V(t, T) - }z] n 1[I(O,T)]n- If

I(0, t)[1 - I(t, T)] I(0, t){I(t, T) + V(t, ) - 1} I(0, T) I(0, T) + V(0, T)

then mean = 1 + P(n - 1) + Q(n + 1), variance = P(1 - P)(n - 1) + Q(1 + Q)(n + 1) while the probability that the population had dwindled to one is

I(t, T) '[(0 T) + V(0, 5)1n' 1

1(0, ) j L I(t, z) + V(t, z) 1 [I(0, t) + V(0, t)]2 Table 4 shows the results at two points when classical literature flourished, in late antiquity (AD 200) and the Renaissance (AD 1500); at AD 600, when the Dark Ages had set in for Latin but not yet for Greek; and at AD 850, when the Dark Ages were over. The values n = 1 and n = 60 were considered, since 60 lies-as shown at the end of? 3-near the 99th percentile of extant populations in both versions of the model.

TABLE 3

Date of origin (AD) Mean generation Greek Latin

500 14.3 4.3 750 15.9 4.6

1000 17.3 6.5 1250 19.0 8.2 1500 21.1 10.0 Overall 20.5 9.2

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 12: Weitzman 1987

1987] The Evolution of Manuscript Traditions 297

TABLE 4

AD 200 AD 600 AD 850 AD 1500 Gk Lai Gk Lat Gk Lat Gk Lat

Final population = 1 Mean population 501 10 1536 3.8 4.2 2.4 4.1 3.5 St. deviation 354 7.2 1086 2.6 2.9 1.5 2.8 2.4 Pr (population = 1) <.0001 .03 <.0001 .18 .15 .34 .15 .20

Final population = 60 Mean population 1472 19 4555 9.4 15 9.7 134 125 St. deviation 772 12 2379 5.2 7.2 4.4 16 13 Pr(population = 1) <.0001 .006 <.0001 .016 .002 .004 <.0001 <.0001

The figures for Latin in AD 200 and 600 are obviously too low, and those for Greek in AD 200 and 600 suspiciously high-despite the statement in the Letter of Aristeas (Thackeray, 1917, p. 23) ? 10 that the library in Alexandria ca. 300 BC contained 200,000 volumes. The parameters chosen here made the model over-active for Greek, as earlier results suggested, and under-active for Latin. Still, if the existing Greek and Latin applications of the model have opposite faults, features in which they agree merit more credence than the individual figures. One such feature is that the probability that a tradition now extant was reduced in the Dark Ages to a single manuscript (a supposition often advanced to explain the existence of an archetype) is small, even if only one manuscript survives today. Another is that if we consider one tradition that survives in a single manuscript and another that survives in sixty, then the probability that the now smaller tradition was more populous than the now larger is found (with a normal approximation) to exceed 10% until well into the Middle Ages (AD 800 for Greek, AD 1000 for Latin).

6. STRUCTURE WITHIN THE EXTANT TRADITION Let us now consider subsets of the extant manuscripts-independent manuscripts and the

"team" of manuscripts requiring citation. We first estimate the proportion of independent manuscripts among the extant manuscripts of given date. Consider one sole individual, with no ancestor, at time t. Let K(t, z) be the p.g.f. of the number of independent manuscripts extant at time T, among the manuscript and its descendants. There is probability exp[M(t) - M(r)] that the initial manuscript will survive; it will then be the sole independent manuscript. If it dies at time s (s < z), however, consider the direct copies of that manuscript. Their number is a Poisson variate, with p.g.f.

expLj' A(r)(z - 1)drl;

and the p.g.f. of the number of independent manuscripts extant at time T, in the population arising from (and including) a direct copy born at time r, will be K(r, z). Hence

K(t, z) = z exp[M(t) - M(T)] + X/(s) exp[M(t) - M(s) + L(t) - L(s) + A(r)K(r, z)drlds

with solution

K(t, z)=1- (1-z)H(t, z) 1 z(1 - z) f (r)H(r, z)dr

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 13: Weitzman 1987

298 WEITZMAN [Part 4,

where

H(t, z) = exp{M(t) - M(l) - L(t) + L(T) - z { A(u) exp[M(u) - M(T)]du}.

The expected number of independent manuscripts in the whole population at time -X is

z (0, 1) = H(O, 1) = expL(T) - M() - e-M(T) f {(u)eM(u)dul

Consider now the sub-group of manuscripts originating before a specific time s. They may be viewed as the product of a process where the birth-rate is A(t) until t = s and zero thereafter, and the death-rate is ,(t) for all t. Within this sub-group, the expected number of independent manuscripts extant at X is

expLL(s) - M() - e - M(T) J 1(u)eM(u)dul = f(s), say.

The expected number of independent manuscripts extant at z and originating in the interval [s, s + bs] is

df bs = A(s){ 1 - em(s) - m() } (s)3s ds

But as the expected total number of manuscripts extant at z and originating in that interval is A(s) exp[L(s) - M(z)]5s, the expected ratio (though, strictly, it is rather a ratio of expectations) of independent manuscripts among manuscripts originating in that interval is

[1-eM(s) - M(t)] . exp - &M(T) I (r)eM(r)dr

This is just below 1 at t = 0, decreasing, very slowly at first, to zero at t = T. For Greek it falls to 0.5 in AD 1460; manuscripts earlier than that date should be presumed independent, and later manuscripts dependent, failing contrary evidence. The corresponding dividing date for Latin is AD 1415. Where contamination (?2) occurred, however, the later manuscripts cannot be rejected so lightly; a manuscript whose main source is another extant manuscript may have picked up additional good readings from a lost secondary source.

The last subset is the smallest "team" of manuscripts to contain every original reading anywhere preserved. Should the archetype (or only one manuscript) survive, the "team" comprises one manuscript. Otherwise, any two manuscripts independently derived from the archetype would suffice if only we could assume that no two manuscripts ever erred independently at the same point and that contamination never occurred. In practice neither assumption is safe. Firstly, coincidence could have caused separate or identical errors in the two manuscripts in passages where the truth elsewhere survives. More systematically, contamination may have transferred into or between the two manuscripts errors that have not infected the whole tradition, or it may have introduced good readings from a lost source into some other extant manuscript in places where the archetype, followed by the two manuscripts, was corrupt. In a highly contaminated tradition, the "team" of extant manuscripts that one must consult in order to extract every original reading anywhere preserved may number 10 or more (Weitzman, 1985, p. 102).

7. THE ARCHETYPE In almost every tradition, as mentioned in ?1, common errors show that the latest common

ancestor of the extant manuscripts is later than the original. So widespread is the phenomenon that Maas knew of no exception among major classical works (p. 11); yet there is no agreed

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 14: Weitzman 1987

19871 The Evolution of Manuscript Traditions 299

explanation. Maas's hypothesis for Greek has already been mentioned, that once a text had been laboriously transliterated into minuscule script in the ninth century AD, that minuscule copy became the sole source of later copies. But many texts survive additionally in papyri that pre-date the ninth century and share errors with the medieval manuscripts; examples from Euripides and Aristophanes are cited by Pasquali (1952, pp. 190-201). Once these are brought into account, an archetype well before the ninth century must be posited. (One author, Bacchylides, survives in papyri alone, which share errors and for which Maas himself [1958, p. 36] posited an archetype as early as AD 100.) Again, when an oriental version of early date (say fifth century AD) shares errors with a particular group of extant manuscripts, it is natural to attribute these errors to a common ancestor at least as old as the version; the archetype, an ancestor common to the remaining manuscripts also, must then be older still (examples in Reynolds and Wilson, 1974, p. 53). Another argument against Maas's theory that only exceptionally were texts transliterated more than once, was advanced by Grassi (1961): scribes can hardly have been sufficiently aware of one another's work, especially at great distances. To meet this objection, di Benedetto (1965, p. 161) posited an official minuscule copy, lodged at the academy of Constantinople between AD 863 and AD 1000 or thereabouts; but toilsome transliteration, a feature confined to Greek, cannot explain the phenomenon of the archetype, common to traditions in general.

What according to the model is the probability that the extant manuscripts go back to an archetype, i.e. a latest common ancestor distinct from the original? First, the original must perish. In any interval [s, s + bs] during its life, the original had probability A(s)bs of generating a copy, which in turn had probability S(s, T) of either surviving or leaving extant progeny at time z. In other words, the original had probability A(s)S(s, T)6s of generating a trace, i.e. a line still extant at the end of the process, in interval [s, s + bs]. If the original dies at time t, its total of traces follows the Poisson distribution, with mean

IVt a(t) = )A(s)S(s, T)ds

= ln S(t, T) - ln S(0, T) - M(t) + L(t).

There will be an archetype if the original dies having left exactly one trace. As the probability that the original die in [t, t + bt] is exp{ - M(t)}p(t)bt, the probability that an archetype exist is

T e-M(t) . j(t). a(t). e a(t)dt (7)

Now

t)e () a(t) - S(0, c)' Le (t){ - S ]

Hence we may integrate (7) by parts, and, on division by S(0, T), obtain the probability of an archetype, conditional upon the survival of the tradition, as

T (t)e L( )[1 - S(t, T)]dt (8)

calculated at 0.76 for Greek and 0.79 for Latin. Under the simplifying assumptions that A and p are constant andT o- x, the probability

(8) becomes y1A, equalling E, the extinction probability. Hence a general argument. Surviving traditions are those whose originals generated at least one trace; but, given the high extinction probability, in most of these cases the original generated only one trace and so an archetype exists. The few traditions where Pasquali (1952, pp. 15 ff) found no archetype, namely Homer, Virgil's Aeneid and the Vulgate Octateuch, are precisely those where the a priori extinction

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 15: Weitzman 1987

300 WEITZMAN [Part 4,

probability was unusually low. An explanation on similar lines was proposed, though not rigorously justified, by Ageno (1975): no matter how often the original was copied, chance after a time made one particular family more numerous than the others, which eventually died out.

In contrast to the transliteration theory, the archetype was not marked out at birth: it owes its position to the misfortune of its elder contemporaries. Only when they and their progeny had died out did it accede as archetype. In most of the artificial traditions simulated by the present writer (1982), the archetype acceded during its life, but, in two traditions, only after its death. If contamination was rife, the character of the tradition is much affected by the extent to which the ramifications existing at the end of the process had taken shape before the archetype acceded. In one experiment, the two main branches, and two sub-branches within each, were already formed. Copyists within each sub-branch could thus have consulted manuscripts not descended from the archetype and free from its errors, so that in different passages the truth may survive in any one or in any combination of the four sub-groups. Such phenomena are often observed, e.g. in Homer (Pasquali, 1952, pp. 211ff) and Aeschylus (Dawe, 1964, Ch. V). In other experiments, at the other extreme, the archetype acceded by becoming at one stage the sole survivor.

If the tradition survives and there is no archetype, the original must either survive or leave at least two traces, the probability being S(0, z)[1 - (8)]. It follows that the probability that a manuscript originating at time t (>O) should either survive or leave at least two traces, or in other words that it should be a point manuscript (?5), is

S(t, T)L1 - I(r)eL(t) - L(r)[I - S(r, T)]dr] = P(t), say.

The archetype is the oldest text that can be constructed from the extant witnesses; only by conjecture can we reach further back. Its date is therefore of interest. Let n manuscripts be alive at time t. Then the archetype will arise in the interval [t, t + bt] if a new manuscript is then "born" which becomes a point manuscript, and if none of the n manuscripts leaves a trace by time z. But the probability that the population at time t be n is found from eqn. (3) as

V(0, t)[I(0, t)]n

[ V(0, t) + I(0, t)]n + 1

Hence the probability that the archetype arise during [t, t + bt] is

P(t). *J f[V (0, t) + 1(0, t)]) +1 ni(t)[1-S(t, n)],6t

i(t)em(t) L()P(t)[1 - S(t, T)][S(0, T)] 23t

[S(t, T)] 2 (9) The probability that the archetype arise before time s is the integral of the above from 0 to s, which is

eSO T]2M(S) ,{t)LOT)[1 (t )]t ( S(o, T) X (t)e L(t)[1 S(t, T)]dt - S(0 T ) -)] -(t S(t, e)]dt. (10) . 0 , s~~~~~~~(, c

Given that an archetype exists, division of (10) by [(8). S(0, T)] gives the cumulative distribution of its date. Results are summarised in Table 5, which shows a spread of dates well beyond the ninth century, in either direction.

The more recent the archetype, the likelier it is to be extant. Given that a manuscript of date t is a point manuscript (which any archetype must be), the probability that it is extant is exp[M(t) - M(-r)]/P(t), which rises from just over 0 to 1 over the period when A(t) > 0. The greatest concentrations of dead archetypes, for both Greek and Latin, are found in antiquity

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 16: Weitzman 1987

1987] The Evolution of Manuscript Traditions 301

TABLE 5

DATE BC 250 0 AD 250 500 750 1000 1250 1500

Greek Percentage of archetypes

that are earlier: all 28 35 37 40 56 76 89 98 live 0 0 0 2 12 27 50 91 dead 34 43 44 48 65 86 97 100

Prob. that archetype of given date is extant .00 .00 .07 .10 .10 .19 .48 .98

Latin Percentage of archetypes

that are earlier: all - 8 36 52 59 81 92 99 live - 0 8 14 18 37 63 96 dead - 10 43 61 69 91 98 100

Prob. that archetype of given date is extant - .02 .07 .09 .09 .29 .65 1.00

and then in periods of revival after the Dark Ages-the ninth century for Latin, AD 850-1000 for Greek. To find the overall probability that there is a live archetype, we replace P(t) in (9) by exp[M(t) - M(z)] and integrate from 0 to z, obtaining 0.13 for Greek and 0.14 for Latin (or, conditionally on there being an archetype, 0.17 and 0.18). These figures may seem high, but Reeve (1986, p. 60) counts eighteen live archetypes (= 12%) among the 150-odd Latin traditions in Reynolds et al. (1983), and some Greek examples are cited by Pasquali (1952, pp. 31 ff).

8. THE BRANCHES OF THE STEMMA The preponderance of two-branched stemmata in prefaces to critical editions was first

pointed out by Bedier (1913, pp. xxv f), who considered it absurd and inferred some basic flaw in the traditional method of deriving stemmata from common errors. The problem evoked the mathematical models due to Whitehead and Pickford, Castellani and Kleinlogel, discussed in ?2-as well as many other treatments, surveyed by Timpanaro (1985, pp. 123-153). All these models suggested that a preponderance of two-branched stemmata was not to be expected. Their inventors, however, rather than condemn the stemmatic method, effectively abandoned them. Thus Whitehead and Pickford (1951, pp. 89 ff) finally condemned a purely mathematical treatment as "mistaken"; Kleinlogel agreed that it was misguided ("das Pferd beim Schwanz aufzaumen"); Castellani (1957, p. 35) listed several historical factors ignored by his model and attributed to these the prevalence of two-branched stemmata. Timpanaro, however, accepts that the stemmatic method is at fault. He attributes the prevalence of two-branched stemmata largely to coincidence in error, to contamination and to conjecture, any of which can cause a stemma which is in fact many-branched to appear two-branched (see Fig. 5).

Bedier's doubts primarily concerned traditions with lost archetypes. A tradition whose archetype survives can even have just one branch (which may fan out lower down) or none (if only one manuscript survives), and, in any case, the constitution of the text is not affected by what lies below an extant archetype. Dead archetypes, however, must generate at least two surviving lines, and Bedier's complaint is that most generate only two.

To discover whether the present model implies a preponderance of two-branched stemmata, let us first consider the distribution of traces left by a manuscript originating at time t. If it

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 17: Weitzman 1987

302 WEITZMAN [Part 4,

Q Q \ ~~~~~~~~~Q

A C A/

(0) () O)

Fig. 5. Given a three-branched stemma, if B adopts some errors of A (i), or produces by coincidence some errors identical with A's, or if C corrects some of co's errors through A (ii), or by conjecture, then AB will share errors against C, whence the illusion that they share an ancestor a* and that the stemma is two-branched (iii).

dies at time u, the number of traces that it generates is a Poisson variate with mean ru

A(r)S(r, c)dr = 0b(u) - q(t), where 0(t) = In S(t, z) + L(t) - M(t). t

Using the known probabilities of death before, and of survival to, time z, we find the p.g.f. to be

1 + (z - l )S(t, T)eL(t) A (r) e-(r) e-(t))zdr (11)

whence Pk(t), the probability of exactly k traces, can be obtained. In particular,

Pi(t) = S(t, T)eL()L(T)(A(t) - L(t) - M(T) + L(T) - ln S(t, T)

+ J (r)e L(T)-Lr){ 1 - S(r, T)}dr)

The probability P2/(1 - Po - P1) of generating exactly two traces, given that at least two are generated, at all times exceeds 0.625 for Greek and 0.6 for Latin. This implies that the majority of the branchings observed in (and not only at the top of) stemmata should be into just two lines-as is observed to be the case in classical traditions. The probability of generating three or more traces reaches its maximum (0.06 for Greek, 0.07 for Latin) for manuscripts born around AD 900 (Greek) or AD 800 (Latin); earlier manuscripts risked extinction in the Dark Ages, while later manuscripts had less time to generate traces before the invention of printing. Since only a minority of archetypes are as late as AD 800-900 (Table 5), the rare cases of branching into three or more lines should tend to occur in the lower positions within the stemma. This too agrees with observation (Alberti, 1979, pp. 94 f.).

Let us return to the question of the number of branches at the top of the stemma. Let uk(t) be the probability that a stemma originating from one sole manuscript at time t is observed to be k-branched at time T (k > 1). Let the first event after time t be a birth at time s. The stemma as observed at time T will be k-branched if

either (i) the ancestor leaves no further trace and the descendant generates a k-branched stemma;

or (ii) the descendant's line dies out and the ancestor nevertheless generates a k-branched stemma;

or (iii) the descendant's line survives and the ancestor generates a further (k - 1) traces.

Thus (t

ak(t) = J exp{M(t) - M(s) + L(t) - L(s)}J(s)[2Ck(s){ 1- S(S, )} + S(s, T)Pk- 1(s)]ds t

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 18: Weitzman 1987

1987] The Evolution of Manuscript Traditions 303

with solution

Yk(t) = S(t, T) exp -M(t) + L(t)} { [X(S)Pk - i(s)em(s) -L(s)/S(s -T)]ds. t

For Bedier's problem, we require the probability that the stemma have k branches, given that there are at least two branches (and thus that the tradition is not, for example, extinct). The probability of that condition is

Z 0k(O) = S(0, T) - S(0, T)2 exp{M(T) - L(T)}{1 + e M(l) f A(s)eM(s)ds}

since oo

E Pk -(S) = 1 -PO(S) = S(S, T)[1 - eL(s)-L()]I 2

The required conditional probability, i.e. U2(0) divided by the above, was found to be 0.221/0.286 = 0.77 for Greek, and 0.188/0.264 = 0.71 for Latin. The preponderance of two- branched stemmata, at least in classical traditions, was thus entirely to be expected.

The underlying reason is highlighted if we again put A and , constant (A > p) and X -+ oo. Then (11) becomes A/( - Az + yuz), and the probability of exactly two branches, given that there are at least two, is found to be y/1, equal to the initial extinction probability. We are back to the same "principle of poverty" which accounted for the prevalence of archetypes. Here it implies that dead archetypes, which by definition produce at least two traces, usually produce only two. Already Greg(1930-1) attributed the prevalence of two-branched stemmata to this factor of heavy losses; but without a mathematical treatment, one could not answer Timpanaro's objection (1985, pp. 132 f) that a three-branched tree is almost as bare as a two-branched tree and ought therefore to occur almost as often. Not surprisingly, the few instances of many-branched stemmata for classical works tend to be observed (i) in traditions where the archetype survives (Reeve, 1986, p. 60) and thus had an unusually long opportunity of generating traces, and (ii) in traditions where all extant manuscripts are relatively recent, from the thirteenth century or later (Alberti, 1979, pp. 57, 94), so that the archetype is likely to be dated towards the later end of the distribution of Table 5 and therefore to have witnessed the revival of learning in the ninth and later centuries.

9. CONCLUSION Scholars have sometimes felt committed to implausible theories about textual transmission,

simply because certain phenomena seemed inexplicable otherwise. Here, however, the phenomena are held to follow naturally from features common to most traditions the chronological spread of extinctions from the ever present risk of manuscript "death", and the prevalence of archetypes and two-branched stemmata from the high extinction probability for the population arising from any manuscript, right down to the age of print. These results should hold in classical traditions even if the functions adopted for A(t) and p(t) are disputed, and indeed in other traditions. Other results are more dependent on the exact course of the birth- and death-rates, and partake of the uncertainty surrounding that course; the dependence is probably least in results on which the Greek and Latin applications of the model agree. Despite the concentration on these two literatures, the mathematical work applies to traditions in general, some of which may prove open to less uncertainty.

A mathematical model, as Kleinlogel and others urge, is not the same as the intricate processes of history. It can, however, establish a reasoned presumption, in the place of sheer conjecture; the present model, for example, overturns Bedier's assertion that the majority of stemmata cannot be two-branched. Inevitably, in some respects the fit between the model and reality is poor. This is not a reason for dismissing the model, but rather poses a challenge

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 19: Weitzman 1987

304 WEITZMAN [Part 4,

to improve it; for without mathematical modelling, those aspects of transmission that are common to texts in general cannot be systematically investigated.

ACKNOWLEDGEMENTS The help of Mr A. H. Griffiths, Mr R. I. Ireland, Prof. D. V. Lindley, Prof. M. D. Reeve,

Mr N. G. Wilson and the referees, and the encouragement of Prof. A. P. Dawid and Prof. D. G. Kendall, are gratefully acknowledged.

REFERENCES

Ageno, Franca, B. (1975) Ci fu sempre un archetipo? Lettere Italiane, 27, 308-9. Alberti, G. A. (1979) Problemi di Critica Testuale. Florence: La Nuova Italia. Avalle, D. S. (1961) La Letteratura Medievale in Lingua d'Oc nella sua Tradizione Manoscritta. Turin: Einaudi. Bardon, H. (1952-60) La Litterature Latine Inconnue. Paris: Klincksieck. Bartlett, M. S. (1966) An Introduction to Stochastic Processes. Cambridge: CUP. Bedier, J. (ed.) (1913) Le Lai de l'Ombre par Jean Renart. Paris: Firmin-Didot. idi Benedetto, V. (1965) La Tradizione Manoscritta Euripidea. Padua: Antenore. Bevenot, M. (1961) The Tradition of Manuscripts. Oxford: OUP. Castellani, A. (1957) Bdier avait-il raison? Fribourg Suisse: Editions Universitaires. Dain, A. (1975) Les Manuscrits. Paris: Les Belles-Lettres. Dawe, R. D. (1964) The Collation and Investigation of Manuscripts of Aeschylus. Cambridge: CUP. Dearing, V. A. (1962) Methods of Textual Editing. Los Angeles: William Andrews Clark Memorial Library, University

of California. Dearing, V. A. (1974) Principles and Practice of Textual Analysis. Berkeley: University of California Press. Fourquet, J. (1946) Le paradoxe de Bedier. Publications de la Faculte des Lettres de l'Universite de Strasbourg, 105,

Melanges 1945 II (Etudes Litteraires). Paris. Gerrard, R. J. (1983) Some Results on Convergence in the Theory of Stochastic Processes. Ph.D. Thesis, Cambridge. Grassi, E. (1961) Perche la tradizione manoscritta di quasi tutti gli autori antichi risale ad un archetipo? Atene e

Roma, 6, 150-1. Greg, W. W. (1930-1) Recent theories of textual criticism. Modern Philology, 38, 401-4. Haigh, J. (1971) The manuscript linkage problem. In Mathematics in the Archaeological and Historical Sciences (F.

R. Hodson et al. eds) pp 396-400. Edinburgh: Edinburgh University Press. Kendall, D. G. (1948) On the generalized 'birth-and-death' process. Ann. Math. Statist., 19, 1-15. Kleinlogel, A. (1968) Das Stemmaproblem. Philologus, 112, 63-82. Laistner, M. L. W. (1957) Thought and Letters in Western Europe. London: Methuen. Maas, P. (1952) Sorti della letteratura antica a Bizanzio. Appended to Pasquali (1952), pp. 487-492. Maas, P. (1958) Textual Criticism (translated by Barbary Flower). Oxford: OUP. Pasquali, G. (1952) Storia della Tradizione e Critica del Testo. Florence: Felice Le Monnier. Reeve, M. D. (1986) Stemmatic Method: Qualcosa che non funziona? In The Role of the Book in Medieval Culture,

(P. Ganz, ed.) Turnhout: Brepols, pp. 57-69. Reynolds, L. D. and Wilson, N. G. (1974) Scribes and Scholars. Oxford: OUP. Reynolds, L. D. et al. (1983) Texts and Transmission. Oxford: OUP. Richardson, N. J. (1974) The Homeric Hymn to Demeter. Oxford: OUP. Souilhe, J. (1943) Epictete: Entretiens, vol. 1. Paris: Les Belles-Lettres. Thackeray, H. St. J. (tr.) (1917) The Letter of Aristeas. London: SPCK. Timpanaro, S. (1985) La Genesi del Metodo del Lachmann. Padua: Liviana. Weitzman, M. P. (1982) Computer Simulation of the Development of Manuscript Traditions. ALLC (= Association

for Literary and Linguistic Computing) Bulletin, 10, 55-59. Weitzman, M. P. (1985) The analysis of open traditions. Studies in Bibliography, 38, 82-120. West, M. L. (1973) Textual Criticism and Editorial Technique. Stuttgart: Teubner. Whitehead, F. and Pickford, C. A. (1951) The two-branch stemma. Bull. Bibliographique de la Societe Internationale

Arthurienne, 3, 83-90. von Wilamowitz-Moellendorff, U. (1907) Einleitung in die griechische Tragbdie. Berlin: Weidmann.

DISCUSSION OF DR WEITZMAN's PAPER Dr R. J. Gerrard (City University): Dr Weitzman has presented us with a fine example of the

application of probabilistic models to real situations. Whilst he does not make extravagant claims for the applicability of the model and does not hesitate to point out its deficiencies, he nevertheless pursues the mathematical analysis to a point where he is able to make some very interesting conclusions on

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 20: Weitzman 1987

1987] Discussion of the Paper by Dr Weitzman 305

problems which confront historians, in particular the existence of an archetype and the 'paradox' of the two-branched stemma.

The problem of the discrepancy between the observed average dates of surviving manuscripts and the a priori probability of extinction of the tradition could, I feel sure, be alleviated by the expedient of introducing a class of 'a priori masterpieces' which have been recognised for their intrinsic worth from the time of writing until the present. If the birth rate of such manuscripts were multiplied throughout by a constant factor a without necessarily altering the death rate, it should be possible to choose a, I and m, in such a way that the model is in better agreement with the data.

On the subject of the ages of the manuscripts observed to be in existence at time t, Dr Weitzman's results are stated in terms of ratios of expectations. That these results hold asymptotically with probability one can be seen, however, from a martingale treatment of the subject. The quantity H(t) = eM(t) - L(t)N(t) is a non-negative martingale with limit W, where W = 0 if and only if extinction ultimately occurs, and, conditional on W > 0, W has an exponential distribution with mean I(0, oo). Letting A(t) be the sum of the agesof the manuscripts in circulation at time t, and writing +(t) = ft eL(u)du,

one can show that X(t) = eM(t)(A(t) - e -L(t)4(t)N(t)) is a martingale and that under certain conditions X(t)/+(t) -- W a.s., and A(t)/N(t) e- L(t)+0(t) almost surely on { W > 0}.

A similar treatment shows that, if N(t, x) is the number of manuscripts at time t whose age exceeds x, then N(t, x)/N(t) - eL(t - x) - L(t) -+ 0 almost surely on { W > 0}.

Although it may at first seem surprising that these asymptotic limits do not depend on the death rate, it can be seen that the reason for this is that the death probability for any one manuscript is independent of its age.

I have much pleasure in proposing the vote of thanks to Dr Weitzman for a most interesting and stimulating treatment of his subject.

Dr J. Haigh (Sussex University): It is a pleasure to congratulate Dr Weitzman on his resourcefulness and scholarship in bringing together such diverse material for statistical investigation, in the best traditions of read papers to this Society. Here is an attempt to construct a plausible stochastic model of the processes leading to the current distribution of the frequencies of copies of ancient manuscripts. If all the questions posed do not have answers, we have at least two ripostes: firstly, Dr Weitzman's own concluding remarks about the necessity to begin with some model; and secondly, as David Kendall said to me, "What is the point of a read paper that has so completely covered the subject that there is nothing left to say?"

The major novelty of this paper is to present Figs 2 and 3, in order that the data of Table 1 be explained via a linear birth and death process. Any statistical test of the model needs to acknowledge that the data for such a test are part of the evidence that suggested it. But, aside from this, the central critical feature is the assumption that the parameters I and m are constant across time, authors and works. Dr Weitzman explains eloquently why such an assumption, particularly about 1, is unwarranted, although it is clearly reasonable to investigate the extent to which such a model can stand up. Unfortunately, as the last part of Section 3 describes, the model with constant 1 and m seems to fail. Consequently, since the numerical values in Tables 2, 3, 4 and 5 are derived from values of I and m that are inconsistent with the data in Table 1, they are of limited use.

However, and without necessarily expecting success, I do not believe that a model with constant 1 and m has been given its best chance. Expressions (1) and (6) are point estimates, based on average behaviour, in an application where variation is enormous. An alternative approach might be to seek to draw a likelihood surface for (1, m), hoping to find values of high relative likelihood that fit the data better.

To do this, write the data as n manuscripts alive now, of which r were alive at time 0, and the others arose at known times ur+ 1 < Ur+2 < ... < un. An unknown number, N, have existed at some time, dying at times tl < t2 < ... < tN( < z); R of these were alive at time 0, the others arose at SR+ 1 < ... < SN. (We are not supposing that ti and si refer to the same manuscript.) Given Fig. 2 (or 3), the likelihood L (1, m, R, N, s, t) can, in principle, be computed; vary R, N, s and t to maximise L(-), and thereby plot a 'maximum likelihood surface" for (1, m). This may not be so formidable a task as might appear, as Figs 2 and 3 suggest good guesses for R, N, s and t, and we can discretize (0, z) into, say, 50 year intervals. If a pair (1, in) can give a reasonable fit to the data, then analogues of Tables 2-5 can be calculated, and estimates of the missing data R, N, s and t given. Dr Gerrard's suggestion of a third parameter, a, could smilarly be incorporated.

To estimate the ratio of the sizes of the arbre reel and the extant population, denote the size of the

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 21: Weitzman 1987

306 Discussion of the Paper by Dr Weitzman [Part 4,

population at time t by X(t). Thus we seek E((1 + fJA(t)X(t)dt)/X(z)) = E(1/X(-r)) + J' A(t)E(X(t)/X(-r)]dt. Since E(X(t)/X(z)) ? 1/E(X(c)/X(t)), and E(X(-r)/X(t)) = exp(L(-c) - L(t) - M(-r) + M(t)), we note that the expression given in Section 5 is, theoretically, a lower bound for the expectation of the required ratio, emphasising the conclusion that far more manuscripts have perished than have survived.

Later in Section 5, attention is drawn to the probability that a younger manuscript may be fewer generations removed from the original than an older copy. Recent work in population genetics, such as calculating the probability that the ith most populous allele in the current population is the oldest (Watterson and Guess, 1977) may be of relevance.

The virtuosity demonstrated in Section 6, 7 and 8 needs only one, but major, note of caution: if the mathematical model is a reasonable representation of reality, then light is shed on the problems investigated. My instinct tells me that the explanations for the existence of an archetype, and for the proponderance of two-branched stemmata, are appealing. But, without some means of justifying the underlying model more convincingly, these "explanations" similarly lack conviction.

My reservations about the applicability of these results have been aired. But we must be grateful to Dr Weitzman for his audacity in investigating the most plausible, simple, stochastic model of this complex process, and drawing our attention to the ways in which a successful model, along these lines, can aid our understanding of the data. I am pleased to second this vote of thanks.

The vote of thanks was passed by acclamation.

Dr Peter Donnelly (University College London): Let me begin by adding to those of the earlier discussants my thanks and congratulations to the author for a very interesting paper. I would like, if I may, to draw attention to close connections between the problems discussed this evening and recent work on genealogies in population genetics, and in particular to suggest an alternative class of models, not, I hasten to add, because of any lack of faith in Dr. Weitzman's analysis, but because it is both informative and reassuring that they lead to the same broad conclusions.

The perspective is slightly different. Consider a particular literary work and all its descendants (copies) as a population which "reproduces" through time. Instead of specifying birth and death rates, specify in advance (or rather, retrospectively on the basis of historical information) a deterministic or stochastic sequence of population sizes at given points in time, and (possibly conditional on this) a random mechanism for reproduction. Many of the questions addressed in the paper can now be expressed in terms of the resulting genealogy: the family tree, traced backwards in time, of the surviving manuscripts.

One approach is to specify a particular model (reproductive mechanism and collection of population sizes) and then to ask specific questions. Considerable progress is possible for certain models, see for example Donnelly (1986). Alternatively one could ask about general features of the resulting genealogy, in the hope that these would not depend too closely on precise details of the model. The striking result, that this is in fact the case, was first proved in Kingman (1982a,b). Loosely speaking, for any of a large class of time-invariant, exchangeable, "neutral" reproductive models, set in discrete or continuous time, with overlapping or non-overlapping generations, the (time rescaled) stochastic process describing the genealogy of a sample from the population, converges as the population sizes become large, to a particular, tractable, process called a coalescent. Furthermore the timescaling depends only on the population sizes and very general features of the reproductive mechanism.

What of the limiting genealogy? It is (almost surely) two-branched everywhere, so in particular for models for which the convergence obtains we would expect two-branched stemma. Secondly the probability of an archetype could easily be calculated. Although the theorem would apply to most plausible "reproductive" models for manuscripts, I shrink from actually performing the calculation, firstly because it would involve a commitment, which I feel totally unqualified to make, to a series of population sizes and some details of a reproductive mechanism (although only the harmonic mean of the population sizes and the variance of "offspring" numbers are important here) and secondly it would involve a belief that the population sizes were large enough for the asymptotics to apply, a question in need of rather more mathematical study.

A focus on genealogy has proved an immensely powerful tool in population genetics (Tavar6 1984). In this context it shares the considerable advantage of robustness, and furthermore by effectively conditioning on "observed" population sizes it may mirror some aspects of reality more closely than the birth-death model (recall ?3 and ?5). Also, population "catastrophes", for example that due to the sacking of Constantinople, may be incorporated directly in the model.

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 22: Weitzman 1987

1987] Discussion of the Paper by Dr Weitzman 307

Dr A. C. Davison (Imperial College, London): My comment relates to the possibility of maximum likelihood estimation of parameters for models of the evolution of manuscript traditions, as suggested by Dr Haigh. To find an analytic expression for the marginal likelihood of the observed data might be hard for a realistic model, but it would not be hard to apply the implicit statistical estimation method of Diggle and Gratton (1984), perhaps in the simpler form suggested by Thompson et al. (1986).

The Author replied later, in writing, as follows. May I first of all thank the discussants for the time and care that went into their contributions. It is

particularly gratifying that two results which I could derive only approximately, as ratios of expectations, have been confirmed through rigorous analysis by Dr Gerrard on the average age of the mss, and by Dr Haigh on the ratio of the arbre reel to the extant population.

Dr Haigh notes how far the model, with the values of 1 and m that I eventually adopted, fails to satisfy simultaneously the three equations for a priori extinction probability, average size of extant traditions, and average date of extant manuscripts. He therefore doubts the applicability of my results, on the ground that they depend on those values of 1 and m. As far as the precise figures are concerned, I have always shared his reservations. For some of my conclusions, however, the mathematical analysis leads on to more general arguments, which suggest that those conclusions should hold, whatever the values of the birth- and death-rates over time, given one condition namely, a high a priori extinction probability for the population arising from any manuscript at any time until the age of print. This is a plausible assumption, at least for classical literature. The conclusions I have in mind are the high random variability of population size, the high ratio which the arbre reel bears to the relevant (and therefore to the extant) manuscripts, and the high probabilities of an archetype and of a two-branched stemma, given survival. And even where the numerical results depend crucially on my values for 1 and m, the algebraic formulae still seemed worth reporting, in the hope that the parameters could one day be assigned values that better fitted reality in other literatures, and even in the classical literatures.

Dr Gerrard proposes to improve the fit by identifying masterpieces (an invidious but not impossible task) and multiplying their birth-rate by a constant a (> 1). This would imply that the average date of extant manuscripts was later for masterpieces than for other works. It would be interesting to investigate this empirically. Within each of the two resulting classes, he would still be trying to satisfy three equations with two unknowns-but the reduction of heterogeneity might bring him greater success than I attained in ?3.

Dr Haigh would estimate 1 and m by maximising the likelihood of the observed number of manuscripts, with their observed dates. These data could be drawn from one or more extant traditions. Before plotting a likelihood surface for I and m, he would vary a large and indeed unknown number of further parameters; the number of lost manuscripts (N), their dates of death, the number of lost manuscripts that were alive at t = 0 (R), and the birth dates of the remainder. In fact, through, R is fixed, being the number of lost originals, and therefore (as all originals are lost) the number of traditions included in the data. For N, a minimum level would have to be set, high enough to convey the scale of the death toll, and therefore probably running into hundreds. It might therefore be better not to bring the birth and death dates of lost manuscripts explicitly into the calculation. Thanks to Diggle and Gratton, to whom Dr Davison made timely reference, the maximisation should then be tractable despite the intricate algebra. Before embarking on such extensive computation, however, one should remember how tentative the underlying functions A*(t) and /l*(t) inevitably are.

Dr Donnelly suggests a promising alternative model, which could dispense with Figs. 2-3 but instead requires estimates of population size down the ages. Here the historians of scholarship could offer only a very broad range of possible sizes at each period. Yet even these estimates are definite enough to enable us, for example, to impugn the past population figures for Latin in Table 4, and so they could well give a more precise picture than my own model. It is reassuring that this approach too predicts a preponderance of two-branched stemmata. I am indeed grateful to Dr Donnelly, and also to Dr Haigh, for pointing out such parallels with population genetics aptly enough in a year that began with the appearance of Rebecca Cann's noted article which classified the human species into a two-branched stemma!

Let me finally thank the Royal Statistical Society for creating this opportunity for interdisciplary debate. What my discussants lacked in numbers, they amply made up in acumen and depth.

REFERENCES IN THE DISCUSSION Cann, R. L., Stoneking, M. and Wilson, A. C. (1987) Mitochondrial DNA and human evolution. Nature, 325, No.

6099 (1-7 Jan 1987), 31-36.

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions

Page 23: Weitzman 1987

308 Discussion of the Paper by Dr Weitzman [Part 4,

Diggle, P. J. and Gratton, R. J. (1984) Monte Carlo methods of inference for implicit statistical models (with Discussion). J. R. Statist. Soc. B, 46, 193-227.

Donnelly, P. (1986) A genealogical approach to variable-population-size models in population genetics. J. Appl. Prob., 23, 283-296.

Kingman, J. F. C. (1982a) On the genealogy of large populations. J. Appl. Prob., 19A, 27-43. Kingman, J. F. C. (1982b) Exchangeability and the evolution of large populations. In Exchangeability in Probability

and Statistics, (G. Kock and F. Spizzichino, eds.) pp. 97-112. Amsterdam: North-Holland. Tavare, S. (1984) Line of descent and genealogical processes and their application in population genetic models.

Theor. Pop. Biol., 26, 119-164. Thompson, J. R., Atkinson, E. N. and Brown, B. (1986) SIMEST: An algorithm for simulation based estimation of

parameters characterising a stochastic process. Technical Report 86-20, Dept of Mathematical Sciences, Rice University.

Watterson, G. A. and Guess, H. A. (1977) Is the most frequent alele the oldest? Theor. Pop. Biol., 11, 141-160.

This content downloaded from 134.58.253.30 on Thu, 9 Jan 2014 04:55:17 AMAll use subject to JSTOR Terms and Conditions