Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of...

25
Explanatory Instability Author(s): Robert W. Batterman Reviewed work(s): Source: Noûs, Vol. 26, No. 3 (Sep., 1992), pp. 325-348 Published by: Blackwell Publishing Stable URL: http://www.jstor.org/stable/2215957 . Accessed: 15/04/2012 14:43 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. Blackwell Publishing is collaborating with JSTOR to digitize, preserve and extend access to Noûs. http://www.jstor.org

Transcript of Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of...

Page 1: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

Explanatory InstabilityAuthor(s): Robert W. BattermanReviewed work(s):Source: Noûs, Vol. 26, No. 3 (Sep., 1992), pp. 325-348Published by: Blackwell PublishingStable URL: http://www.jstor.org/stable/2215957 .Accessed: 15/04/2012 14:43

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

Blackwell Publishing is collaborating with JSTOR to digitize, preserve and extend access to Noûs.

http://www.jstor.org

Page 2: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

Explanatory Instability*

ROBERT W. BATrERMAN

Ohio State University

0. I am going to describe a very simple game which involves throwing a die. The outcome of the game, however, is a priori rather surprising: it cries out for explanation. My purpose in presenting this example is to illustrate a ubiquitous pattern of explanatory reasoning in physics. Although this initial example is unphysical, I consider it because it makes the structure of the explanatory argu- ment transparent. I will offer other physical cases later on. It seems to me that this type of explanation is sui generis. It does not fit well into any of the various models of statistical explanation that are currently in vogue. My discussion will focus primarily on Peter Railton's Deductive-Nomological-Probabilistic (D-N-P) model, although the conclusion remains valid for other proposed models as well. For lack of a better name, one might, perhaps somewhat paradoxically, call this type of explanation "Statistical-Deterministic".

To play the "chaos game" you need to mark off three vertices of a triangle on a piece of paper. Label them 1, 2, and 3. Choose one point as your starting place, and start tossing a 3-sided die. (If you don't happen to have one of these lying around, you can label the points (1,2), (3,4), (5,6) and use a regular 6-sided die instead.) Suppose that your starting point is 1 and the first roll lands on the side marked 3. You must put a point midway between your starting point 1 and the point marked 3. On the next roll, put a point halfway between this new point and the point assigned to the number rolled, and so on. Apparently random place- ments of the points yields, after sufficiently many iterations, the fractal pattern (called a Sierpinski gasket) in figure 1. Fractals are characterized by a dimension. Just as the plane has dimension two, it can be shown that patterns like the Sierpinski gasket have noninteger or fractional dimension. Questions of the sort I want to ask are the following: What explains the emergence of this fractal pattern? Why do the patterns resulting from different plays of the game all have the same fractal dimension?

NOUS 26:3 (1992) 325-348

325

Page 3: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

326 NOUS

Fiur -.SepisiTrage

1. ~ ~ ~ .$- Beoetmigt hseqetos,ltm akafegnrlrmrs

about st A pn sm

A.s A . A sA

asetin of thos thores Thi prblm in tum relates f., th qusto . of

Figure 1. Sierpinski Triangle.

1. Before turning to these questions, let me make a few general remarks about statistical explanation, probability, and determinism.

The problem of finding an adequate model for statistical explanation in physical theories is closely tied to the problem of interpreting the probabilistic assertions of those theories. This problem, in turn, relates to the question of whether determinism or indeterminism prevails in those theories. Just how intimate these connections are is in part the subject of this paper.

One might argue that the relationships between determinism and explanation and between determinism and interpretations of probability are not that close. For instance, it has seemed to some that the proper interpretation of probability should be indifferent or invariant between the possibilities of determinism and indeterminism. And, one might also hold that a model of statistical explanation should be appropriate for "deterministic" (but possibly under-described), as well as for "genuinely indeterministic," phenomena.

I think that these views are mistaken. We arrive at the correct interpretation of physical probability by examining our theories about the world. 1 Different theories may require different interpretations. The classic example is the difference between the statistical generalizations or "laws" of classical statistical mechanics (SM) a "deterministic" theory, and those of quantum mechanics (QM) a "genuinely indeterministic" theory. Many have held that QM requires a propensity interpretation while SM employs some kind of frequency inter- pretation.

Page 4: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 327

This split in interpretations of probability is reflected in the two models of statistical explanation we will be considering. Railton's D-N-P model adopts a fairly radical propensity interpretation in which probabilities are objective single case propensities.2 On the other hand, Hempel's Inductive-Statistical (I-S) model can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic explanation is incapable of explaining various fundamental facts in SM such as the existence of a state of equilibrium. The I-S model appears, initially, as if it might fare better in this regard. However, I will also argue that in the case of explanation in SM, it too misses what is genuinely explanatory. An important conclusion of this discussion will be that probabilities can enter into physical theories in a way that is fundamentally differ- ent from both the classical (ignorance based) view and the quantum (irreducible- nonepistemic) conception. They arise through the dynamics characterizing the evolution of a system. I will have something (although, unfortunately, not too much) to say about how this is possible when considering the examples below. I will try to argue that while Railton's D-N-P model is, in some respects, an improvement over the I-S model, it is too restrictive. In employing a strict pro- pensity interpretation of probability, it rules out as non-explanatory, certain fea- tures which, in fact, are essential for the correct explanation of certain phenomena.

In the next section I will present the D-N-P model of explanation and contrast it briefly with Hempel's I-S model. Following this, in section 3, I will argue that the D-N-P model is incapable of explaining the emergence of the pattern described above. I will also reach the same conclusion for two analogous, yet physical, examples of the same explanatory phenomenon. Finally, in section 4, I take up the question of explanation in statistical mechanics and ergodic theory. What concerns me here first is how deterministic systems can possess statistical properties, and second, how these properties can play essential roles in explaining the behavior of such systems. It seems that in certain cases, being told that a system can exhibit behavior that is "as random as" that of a roulette wheel, can be genuinely explanatory. However, as I have already noted, neither the D-N-P nor the I-S models of explanation are themselves equipped to handle such cases.

2. In two articles Railton (1978) and (1981), Peter Railton has presented us with an account of probabilistic or statistical explanation that differs fundamen- tally from the "received" Hempelian view that statistical explanations are induc- tive in form. As I have already mentioned Railton calls his account a "Deductive- Nomological-Probabilistic" model. This title is apparently meant to express not only a rejection of models involving inductive arguments, but also to establish its allegiance with the Hempelian Deductive-Nomological (D-N) model of explana- tion, which is appropriate, according to the received view, for explaining deter- ministic phenomena.

Railton argues that events such as an "irreducibly probabilistic" a-decay of a Uranium238 nucleus can be explained by deductive subsumption under genuinely

Page 5: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

328 NOUS

probabilistic laws. Well, this is almost but not quite right. Since the event to be explained (the particular ax-decay) is a matter of chance, we cannot give a strictly deductive argument to the explanandum sentence describing the particu- lar a-decay: The event might not occur, even if the premises are true.

Rather, these explanations subsume a fact in the sense of giving a D-N account of the chance mechanism responsible for it, and showing that our theory implies the existence of some physical possibility, however small, that the mechanism will produce the explanandum in the circumstances given. [Railton (1978), p. 209.]

Before we discuss the D-N-P model in more detail, we ought to see why we should be dissatisfied with Hempel's I-S account of statistical explanation. What are the problems with this view which the D-N-P model seeks to address? I will not go into this in detail since Railton gives a good discussion of several well known difficulties with the I-S view. Let me just briefly mention the two most important problems. The first is the requirement that for an I-S argument to be explanatory, it must confer high probability or near certainty on the explanan- dum. (When asked why Johnny has the measles, we do not feel that our request for explanation is fulfilled upon being told: "Johnny was playing in the sandbox with some friends, none of whom had the measles, and there is a .000032 chance of being infected in those circumstances.") But, as Railton argues, in the case of highly improbable ax-decay (in the absence of any external radiation), the correct response to the why question is just to be told the probability of the event-in this case an extremely small number. (The half-life of U238 is on the order of 109 years.)

The second problem with the I-S model is what Hempel called the "ambiguity" of I-S explanations. Since the explanatory arguments are inductive in form it will generally be possible to find conflicting "explanatory" I-S argu- ments; one showing high probability that the explanandum event occurs, the other showing high probability for its failure to occur. Essentially, this problem arises out of the possibility that relative to one reference class, a given event may be (highly) probable, whereas relative to some more restrictive class, the proba- bility of the event can be quite low. Hempel's response to this problem was to require maximal specificity in constructing I-S explanations-in essence, to take into account all relevant available information. Statistical explanation is now relativized to our present epistemic state. Of course, problems remain even after this move has been made. How do we characterize the "relevant" information? Do we really want to accept explanations of physical phenomena which are relativized to our epistemic state?

The D-N-P model aims to meet both of these objections. It allows for genuine probabilistic explanation of unlikely phenomena, and it avoids the empiricist epistemic relativization. However, it does so at some cost: D-N-P explanations are possible only for theories which postulate the existence of genuine

Page 6: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 329

irreducibly probabilistic processes. (Of course, Railton takes this to be a virtue of the view as we will see.)

In restricting the domain of probabilistic explanation to such theories Railton is able to block the problem of epistemic relativity. The paradigm of such theories is, of course, QM. On the "accepted" interpretation of QM the T-func- tion gives the most complete state description for a system. Yet, given that state description, we can at best determine only a probability distribution over the system's future states. This means that once we are in possession of the T- function for a system, there is no further (before-the-measurement) information which would allow us to exhaustively partition an ensemble of similar systems (i.e. those with the same T-function) into mutually exclusive classes yielding different probabilities. The problem of ambiguity or of alternative reference classes does not arise. It is plausible that the interpretation of probability which best suits this situation is an objective single case propensity theory of the kind mentioned above. (A frequency or ensemble interpretation will always allow for the (conceptual) possibility of partitioning the initial ensemble.) Therefore, we see that the interpretation of probability associated with the model of explanation does much of the work in resolving the second of the two problems which afflict the I-S model.-

The other problem, that the I-S model demands high probability of the explanandum given the explanans, is resolved by giving up the inductive character of the explanations. A deductively valid demonstration that a given phenomenon (a-decay) has a particular probability (high or low) to occur is all that can possibly be demanded. If the event is objectively improbable, then it is unlikely, although not impossible, for it to occur. And, we should not expect our theory of explanation to show that its occurrence was likely. Furthermore, if the event is governed by genuine irreducibly probabilistic laws, then we should not expect to be able to give a valid demonstration of the explanandum. The conclu- sion of the deductive argument in a D-N-P explanation is not the explanandum, rather it is a statement expressing the objective probability for the explanandum to occur.

Actually, there are two valid deductive arguments involved in a D-N-P expla- nation. The first is a derivation from the theory of a probabilistic law of the form:

(Vx)(Vt)(Fx,t -> Pr(Gx,t + e) = r); ? 2 0

The second is the derivation, using this law together with the relevant initial conditions, of the probability of the event to be explained. It is important to take note of the form of probabilistic laws on Railton's view. They are universal generalizations asserting of objects that if they meet certain conditions at a given time, that they have a particular propensity to display some further property. These laws are of a completely different form than the statistical "laws" of the form Pr(F I G) = r, which we find in Hempel's I-S model.

Page 7: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

330 NOUS

D-N-P explanations have the following form:

(1) A theoretical derivation of a probabilistic law L of the following form:

(2) L: (Vx)(Vt)(Fx,t -4 Pr(Gx,t + E) = r)

(3) Fa,to

(4) Pr(Ga,to + E) = r [and (5) G(a,to + E)]

(5) is the explanandum, a statement to the effect that the chance event did indeed occur. Clearly, (5) in no way follows deductively from (1)-(4). Railton calls (5) a "parenthetic addendum", and allows that its inclusion in the schema is not indispensable: [Railton (1981), p. 236.]. However, its inclusion will become necessary if we want to string together D-N-P explanations so as to provide a complete explanatory account of a given chance phenomenon.

This is important, because Railton departs from the Hempelian ideal that explanations are "purely arguments." He holds that deductive arguments play essential roles in the explanations of both deterministic and probabilistic phenomena, but that they are not always themselves sufficient. What is sufficient for a genuine explanation, statistical or deterministic, is being able, in principle, to provide arbitrary parts of what he calls an ideal explanatory text. This is a key concept; one which we must briefly discuss. What I will argue in the following two sections is that Railton's ideal explanatory texts are unable to explain the patterns like that in the triangle game. Later on I will also argue that for basic explanations in SM, such as why systems generally evolve from states of nonequilibrium to states of equilibrium, the ideal explanatory texts completely miss what is, in fact, explanatory.

These ideal texts are accounts which include the relevant D-N or D-N-P schema. The D-N or D-N-P schemas provide the "skeletal form" of the ideal explanatory texts, but they include much more as well. An ideal D-N explanatory text will, according to Railton, look something like

an inter-connected series of law-based accounts of all the nodes and links in the causal network culminating in the explanandum, complete with a fully detailed description of the causal mechanisms involved and theoretical derivations of all the covering laws involved.... It would be the whole story concerning why the expla- nandum occurred, relative to a correct theory of the lawful dependencies of the world [Railton (1981), p. 247.].

An ideal D-N-P explanatory text will be similar, only there the mechanisms and causality would be probabilistic in nature. These texts are supposed to represent an ideal such that if we possessed them, we would have complete scientific understanding of the explanandum phenomenon.

The notion of an ideal explanatory text is proposed in an attempt to answer a particular objection to law based explanations whether D-N or D-N-P. The

Page 8: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 331

objection is simply that there are many "accepted" explanations which do not even come close to instancing the required D-N or D-N-P schemas. As an example, Railton cites the "explanation" of why a particular muon decayed: "Because, it was unstable" [Railton (1981), p. 239.]. This may indeed be taken as explanatory in some contexts, and as less explanatory in others. However, this does not mean that explanations, statistical or otherwise, are essentially prag- matic. Railton sees the proffered "explanation" as providing explanatory infor- mation. For instance, being told that the muon is unstable "points to a [probabilistic] dispositional property of the muon responsible for the decay, distinguishes this decay from disintegration due to external forces..." [Railton (1981), p. 241.]. Thus, the answer tells us that the phenomenon is probabilistic, and hence, that the ideal text is D-N-P. We are also directed towards the theory from which to derive the fundamental probabilistic law or laws which will play the essential role in the relevant D-N-P schema-the backbone of the ideal D-N- P text.

Very roughly, the idea is that if a statement S would allow us to remove some degree of uncertainty from the explanatory text, then S contains explanatory information about why the phenomenon occurred. Explanations, according to Railton, should "elucidate the mechanisms at work," whether they are deter- ministic or probabilistic. The conclusion is that many of the seemingly non-law based explanations such as the example of muon decay, are explanatory because they provide information relevant for filling out ideal explanatory texts.

Now let us see what this law based account of explanation has to say about the example from the introductory section.

3. The process of rolling a die is not usually considered to be irreducibly probabilistic. One frequently hears the claim that if we only knew the exact initial and boundary conditions together with detailed knowledge of the throwing mechanism, then we would be able to predict with certainty the outcome of any one throw of a given die. In other words, there are unknown (sometimes called hidden) parameters which would enable us to exhaustively partition the class of throws into those which result in some particular number, say two, and those that do not. On the single case propensity interpretation of probability, we would have to say that strictly speaking, the sentence "the probability that this die lands on two on the next throw is 1/6" is false. (Of course, this does not mean that a state- ment to the effect that the relative frequency of heads in a given sequence is 1/6, is false. It is just that relative frequency is not probability.) The only probabilities that this completely deterministic process can have are the trivial ones: 0 or 1.

These considerations are sufficient to show that the D-N-P model is inappro- priate for explaining the triangle patterns we observe. The ideal explanatory texts for these patterns must be of nonprobabilistic form; and so, the D-N model is appropriate. As Railton notes these texts may very well be infinite, if we require the complete causal network. But, for the cases at hand we need only worry about certain parts of the complete text. We need to consider the D-N text which

Page 9: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

332 NOUS

is relevant for explaining the emergence of the observed pattern in the example described above.

What D-N text meets this challenge? The dots that appear on the page are each individually correlated with a single throw of the die. Hence, the ideal D-N text for why a given dot appears in a given spot will contain the D-N explanation of why the die lands on the number that it does, together with the "starting point" for that toss (the position of the dot which resulted from the previous trial) and the rules of the game. In other words, we can give ideal D-N texts for the occurrence of each dot at a particular point on the page by deductively subsum- ing that event under the laws of classical mechanics which govern the behavior of the die. What we have then is a large number of ideal D-N texts; one for each roll and dot. It is clear that any one of these ideal D-N texts is incapable itself of explaining the emergence of the pattern. But if we somehow take them together, can we then achieve the desired explanation?

In order to answer this we need to know what it means to "take them together." Each ideal D-N text contains an explanation of why that given throw yielded that particular outcome. Is there a way of combining these individual texts into some grand ideal D-N text which explains the pattern? In conjoining the individual texts we do seem to get an explanation or account of why we have the particular sequence of numbers (e.g. 1,3,2,2,3,1,2,1,1,1,2,3,...) that we do, in fact, have. Is this account sufficient to explain why we get the pattern?

It is important to distinguish between two levels at which this question may be addressed. There are two different explananda. We could be asking (i) why does the pattern arise from this particular sequence, or (ii) why is it that, in general, sequences of dice rolls normally produce patterns like these (e.g., with such and such fractal dimension)? In essence, this is a token/type distinction. In my opinion it is the second question (ii) that is the most important. We want an explanation of why it is that a remarkably vast set of sequences all yield the same kind of pattern. It is therefore, appropriate to ask what model of explanation can address questions of type (ii)?

Now, it might be argued that D-N/D-N-P model provides a response to the first question. The conjunction of all the individual ideal D-N texts may possibly explain why we have the particular token of the triangle pattern. But, I do not see how it can explain the generic presence of this type of pattern. It makes sense to ask quite generally, why sequences of throws of a die yield patterns of this type. It is especially important to ask this question since not all such sequences will result in triangle patterns. (Consider, for instance, a sequence in which the die lands only on the number 2, or some other "nonrandom" sequence such as 1,2,3,1,2,3,...)

Unless each sequence has an objective single case propensity to yield a triangle pattern, in a way completely analogous to the o-particle's propensity to decay, the D-N/D-N-P theorist cannot respond. But, before we are willing to attribute objective chance to sequences, we must require (for the analogy to be

Page 10: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 333

complete) a theory from which we can derive the relevant single case propen- sities. I cannot imagine what such a theory would be like.

I have been arguing that the complete ideal explanatory texts for the outcomes of the game do not explain what we want them to. It is, therefore, fair to ask what does do the explanatory work. My response is that the key explanatory work is done in part by noting that the sequences of throws are sequences of (stochasti- cally or probabilistically) independent events. The technical term for this is a Bernoulli sequence. This simply means that the probability of a given outcome on a given trial is not affected by, nor does it affect the probability of an outcome on any other trial in the sequence.3 For such sequences, one can apply the laws of large numbers. For example, the Strong Law of Large Numbers (SLLN) for a sequence of coin tosses says:

Pr ( lim Hn 1/2) = n-*oo n

That is, the probability that the relative frequency of heads in the number of tosses equals the probability of heads on a single toss as the number of tosses goes to infinity, is one. It is essential for the SLLN that the tosses be independent events.

Using the SLLN, it is possible to provide an apparently I-S-like explanation for the generic appearance of the triangle pattern by demonstrating that such patterns are highly probable in an ensemble of sequences of the kind described.4 In fact, one argues that the probability of realizing an instance of the pattern is one. The notion of "probability one" in use here comes from measure theory. It is definitely not a synonym for "with certainty," as the existence of "nonrandom" sequences of dice throws makes clear. But if the term "probability" can appear at all in a D-N explanation, it must be used in an expression synonymous with either certainty (probability one) or impossibility (probability zero). These latter uses are not measure-theoretic. I have noted above that the D-N/D-N-P model can appeal to probabilities only if the phenomenon being explained is irreducibly probabilisitic. As this is not true of the phenomenon being studied here, the D- N/D-N-P model fails to provide an explanation for the existence of the generic triangle pattern.

As I see it, the heart of the explanation of why we generally get the triangle pattern (or why such patterns all have the same fractal dimension) is the appeal to a measure-theoretic limit theorem like the SLLN. The fact that the probabilities of the tosses are independent allows us to employ the SLLN to conclude that the pattern is to be expected with high (unit) probability. We have a genuine statisti- cal explanation of a phenomenon in which not only do irreducible probabilities play no role, but for which it is not possible to provide a nonstatistical D-N explanation.

At this point the following objection naturally arises. The case we have been considering is just a mathematical example. The D-N/D-N-P theorist is, after all,

Page 11: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

334 NOUS

concerned with modelling explanation in the natural sciences. Perhaps my claim that this pattern of explanation is sui generis in science rests upon a mistaken generalization from a mathematical as opposed to a physical example.

However, there are many physical examples which illustrate the same pattern of explanation seen here. Many physical substances "crystallize" by forming long complicated dendrites or fingers. Much recent work on understanding these structures has appealed to fractal considerations similar to those in the example discussed above. Consider, for example, the formation of snowflakes, of lightning, or of copper dendrites by precipitation out of a solution by electrolysis. Each of these yields results (patterns) having a fractal structure. I will briefly describe the last case.

The bottom of a dish is covered with a solution of copper sulfate. The copper and sulphate ions in the solution follow Brownian paths as a result of random molecular bombardment. When a small voltage is applied across an electrode in the center of the solution, the copper ions begin to be deposited as copper at the electrode. Once the current is on the solution is in an unstable state of relatively high potential energy. As copper ions get deposited, the system falls into a state of lower potential energy, representing a more stable equilibrium state. Various factors force the copper ions to deposit randomly at available sites along dendrites that have begun to form. When this experiment is run for a certain period of time, the copper grows outward from the electrode in a branching fractal structure as illustrated in figure 2.

Figure 2. Fractal dentrites.

It turns out that the copper pattern one gets invariably has a fractal dimension of about 1.70. The question is: Why? Clearly this is a request for an explanation of a pattern-a question of type (ii).

This fact can be explained through considerations quite analogous to those seen with the Sierpinski gasket. The ingredients of this "Statistical-Deterministic" explanation are:

(a) An assessment of the probabilities that copper ions get deposited at certain locations. (This is analogous to the Bernoulli nature of the sequences in the example above.)

Page 12: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 335

(b) A proof that under iteration, a process with these probabilities gives rise to fractals of dimension 1.70. (This is analogous to the use of the SLLN in the earlier example.)

This is a reasonable reconstruction of the type of explanatory argument that actually gets made.

The SLLN seems to be doing the lion's share of the work in this explanation. Nevertheless, Railton might focus on the probabilities mentioned in (a). What is the status of these probabilities? Are they irreducible or not? Clearly, they are not. There are no "genuinely probabilistic" laws involved here. But if they are not irreducible, then why not replace them with a complete account (ideal D-N text) of what occurs in each case? This would, for example, include a complete characterization of the Brownian path take by each ion-including a description of all collisions it has undergone prior to being captured as copper on a dendrite. This way one can avoid the SLLN by calculating the fractal dimension on each pattern using one of the numerous equivalent algorithms for determining dimension.

I have two objections to this line of argument. First, there is the same worry I expressed in the earlier example. What you get looks like a collection of type (i) explanations-why each particular run of the experiment yielded a fractal of dimension 1.70. But, we still do not have an answer to the question of why, in general, patterns of this sort will be produced. Secondly, and more importantly, this account simply misses the main point.

The probabilities mentioned in (a) essentially derive from instabilities inherent in the nature of the system. To trace the phenomenon deterministically as Railton demands, one must take into account factors that have virtually nothing to do with the system (= the copper sulfate solution with the applied voltage). These factors, for example, would include complete descriptions of the Brownian path of each copper ion. That is, a complete ideal text would have to include an account of every collision a given ion has suffered prior to being captured on a dendrite. Explanations are supposed to further our scientific under- standing. Yet, these factors (the complete Brownian paths) play a completely insignificant role in our understanding of why the process occurs, or how robust it is, i.e. how generically it occurs.

Let's look more closely at how probabilities may arise from deterministic systems. They derive from internal symmetries and "dynamical" instabilities. This will help us see that the "Statistical-Deterministic" pattern of explanation is sui generis. By way of illustration, consider another example, one which this time has nothing directly to do with fractal patterns: the Euler strut. This is a fairly stiff ribbon of steel mounted vertically and rigidly on the floor as in figure 3. Suppose that we begin gradually to apply weight symmetrically to the top of the strut. Once the load reaches a critical value, the Euler critical point, the strut will buckle. That is, it will come to rest in a new equilibrium state-having buckled either to the left or to the right. Our concern is with these two possible

Page 13: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

336 NOUS

final equilibrium states in which the system can be found. What is our expectation as to the behavior of such strut plus weight systems in general?

It is fairly obvious that we expect such struts to buckle to the left about half the time and to the right about half the time. Why? Why don't we expect it to buckle to the left say three fourths of the time? This is the "pattern" that needs to be explained. It is the analog of the patterns we come to expect when playing the triangle game or when growing copper dendrites in electrolysis experiments. Once again the explanation is statistical, yet the phenomenon described is by no means irreducibly probabilistic.

Figure 3. The Euler Strut.

The explanation for this generic behavior of Euler struts begins by pointing out that once the Euler critical point is reached, the system is an a state of unstable equilibrium. It exhibits a form of dynamical instability. This instability, coupled with the symmetry of the situation, is certainly responsible for our expectation that the probability of buckling to the left is one half. Thus, there may be room for a "principle of indifference" in characterizing the deterministic probabilities inherent in the Euler strut. Here the principle would be dynamically motivated.

However, while instability and symmetry are major factors in the explanation they are not by themselves sufficient. According to the equations governing the behavior of the Euler strut, it could remain in the unstable equilibrium state at the critical point indefinitely. This behavior, of course, is never observed because even the slightest nudge sends the strut eventually into one of its new stable, buckled equilibrium states. Therefore, influences from the outside also play an important role. Unless these influences are in a sense independent, random, or spatially uncorrelated, our expectations as to the generic behavior of such struts will not be met. (I believe, however, that this is probably the least important factor in the explanation. The instability and symmetries are typically so robust

Page 14: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 337

that the requirement of independence or "randomness" can be considerably weakened. Strictly speaking, the claim of genuine independence or randomness of the influences is probably false anyway. This is analogous to the fact, men- tioned in note 4, that the requirements of independence can be considerably weakened.) As an example of such an external influence we can take the infamous butterfly that flaps its wings on the other side of the earth, causing a chain of molecular collisions which culminates in some particular air molecule hitting the strut, causing it to buckle to the one side.

Once again, I claim that the instability, symmetry and randomness inherent in the situation conspire to realize the conditions under which some measure- theoretic limit theorem such as the SLLN applies. And so we can conclude with probability one that the Euler strut buckles to the left half the time and to the right half the time. This follows the form of a "Statistical-Deterministic" explan- ation as in (a) and (b) above.

Now, however, the D-N/D-N-P theorist will, once again, cry foul for essen- tially the same reasons just discussed in the fractal case. It is, I believe, worth rehearsing the argument again in a bit more detail for the case of the Euler strut. The D-N/D-N-P theorist will point out that the butterfly effect will be part of the ideal D-N explanatory text for why the strut buckled to the side that it did. Once we have filled in this explanatory text for the particular deterministic event, there is nothing more to be said, nothing more to explain. Are we back to square one?

First of all, let me reiterate that this looks like a type (i) explanation, and as such (without using a limit theorem), I do not see how to unify the individual texts for an explanation of the ubiquity of the pattern. But secondly, internal to Railton's view, there is a profound tension between providing ideal explanatory texts and satisfying his prescription that explanations "should elucidate the mechanisms at work". I think that this tension may in part be responsible for the D-N/D-N-P theorist's failure to recognize that the phenomena being considered require statistical explanation.

The ideal explanatory text for why the strut buckled to the left includes a complete account of the causal chain leading (at least) from the butterfly to the air molecule that hits the strut and is "responsible" for its buckling to that side. According to the ideal text, the immediate cause responsible for the strut's buckling is its being hit by that particular air molecule. The causal mechanism responsible for this would be that of molecular collision, or the forces respon- sible for such collisions. A description of this mechanism would also be part of the text: Recall Railton's statement that the text for the outcome of a causal process consists in "an inter-connected series of law-based accounts of all the nodes and links in the causal network culminating in the explanandum, complete with a fully detailed description of the causal mechanism involved..." [Railton (1981), p. 247.].

But this text leaves little or no room for an analysis of the instability and symmetry present in the strut. The demand that explanations require a complete

Page 15: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

338 NOUS

account of the causal network leading up to the (type (i)) explanadum, in a certain sense obscures the real mechanism responsible for the system's behavior; namely, the instability present in its "motion". On Railton's ideal text account, the causal mechanism responsible for the behavior of the strut would probably be taken to be the intermolecular forces responsible for the collisions, as well as the forces responsible for movement in the butterfly's wings.5 Instability is not a causal mechanism of this kind. Perhaps it is possible to think of it as causal, but I believe to do so would require a deep reworking of standard conceptions of (efficient) causation. Present views about causation do not easily subsume phrases like "the strut's being in an unstable state of equilibrium caused it to buckle to the left." I think that considerations of dynamical instability might very well lead to new and, perhaps, better theories about causation, although to pursue this line of inquiry clearly would be beyond the scope of this paper.

I have been arguing in this section that Railton's D-N/D-N-P model is unsuccessful in answering the second why question (ii) above. On the other hand, the I-S model might more easily be able to appeal to some form of the laws of large numbers, and can incorporate the crucial formalization of the various trials as sequences of independent and identically distributed outcomes. One might, therefore, hope that an I-S explanation would be available in the present case. However, as I will show in the next section, when it comes to providing explanations in statistical mechanics, both views seem wanting.

4. There is a growing recognition in the scientific community at large that classically deterministic systems can exhibit random and chaotic behavior. Demonstrating the possibility that classical dynamical systems can possess well- defined randomness properties of varying degrees, is the task of modem ergodic theory. There are now several deep and powerful theorems which demonstrate the existence of a kind of equivalence between the statistical behavior of a deter- ministic system and a random stochastic process-for example, between the observable behavior of a hard sphere "gas" in a box and the outcomes of a roulette wheel. The province of ergodic theory is, in the words of one of its fore- most practitioners, Ya. Sinai, "the study of the statistical properties of the groups of motions of non-random objects" [Sinai (1977), p. 3.]. What I want to argue is that these "statistical properties of the motions of non-random objects" are explanatory, and, in fact, play an essential role in the correct explanations of various phenomena in SM. However, neither the I-S nor the D-N/D-N-P theorists seem to have noticed this crucial explanatory feature. Perhaps, it would be better to say that the interpretations of probability associated with these models did not allow them to notice it.

Interestingly, there is agreement between the inductive and deductive theorists about explanation in SM. In particular, both agree that the ultimate explanation of, for example, the existence of a state of thermodynamic equilibrium and of the evolution of a system towards such a state, is not to be found in any underlying

Page 16: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 339

lawlike regularity. Instead, the ultimate explanation rests on the de facto distribution of initial conditions in the world. One sees repeatedly in the literature talk of "factlike" irreversibility and "lawlike" reversibility. Very roughly the idea is that since the dynamical laws governing the evolution of the particles of a gas are symmetric in time, the reason why we (almost) never observe behavior that is contrary to the time asymmetric second law of thermodynamics is simply because the actual distribution of initial conditions for systems heavily favors (makes overwhelmingly probable) those which, when acted upon by the time symmetric dynamical laws, evolve as we would expect-in accordance with the second law.

Consider the following quote from Railton concerning the ultimate expla- nation of macroscopic thermodynamic behavior. He holds that the statistical regularities of classical SM are not explanatory. Instead, "the prevalence of equilibrium and other features of macroscopic behavior must, on the classical theory [SM], ultimately be attributed to brute fact and to the operation of deterministic laws on brute fact" [Railton (1981) p. 252.]. On his view, the statistical generalizations of SM "function in explanation not as ersatz laws, but as summaries of information about initial and boundary conditions" [Railton (1981) p. 251.].

The following quote from Sklar who, at least in "Statistical Explanation and Ergodic Theory" appears to hold something like an I-S model of explanation, echos the statement of Railton's.6 In this paper Sklar questions the explanatory role of ergodic theory, in particular with regard to equilibrium SM. Ergodic theory is supposed to explain why a certain probability distribution (the micro- canonical distribution) over possible microstates enables us to compute the correct equilibrium values for thermodynamic quantities by taking averages of certain functions with respect to that distribution. This is the so-called Gibbs method. Presumably, it is able to show this by demonstrating that infinite time averages are equal to phase averages taken with respect to that special distri- bution. Sklar argues that the ergodic "explanation" does not work. He says that there is a simpler explanation which is correct:

And it is the full answer. And it is totally independent of any ergodic results. It goes like this: How a gas behaves over time depends upon (1) its microscopic consti- tution; (2) the laws governing the interaction of its micro-constituents; (3) the constraints placed upon it; (4) the initial conditions characterizing the microstate of the gas at a given time.... It is the matter of fact distribution of such initial condi- tions among samples of a gas in the world which is responsible for many of the most important macroscopic features of gases; the existence of equilibrium states, the "inevitable" approach to equilibrium... [Sklar, (1973), p. 210.].7

Calculations of (Gibbs) phase averages using the microcanonical distribution work, period. He continues by saying that "this is a matter of fact, not of law. These 'facts' explain the success of the Gibbs method. In a clear sense they are the only legitimate explanation of its success" [Sklar (1973), p. 210.].

Page 17: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

340 NOUS

Consider the evolution of an energetically isolated gas which is initially at equilibrium confined to the left half of a box. The Gibbs method tells us to repre- sent the state of the system by a uniform probability distribution over all possible microstates compatible with the fixed value for energy and the physical con- straints. We can represent microstates as points in a many dimensional phase space. If, as we are assuming, the system is at constant energy, all of the physically possible states for this particular system will be found on a surface of constant energy in that space. The results of our measurements on the system for quantities such as volume, temperature, and pressure further restrict the possible microstates of the system.

When the partition is removed the gas is no longer at equilibrium, and it will evolve to a new equilibrium state uniformly spread throughout the entire box. If, contrary to fact, we had observed the gas initially in this state, the Gibbs method would have told us to represent this equilibrium state by a uniform probability distribution over all possible microstates compatible with the enlarged con- straints (e.g. greater volume). But, in actual fact, we know that not all microstates compatible with the measured values of this final equilibrium state are possible. We know that only those microstates which resulted from the evolution of microstates from the initial ensemble are possible states for the system at this later time. We want to know how the initial ensemble (representing the gas con- fined to half the box) evolves into the new equilibrium distribution representing the system at equilibrium throughout the entire box.

The answer is that it cannot evolve in this way. A theorem of mechanics, Liouville's theorem, tells us that the "volume" of the initial distribution remains invariant under the dynamical evolution. But, this does not mean that the initial ensemble cannot in some sense approach the uniform distribution over all the available phase space. For example, the initial ensemble might evolve in such a way as to approximate, in a coarse-grained sense, the uniform distribution. (See Figure 4.) An analogy due to Gibbs is helpful here. Consider a liquid 90% of which is water and 10% an insoluble black ink. If we stir the mixture we will eventually see the liquid as having a uniform grey color. But, were we to look more closely at the solution, using a microscope for example, we would observe regions of clear water and regions of black ink. Looking at the solution with the unaided eye corresponds to a kind of coarse-graining. We reject information that could be had by a closer examination.

We can guarantee a coarse-grained approach to equilibrium if we make an assumption about the continual rerandomization of the initial ensemble as time goes on. Such an assumption can be made explicit in the following way. Con- sider a cell of the partition, call it tj This is just one of the small boxes in Figure 4. Now consider a uniform distribution of phase points in this cell. Watch these phase points evolve for a period At. Now look and see what proportion of points initially in tj end up in some other cell, pj, in this time interval. Call this fraction the transition probability xij from i to j. Determine the transition probabilities for the other cells of the partition in the same way. If we assume

Page 18: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 341

Figure 4. Coarse-graining of the Energy Surface, and the Evolution of the Initial Ensemble.

that for all At the same proportion 1ij of points in ti at the beginning of the interval end up in tj, then we can show that the function describing the behavior of that part of the ensemble in a given cell, obeys a time asymmetric kinetic equation with a unique stationary solution corresponding to the coarse-grained equilibrium distribution. Such an assumption is called a Markov postulate, and in essence it amounts to the claim that where the ensemble was located in the past is irrelevant for determining its future behavior.

The big question, of course, is whether or not this Markovian assumption is consistent with the dynamics. After all, each point in the ensemble follows a completely deterministic trajectory. Its past state is clearly relevant for its future state. Modem ergodic theory has given us hope that in some cases we can, in fact, demonstrate the consistency of the rerandomizing postulate with the under- lying dynamics. It is easy to see qualitatively how this would go.

We want the initial ensemble to rapidly "spread itself out" over the available phase space in such a way that points initially close (in some single cell ti) may find themselves quite far apart as time goes on. This property of the ensemble is called trajectory instability or a sensitive dependence on initial conditions. Modem ergodic theory gives us a hierarchy of statistical properties that correlate with degrees of instability. This hierarchy begins with the concept of ergodicity and includes, in order of increasing instability, mixing, K-, and Bernoulli systems. (For our purposes here, we need not worry about the precise definitions of these terms.) Demonstrating the possibility that "nonrandom", deterministic systems may possess such properties is just what Sinai takes the task of ergodic theory to be.

The proofs show how extreme trajectory instability enables a system to exhibit macroscopic behavior which, in the most extreme cases, can be demonstrated to be isomorphic to the behavior of a roulette wheel, or a Markov process.8 We have a dynamical justification not only for the consistency of the Markovian postulate with the underlying dynamics, but for its necessity as well.

At this point let me qualify slightly my criticisms of the "received views" about explanation in SM. None of what I have said is meant to deny that the de facto distribution of initial conditions plays some role in SM explanations of the

Page 19: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

342 NOUS

approach to equilibrium. We still need to determine a probability distribution over the microstates in the initial ensemble. It may be that the matter of fact distribution of microstates plays an important role in the determination of this distribution.9 What I am arguing here is that there is a further law-like feature that has been over-looked.

So, it should be clear how my argument will go. I claim that an essential feature of the statistical mechanical explanation of why there is a state of equilib- rium and why systems (almost) always evolve towards that state according to the second law, will be a demonstration that such systems generally possess statistical properties such as being K- or Bernoulli. These demonstrations follow directly from an analysis of the dynamical instability present in the equations describing the systems' completely deterministic motions. There are some impor- tant differences between this determination of probabilities for the SM cases and their corresponding determination for the cases discussed in section 3. For instance, in the earlier examples there was a need to appeal to random influences from outside the system. But the so-called Hamiltonian systems (such as a gas in a box) are treated as completely isolated from the external world. It is indeed remarkable that in such cases, the dynamics alone can yield probabilities. Another difference is that symmetries and the principle of indifference play much less of a role than they do in the earlier examples. This leads me to believe that there will be no general, universally applicable prescription for determining what I have called "dynamical probabilities" in deterministic systems, making it diffi- cult to provide a well-characterized "interpretation" of probability in these cases. A detailed analysis of instabilities and symmetries will be necessary on a case by case basis. Nevertheless, there does appear to be a kind of universality in the basic form of explanation which is applicable in a wide variety of distinctly different cases.

Being told that a given individual system possesses a statistical property like the K- or Bernoulli properties, is thus essential for explaining why it evolved to equilibrium. The D-N/D-N-P model of explanation cannot accommodate this key explanatory feature for the following reasons. These ergodic properties are definable only in terms of ensembles. They are probabilistic/measure theoretic concepts, and are, therefore, not definable in terms of the individual trajectory of a system. The complete ideal D-N explanatory text for why our system evolves as it does is just the description of the unique trajectory of that system in the phase space. On the classical view, because of the existence of hidden variables (exact microstates and unique trajectories) the evolutions of the other members of the initial ensemble give rise to different independent trajectories and com- pletely separate ideal texts. The ideal D-N texts for each member of the ensemble are, therefore, disjoint.

What does, in fact, unify the individual D-N texts is the fact that the given deterministic system possesses the statistical property, such as being K- or Bernoulli. This explains why the system evolves towards an equilibrium state, by showing that the initial ensemble will with overwhelming probability approach

Page 20: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 343

(in a coarse-grained sense) the equilibrium distribution as we discussed. But, the D-N/D-N-P model cannot incorporate this crucial explanatory feature. The point is that while these statistical properties are definable only in ensemble terms, they nevertheless play an essential role in explaining both the existence of equilibrium in general, and in why the individual system evolves as it does, in accordance with the second law.

For example, if it can be shown that the instability present in the Hamiltonian (the function characterizing the system's evolution in phase space) leads to the Bernoulli property, then we can say the following about the individual system's evolutionary behavior. Suppose we have knowledge of the system's evolution from the arbitrarily distant past in terms of the sequence of coarse-grained boxes occupied by the representative point at unit times. Then no matter how fine the coarse-graining, no matter how small the boxes are, that knowledge gives us no information about which box the system's representative point will occupy in the next instant. In other words, the system's future behavior is probabilistically independent of its past. Thus, the particular sequence of box occupation numbers is a Bernoulli sequence. This will be true even for partitions (coarse-grainings) that are so fine that the entire sequence of box occupation numbers, -co<_ t < +00,

suffices to individuate the system's trajectory. Such trajectory-individuating partitions are, in fact, guaranteed to exist for K- and Bernoulli systems. In this sense, then, being told that the system possesses a certain statistical property defined in terms of the "global" behavior of the ensemble, is relevant for charac- terizing its individual behavior.

The quotes from Sklar show that I-S theorists have also overlooked the key feature of explanation in SM; namely, the "derivation" of the statistical properties from an analysis of the lawlike instability present in the motion of the system. I-S explanations do not require that the statistical generalizations appear- ing in the explanans be derivable from underlying theory. They can be state- ments recording merely de facto regularities. This leads Sklar, for example, to conclude that most of the explanatory work in statistical mechanical explanations is done by the matter of fact distribution of initial conditions in the world.

It seems to me that the more appropriate model of explanation I have been advocating can be seen as a combination of the best features of the D-N-P and I-S models. It also requires, as I have tried to show, a new understanding of the nature and genesis of probabilities. First, the D-N-P model is superior to the I-S model in that it demands that the statistical generalizations or laws be derived from underlying theory. As we have just seen, for the cases of explanation in SM, this corresponds to the demonstration, through an analysis of the system's lawlike dynamical instability, that the s'ystem possesses strong statistical properties. Such demonstrations have a deductive character, and it would be appropriate, if we want, to call this part of the explanation "deductive- nomological". The second important feature of these "Statistical-Deterministic" explanations also involves a deductive argument. It is the derivation of a "probability one" assertion from a statement to the effect that the system has

Page 21: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

344 NOUS

such a statistical property, together with a mathematical proposition; namely, a measure-theoretic limit theorem such as the SLLN or some ergodic theorem.

Third, there is something at least intuitively satisfying about the I-S model in its claim that the explanans should provide us with evidence from which the explanandum can be inferred with high probability or with "practical certainty". Granted there are cases such as the cc-decay of a U238 nucleus where this intuition completely breaks down. For such irreducibly probabilistic phenomena, the D-N-P model is appropriate, as Railton persuasively argues. But when we have a result from theory which tells us that a particular explanandum is nomically expectable with probability one, then an inductive argument to that explanandum is clearly appropriate. I see nothing wrong with calling such an argument explanatory. Thus, the final component of the new model of statistical explanation is the inductive argument that is distinctive of I-S explanations.

It seems to me that these three features provide a reasonable reconstruction of the backbone of a type of explanatory argument that actually gets made by inves- tigators working in statistical physics. It is quite analogous to the "explanatory" story that gets told in the artificial (nonphysical) example with which this paper began.

Before concluding, let me discuss two possible responses to my argument. The first can be seen as a substantive objection to my criticism of the D-N/D-N-P model in the case of explanation in SM. Understanding why this argument is unsuccessful clearly illuminates the explanatory relevance of the statistical properties of the motions of nonrandom objects. The second is a familiar methodological objection to the inductive character of the argument which plays a role in my explanatory account. It is just the old problem of the "ambiguity" of I-S explanations, which, in part, led Railton to develop his model in the first place. This objection will also be addressed in the context of explanations in SM.

The first response is suggested by an interesting argument of Railton's in his discussion of explanation in classical SM. Recall that he argues that the statistical generalizations of SM, while not laws, nevertheless can provide explanatory information relevant for filling out the ideal D-N text which, on his view, is genuinely explanatory.

Railton points out that certain results in ergodic theory can be used to show that

if a gas is in an initial condition that obeys a relatively few constraints, it will, over infinite time, spend most of its time at or near equilibrium. This illuminates a modal feature of the causal processes involved and therefore a modal feature of the relevant ideal explanatory texts: this sort of causal process is such that its macroscopic outcomes are remarkably insensitive (in the limit) to wide variations in initial microstates" [Railton (1981), p. 251.].

He continues noting that such information can be informative about the ideal causal texts, and concludes that we can capture the "intuition that ergodic theory and its kin are somehow explanatory: they shed light on a modal feature of the

Page 22: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 345

causal processes underlying thermodynamic behavior, thus providing informa- tion about the relevant ideal causal texts" [Railton (1981), p. 251.].

So, perhaps the D-N/D-N-P model has another response to my criticism that it misses the key explanatory factors in explaining thermodynamic behavior. Does the appeal to this "modal feature" of the causal process do the trick? Let us look a bit more closely at this feature which is illuminated by ergodic theory. The results to which Railton refers are meant to justify using the microcanonical distribution as that distribution with which to represent the equilibrium state. That is, they are supposed to rationalize the uses of the Gibbs method. They allow us to conclude that the equilibrium state of a system is its overwhelmingly most probable state. What the results of ergodic theory do, in fact, illuminate are the statistical properties of the systems. In a sense it is true that the ergodic results demonstrate that the macroscopic outcomes are insensitive to variations in initial conditions. But, this is not in a directly causal or deterministic sense. Rather, the sense is wholly statistical.

Furthermore, as we have seen in the discussion above, contrary to what Railton claims, statistical systems such as gases exhibit extreme sensitivity to initial conditions. Two microstates initially very close to one another (in some cell 4i) can, as a result of dynamical evolution, end up in wildly different sub- sequent states-states which have distinguishable macroscopic characteristics. 10

Thus, the only sense in which "macroscopic outcomes are remarkably insensi- tive" to variations in initial conditions is expressed in the claim that we can accurately describe the equilibrium state of the system in terms of a special probability distribution over possible microstates. 11 But, this is a statistical claim and is not an expression of any peculiar causal modality of the underlying process.

Turning now to the second objection, recall that a problem with the I-S account is that it appears always to be possible to find conflicting I-S arguments; one showing that the explanadum was to be expected with high probability, the other showing it was not to be expected, also with high probability. As I mentioned in Section 2, this problem arises out of the possibility that relative to one reference class an event can be quite probable, but if one considers a narrower reference class, the event can be quite improbable: The probability that a forty year old U.S. male will live to be sixty can be quite high. Yet the probability that a forty year old U. S. male with lung cancer lives to be sixty will be very low. Different partitions of the reference class can lead to different probabilistic conclusions.

Now prima facie, one would expect that this problem of partitioning arises in the demonstrations that deterministic Harniltonian systems will with over- whelming probability evolve to equilibrium states. As we saw earlier, the approach to equilibrium gets characterized in terms of a coarse-graining or parti- tioning of the phase space. Doesn't this introduce an epistemic element into the characterization of the approach to equilibrium? Does this mean that the probabilities involved are not really coming from the dynamics? The D-N/D-N-P

Page 23: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

346 NOUS

theorist might insist that the appropriate partitioning of the phase space is infinite-the "partition" of the space into individual phase points. In the explanatory context this is equivalent to providing a complete ideal D-N explanatory text-a description of the system's unique trajectory in phase space. (Since measurable partitions are countable coverings of the phase space, this "infinite partition" is, however, not really a partition.)

Hempel's response to the partition or reference class problem was to require maximal specificity; thereby making explicit the relativization of the explanation to a particular epistemic context. Must my account also relativize statistical explanations to a given epistemic state? The answer is no. As I mentioned earlier, if a system exhibits instability that results in the K- or Bernoulli property, then the system exhibits stochastic behavior no matter how fine the partition. There exist dynamically generated partitions (sometimes called K-partitions or Markov partitions) which are so fine as to allow for the individuation of trajectories.12 Thus, in a well-defined sense, these K-partitions are maximally specific parti- tions. They give us all the relevant dynamical information about the system's evolution. Yet relative to these partitions, the behavior remains stochastic. So, I do not believe that the problem of ambiguity will arise, at least for the cases of explanation in classical SM which have concerned us here.13

5. By way of concluding, let me begin with an important disclaimer. One might very easily get the impression from the above discussion, that there is a fully worked-out explanation for the thermodynamic behavior of systems such as real gases in boxes. Unfortunately, this is not true. The ergodic results, as I mentioned, have been established for idealized dynamical systems. We are still a long way from being able generally to determine, of any one given physical system, whether it really has any of the ergodic properties. 14 However, I do think that the instability and randomness results are conceptually relevant and are, in fact, essential for a proper understanding of the existence of equilibrium and of the "inevitable" approach to equilibrium.

Speaking more generally, what have we learned from the above discussion? I believe that there are two important conclusions to be drawn. First, we see that we must recognize the possibility that probabilities, and probabilistic laws can arise in physical theory in a way distinct from both the old classical view and the quantum theory. On the old classical view, probabilities are due entirely to our ignorance of the system's true exact state. On the quantum theory, probabilities are irreducible; where this is understood in terms of propensities and the non- existence of hidden variables. The simple examples discussed in the early sections, as well as the later discussion of explanation in SM, indicate that probabilities may arise in some third way, via instabilities and symmetries in the dynamics of the theory.

Second, because of the close ties between interpretations of probability and models of statistical explanation, we should conclude that neither the I-S nor the

Page 24: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

EXPLANATORY INSTABILITY 347

D-N-P models tell the whole story. The D-N-P model, in particular, misses the key explanatory factors for a proper explanation of the patterns in the fractals and Euler strut, as well as in the thermodynamic regularities. On the other hand, the D-N/D-N-P model is superior to the I-S view in that it demands that the probabilistic laws, essential for explanation, be derivable from our underlying theory. The standard criticism of SM is that such laws do not exist. Instead, the de facto distribution of initial conditions is said to do the explanatory work. I think that recent work on chaos in dynamical systems theory presents a serious challenge to this view.

Finally, I have offered an outline or sketch of what I take to be the proper form for statistical explanation in deterministic dynamical theories. This "Statistical-Deterministic" model adapts key features from both the D-N-P and I- S models. On the one hand, from the D-N-P model, it takes seriously the idea that the probabilities appearing in the "laws" must be derived from the under- lying theory. The laws cannot simply be de facto generalizations expressing statistical regularities. From the I-S model, on the other hand, the new model retains the idea that explanations should show that the explanandum was to be expected with high (unit) probability. The final step in the explanatory argument is an inductive argument to the explanandum.

Notes I would like to thank the following people for helpful comments, discussions, and criticism:

Hugh Chandler, Tim McCarthy, Joe Mendola, Peter Railton, Bob Wengert, and Mark Wilson. lln this paper I am concerned only with what are usually called objective or empirical

interpretations of probability. Hence, I will ignore the various "subjective" views. 2See Giere (1973). I think that if any kind of propensity interpretation is correct (and I am not

overly optimistic), it will have to be of this type, and not of the type that takes probability to be a dispositional property of the experimental arrangement-a disposition to yield (at least on some versions) a certain relative frequency in a hypothetical sequence of trials. It has been forcefully argued by Sklar (1970) that these latter, more orthodox, and less extreme interpretations succumb to many of the same problems (and perhaps to a few more) which afflict the frequency interpretations they were intended to replace.

3Note that despite the "causal" flavor of this characterization of independence, the assumption of independence is not causal or deterministic. There are many well-known examples of noncausal correlations which entail the failure of independence.

4Actually, the condition of independence required for a sequence to be Bernoulli can be weakened considerably; and so, one can employ a result which is somewhat weaker than the SLLN to explain why the pattern is to be expected. The process described by the game is a stationary Markov process. The "stationarity" guarantees that the probabilities of moving towards one of the vertices do not depend on where one is in the sequence. It means that the shift transformation along the sequence is measure preserving. The fact that the sequence of numbers used in the game is Bernoulli is a sufficient condition for the Markov process to be stationary, but it is not necessary. The relevant technical result, proving an ergodic theorem weaker than the SLLN, can be found in Elton (1987). This can be used to prove that with probability one, the sequence of throws of the die will yield a sequence of points which are dense on the Sierpinski gasket, regardless of the initial starting point. Note that the concept of a stationary Markov process (as that of a Bernoulli sequence) is not one which can be explicated without recourse to probabilistic notions. (I want to thank John Winnie for discussions concerning this weakened condition.)

5Railton does, however, allow noncausal features to play key roles in the ideal explanatory texts (private communication), so the following criticism may lose some of its force.

Page 25: Explanatory Instabilityweb.mit.edu/bskow/www/810-S12/batterman... · can make do with some kind of frequency/epistemic interpretation. I will argue below that Railton's model of probabilistic

348 NOUS

6At least if echos can travel backwards in time. 7Sklar (1973), p. 210. Sklar's views have changed somewhat since the publication of this article.

Nevertheless, I think the sentiment expressed in the following quote is representative of the received stance to these issues.

8The idea of isomorphism is really quite simple: Consider the "observable" behavior of the dynamical system-the sequence of box numbers Ii i (from Ei) in which the system's representative point finds itself at intervals At throughout its evolution. This sequence of numbers {i} is statistically indistinguishable from the sequence of numbers obtained by repeatedly playing a game of roulette in which the number of slots in the wheel is the same as the number of cells Ei in the coarse-graining.

9For an argument that this de facto distribution cannot play such a role see Krylov (1979) and my discussion of Krylov's work in Batterman (1990).

101t is usual to assume that the "size" of the cells in the coarse-graining represent the limits of accuracy of our measuring instruments. That is, within any given cell we cannot individuate systems, but we can determine if systems have their representative points in different cells.

IIBy "accurately" here, I mean only that experimentally observed values for thermodynamic quantities can be computed using the microcanonical distribution.

12As with all else in ergodic theory, this statement must be understood as true except possibly on a set of measure zero of trajectories.

13lnstead of the ambiguity problem, there will be the "sets of measure zero" problem mentioned in the last note. A solution to this problem would show that the sets of measure zero, which are ignored in ergodic theory, physically have probability zero. This is, however, a notoriously difficult problem.

14There is in fact a theorem, known as the KAM theorem, which says roughly that for most systems there will exist regions of stability in the phase space such that states initially within these regions will remain there as time goes on. In such cases the system cannot even be ergodic, in which case we lose the straightforward justification for adopting the microcanonical distribution.

References

Batterman, R. W. (1990) "Irreversibility and Statistical Mechanics: A New Approach?", Philosophy of Science, 57, 395-419.

Elton, J. H. (1987) "An Ergodic Theorem for Iterated Maps," Ergodic Theory and Dynamical Systems, 7, 481-488.

Giere, R. (1973) "Objective Single Case Probabilities and the Foundations of Statistics," P. Suppes et al (eds.): Logic, Methodology, and Philosophy of Science 4, (Amsterdam, North-Holland), 467-483.

Krylov, N. S. (1979) Works on the Foundations of Statistical Mechanics, trans. A. B. Migdal, Ya. G. Sinai, and Y. U. Zeeman, (Princeton, Princeton University Press).

Railton, P. (1978) "A Deductive-Nomological Model of Probabilistic Explanation," Philosophy of Science 45, 206-226.

Railton, P. (1981) "Probability, Explanation, and Information," Synthese 48, 233-256. Sinai, Ya. G. (1977) Introduction to Ergodic Theory (Trans., V. Scheffer), (Princeton,

Princeton University Press). Sklar, L. (1970) "Is Probability a Dispositional Property, "Journal of Philosophy, 67,

355-366. Sklar, L. (1973) "Statistical Explanation and Ergodic Theory," Philosophy of Science 40,

194-212.