A Statistical Analysis of Hitting Streaks in Baseball: Comment

6
A Statistical Analysis of Hitting Streaks in Baseball: Comment Author(s): Jim Albert Source: Journal of the American Statistical Association, Vol. 88, No. 424 (Dec., 1993), pp. 1184- 1188 Published by: American Statistical Association Stable URL: http://www.jstor.org/stable/2291255 . Accessed: 11/06/2014 07:12 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of the American Statistical Association. http://www.jstor.org This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AM All use subject to JSTOR Terms and Conditions

Transcript of A Statistical Analysis of Hitting Streaks in Baseball: Comment

Page 1: A Statistical Analysis of Hitting Streaks in Baseball: Comment

A Statistical Analysis of Hitting Streaks in Baseball: CommentAuthor(s): Jim AlbertSource: Journal of the American Statistical Association, Vol. 88, No. 424 (Dec., 1993), pp. 1184-1188Published by: American Statistical AssociationStable URL: http://www.jstor.org/stable/2291255 .

Accessed: 11/06/2014 07:12

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journalof the American Statistical Association.

http://www.jstor.org

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions

Page 2: A Statistical Analysis of Hitting Streaks in Baseball: Comment

1184 Journal of the American Statistical Association, December 1993

Comment Jim ALBERT *

1. INTRODUCTION

This article deals with an interesting question among baseball fans: is there statistical evidence for streakiness in the performance of baseball hitters? When I was growing up near Philadelphia, Mike Schmidt was one of my favorite players and, from my observation of his batting performance across seasons, I was convinced that Schmidt was a very streaky hitter. From the writings of Tversky, it appears that I was convinced of Schmidt's streakiness probably by en- thusiastic baseball announcers and my distorted view of the streakiness inherent in random data. But I still want to believe that Schmidt's hitting behavior was streaky. Albright's study, which looks for patterns of streakiness among a large group of players, clearly tackles an interesting problem.

Albright uses two general approaches in his search for patterns of streakiness. In the first approach, a binomial constant-p model is set up (for each player) as the null hy- pothesis; a test statistic, such as the number of runs, is con- structed to detect streakiness; and the null hypothesis is ac- cepted or rejected on the basis of a significance probability. The second approach fits a logistic regression model for each player, where hitting is modeled using "recent success" vari- ables for streakiness together with other situational variables. (For future reference, we will refer to these recent success covariates as history variables.) Although some players ex- hibit a significant degree of streakiness, Albright concludes that there is little support for streakiness overall, because only a small number of the 501 players exhibit this behavior.

Albright has done an extensive statistical analysis on this large baseball dataset; however, many issues seem unan- swered. First, this data set is interesting because it includes a number of situational variables that many baseball people believe are relevant to hitting performance. Although these variables are included in the logistic regression analysis, there is little discussion in the article on their general usefulness in the modeling of hitting. In Section 2 we take a more de- tailed look at the logistic modeling, see which variables are included in a stepwise approach, and then make some general comments about the importance of all of the situational variables.

From the results of the logistic modeling of hitting, we will see that only one of the situational variables appears important for a sizable proportion of the 200 hitters that are analyzed. Most of the situational variables appear in ap- proximately 10% of the regressions, which is the percentage that one would expect by chance.

Does the fact that one situational variable, say the home/ away effect, is significant for only 10% of the hitters mean

* Jim Albert is Professor, Department of Mathematics and Statistics, Bowling Green State University, Bowling Green, Ohio 43403.

that this variable is spurious and should be ignored in pre- dicting hitting? Indeed not! The conclusion that there is little evidence for a general home/ away effect for the population of all hitters says nothing about the significance of the home/ away effect for specific hitters such as Kirby Puckett. It is possible that the home/away effect is very strong for a small group of players. Similarly, Albright's analysis, which focuses on the general pattern of streakiness across all players, seems to deemphasize the fact that particular players do indeed show signs of streakiness for particular seasons. Albright sug- gests that these particular patterns of streakiness are spurious, because the players involved do not exhibit streakiness in neighboring years. But, this suggestion implicitly assumes that streakiness is an intrinsic quality of a player that should appear throughout the player's career. It is possible that streakiness for a particular season is a combination of some inner quality and other environmental factors. I am not sure what causes streakiness. But I think it is inappropriate to dismiss the existence of this characteristic based on the overall behavior of the population of hitters and our belief about its intrinsic nature.

Another thing we learn from the logistic regression fitting is that the situational effects for players, if they exist, are generally small in magnitude. Similarly, streakiness effects are small; they are much smaller than the effects that baseball announcers would have us believe. Thus it is important to use testing procedures that are designed to detect these pat- terns. Albright uses the standard testing approach to detect for streakiness, where one tries to reject the binomial null hypothesis by the application of a test statistic such as the number of runs. Because there is no alternative model pro- posed, one must question the power of the test procedure or the ability to detect a streakiness effect when indeed it exists.

In Sections 3 and 4, we test for streakiness by developing an alternative "streaky" model. We motivate this model by first looking at the hitting performance of a batter across a season and asking whether the observed sequence of hits is consistent with a binomial model with a constant probability of success. If a player is streaky, one would expect his batting average to show noticeable patterns over the season. To see this variation graphically, we propose a simple moving av- erage plot. For some players, this plot has little trend, indi- cating stability or consistency of the batter's batting average across the season. For other players, this plot shows significant peaks and valleys, indicating some significant batting slumps or hot streaks during the season.

These moving average graphs motivate the use of a Markov switching model in Section 4 to explain the variability of

? 1993 American Statistical Association Journal of the American Statistical Association

December 1993, Vol. 88, No. 424, Applications & Case Studies

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions

Page 3: A Statistical Analysis of Hitting Streaks in Baseball: Comment

Albert: Comment 1185

Table 1. Summary of Stepwise Logistic Regressions Performed on 200 Major League Players

Situational variablea History variableb

17 02 SC RB RS DN HA ERA TF TH Y1 Y2 Y3 Y6 Y10 Y20 Wi W2

%oftime entered 11.0 14.0 10.5 12.0 10.0 8.0 9.5 35 11.5 16.9 6.0 3.0 5.5 6.5 8.0 8.0 0.5 3.0

% of positive effects 41 14 48 50 50 61 42 50 73 31 31 100 17

NOTE: For each covariate, the percentage of players for which the variable is significant and the percentage of significant effects that are positive are given. Certain noteworthy percentages are boxed.

a For indicator variables, the effect listed is coded 1 in the data set. 17-game in 7th inning or later; 02-two outs in inning; SC-game score (player's team score-opponent score); RB-runners on base; RS-runners in scoring position; DN-night game; HA-home game; ERA-earned run average of pitcher; TF-game played on grass; TH-pitcher has opposite arm.

bY# denotes the number of hits in the latest # at-bats, Wl and W2 are the exponentially weighted sum of the results of the last 20 at-bats (walks and sacrifices included) with 6 = .8 and.95.

hitting for a particular player during a season. This model assumes that during a particular game, a player is either "hot" and hitting with probability pi, or "cold" and hitting with probability po, where pi > po. This model is fit to 50 baseball players. By inspection of various posterior distributions, one can compare the streakiness of various batters and assess when a particular player is hot or cold during the season. In Section 5 we make some concluding remarks comparing this analysis with the methods used by Albright.

2. THE SITUATIONAL VARIABLES

One interesting aspect of this particular data set is that a large number of situational variables are recorded. Many of these covariates are similar to the streakiness variable in that the baseball establishment believes that they explain a sig- nificant portion of the variability in hitting performance. For example, it is generally believed by baseball fans that many players hit better at their home ballpark than on the road, that many players hit significantly better against pitch- ers of the "opposite arm," and that some players are clutch hitters in the sense that they have a significantly higher batting average when there are runners in scoring position. By look- ing at the effects of these other situational variables across a large group of players, one may better understand the effects of streakiness.

The effects of the entire set of situational variables, in- cluding the history variables, were investigated for a group of 200 players, 50 each from the American League in 1988 and 1989 and the National League in 1989 and 1990. For each player, a stepwise logistic regression was run similar to the one performed by Albright. First the situational variables were placed in a stepwise fashion into the model using a 10% significance level, and next the set of history variables were considered for inclusion at a 10% level. In the following analysis, I define success as getting a hit-the walks and sac- rifices were not included. I also adjusted the definition of the handedness of the pitcher that was given in the data set. The new variable "throw" is set equal to 1 if the pitcher has a throwing arm opposite to the hitting side of the hitter and 0 otherwise.

Table 1 summarizes the fitting of stepwise regressions for each of the 200 players. For each situational and history variable, the percentage of players for which the particular variable entered the model is given in the first row of the table. The next row gives the percentage of significant effects

where the corresponding regression coefficients were positive. Both numbers are indicators of the strength of the variables in explaining hitting variation.

First consider the set of situational variables. If a particular variable had no relationship with hitting (across all players), and this variable is approximately uncorrelated with the other situational variables, then one would expect 10% of the regressions to contain the variable just by chance, with ap- proximately half of the significant effects to be positive. It is interesting to note that only one situational variable, the pitcher's earned run average (ERA), appears in a high (33%) percentage of the regressions. Practically all (96%) of the ERA were positive, indicating that many players hit for a significantly higher average against weaker pitchers, as mea- sured by the ERA. Other situational variables stand out, not for the percentage of time in the model but for the direction of the effect. Generally, when the effects were significant, players hit better when runners were on base, when the game was played at home, and when the pitcher had an opposite arm. When the variable 02 was significant, the players gen- erally hit worse when there were two outs in the inning.

Comparing the sets of situational and history variables, note from Table 1 that the history variables individually ap- pear in the model less frequently. But this difference can be explained by the positive correlation between the history variables. One history variable that is notable in terms of the direction of the effect is y20, the number of hits in the last 20 at-bats. Note that 81% of the y20 significant effects were negative. This negative history effect is referred to as the "plexiglass principle" by James (1988); players who hit poorly in recent at-bats are more likely to get a hit in the current at-bat and vice versa.

3. DOES A PLAYER'S HITTING PROBABILITY CHANGE ACROSS A SEASON?

From the preceding logistic regression analysis, we see that, with the possible exception of the ERA variable, no single situational or history variable appears to explain a significant amount of the hitting variability for the 200 players studied. In particular, no single history variable stands out as an im- portant predictor of hitting, and it seems that a different approach is necessary in our study of streakiness.

If we ignore all of the situational variables, then the sim- plest reasonable model for hitting is to assume that the hits for a particular player follow a Bernoulli process with prob-

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions

Page 4: A Statistical Analysis of Hitting Streaks in Baseball: Comment

1186 Journal of the American Statistical Association, December 1993

ability of success p (which may be different between hitters). If the player exhibits some streakiness during the season, then one would expect the hitting probability p to change over the course of the season. Suppose that the batter plays in T games during the sequence and we observe the sequence { {xi, ni }, i = 1, . . ., T}, where xi and ni are the number of hits and at-bats of the player of the ith game. Then a simple way to look for changes in the hitting probability over time is to first select a window width d and then plot the moving averages Pt = (+d-I X, /+d- ni as a function of the time t. Figure 1 displays these moving average plots using window width d = 4 for three of the 1988 American League players. As in the earlier analysis, a batting opportunity is defined as an official at-bat and a success is defined as a hit. Note that these three players have very different hitting pro- files over the season. Chet Lemon's hitting performance seems very stable over time. In contrast, Carney Lansford appears to have some significant hot and cold streaks throughout the season. To measure the streakiness or steadi- ness of the hitting probabilities, we use the lag d correlation of the moving averages; that is, the correlation of the se- quences {Pt, t = 1, .. ., T-2d + 1} and {pt, t = d + 1, . T - d + 1 }. A positive lag correlation indicates some

positive dependence or streakiness in the sequence of moving averages, and a negative correlation reflects a negative de- pendence or steadiness in the averages. The values of the correlations are indicated on the plots in Figure 1.

4. ONE MODEL FOR STREAKINESS From the inspection of a number of moving average plots

such as those in Figure 1, the hitting performance of some players appears to change significantly over time. This ob-

servation motivates the consideration of the following simple model, which allows for changes in a batter's hitting prob- ability across games. (A similar version of this model was used to model econometric data in Albert and Chib 1993). Suppose that at game i, the player is either "hot" and hitting with probability pI or "cold" and hitting with probability po, where po < P1. The state (hot or cold) in which a player is in during a particular game is unknown, but we assume that the player moves between the states according to a Markov chain with given transition probabilities.

Specifically, we introduce the latent variables Z1, ... . ZT, where Zi = 1 if the player is hot during the ith game of the season and Zi = 0 if he is cold during the particular game. Conditional on the Zi, the observed number of hits x, . ... , XT are assumed independent with xi distributed binomial (ni, pi) if Zi = 1 or binomial (ni, rpo) if Zi = 0. The distri- bution of the states Zi follows a Markov chain with the fol- lowing transition probabilities:

zi=O Zi= 1

Zi-,=O | a 1 -a

Zi-I =1 1 -b b

We will be using this model to reflect streakiness of a bat- ter's hitting performance across games. So we will set the Markov chain transition probabilities a and b both equal to .9. Thus if a player is in a hot state, then he remains in a hot state for the next game with probability .9. Similarly, a player in a cold state will remain in a cold state for the next game with high probability.

With the transition probabilities fixed, the remaining pa- rameters in the model are the hot and cold binomial prob-

1988 C. Lansford (r = 0.43) 1988 D. Evans (r =-0.05) 1 ......1

0.8 0.8

0.6 - 0.6-

0.4 0.4

0.2 - 0.2-

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

1988 C. Lemon (r = -0.41)

0.8

0.6-

0.4

0.2-

0 20 40 60 80 100 120 140

Figure 1. Moving Average Plots of Batting Averages of Three Players With Window Equal to 4 Games. The lag 4 correlation of the moving averages is placed above, each plot.

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions

Page 5: A Statistical Analysis of Hitting Streaks in Baseball: Comment

Albert: Comment 1187

abilities po and Pi . If these two probabilities are equal, then this model reduces to the usual binomial independence model with constant probability of success. On the other hand, if po and Pi are significantly different, then the player has a season-long batting performance that can be described as streaky. One can measure the degree of streakiness of a particular player by the difference in hot and cold probabil- ities p - Po.

In this setting, a Bayesian analysis is appropriate, because we have some prior knowledge about the locations of the binomial probabilities. What do we know about hitting probabilities? First, the batting averages of recent major league regular batters are approximately normally distributed with mean .27 and standard deviation .03. Second, from the earlier regression analysis, we view streakiness to be a rela- tively rare characteristic, and so we believe that the cold and hot probabilities are similar in size. Albert (1992) showed how one can model this prior information by means of a hierarchical distribution (Lindley and Smith 1972). For the following examples, the parameters of the prior distribution are set so that the prior median and 90th percentile of the difference of probabilities PI - po are given by .04 and .10. So we believe a priori that the events "pi - po < .04" and "Pi - po > .04" are equally likely, and we are pretty sure (with probability .90) that the difference in hot and cold probabilities is less than 10%.

In this model the unknown quantities are the two hitting probabilities and the latent states Zi. Given the hitting data { (xi, ni ), i = 1, . . ., T } for a particular player, our updated beliefs about these quantities are summarized in terms of their posterior distribution. Albert (1992) outlined how one can simulate from the joint distribution of the probabilities and the states by the use of the Gibbs sampler.

Table 2. Posterior Means (Standard Deviations) of Hot and Cold Probabilities for Three Players in 1988

C. Lansford D. Evans C. Lemon

Cold probability .197 (.029) .257 (.032) .246 (.022) Hot probability .359 (.030) .330 (.031) .286 (.026)

Table 2 gives the posterior means and posterior standard deviations of the hot and cold probabilities for the three players whose batting averages were plotted in Figure 1. One can see from this table that these three players had very dif- ferent seasons in terms of streakiness. The difference between the hot and cold batting average for Chet Lemon was only .04; in contrast, Carney Lansford hit approximately 160 points higher when he was hot than when he was cold. This posterior analysis also gives us information about the latent states Zi. The posterior probability P(Zi I data) gives the probability that a player is in the hot state during game i. Figure 2 plots the sequence P(Zi I data) against the game number for each of the three players. These figures mimic the moving average plots of Figure 1 and dramatically show the streakiness or lack of consistency of a player's hitting across a season. Lemon's probability of being hot stays in the interval (.3, .65) for the whole season. In contrast, Lans- ford had three major streaks and three major slumps during the 1988 season; for these streaks and slumps the probability of hot is close to 1 and 0.

The preceding model was also run on each of the 50 American League regulars. The left side of Figure 3 sum- marizes the posterior distributions of the differences in hot and cold probabilities Pi - Po for all the players. Each pos-

1988 C. Lansford 1 988 D. Evans

0.8 0.8-

0.6 - 0.6

0.4 - 0.4-

0.2- 02-

0 o 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

1988 C. Lemon

0.8

0.6

0.4-

0.2-

0 20 40 60 80 100 120 140

Figure 2. Posterior Probabilities of Hot Plotted as a Function of Game Number for Three Players

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions

Page 6: A Statistical Analysis of Hitting Streaks in Baseball: Comment

1188 Journal of the American Statistical Association, December 1993

50 50

40 - 40 i

30 * 30 *

20 - 20 -

10 10 *

=; ___

... * , *-

0 ) 0.05 0.1 0.15 0.2 0 0.05 0.1 0.15 0.2

Figure 3. Posterior Distributions of Difference in Hot and Cold Probabilities for 50 1988 American League Regulars and 50 "Random" Hitters

terior distribution is represented by a horizontal line, with the starred points corresponding to the posterior median and the endpoints to the lower and upper quartiles of the distri- bution. Note that practically all of the distributions have medians in the .02-.05 range and that only one hitter, Carney Lansford, has a difference in probabilities greater than .10. Recall that the prior median ofpI - po was equal to .04, and only 8 of the 50 hitters have a posterior median greater than .05.

Next suppose that the hitting sequence of a player was a Bernoulli process with a constant probability of success. Is it possible for an outcome from such a random process to look streaky using this model? To gain some insight into this question, 50 Bernoulli sequences of hits were simulated. For each player, a simulation was run using the player's actual at-bat sequence with a probability of success equal to the player's season batting average. The posterior distributions of p, - po from these 50 "random" hitters are displayed at the right side of Figure 3. It is interesting to note that three of these random sequences produced a posterior median of p- po equal to .10; however, none of the data sets were as streaky as Lansford in 1988. So it appears that one has to be careful in interpreting an up and down season for a base- ball hitter, because random data can display similar patterns.

5. CONCLUDING REMARKS

From Albright's work, it appears that it is difficult to dis- tinguish the hitting records of the 500 players from the results of 500 sequences of coin-tossing experiments where the coins have different probabilities of success. But I believe that it is wrong to conclude from this analysis that streakiness is not present in baseball data. Instead we should recognize that

streakiness, like other situational variables, is a subtle char- acteristic of data. It is especially hard to observe because we have distorted views of the streakiness present in random data. Because it is a subtle characteristic, we need to explicitly define what we mean by streakiness and develop models that incorporate the particular definition.

Here we define streakiness by the obvious hills and valleys that we observe in a moving average plot of a player's batting average. The Markov switching model is a first approxi- mation for modeling this particular notion of streakiness. Certainly, it is likely that one could improve this model, say by incorporating more than two levels of hitting ability. But the use of this model has attractive features. One can ex- plicitly talk about the degree of streakiness or how hot a given player was on a particular day using parameters of the model. The model seems to set apart streaky players very well, such as Lansford in 1988.

Streakiness is a characteristic of data that is not well un- derstood by many people and is difficult to detect statistically. Thus I would hope that more alternative models could be developed to aid in this understanding and detection. Maybe Mike Schmidt was streaky, but we haven't developed the right tools to detect his streakiness.

REFERENCES

Albert, J. (1992), "Applying a Markov Switching Model to Baseball Hitting Data," technical report, Bowling Green State University, Dept. of Math- ematics and Statistics.

Albert, J. H., and Chib, S. (1993), "Bayes Inference via Gibbs Sampling of Autoregressive Time Series Subject to Markov Mean and Variance Shifts," Journal of Business & Economic Statistics, 1 1, 1-15.

James, B. (1986), The Bill James Baseball Abstract, New York: Ballentine Books.

Lindley, D. V., and Smith, A. F. M. (1972), "Bayes Estimates for the Linear Model," Journal of the Royal Statistical Society, B, 34, 1-41.

This content downloaded from 193.104.110.110 on Wed, 11 Jun 2014 07:12:57 AMAll use subject to JSTOR Terms and Conditions