Sex Bias in Graduate Admissions:...

7
Sex Bias in Graduate Admissions: Data from Berkeley Measuring bias is harder than is usually assumed, and the evidence is sometimes contrary to expectation. P. J. Bickel, E. A. Hammel, J. W. O'Connell Determining whether discrimination because of sex or ethnic identity is be- ing practiced against persons seeking passage from one social status or locus to another is an important problem in our society today. It is legally impor- tant and morally important. It is also often quite difficult. This article is an exploration of some of the issues of measurement and assessment involved in one example of the general prob- lem, by means of which we hope to shed some light on the difficulties. We will proceed in a straightforward and indeed naive way, even though we know how misleading an unsophisti- cated approach to the problem is. We do this because we think it quite likely that other persons interested in ques- tions of bias might proceed in just the same way, and careful exposure of the mistakes in our discovery pro- cedure may be instructive. Data and Assumptions The particular body of data chosen for examination here consists of ap- plications for admission to graduate study at the University of California, Berkeley, for the fall 1973 quarter. In the admissions cycle for that quarter, the Graduate Division at Berkeley re- ceived approximately 15,000 applica- tions, some of which were later with- drawn or transferred to a different proposed entry quarter by the appli- cants. Of the applications finally re- maining for the fall 1973 cycle 12,763 were sufficiently complete to permit a Dr. Bickel is professor of statistics, Dr. Hammel is professor of anthropology and associ- ate dean of the Graduate Division, and Mr. O'Connell is a member of the data processing staff of the Graduate Division, at the University of California, Berkeley 94720. 398 deceision to admit or to deny admission. The question we wish to pursue is wheth- er the decision to admit or to deny was influenced by the sex of the applicant. We cannot know with any certainty the influences on the evaluators in the Graduate Admissions Office, or on the faculty reviewing committees, or on any other administrative personnel par- ticipating in the chain of actions that led to a decision on an individual ap- plication. We can, however, say that if the admissions decision and the sex of the applicant are statistically asso- ciated in the results of a series of ap- plications, we may judge that bias existed, and we may then seek to find whether discrimination existed. By "bias" we mean here a pattern of as- sociation between a particular decision and a particular sex of applicant, of sufficient strength to make us con- fident that it is unlikely to be the re- sult of chance alone. By "discrimina- tion" we mean the exercise of decision influenced by the sex of the applicant when that is immaterial to the quali- fications for entry. The simplest approach (which we shall call approach A) is to examine the aggregate data for the campus. This approach would surely be taken by many persons interested in whether bias in admissions exists on any cam- pus. Table 1 gives the data for all 12,763 applications to the 101 grad- uate departments and interdepartmental graduate majors to which application was made for fall 1973 (we shall refer to them all as departments). There were 8442 male applicants and 4321 female applicants. About 44 percent of the males and about 35 percent of the females were admitted. Just this kind of simple calculation of propor- tions impels us to examine the data further. We will pursue the question by using a familiar statistic, chi-square. As already noted, we are aware of the pitfalls ahead in this naive approach, but we intend to stumble into every one of them for didactic reasons. We must first make clear two as- sumptions that underlie consideration of the data in this contingency table approach. Assumption 1 is that in any given discipline male and female ap- plicants do not differ in respect of their intelligence, skill, qualifications, prom- ise, or other attribute deemed legiti- mately pertinent to their acceptance as students. It is precisely this assumption that makes the study of "sex bias" meaningful, for if we did not hold it any differences in acceptance of ap- plicants by sex could be attributed to differences in their qualifications, prom- ise as scholars, and so on. Theoretical- ly one could test the assumption, for example, by examining presumably un- biased estimators of academic qualifica- tion such as Graduate Record Exam- ination scores, undergraduate grade point averages, and so on. There are, however, enormous practical difficul- ties in this. We therefore predicate our discussion on the validity of assump- tion 1. Assumption 2 is that the sex ratios of applicants to the various fields of graduate study are not importantly as- sociated with any other factors in ad- mission. We shall have reason to chal- lenge this assumption later, but it is crucial in the first step of our explora- tion, which is the investigation of bias in the aggregate data. Tests of Aggregate Data We pursue this investigation by com- puting the expected frequencies of male and female applicants admitted and denied, from the marginal totals of Table 1, on the assumption that men and women applicants have equal chances of admission to the university (that is, on the basis of assumptions 1 and 2). This computation, also given in Table 1, shows that 277 fewer wom- en and 277 more men were admitted than we would have expected under the assumptions noted. That is a large number, and it is unlikely that so large a bias to the disadvantage of women would occur by chance alone. The chi-square value for this table is 110.8, and the probability of a chi-square that large (or larger) under the as- sumptions noted is vanishingly small. We should on this evidence judge SCIENCE, VOL. 187

Transcript of Sex Bias in Graduate Admissions:...

Page 1: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

Sex Bias in Graduate Admissions:Data from Berkeley

Measuring bias is harder than is usually assumed,and the evidence is sometimes contrary to expectation.

P. J. Bickel, E. A. Hammel, J. W. O'Connell

Determining whether discriminationbecause of sex or ethnic identity is be-ing practiced against persons seekingpassage from one social status or locusto another is an important problem inour society today. It is legally impor-tant and morally important. It is alsooften quite difficult. This article is anexploration of some of the issues ofmeasurement and assessment involvedin one example of the general prob-lem, by means of which we hope toshed some light on the difficulties. Wewill proceed in a straightforward andindeed naive way, even though weknow how misleading an unsophisti-cated approach to the problem is. Wedo this because we think it quite likelythat other persons interested in ques-tions of bias might proceed in justthe same way, and careful exposureof the mistakes in our discovery pro-cedure may be instructive.

Data and Assumptions

The particular body of data chosenfor examination here consists of ap-plications for admission to graduatestudy at the University of California,Berkeley, for the fall 1973 quarter. Inthe admissions cycle for that quarter,the Graduate Division at Berkeley re-ceived approximately 15,000 applica-tions, some of which were later with-drawn or transferred to a differentproposed entry quarter by the appli-cants. Of the applications finally re-maining for the fall 1973 cycle 12,763were sufficiently complete to permit a

Dr. Bickel is professor of statistics, Dr.Hammel is professor of anthropology and associ-ate dean of the Graduate Division, and Mr.O'Connell is a member of the data processingstaff of the Graduate Division, at the Universityof California, Berkeley 94720.

398

deceision to admit or to deny admission.The question we wish to pursue is wheth-er the decision to admit or to deny wasinfluenced by the sex of the applicant.We cannot know with any certaintythe influences on the evaluators in theGraduate Admissions Office, or on thefaculty reviewing committees, or onany other administrative personnel par-ticipating in the chain of actions thatled to a decision on an individual ap-plication. We can, however, say thatif the admissions decision and the sexof the applicant are statistically asso-ciated in the results of a series of ap-plications, we may judge that biasexisted, and we may then seek to findwhether discrimination existed. By"bias" we mean here a pattern of as-sociation between a particular decisionand a particular sex of applicant, ofsufficient strength to make us con-fident that it is unlikely to be the re-sult of chance alone. By "discrimina-tion" we mean the exercise of decisioninfluenced by the sex of the applicantwhen that is immaterial to the quali-fications for entry.The simplest approach (which we

shall call approach A) is to examinethe aggregate data for the campus.This approach would surely be takenby many persons interested in whetherbias in admissions exists on any cam-pus. Table 1 gives the data for all12,763 applications to the 101 grad-uate departments and interdepartmentalgraduate majors to which applicationwas made for fall 1973 (we shall referto them all as departments). Therewere 8442 male applicants and 4321female applicants. About 44 percentof the males and about 35 percent ofthe females were admitted. Just thiskind of simple calculation of propor-tions impels us to examine the datafurther. We will pursue the question

by using a familiar statistic, chi-square.As already noted, we are aware of thepitfalls ahead in this naive approach,but we intend to stumble into everyone of them for didactic reasons.We must first make clear two as-

sumptions that underlie considerationof the data in this contingency tableapproach. Assumption 1 is that in anygiven discipline male and female ap-plicants do not differ in respect of theirintelligence, skill, qualifications, prom-ise, or other attribute deemed legiti-mately pertinent to their acceptance asstudents. It is precisely this assumptionthat makes the study of "sex bias"meaningful, for if we did not hold itany differences in acceptance of ap-plicants by sex could be attributed todifferences in their qualifications, prom-ise as scholars, and so on. Theoretical-ly one could test the assumption, forexample, by examining presumably un-biased estimators of academic qualifica-tion such as Graduate Record Exam-ination scores, undergraduate gradepoint averages, and so on. There are,however, enormous practical difficul-ties in this. We therefore predicate ourdiscussion on the validity of assump-tion 1.

Assumption 2 is that the sex ratiosof applicants to the various fields ofgraduate study are not importantly as-sociated with any other factors in ad-mission. We shall have reason to chal-lenge this assumption later, but it iscrucial in the first step of our explora-tion, which is the investigation of biasin the aggregate data.

Tests of Aggregate Data

We pursue this investigation by com-puting the expected frequencies of maleand female applicants admitted anddenied, from the marginal totals ofTable 1, on the assumption that menand women applicants have equalchances of admission to the university(that is, on the basis of assumptions1 and 2). This computation, also givenin Table 1, shows that 277 fewer wom-en and 277 more men were admittedthan we would have expected underthe assumptions noted. That is a largenumber, and it is unlikely that so largea bias to the disadvantage of womenwould occur by chance alone. Thechi-square value for this table is 110.8,and the probability of a chi-squarethat large (or larger) under the as-sumptions noted is vanishingly small.We should on this evidence judge

SCIENCE, VOL. 187

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 2: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

that bias existed in the fall 1973 ad-missions. On that account, we shouldlook for the responsible parties to seewhether they give evidence of dis-crimination. Now, the outcome of anapplication for admission to graduatestudy is determined mainly by thefaculty of the department to which theprospective student applies. Let usthen examine each of the departmentsfor indications of bias. Among the 101departments we find 16 that eitherhad no women applicants or deniedadmission to no applicants of eithersex. Our computations, therefore, ex-cept where otherwise noted, will bebased on the remaining 85. For astart let us identify those of the 85with bias sufficiently large to occur bychance less than five times in a hun-dred. There prove to be four suchdepartments. The deficit in the numberof women admitted to these four (un-der the assumptions for calculatingexpected frequencies as given above)is 26. Looking further, we find sixdepartments biased in the opposite di-rection, at the same probability levels;these account for a deficit of 64 men.

These results are confusing. Afterall, if the campus had a shortfall of277 women in graduate admissions,and we look to see who is responsible,we ought to find somebody. So largea deficit ought not simply to disappear.There is even a suggestion of a sur-plus of women. Our method of ex-amination must be faulty.

Some Underlying Dependencies

We have stumbled onto a paradox,sometimes referred to as Simpson's inthis context (1) or "spurious correla-tion" in others (2). It is rooted in thefalsity of assumption 2 above. We haveassumed that if there is bias in theproportion of women applicants ad-mitted it will be because of a link be-tween sex of applicant and decision toadmit. We have given much less at-tention to a prior linkage, that betweensex of applicant and department towhich admission is sought. The tend-ency of men and women to seekentry to different departments ismarked. For example, in our data al-most two-thirds of the applicants toEnglish but only 2 percent of the ap-plicants to mechanical engineering arewomen. If we cast the application datainto a 2 x 101 contingency table, dis-tinguishing department and sex of ap-plicants, we find this table has a chi-7 FEBRUARY 1975

Table 1. Decisions on applications to Graduate Division for fall 1973, by sex of applicant-naive aggregation. Expected frequencies are calculated from the marginal totals of the observedfrequencies under the assumptions (I and 2) given in the text. N = 12,763, x2 = 110.8,d.f. = 1, P = 0 (18).

OutcomeDifference

Applicants Observed ExpectedAdmit Deny Admit Deny Admit Deny

Men 3738 4704 3460.7 4981.3 277.3 - 277.3Women 1494 2827 1771.3 2549.7 - 277.3 277.3

square of 3091 and that the probabilityof obtaining a chi-square value thatlarge or larger by chance is aboutzero. For the 2 x 85 table on the de-partments used in most of the analysis,chi-square is 3027 and the probabilityabout zero. Thus the sex distributionof applicants is anything but ran-dom among the departments. In ex-amining the data in the aggregate aswe did in our initial approach, wepooled data from these very different,independent decision-making units. Ofcourse, such pooling would not nullifyassumption 2 if the different depart-ments were equally difficult to enter.We will address ourselves to that ques-tion in a moment.

Let us first examine an alternativeto aggregating the data across the 85departments and then computing astatistic-namely, computing a statisticon each department first and aggregat-ing those. Fisher gives a method foraggregating the results of such in-dependent experiments (3). If we ap-ply his method to the chi-square sta-tistics of the 85 individual contingencytables, we obtain a value that has aprobability of occurrence by chancealone, that is, if sex and admissionare unlinked for any major, of about29 times in 1000 (4). Another com-mon aggregation procedure, proposedto us in this context by E. Scott, yieldsa result having a probability of 6times in 10,000 (5). This is consistentwith the evidence of bias in somedirection purportedly shown by TableJ. However, when we examine thedirection of bias, the picture changes.For instance, if we apply Fisher'smethod to the one-sided statistics, test-ing the hypothesis of no bias or ofbias in favor of women, we find thatwe could have obtained a value aslarge as or larger than the one ob-served, by chance alone, about 85times in 100 (6).Our first, naive approach of examin-

ing the aggregate data, computing ex-pected frequencies under certain as-sumptions. computing a statistic, and

deciding therefrom that bias existedin favor of men has now been castinto doubt on at least two grounds.First, we could not find many biaseddecision-making units by examiningthem individually. Second, when wetake account of the differences amongdepartments in the proportions of menand women applying to them andavoid this problem by computing astatistic on each department separately,and aggregating those statistics, . theevidence for campus-wide bias in favorof men is extremely weak; on thecontrary, there is evidence of bias infavor of women.The missing piece of the puzzle is

yet another fact: not all departmentsare equally easy to enter. If we castthe data into a 2 x 101 table, distin-guishing department and decision toadmit or deny, we find that this tablehas a chi-square value of 2195, withan associated probability of occurrence

by chance (under assumptions 1 and2) of about zero, showing that theodds of gaining admission to differentdepartments are widely divergent. (Forthe 2 X 85 table chi-square is 2121and the probability about zero.) Now,these odds of getting into a graduateprogram are in fact strongly associatedwith the tendency of men and womento apply to different departments indifferent degree. The proportion ofwomen applicants tends to be high indepartments that are hard to get intoand low in those that are easy to getinto. Moreover this phenomenon ismore pronounced in departments withlarge numbers of applicants. Figure 1is a scattergram of proportion of ap-plicants that are women plotted againstproportion of applicants that are ad-mitted. The association is obvious on

inspection although the relationship iscertainly not linear (7). If we use a

weighted correlation (8) as a measureof the relationship for all 85 depart-ments in the plot we obtain p = .56.If we apply the same measure to the17 departments with the largest num-bers of applicants (accounting for two-

399

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 3: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

thirds of the total population of ap-plicants) we obtain A = .65, while theremaining 68 departments have a cor-responding p = .39. The significance ofp under the hypothesis of no associa-tion can be calculated. All three valuesobtained are highly significant.The effect may be clarified by means

of an analogy. Picture a fishnet with twodifferent mesh sizes. A school of fish,

Table 2. Admissions data by sex of applicantX2 = 5.71, d.f. = 1, P = 0.19 (one-tailed).

all of identical size (assumption 1),swim toward the net and seek to pass.The female fish all try to get throughthe small mesh, while the male fishall try to get through the large mesh.On the other side of the net all thefish are male. Assumption 2 said thatthe sex of the fish had no relation tothe size of the mesh they tried to getthrough. It is false. To take another

for two hypothetical departments. For total,

OutcomeDifference

Applicants Observed Expected

Admit Deny Admit Deny Admit Deny

Department of machismaticsMen 200 200 200 200 0 0Women 100 100 100 100 0 0

Department of social warfareMen 50 100 50 100 0 0Women 150 300 150 300 0 0

TotalsMen 250 300 229.2 320.8 20.8 - 20.8Women 250 400 270.8 379.2 - 20.8 20.8

100

go

80

70

60

50

40

30 O

- OE20

10

0

Fig. 1. Proportcants admitted,to the departme

400

O Number of applicants ' 40

'0 C0

0

0

C 0

0

00

0

0

0

0 00

0

0 0

0

00

0

00

Fml mr{R1rI

Z

example that illustrates the danger ofincautious pooling of data, considertwo departments of a hypothetical uni-versity-machismatics and social war-fare. To machismatics there apply 400men and 200 women; these are ad-mitted in exactly equal proportions,200 men and 100 women. To socialwarfare there apply 150 men and 450women; these are admitted in exactlyequal proportions, 50 men and 150women. Machismatics admitted half theapplicants of each sex, social warfareadmitted a third of the applicants ofeach sex. But about 73 percent of themen applied to machismatics and 27percent to social warfare, while about69 percent of the women applied tosocial warfare and 31 percent tomachismatics. When these two depart-ments are pooled and expected fre-quencies are computed in the usualway (with assumption 2), there is adeficit of about 21 women (Table 2).A discrepancy in that direction thatlarge or larger would be expectableless than 2 percent of the time bychance; yet both departments wereseen to have been absolutely fair indealing with their applicants.The creation of bias in our original

situation is, of course, much morecomplex, since we are aggregatingmany tables. It results from an inter-action of the three factors, choice ofdepartment, sex, and admission status,whose broad outlines are suggested byour plot but which cannot be describedin any simple way.

In any case, aggregation in a simpleand straightforward way (approach A)is misleading. More sophisticated meth-ods of aggregation that do not relyon assumption 2 are legitimate buthave their difficulties. We shall havemore to say on this later.

DisaggregationII ___

L.J \The most radical alternative to ap-° o3 ~ Cproach A is to consider the individual

m3 graduate departments, one by one.

D 5 Li 0 However, this approach (which wemay call approach B) also poses diffi-

° 0o culties. Either we must sample ran-domly from the different departments,

Co or we must take account of the proba-l bility of obtaining unusual sex ratios

of admittees by chance in a number10 20 30 40 50 60 70 80 of simultaneously conducted indepen-

Percent women applicants dent experiments. That is, in examining

tion of applicants that are women plotted against proportion of appli- 85 separate departments at the samein 85 departments. Size of box indicates relative number of applicants time for evidence of bias we are con-nt. ducting 85 simultaneous experiments,

SCIENCE, VOL. 187

0

E(U

-WC

cLCL

c

00.

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 4: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

and in that many experiments theprobability of finding some markeddepartures from expected frequencies"just by chance" is not insubstantial.The department with the strongest biasagainst admitting women in the fall1973 cycle had a bias of sufficientmagnitude to be expectable by chancealone only 69 times in 100,000. If wehad selected that department for ex-amination on a random basis, wewould have been convinced that itwas biased. But -we did not so selectit; we looked at 85 departments atonce. The probability of finding adepartment that biased against women(or more biased) by chance alonein 85 simultaneous trials is about 57times in 1000. Thus that particulardepartment is not quite so certainlybiased as we might have first believed,.057 being a very much larger num-ber than .00069, although still a smallenough probability to warrant a closerlook. This department was the worstone in respect of bias against womenin admissions; the probability of find-ing departments less biased by chancealone is of course greater than .057.We can also examine events in theother direction. The department mostbiased against men had a bias suffi-ciently large to be ex-pectable by chancealone about 20 times in a million,and the chance of finding a departmentthat biased (or more biased) in thatdirection by chance alone in 85 simul-taneous trials (9) is about .002.

There is a further difficulty in ap-proach B. Although it makes a greatdeal of sense to examine the individualdepartments that are in fact the in-dependent decision-making entities inthe graduate admissions process, someof them are quite small, and even insome that are of ordinary size thenumber of women applying is verysmall. Calculation of the probability ofobserved deviations from expected fre-quencies can be carried out for suchunits, but when the numbers involvedare very small the evidence for decid-ing whether there is no bias or grossbias is really worthless (10). This de-fect is evident not only in approachB but also if we use some reasonablemethod of aggregation of test statisticsto avoid the pitfalls of approach Asuch as that of Fisher, or even theapproach we suggest below. That is,large biases in small departments orin departments with small numbers ofwomen applicants will not influence areasonable aggregate measure appreci-ably.7 FEBRUARY 1975

Pooling

The difficulty we face is not onlytechnical and statistical but also ad-ministrative. In some sense the campusis a unit. It operates under generalregulations concerning eligibility foradmission and procedures for admis-sion. It is a social community that sharescertain values and is subject to certaingeneral influences and pressures. It isidentifiable as a bureaucratic unit byits own members and also by externalagencies and groups. It is, as a socialand cultural unit, accountable to itsvarious publics. For all these reasonsit makes sense to ask the question,Is there a campus bias by sex in gradu-ate admissions? But this question raisesserious conceptual difficulties. Is cam-pus bias to be measured by the netbias across all its constituent subunits?How does one define such a bias?For any definition, it is easy to imaginea situation in which some departmentsare biased in one direction and otherdepartments in another, so that thenet bias of the campus may be zeroeven though very strong biases are ap-parent in the subunits. Does one lookinstead at the outliers, those depart-ments that have divergences so extremeas to call their particular practicesinto question? How ex-treme is extreme in 2

such a procedure, and n

what does one do ,(a, +about units so small asto make such assessment meaningless?We believe that there are no easy

answers to these questions, but we areprepared to offer some suggestions.We propose that examination of cam-pus bias must rest on a method ofestimation of expected frequencies thattakes into account the falsity of as-sumption 2 and the apparent propen-sity of women to apply to departmentsthat are more difficult to enter.We reanalyze Table 1, using all the

data leading to it, by computing theexpected frequencies differently thanin approach A, since we now knowthe assumptions underlying that earliercomputation to be false. We estimatethe number of women expected to beadmitted to a department by multiply-ing the estimated probability of ad-mission of any applicant (regardlessof sex) to that department by thenumber of women applying to it. Thus,if the chances of getting into a de-partment were one-half for all appli-cants to it, and 100 women applied,we would expect 50 women to be

admitted if they were being treatedjust like the men. We do this computa-tion for each department separately,since each is likely to have a differentprobability of admission and a differentnumber of women applying, and wesum the results to obtain the numberof women expected to be admitted forthe campus as a whole (11). This esti-mate proves to be smaller by 60 thanthe number of women observed to havebeen admitted (Table 3).

The computation of Table 3 is asfollows: For a four-cell contingencytable of the following format:

Admit DenyMen a, biWomen c, di

the particular cell of interest is ci,containing the number of women ad-mitted. The expected frequency underthe hypothesis of no bias is E = wipi= (c, + di) (a1 +ci)/N.1, where Ni isthe total of applicants to departmenti. The observed number, 0, is thenumber in c,. The difference betweenthese two quantities, 0 - E, summedover n departments is

n( - E) = DIFF

Then,

(DIFF)2

b,) (at + c.)(c,+ d.)(b,+ di) / N, (N,- I)

with d.f. = 1. Ninety-six departmentswere included in the computation, since5 of the total 101 each had only 1applicant. If Ni- 1 is replaced by N inthe denominator, all 101 departmentscan be included, yielding x2 = 8.61;O- E remains 60.1 and the expectedand observed female admittees are eachincreased by 1. (This statistic makes itpossible to include contingency tableshaving an empty cell, so that no infor-mation is lost; there is thus an advan--tage over methods that pool the chi-square values from a set of contingencytables.)The probability that an observed

bias this large or larger in favor ofwomen might occur by chance alone(under these new assumptions) is.0016; the probability of its occurringif there were actual discriminationagainst women is, of course, evensmaller. This is consistent with whatwe found using Fisher's approach andaggregating the test statistics: there isevidence of bias in favor of women.[The test used here was proposed in

401

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 5: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

another context by Cochran (12)and Mantel and Haenszel (13).]We would be remiss if we did not

point out yet another pitfall of ap-proach A. Whereas the highly signifi-cant values of the Mantel-Haenszel orFisher statistics just mentioned for1973 are evidence that there is biasin favor of women, the low values ob-tained in other years (see below) donot indicate that every department wasoperating more or less without bias.Such low values could equally wellarise as a consequence of cancellation.We illustrate with the hypothetical de-partments of machismatics and socialwarfare. If machismatics admitted 250men and 50 women, creating a short-fall of 50 women, while social warfareadmitted 200 women and no men,creating an excess of 50 women, theaggregate measure of bias we haveintroduced would be zero. We onlyargue that if an aggregate measure ofbias is wanted the one we propose isreasonable. Of course, if we combinetwo-sided statistics by the Fisher meth-od this phenomenon does not occur.We would conclude from this exam-

ination that the campus as a whole didnot engage in discrimination againstwomen applicants. This conclusion isstrengthened by similarly examining thedata for the entire campus for the years1969 through 1973. In 1969 the num-ber of women admitted exceeded theexpected frequency by 24; the prob-ability of a deviation of this size orlarger in either direction by chancealone is .196. In 1970 there were fourfewer women admitted than expected,the probability of chance occurrencebeing .833. In 1971 there were 25 morewomen than expected, with a proba-bility of .249. In 1972 there were sevenmore women than expected, the prob-ability being .709. For 1973 as shownabove the deviation was an excess of60 women over the expected number:the probability of a chance deviationthat large or larger in either directionis .003. These data suggest that thereis little evidence of bias of any kinduntil 1973, when it would seem signifi-cant evidence of bias appears, in favorof women. This conclusion is supportedby all the other measures we haveexamined. For instance, pooling thechi-square statistics by Fisher's methodyields a probability of .99 in 1969,1970, and 1971, a probability of .55in 1972, and a probability of .029 in1973 (14).We may also take approach B and

402

Table 3. Sum of expected departmental out-comes of women's applications compared withsum of observed outcomes, Graduate Division,Berkeley, fall 1973. x2 = 8.55, d.f. = 1, P =.003 (two-tailed).

Expected female admittees 1432.9Observed female admittees 1493.0Difference (O- E) 60.1

look for individual department outliers.Because the numbers of women stu-dents applying to some of them in anyone year are often small, we aggregatedthe data for each department over the5-year span, using the method just ex-plained. (This procedure of coursehides the kind of change that the ag-gregating approach reveals when pur-sued through time, but it enables usto focus on possible "offenders" ineither direction in a campus that is onthe average behaving itself.) Duringthe 5-year period there were 94 unitsthat had at least one applicant of eachsex and admitted at least one applicantand denied admission to at least one inat least one year. Two of the 94 units,one in the humanities and one in theprofessions, show a divergence fromchance expectations sufficient to arouseinterest. One of these admitted 16fewer women than expected over 5years, a shortfall of 29 percent; theprobability of such a result by chancealone in 94 trials is about .004. Theother unit admitted 40 fewer womenthan expected over the 5-year period,a shortfall of 7 percent, with a prob-ability in 94 trials of about .019. Thenext most likely result by chance wasat a level of .094 and the next afterthat at .188. Conversely there were twounits significantly biased in the oppositedirection, with chance probabilities ofoccurrence of .033 and .047, account-ing for a combined shortfall of 50 men,13 and 24 percent respectively of theexpected frequencies in the individualunits.The kinds of statistics we may wish

to use in examination of individualdepartments may differ from those em-ployed in these general screening pro-cesses. For example, in one of thecases of a shortfall of women citedabove, it seems likely that an intensifieddrive to recruit minority group mem-bers caused a temporary drop in theproportion of women admitted, sincemost of the minority group admitteeswere males. In most of the cases in-volving favored status for women itappears that the admissions committees

were seeking to overcome long-estab-lished shortages of women in theirfields. Overall, however, it seems thatthe admissions procedure has beenquite evenhanded. Where there are di-vergences from the expected frequen-cies they are usually small in magni-tude (although they may constitute asubstantial proportion of the expectedfrequency), and they more frequentlyfavor women than discriminate againstthem.

More General Issues

We have already explained why as-sumption 1-the equivalence of aca-demic qualifications of men and womenapplicants-is necessary to the statisti-cal examination of bias in admissions.But the assumption is clearly false inits most extensive sense; there are areasof graduate study that men and womensimply have not hitherto been equallyprepared to enter. One of the principaldifferentiators is preparation in mathe-matics, which is prerequisite in an elab-orate stepwise fashion to a number offields of graduate endeavor (15).

This differentiation would have littleeffect on women's chances to entergraduate school if it were unrelated todifficulty of entry. But it is not. Al-though it would appear in a logicalsense that the departments requiringmore mathematics would be more diffi-cult to enter, in fact it appears to bethose requiring less mathematics thatare the more difficult. (For the 83graduate programs with matching un-dergraduate majors, the Pearson r be-tween proportion of applicants ad-mitted and number of recommendedor required undergraduate units inmathematics or statistics is .38.) In partthis may be because departments re-quiring less mathematics receive appli-cations from persons who might havepreferred to enter others but cannot forlack of mathematical (or similar)background, as well as from personsintrinsically inclined toward nonmathe-matical subjects. In part it is becausein the nonmathematical subjects (thatis, the humanities and social sciences)students take longer to get throughtheir programs; in consequence, thosedepartments have lower throughputand thus less room, annually, to acceptnew students. Just why this is so is amatter of debate and of great complex-ity. Some of the problem may lie in thevery lack of a chain of prerequisites

SCIENCE, VOL. 187

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 6: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

such as that characterizing graduatework in, let us say, the physical sci-ences. Some may lie in the nature ofthe subject matter and the intractabil-ity of its data and the questions askedof the data. Some may lie in the lessfavorable career opportunities of thesefields and in consequence a lower pullfrom the professional employment mar-ket. Some may lie just in the higherproportion of women enrolled and thepossibility that women are under lesspressure to complete their studies (hav-ing alternative options of social rolesnot open generally to men) and haveless favorable employment possibilitiesif they do complete, so that the pull ofthe market is less for them. Whateverthe reasons, the lower productivity ofthese fields is a fact, and it crowds thedepartments in them and makes themmore difficult to enter.The absence of a demonstrable bias

in the graduate admissions system doesnot give grounds for concluding thatthere must be no bias anywhere elsein the educational process or in itsculmination in professional activity.Our intention has been to investigatethe general case for bias against womenin a specific matter-admission tograduate school-not only because wehad the data base to do so but alsobecause allegations of bias in the ad-missions process had been aired. Ourapproach in the beginning was naive,as befits an initial investigation. Wefound that even the naive questioncould not be answered adequatelywithout recourse to sophisticatedmethodology and careful examinationof underlying processes. We take thisopportunitv to warn all those who areconcerned with problems of bias aboutthese methodological complexities (16).We also find, beyond this immediate

area of concern in graduate admissions,that the questions of bias and discrimi-nation are more subtle than one mighthave imagined, and we mean this inmore than just the methodologicalsense. If prejudicial treatment is to beminimized, it must first be locatedaccurately. We have shown that it isnot characteristic of the graduate ad-missions process here examined (al-though this judgment does not elimi-nate the possibility of individual casesof prejudicial treatment, and it doesnot deal with politically or morally de-fined null hypotheses). The fairness ofthe faculty in admissions is an impor-tant foundation for further effort. Thateffort can be made directly by univer-7 FEBRUARY 1975

sities in seeking to equalize the prog-ress of men and women toward theirdegrees (17). A university can use itspowers of suasion to equalize the prep-aration of girls and boys in the primaryand secondary schools for entry intoall academic fields. By its own objec-tive research it may be able to deter-mine where and how much bias anddiscrimination exist and what the suit-able corrective measures may be.

Summary

Examination of aggregate data ongraduate admissions to the Universityof California, Berkeley, for fall 1973shows a clear but misleading patternof bias against female applicants. Ex-amination of the disaggregated datareveals few decision-making units thatshow statistically significant departuresfrom expected frequencies of femaleadmissions, and about as many unitsappear to favor women as to favormen. If the data are properly pooled,taking into account the autonomy ofdepartmental decision making, thuscorrecting for the tendency of womento apply to graduate departments thatare more difficult for applicants ofeither sex to enter, there is a smallbut statistically significant bias in favorof women. The graduate departmentsthat are easier to enter tend to be thosethat require more mathematics in theundergraduate preparatory curriculum.The bias in the aggregated data stemsnot from any pattern of discriminationon the part of admissions committees,which seem quite fair on the whole,but apparently from prior screeningat earlier levels of the educational sys-tem. Women are shunted by their so-cialization and education toward fieldsof graduate study that are generallymore crowded, less productive of com-pleted degrees, and less well funded,and that frequently offer poorer pro-fessional employment prospects.

References and Notes

1. C. R. Blyth, J. Amn. Stat. Assoc. 67, 364(1972).

2. J. Neyman, Lectutres atid Conferences onMathematical Statistics and Probability (U.S.Department of Agriculture Graduate School,Washington, D.C., ed. 2, 1952), p. 147.

3. R. A. Fisher, Statistical Methods for Re-search Workers (Oliver and Boyd, London,ed. 4, 1932).

4. Fisher's statistic is

nF =-2E In p(T,)

i=1

where p(Ti) is the P value of the test statisticcalculated for the il'" experiment (department).

F is referred to the upper tail of a chi-square distribution with 2n degrees offreedom where n =number of experimentalresults to be aggregated, here 85. In ourapplication here, Ti is the usual contingencytable chi-square statistic, with P value ob-tained from a table of the chi-square dis-tribution with I degree of freedom.

5. This method uses as a statistic

n

£2i=1

having a chi-square distribution with d.f. =n (= 85 here). x'2 iS the usual X2 statistic inthe it" 2 X 2 table.

6. In this application of Fisher's statistic (4),Ti is + the square root of the chi-squarestatistic with sign plus if there is an excessof men admitted and sign minus otherwise;the P value is the probability of a standardnormal deviate exceeding Ti.

7. Transformation to linearity by simple changesof variable, for example to log (odds), is alsonot successful.

8. If 7ri, pi, and p'. represent, respectively, theprobability of applying to department i, theprobability of being admitted given that ap-plication is to department i, and the prob-ability of being a male given that applicationis to department i, then a reasonable mea-sure of the association of the numbers pi, p'iis the correlation (weighted according to theshare of each major in the applicant pool)

p = rt(pt -p.)(p'I-p.)/[2sr,(pi - p.)iZsri(pt -pl )2]XH

where p., p'. are defined by .2ripi, ''srip'i,respectively. As usual, IPI = I indicates lineardependence between the pi, p'i while p = 0suggests "no relation." Positive values indicate"positive association" and so on.This correlation can be estimated by

substituting the observed proportions of ap-plicants to department i, admitted applicantsto department i among applicants to de-partment i, and male applicants to departmenti among applicants to department i for7ri, pt and p'i, respectively. This is the sta-tistic we call A.We can use 0 as a test statistic for the

hypothesis that p = 0. To do so we need thedistribution of p under that hypothesis. Itturns out that s/Var '12 has approximatelya standard normal distribution. The expressionVar pA is complicated because of the sta-tistical dependence between pt and p'i Edi-torial considerations have prompted its de-letion. It is obtainable from the authors.

9. The probability that an observation as ex-treme as (or more extreme than) the mostextreme cne would occur by' chance alone,where n = number of simultaneous indepen-dent experiments or observations, and p = prob-ability of occurrence by chance of the mostextreme observation if it had been selectedat random for a single observation, is 1 -(I - p)", and thtis for p close to zero isapproximately np.

10. Smallness of numbers of women applicantsalso invalidates the normal approximationused in the significance probabilities of ap-proach B, but this can be remedied.

11. This may be expressed as

85

E (WI)(pi)

where wi is the number of women applyingto the ith major and pi is the probability ofentry of any applicant into the ith major,the latter being estimated from the numberof admittees divided by the number of ap-plicants.

12. W. G. Cochran, Biometrics 10, 417 (1954).13. N. Mantel, J. Am. Stat. Assoc. 58, 690

(1963).14. Further analysis of these data, in particular

examination of individual tunits through time,is in progress.

15. Research currently being conducted by L.Sells at Berkeley shows how drastic thisscreening process is, particularly with respectto mathematics.

16. There is a real danger in naive determina-tion of bias when the action following posi-tive determination is punitive. On the basisof Table 1, which we have now shown to be

403

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om

Page 7: Sex Bias in Graduate Admissions: DatafromBerkeleybrenocon.com/science_1975_sex_bias_graduate_admissions...SexBias in Graduate Admissions: DatafromBerkeley Measuring bias is harder

misleading, regulatory agencies of the federalgovernment would have felt themselves justi-fied in withholding substantial amounts ofresearch funding from the university. Afurther danger in punitive action of this kindis that, being concentrated in the researcharea, which provides an important sourceof support for graduate students, it punishesnot only male but also female students-women in areas in which women have tra-ditionally been enrolled, such as the social

sciences, and also pioneering women in thephysical and biological sciences, where fed-eral support has been more concentrated.

17. In fact, data in hand at Berkeley suggest adramatic decrease in the early dropout ratesof women and the disappearance of thedifferential in dropout rates of men andwomen. It will be several years before wewill be able to judge whether this phenome-non is one of decreased or simply of delayedattrition.

18. If the same naive aggregation is carried outfor the 85 departments used in most of theanalysis, N = 12,654, x2 = 105.6, d.f. = 1,P =0.

19. The investigation was initiated by E.A.H.,using data retrievable from a computerizedsystem developed by V. Aldrich. Advice onstatistical procedures in the later stages ofthe investigation was provided by P.J.B., andprogramming and other computation was doneby J.W.O'C.

Crisis Management: SomeOpportunities

International emergency cooperation involvinggovernments, technology, and science is now foreseeable.

Robert H. Kupperman, Richard H. Wilcox, Harvey A. Smith

Many alarming trends of our pres-ent culture share common roots. World-wide inflation, worldwide resourceshortages, extensive famine, and theinexorable quest for more deadly weap-ons may very well reach crisis pro-portions if these trends continue. Theyserve already as examples of nationaland international failures of efficientresource allocation and communica-tions. It is important that we under-stand the possible future implicationsthat these failures hold and, more im-portant, that we develop means fordealing with them.

In discussing the crisis managementdemanded by such situations it istempting to start by defining what ismeant by a crisis, but this is a difficultmatter. Crises are matters of degree,being emotionally linked to such sub-jective terms as calamity and emer-gency. In fact it is not necessary todefine crises in order to discuss prob-lems generally common to their man-agement, including the paucity of ac-curate information, the communica-tions difficulties that persist, and the

Dr. Kupperman is Chief Scientist and Mr. Wil-cox is Chief of Military Affairs of the U.S. ArmsControl and Disarmament Agency, Washington,D.C. 20451. Dr. Smith is Professor of Mathe-matics at Oakland University, Rochester, Michi-gan 48063.

404

changing character of the players asthe negotiations for relief leave one ormore parties dissatisfied.

In a sense, crises are unto the be-holder. What is a crisis to one individ-ual or group may not be to another.However, crises are generally distin-guished from routine situations by asense of urgency and a concern thatproblems will become worse in theabsence of action. Vulnerability to theeffects of crises lies in an inability tomanage available resources in a waythat will alleviate the perceived prob-lems tolerably. Crisis management,then, requires that timely action betaken both to avoid or mitigate un-desirable developments and to bringabout a desirable resolution of theproblems.

Crises may arise from natural causesor may be induced by human adver-saries, and the nature of the manage-ment required in response differs ac-cordingly. Thus the actions requiredto limit physical damage from a severehurricane and to expedite recoveryfrom it differ substantially from thetactics needed to minimize the eco-nomic effects of a major transportationstrike and to moderate the conditionswhich caused it. Yet each also exhibitssome characteristics of the other. Forexample, recovery from the devastation

wrought by the hurricane's wind andfloodwaters brings competition amongdifferent managers whose conceptionsof recovery differ: Is the goal to re-establish the status quo, includingslums, or to seize upon the opportunityfor urban renewal? Similarly, a trans-portation strike may cause such eco-nomic chaos that the. Congress-535crisis managers-might threaten to passlaws that are detrtmental to a unionleadership's prestige and control overits members.

It is useful to note the characteris-tics common to most crisis manage-ment. Perhaps the most frustrating isthe uncertainty concerning what hashappened or is likely to happen,coupled with a strong feeling of thenecessity to take some action anyway"before it is too late." This leads toan emphasis on garnering information:military commanders press their in-telligence staffs, and civil leaders try toget more out of their field personneland management information systems.Unfortunately, few conventional in-formation systems are equal to the taskof covering unconventional situations,so managers in a crisis must frequentlyfall back upon experience, intuition,and bias to make ad hoc decisions (1).The problems of uncertainty are ex-

acerbated by the dynamic nature ofmany crises. Storms follow unpredict-able courses; famine is affected byvagaries in the weather; terrorists per-form apparently irrational acts; andforeign leaders, responding to differ-ent value systems or simply interpret-ing situations differently, select unex-pected courses of action. Thus, withlimited information and resources themanager may find it difficult just tokeep up with rapid developments, letalone improve the overall picture of thesituation.

During a crisis, not only does aninvolved manager suffer from poor in-formation, but he has the problem ofidentifying the objectives he wishes toaccomplish and ordering them bypriority in accord with his limited re-

SCIENCE, VOL. 187

on

Aug

ust 2

3, 2

009

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fr

om