Download - Hallinger Et Al 2016 Gender Differences in Instructional Leadership- A Meta-Analytic Review of Studies Using the Principal Instructional Management Rating Scale

Educational Administration Quarterly 1 –35

© The Author(s) 2016Reprints and permissions:

sagepub.com/journalsPermissions.nav DOI: 10.1177/0013161X16638430

eaq.sagepub.com

Review

Gender Differences in Instructional Leadership: A Meta-Analytic Review of Studies Using the Principal Instructional Management Rating Scale

Philip Hallinger1,2, Li Dongyu3, and Wen-Chung Wang3

AbstractPurpose: Instructional leadership has assumed steadily increasing importance within the general role set of principals over the past 60 years. One persisting finding within this corpus of studies concerns the consistently higher ratings obtained by female principals on instructional leadership when compared with their male counterparts. This article employed meta-analysis first to test if there are significant differences in perceptions of the instructional leadership practices of male and female principals. Method: Then, the results were further analyzed to describe the nature of differences that were revealed in the first-stage analysis. The database for the meta-analysis consisted of 40 data sets drawn from 28 studies that had used the Principal Instructional Management Rating Scale (PIMRS) in studies of gender and instructional leadership. The data sets comprised perception data collected variously from principals and teachers on more than 2,000 principals between 1983 and 2014. The data were analyzed at three different construct levels measured by the PIMRS instrument:

1Chulalongkorn University, Bangkok, Thailand2University of Johannesburg, Johannesburg, South Africa3Hong Kong Institute of Education, Tai Po, Hong Kong

Corresponding Author:Philip Hallinger, Faculty of Education, Chulalongkorn University, 254 Phayathai Road, Bangkok 10330, Thailand. Email: [email protected]

638430 EAQXXX10.1177/0013161X16638430Educational Administration QuarterlyHallinger et al.research-article2016

by guest on March 16, 2016eaq.sagepub.comDownloaded from

mailto:[email protected]

http://eaq.sagepub.com/

2 Educational Administration Quarterly

PIMRS Total Score, three dimensions, 10 functions. Results: The meta-analysis yielded statistically significant gender differences in instructional leadership (Cohen’s d = .288, p < .01) when the results of all studies were combined. The direction of the findings suggested that female principals engaged in more active instructional leadership than male counterparts. Implications: The gender differences in instructional leadership are consistent with the broader results of the meta-analytic literature on gender and leadership. Implications are discussed for both research and practice.

Keywordsinstructional leadership, gender, principal, school leadership, meta-analysis, PIMRS

In 2015, the requirement for principals to assume central responsibility for instructional leadership pervades education systems throughout the world (Hallinger & Lee, 2013; Hallinger & Wang, 2015; Robinson, Lloyd, & Rowe, 2008). A key line of empirical inquiry in educational leadership has focused on identifying how “personal characteristics” of principals influence their role behavior (e.g., Bossert, Dwyer, Rowan, & Lee, 1982; Goldring, Huff, May, & Camburn, 2008; Hallinger, 2011; Hallinger & Heck, 1996; Leithwood, Begley, & Cousins, 1990; Leithwood & Jantzi, 2008). One principal charac-teristic that has maintained the interest of scholars for more than 50 years is gender (Charters & Jovick, 1981; Collard, 2001; Frasher & Frasher, 1979; Krüger, 1996, 2008; Shakeshaft, 2006).

Interest in how gender shapes leadership is linked to both equity and instrumental concerns (Krüger, 1996, 2008; Pounder & Coleman, 2002; Shakeshaft, 2006). In summarizing the early literature in this domain, Eagly, Karau, and Johnson (1992) concluded that females tend to employ a more participatory and task-focused style of leadership than male principals. Moreover, they observed that

occupying a role that is congenial in gender-relevant terms may have gains for school leaders in terms of some increase in their tendency to organize activities to accomplish relevant tasks. As school principals, women may encounter role expectations that are especially congenial with their own gender role. (p. 92)

Subsequent research has focused more specifically on gender differences in the enactment of instructional leadership (Hallinger, 2011). Although scholars have suggested that female principals may be more active instruc-tional leaders, this latter generation of studies has never been systematically reviewed (Hallinger, 2011; Hallinger & Murphy, 1985; Krüger, 2008).



Hallinger et al. 3

In this review of research, we sought to determine if there were differences in the levels and patterns of instructional leadership practiced by male and female principals. We used meta-analysis to synthesize results from a corpus of 40 independent data sets drawn from 28 studies.1 All the studies had employed the Principal Instructional Management Rating Scale (PIMRS; Hallinger, 1982/1990) for data collection. Our sample included data collected from more than 6,000 teachers and 2,800 principals in three countries.

Meta-analysis is playing an increasingly important role in demarking knowledge accumulation in educational leadership (e.g., Hallinger, Wang, & Chen, 2013; Leithwood & Sun, 2012; Robinson et al., 2008; Scheerens, 2012; Sun & Leithwood, 2015; Witziers, Bosker, & Kruger, 2003). Meta-analysis offers a more systematic means of integrating findings from a body of empirical studies than traditional methods of research review and is, there-fore, increasingly prominent in policy-oriented research (Hallinger, 2014; Lipsey & Wilson, 2001). Thus, we undertook this study in the hope that our findings would advance a topic of continuing interest in our field.

Theoretical Perspective

We begin by providing an overview of research on principal instructional leadership. The study focuses specifically on studies that employed the PIMRS, so we also devote some attention to discussing its underlying con-ceptual framework. Finally, we provide an overview of previous research on the relationship between gender and principal instructional leadership.

General Perspectives on Principal Instructional Leadership

Among the global trends in educational leadership and management that emerged over the past 30 years, few have been more significant, widespread, or persistent than the focus on understanding linkages between leadership and learning (Bossert et al., 1982; Hallinger & Heck, 1996; Heck & Hallinger, 2014; Robinson et al., 2008; Sun & Leithwood, 2015; Witziers et al., 2003). The genealogy of scholarship on “instructional leadership” can be traced to the conceptual efforts of Bridges (1967) and the empirical research of Gross and Trask (1964) in the United States. However, it was the “effective schools movement” (Edmonds, 1979) that served as a catalyst for more concerted and sustained efforts to understand if and how principal leadership makes a dif-ference in student learning (Bossert et al., 1982; Erickson, 1979; Hallinger & Murphy, 1985; Leithwood et al., 1990). For example, in 1979, Donald Erickson authored a pointed and, strange as it may seem today, controversial proposal for the field of educational administration.




Three years ago I opined that the most promising relevant work, largely ignored by scholars identified with “educational administration,” was the work on “school effects.” The literature during the last three years has further reinforced my dual conviction that “school effects” studies, broadly defined, represent the current leading edge in the research domain I am assessing, and that few scholars affiliated with “educational administration” are taking note of them, though nothing could be more profoundly pertinent than the school effects studies to the consequence of educational organization. (Erickson, 1979, p. 10)

With hindsight, Erickson’s judgments 30 years ago were nothing less than prescient. Reviews of research conducted during this watershed period resulted in a significant uptick in both the quantity and quality of studies of principal instructional leadership (e.g., Bossert et al., 1982; Bridges, 1982; Erickson, 1979, Leithwood & Montgomery 1982). Among the most influential conceptu-alizations of leadership and learning to emerge from this era was the “instruc-tional management model” developed by Bossert et al. (1982). They proposed that the effects of principal instructional management on learning were both moderated by characteristics of the principal and his/her environment and medi-ated by features of the school organization and learning climate.

Consistent with this model, other scholars identified a number of relevant personal characteristics of principals including gender, personality type, self-efficacy, years of teaching and administrative experience, prior training, and race (e.g., Bridges, 1982; Eagly et al., 1992; Erickson, 1979; Hallinger & Murphy, 1985; Leithwood & Montgomery, 1982). During the ensuing decades, these personal characteristics have been studied extensively in rela-tion to their role in shaping the leadership behavior of school principals (e.g., Coleman, 2007; Goldring et al., 2008; Hallinger, 2011; Hallinger & Heck, 1996; Leithwood et al., 1990). The current study’s attempt to examine the relationship between gender and principal instructional leadership is located within this broader perspective on leadership and learning.

The PIMRS Conceptual Framework

Concurrent with publication of the Bossert model, Hallinger and Murphy (1985) developed a more explicit conceptual framework aimed at defining the instructional leadership role of the principal (see also Hallinger, Murphy, Weil, Mesa, & Mitman, 1983). The framework proposed three “dimensions” for this leadership role: Defines the School’s Mission, Manages the Instructional Program, and Develops a Positive School Learning Climate (see Figure 1). The first dimension, Defines the School’s Mission, centers on the principal’s role working to ensure that the school has a clear mission



Hallinger et al. 5

focused on the academic progress of students. Murphy and Torre (2014) assert that clear mission and goals represent “essential scaffolding” for school improvement. While this dimension does not assume that the principal defines the school’s mission alone, it does propose that the principal is responsible for ensuring that an academic mission exists and ensuring its effective communication to staff, students, and the community (Edmonds, 1979; Hallinger & Heck, 1996, 2002; Hallinger et al., 1983; Robinson et al., 2008; Scheerens, 2012; Sun & Leithwood, 2015).

The second dimension, Manages the Instructional Program, focuses on the principal role in “managing the technical core” of the school, learning, and teaching (Hallinger & Murphy, 1985). Although the principal must share and delegate many tasks involved in monitoring and developing the school’s instructional program, overall coordination remains a key leadership respon-sibility of the principal (Bossert et al., 1982; Edmonds, 1979; Hallinger & Heck, 1996; Hallinger & Murphy, 1985, 2012; Robinson et al., 2008). The third dimension, Develops a Positive School Learning Climate, conforms to the notion that successful schools create an “academic press” through the development of a school climate characterized by high standards and expec-tations, capacity development, and continuous improvement (Edmonds, 1979; Hallinger & Heck, 1996; Hallinger et al., 1983; Leithwood & Montgomery, 1982; Leithwood & Sun, 2012; Robinson et al., 2008).

Figure 1. PIMRS instructional leadership model (Hallinger & Murphy, 1985).




Gender and School Leadership

Researchers first reported differences in patterns of leadership exercised by male and female school principals in the 1960s (e.g., Gross & Trask, 1964; Hemphill, Griffiths, & Frederiksen, 1962). Subsequent scholarship con-ducted during the 1970s and 1980s sought to amplify initial findings using a variety of leadership constructs including task/interpersonal leadership orien-tation, democratic/autocratic leadership orientation, initiating structure/con-sideration, and participatory/directive leadership style (e.g., Adkinson, 1981; Arcy, 1980; Charters & Jovick, 1981; Fischel & Pottker, 1977; Frasher & Frasher, 1979; Gross & Trask, 1976; Salley, McPherson, & Baehr, 1979). In 1992, Eagly and colleagues conducted a meta-analytic synthesis of this early generation of research and concluded that female principals tended to adopt more participatory, democratic, task-focused leadership styles than their male counterparts. In a subsequent meta-analysis conducted a decade hence, Eagly, Johannesen-Schmidt, and van Engen (2003) found that female leaders tended to achieve stronger ratings on transformational leadership as well as engag-ing in more contingent reward behaviors associated with transactional leadership.

Another parallel line of inquiry into gender and school leadership evolved over time focusing on how gender shapes the principal’s exercise of “instruc-tional leadership.” Taking a cue from the research of Gross and Trask (1976), Hallinger and Murphy (1985) reported that “the similarity in findings con-cerning the gender variable in this and previous [school leadership] studies suggests that it is worth examining more closely in future research” (p. 234). This stimulated subsequent empirical inquiry into gender differences in prin-cipal instructional leadership (e.g., Cunningham, 2004; Geiselman, 2004; Howell, 1989; Krüger, 1996, 2008; McCabe, 1993; Miller, 1991; Munroe, 2009; Nogay & Beebe, 2008; O’Donnell, 2002; Schoch, 1992; Trout, 1985). Thus, 25 years later, Hallinger (2011) recommended that this topic was also ready for meta-analytic review.

Methodology

When a domain of research evidences a concentration of quantitative studies that have analyzed the relationship between two variables, meta-analysis offers a powerful means of quantitatively integrating substantive findings (Glass, 1977). Lipsey and Wilson (2001) noted that the analytical power of meta-analysis is magnified when studies have used the same measures of the relevant variables. The current study used meta-analysis to synthesize quan-titative findings drawn from a corpus of studies all of which had employed



Hallinger et al. 7

the same means of measuring the independent variable (i.e., gender) and the dependent variable of instructional leadership (i.e., PIMRS ratings).

The Principal Instructional Management Rating Scale

Hallinger and Murphy’s (1985) instructional leadership framework paved the way for subsequent development of the PIMRS (Hallinger, 1982/1990). The PIMRS consists of 50 “behaviorally anchored” items (Latham & Wexley, 1981) designed to measure the 3 dimensions and 10 functions shown earlier in Figure 1. Three parallel forms of the PIMRS have been developed: Principal Form (self-assessment), Teacher Form, and Supervisor Form. The items that comprise each form are identical; only the stems change to reflect the differing perspectives of the role groups (Hallinger & Murphy, 1985; Hallinger & Wang, 2015).

For each item, the rater assesses the frequency with which the principal has been observed by the respondent to enact a particular instructional leadership behavior. The items employ a Likert-type scale ranging from (1) almost never to (5) almost always. The instrument can be scored by calculating scores for the full test and/or for the dimension/function-level constructs. The resulting data is used to generate profiles that describe the level of principal engage-ment in the dimensions comprising this role (Hallinger & Wang, 2015).

The PIMRS Principal and Teacher Forms have been tested extensively for reliability and validity (Hallinger & Wang, 2015; Hallinger et al., 2013). A recent meta-analytic study of 40+ PIMRS studies found a full-scale alpha reliability (Cronbach, 1951) of .96 for the Principal Form and a Generalizability Theory reliability rho hat (Kane, Gilmore, & Crooks, 1976) of .99 for the Teacher Form (Hallinger et al., 2013). All three dimension-level reliability coefficients were found to exceed .90 for both forms. Function-level reliabil-ity coefficients ranged from .75 to .86 for the Principal Form and from .90 to .95 for the Teacher Form (Hallinger et al., 2013).

Hallinger and Wang (2015) assessed the validity of the PIMRS using five separate validation procedures (Cronbach, 1988; Nunnally & Bernstein, 1994). “Content validity” was supported through judgments of constructs and items by school professionals (Hallinger & Wang, 2015). Construct validity was assessed at the “scale level” through school document analysis, analysis of intercorrelations among subscales, and confirmatory factor analy-sis (Kline, 2013). Construct validity was further assessed at the “item level” using Rasch analysis (Wright, Linacre, Gustafen, & Martin-Lof, 1994). Hallinger and Wang (2015) reported that 90% of the items comprising the 3 dimensions and 10 functions “fit” the proposed subscale structure. Further analysis demonstrated measurement invariance, implying that PIMRS scores




can be equated for principals across school levels. In sum, the PIMRS meets standards of reliability and validity commonly applied for research instru-ments (Cronbach, 1988; Nunnally & Bernstein, 1994).

Identification of Sources for Meta-Analysis

The authors established the following criteria for inclusion of studies in the meta-analysis.

•• Studies of gender and principal instructional leadership that had used the PIMRS for data collection

•• PIMRS studies produced as master theses, doctoral dissertations, or journal articles

•• Studies beginning in 1983, the publication year of the first PIMRS study, and continuing to 2014.

A variety of approaches were employed to identify relevant studies. We examined recently published reviews of related literature, and also employed Google Scholar™ to identify relevant journal articles. Knowing that many PIMRS studies have been conducted as doctoral dissertations, we also searched ProQuest’s Dissertations Express™. Finally, we obtained a number of raw data sets from the publisher.

After a careful review of potential sources, we identified 28 studies con-taining suitable data for meta-analysis. All were doctoral dissertations, though three had also been published as journal articles (i.e., Hallinger & Murphy, 1985; Nogay & Beebe, 2008; O’Donnell & White, 2005). Several authors had administered the PIMRS to both teachers and principals. Therefore, we concluded the search with a larger number of data sets (i.e., 40) than studies (i.e., 28; see Table 2). The total sample for this analysis consisted of 2,807 principals (1,586 males and 1,221 females) and 6,175 teacher respondents covering elementary, middle, and high schools from three countries (i.e., United States, Thailand, Taiwan).

Thirteen data sets had been collected with the PIMRS Teacher Form and 27 data sets with the PIMRS Principal Form. Since Hallinger and Wang (2015) noted that scores generated from different PIMRS forms cannot be treated as equivalent, our approach to using data collected with the two dif-ferent forms requires elaboration. First, our meta-analysis was based on tests of difference and significance rather than comparisons of absolute score lev-els (Glass, 1977; Hedges & Olkin, 1985; Lipsey & Wilson, 2001). Second, even when the teacher and principal data sets had been obtained from a single study the data were collected from different respondents. Therefore, we were



Hallinger et al. 9

able to treat the data sets as “independent samples” and combined for the purposes of analysis. Nonetheless, concerns over the equivalency of data obtained from the PIMRS Teacher and Principal Forms implied the need to examine this issue empirically (see data analysis section below).

There were 17 data sets where results were reported for the full scale (i.e., PIMRS Total Score), 16 for the three dimensions, and 31 for the 10 functions (see Table 1). The varying approaches to reporting PIMRS results represented a practical constraint since one cannot transform dimension- or function-level scores (e.g., means and SDs) extracted from data tables into full scale scores. This constraint is common in meta-analysis and reduced the sample size for our meta-analytic procedures (Lipsey & Wilson, 2001).

In any research synthesis or meta-analysis, the composition of the studies is a salient consideration both for the researchers and readers (Hallinger, 2014; Hallinger & Wang, 2015; Lipsey & Wilson, 2001). Confidence in the findings of meta-analysis is only as high as the quality of the studies that contributed data toward the results. Doctoral dissertations, while vetted by the granting universities, do not typically employ review procedures as rigor-ous as those used by refereed journals (e.g., double-blind peer review). Although scholars actually recommend the inclusion of dissertations in meta-analysis to reduce publication bias (Sutton, Song, Gilbody, & Abrams, 2000), meta-analyses that rely solely or heavily on doctoral dissertations come under greater scrutiny with respect to data quality. The current meta-analytic study relied largely on unpublished doctoral dissertations (89% of sources; see Table 2), a feature that we need to place in perspective for the reader.

Over the past 50 years, scholars have substantiated that the bulk of empiri-cal research on educational leadership is conducted by doctoral students (Bridges, 1982; Hallinger, 2011; Hallinger & Heck, 1996). If this is the case, then one would expect research syntheses, including meta-analytic reviews, in our field to rely heavily on doctoral research. The authors examined this assumption empirically by examining the composition of “all meta-analytic reviews of educational leadership published in refereed journals.” As it is only recently that meta-analysis has become more commonly used in educa-tional leadership, our Google scholar search identified only seven studies (see Table 2).

Data presented in Table 2 indicate that the high percentage of doctoral dis-sertations comprising the database in our study (i.e., 89%) was reasonably consistent with the percentage of dissertations featured in other meta-analytic reviews of educational leadership. The average percentage of dissertations per study was 65%, and doctoral dissertations comprised 90% of all sources included in these published studies. Moreover, it should be noted that these figures actually underestimate the actual percentage of doctoral dissertations.



10

Tab

le 1

. Su

mm

ary

of S

elec

ted

Stud

ies

and

Stud

y C

hara

cter

istic

s.

No.

Aut

hor

Yea

rD

ata

Typ

eD

ata

Sour

ceSc

hool

Le

vel

Sam

ple

Size

PIM

RS

Leve

ls o

f D

ata

Ana

lysi

sPr

inci

pals

Mal

e Pr

inci

pals

Fem

ale

Prin

cipa

lsT

each

er

1A

tkin

son

2013

Raw

TA

LL75

1065

1,31

1A

ll le

vels

2Ba

bcoc

k19

91Ex

tP

E21

312

786

—Fu

nctio

ns 3

Baue

r20

13R

awP

H77

2156

—A

ll le

vels

4Ba

uer

2013

Raw

TH

7721

5615

4A

ll le

vels

5C

arr

2011

Raw

PE

62

4—

All

leve

ls 6

Dun

n20

10Ex

tP

M&

H12

875

53—

Func

tions

7D

urye

a19

88Ex

tP

ALL

289

19—

Dim

ensi

ons

8G

allo

n19

98Ex

tT

ALL

154

1170

4Fu

nctio

ns 9

Gei

selm

an20

04Ex

tP

E18

076

104

—D

imen

sion

s10

Gei

selm

an20

04Ex

tT

E18

076

104

799

Dim

ensi

ons

11G

roff

2002

Ext

PE

110

5951

—Fu

nctio

ns12

Hal

linge

ra19

83Ex

tT

E6

33

104

Func

tions

13H

owel

lb19

89Ex

tP1

E20

214

359

—Fu

nctio

ns14

How

ell

1989

Ext

P2E

202

143

59—

Func

tions

15H

owel

l19

89Ex

tP3

E37

307

—Fu

nctio

ns16

How

ell

1989

Ext

P4E

3730

7—

Func

tions

17Je

nnin

gs20

13Ex

tP

E30

921

—Fu

nctio

ns18

Lehl

1989

Ext

PE

167

9—

Tot

al s

core

19M

arin

2013

Ext

PA

LL13

974

65—

Func

tions

20M

cCab

e19

93Ex

tP

M&

H5

32

—A

ll le

vels

21M

cDon

ald

2012

Raw

PE

142

12—

All

leve

ls (con

tinue

d)



11

No.

Aut

hor

Yea

rD

ata

Typ

eD

ata

Sour

ceSc

hool

Le

vel

Sam

ple

Size

PIM

RS

Leve

ls o

f D

ata

Ana

lysi

sPr

inci

pals

Mal

e Pr

inci

pals

Fem

ale

Prin

cipa

lsT

each

er

22M

cDon

ald

2012

Raw

TE

142

1210

4A

ll le

vels

23M

iller

c19

91Ex

tP1

M44

3410

—T

otal

Sco

re24

Mill

er19

91Ex

tP2

M27

243

—T

otal

Sco

re25

Mun

roe

2009

Raw

PE

3514

21—

All

leve

ls26

Nog

aya

1995

Ext

PM

&H

6133

28—

All

leve

ls27

Nog

aya

1995

Ext

TM

&H

6133

2842

4A

ll le

vels

28O

’Don

nela

2002

Ext

PM

7557

18—

Dim

ensi

ons

29O

’Don

nella

2002

Ext

TM

7557

1830

0D

imen

sion

s30

Pear

iso

2011

Ext

PE

3624

12—

Func

tions

31Po

ovat

anik

ul19

93Ex

tP

M&

H44

3113

—T

otal

Sco

re32

Prat

ley

1992

Ext

PM

&H

7664

12—

Func

tions

33Pr

atle

y19

92Ex

tT

M&

H62

5210

420

Func

tions

34R

ose

1991

Ext

PA

LL71

5516

—Fu

nctio

ns35

Ruz

icsk

a19

89Ex

tT

E&M

118

319

2A

ll le

vels

36Sc

hoch

1992

Ext

TE

7040

3053

7Fu

nctio

ns37

Tro

ut19

85Ex

tP

M&

H28

1414

—T

otal

& F

unct

ions

38T

rout

1985

Ext

TM

&H

2814

1427

5T

otal

& F

unct

ions

39Y

ang

1996

Ext

PE

106

5353

—Fu

nctio

ns40

Yan

g19

96Ex

tT

E10

653

5385

1Fu

nctio

ns

Not

e. P

IMR

S =

Pri

ncip

al In

stru

ctio

nal M

anag

emen

t R

atin

g Sc

ale;

T =

tea

cher

; P =

pri

ncip

al; E

= e

lem

enta

ry s

choo

l; M

= m

iddl

e sc

hool

; H =

hig

h sc

hool

; ALL

= e

lem

enta

ry, m

iddl

e, a

nd h

igh

scho

ol; R

aw =

raw

dat

a fr

om a

utho

r; E

xt =

dat

a ex

trac

ted

from

dis

sert

atio

n.a P

ublis

hed

stud

ies.

bIn

the

How

ell s

tudy

, we

obta

ined

four

sep

arat

e se

ts o

f effe

ct s

ize

estim

ates

. cIn

the

Mill

er s

tudy

, we

obta

ined

tw

o se

ts o

f effe

ct

size

est

imat

es fr

om t

wo

grou

ps o

f pri

ncip

als

(P1

&P2

).

Tab

le 1

. (c

ont

inue

d)




The column “published papers” in Table 2 includes not only refereed articles but also book chapters and conference papers. It would be difficult to argue that the latter two sources undergo more rigorous review than doctoral stud-ies. In sum, we conclude that the composition of the database of studies ana-lyzed in this study is consistent with other published meta-analyses in our field.

Data Collection

Data extraction. Each study was reviewed to identify key characteristics. Descriptive (e.g., sample sizes, school level, year) as well as statistical data were extracted from each study and entered into a MS Excel spreadsheet. In cases where we had obtained a raw data set, we generated relevant statistics for all three construct levels. The final database used for meta-analysis included 40 data sets drawn from 28 studies containing a variety of different statistics organized by construct level.

Table 2. Distribution of Dissertations and Published Studies in Meta-Analyses in Educational Leadership Published in Refereed Journals.

Studies Year Journal

Unpublished Dissertations

Pub & Confa

Total Sources

Raw % Raw Raw

1. Eagly et al. 1992 EAQ 112 90% 13 1252. Witziers et al. 2003 EAQ 2 5% 35 373. Chin 2007 APER 28 100% 0 284. Robinson et al. 2008 EAQ 0 0% 22 225. Leithwood and Sun 2012 EAQ 79 100% 0 796. Hallinger et al. 2013 EAQ 129 94% 6 1357. Sun and Leithwood 2015 SESI 79 72% 31 1108. Marzano, Waters,

and McNulty2005 Book 54 80% 15 69

Across 8 studiesb 483 80% 122 605Average per studyb 60 67% 15 76Current study 2016 25 89% 3 28

Note. EAQ = Educational Administration Quarterly; APER = Asia Pacific Education Review; SESI = School Effectiveness and School Improvement.aThis column contains both papers published in refereed journals as well as conference papers and book chapters. bThe summary “across the 8 studies” totals the sources for all studies reports the average within this meta-analytic literature (e.g., grand mean dissertations across 8 studies = 80%). The next line lists the average per study where the mean dissertations per study = 66%.



Hallinger et al. 13

Data quality. Our reliance on doctoral dissertations, although justified above, still raises issues with respect to “data quality.” Historically, doctoral studies in educational leadership have tended to rely on measurement instruments of dubious reliability and validity, and simple statistics (Bridges, 1982; Hal-linger, 2011). First, as noted above, in this meta-analysis all the studies employed the same validated instrument, the PIMRS, for the dependent vari-able. This not only reduced concerns over data quality, but meant that we could focus on the same set of conceptual constructs across studies. Second, the independent variable of interest, gender, was easily measured as a discrete variable (i.e., male/female). Together, these features enabled the authors to achieve a larger sample size for many of our analyses than has been the case in other meta-analyses (e.g., Robinson et al., 2008)—even those where the total sample of studies was larger (e.g., Eagly et al., 1992; Witziers et al., 2003). Finally, for the purposes of our meta-analysis, we only required rela-tively simple statistics to compute effect sizes. This contrasts, for example, with the corpus of studies comprising meta-analyses of leadership effects on student achievement (e.g., Robinson et al., 2008; Scheerens, 2012; Witziers et al., 2003). In sum, after examining these features of the composite studies, we did not view the reliance on doctoral dissertations as a major constraint with respect to quality for this meta-analysis.

Data Analysis

Our data analysis procedures examined differences in the direction, magni-tude, and significance of differences in the level of instructional leadership exercised by male and female principals across the studies. Then we analyzed the effect sizes (ES) to determine if there were identifiable patterns of differ-ence in how male and female principals enacted the various dimensions and functions of this role.

Transforming statistics into a common metric. We began by transforming statis-tics generated from different tests (e.g., t test, ANOVA, regression) into a common effect size metric. Researchers conducting meta-analyses in educa-tional leadership have employed a variety of effect size metrics including Cohen’s d, z1, r, and g (e.g., Chin, 2007; Eagly et al., 1992; Leithwood & Sun, 2012; Robinson et al., 2008; Scheerens, 2012; Sun & Leithwood, 2015; Wit-ziers et al., 2003). The leadership effects literature often reports “variance explained” as a measure of the predictive power of one variable on another (e.g., Hallinger & Heck, 1996; Robinson et al., 2008; Witziers et al., 2003).

In this study, however, we had a different objective. Rather than testing a set of variables for their predictive power, we sought to determine whether




and to what extent there was a “difference” in the leadership exercised by male and female principals. For this purpose, we selected Cohen’s d, one of the most widely used metrics in meta-analysis. In this study, Cohen’s d repre-sents the “standardized mean difference” between the scores of male and female principals on the PIMRS (Cohen, 1992; Hedges & Olkin, 1985).

Tables presenting mean and standard deviation, correlation r, and ANOVA eta-squared (η2) statistics had been collected from the 28 studies. We trans-formed the r and η2 statistics into Cohen’s d using formulas specified in the literature (e.g., Cohen, 1988; Rosenthal, 1994). When working with means and standard deviations as well as with the raw data sets, we computed Cohen’s d using the following formula (Cohen, 1992):

dX X

S=

−M F

p

, (1)

where X M is the mean for male principals, X F is the mean for female prin-cipals, and Sp is the pooled standard deviation. The standard error of the effect size estimate is

SE dn n

n n

d

n n( )

( ).=

++

+M F

M F M F

2

2 (2)

Using these formulae, we produced a Cohen’s d statistic for each available construct level (i.e., PIMRS Total Score, Dimension, Function) across the corpus of studies (see Tables 3 and 4). To identify the direction of the effect size, we examined the coding of the studies and redefined the d values as either positive or negative. Throughout our analyses a positive d indicates more active instructional leadership from male principals, and a negative d indicates more active instructional leadership from the female group.

Testing for homogeneity. The next step in data analysis involved testing homo-geneity of the data set. This involves examining whether the distribution of effect sizes across data sets may have been influenced by sources other than sampling error (Lipsey & Wilson, 2001). We employed the Q statistic in our test for homogeneity. Q was computed as

Q SE d dSE d d

SE di i i

i i i

ii

= × −×

−−

−∑ ∑∑

( ( ) )[ ( ( ) )]

( ),2 2

2 2

2 (3)

where i indexes data sets, Q is asymptotically chi-square distributed with degrees of freedom of the number of studies minus one, and the others as defined previously. If homogeneity of the study sample is rejected, then



Hallinger et al. 15

additional tests are required to explore other potential sources of error in the sample (Leithwood & Sun, 2012; Lipsey & Wilson, 2001). If the homogene-ity assumption is not rejected, then it is possible to “combine” effect sizes obtained from multiple data sets into a weighted mean effect size (Lipsey & Wilson, 2001).

Weighting the effect sizes. Since sample size differs across studies, researchers commonly “weight” the effect size by the study variance. This produces an additional statistic, weighted mean effect size. We computed this statistic, using the following formula:

dSE d d

SE d

i ii

ii

=−

−

∑∑

( )

( ),

2

2 (4)

with a standard error of

SE dSE dii

( )( )

.=−∑

12

(5)

We computed the weighted mean effect size and its standard error to assess the direction, magnitude, and significance of gender differences on 14 PIMRS constructs (i.e., Total Score, 3 Dimensions, 10 Functions). These analyses also informed our analysis of patterns of instructional leadership exercised by male and female principals on the PIMRS dimensions and functions.

Combining effect sizes. Lipsey and Wilson (2001) demonstrated that when a study reports multiple effect sizes representing the same construct (e.g., three PIMRS Dimensions), they can be averaged to create a “total effect size” sta-tistic for the study (p. 101). Rosenthal and Rubin (1986) further supported the use of this approach when the dependent measures are highly correlated. This was the case in the current study (Hallinger & Murphy, 1985; Hallinger & Wang, 2015).

In the 16 studies where we possessed a d statistic for the PIMRS Total Score, this was used to represent a “total effect size” for the study. For studies that did not report results based on the total score, we averaged ds reported at either the dimension or function level to produce a “total effect size” statistic for the study. Using Babcock’s (1991) study as an example, we extracted principal means and standard deviations reported at the function level. We computed 10 function-level ds and standard errors. We then computed a total mean effect size for the study (i.e., −0.552, SE = 0.142; reported in Table 4).




A similar approach was employed for the other 20 data sets. Finally, we aver-aged the 40 study-level effect sizes into an overall effect size (Lipsey & Wilson, 2001).

Examining differences between principal and teacher data sets. In meta-analy-sis, we can treat the data source as a “study characteristic” and test the between-group variance, even if the assumption of homogeneity has not been rejected. Therefore, as a follow-up test, we calculated the between-groups homogeneity statistic (QB), by conducting the analysis of analog to the ANOVA, under the fixed-effect model. This approach is widely used by researchers when testing the effect of categorical variables, in this case teach-ers or principals as the data source (see Eagly et al., 1992; Lipsey & Wilson, 2001).

We first calculated the Qj for each group (i.e., Q statistics for principals [QPrin] and for teachers [QTch]). We then obtained the within-groups homoge-neity (Qw) by summing up the Qj (j represents the number of groups). We then computed QB using the Q total (QT), which we obtained from the previous homogeneity analysis, minus Qw. If QB is significant, it means the difference of mean effect sizes across groups cannot be explained only by sampling error and there is a significant difference between the groups.

Assessing differences in patterns of instructional leadership practice. The final analytical step involved descriptive analysis of patterns in the instructional leadership practices of the male and female principals. This analysis was aimed at determining if observed differences formed a general pattern or if differences were concentrated in particular instructional leadership dimen-sions or functions. This involved re-examining data generated in the earlier analyses conducted at the dimension and function levels.

Interpreting the magnitude of effect sizes. Following these meta-analytic tests, we faced the task of interpreting the results. As noted above, in meta-analysis different statistics have been used by researchers for measuring effect size (d, r, r2, etc.). Moreover, for a given statistic, researchers have applied different “standards” (i.e., cutoff points) for interpreting the practical meaning of effect sizes (Glass, 1977; Hedges & Olkin, 1985; Lipsey & Wilson, 2001). In selecting a “standard” for this study, we referred to Cohen’s (1992) discus-sion of this issue.

My intent was that medium ES represent an effect likely to be visible to the naked eye of a careful observer. (It has since been noted in effect size surveys that it approximates the average size of observed effects in various fields.) I set



Hallinger et al. 17

small ES to be noticeably smaller than medium but not so small as to be trivial, and I set large ES to be the same distance above medium as small was below it. . . . For this test [i.e., Cohen’s d], the Ho is that d = 0 and the small, medium, and large ESs (or H1s) are d = .20, .50, and .80. (Cohen, 1992, pp. 156-157)

Cohen (1992) further noted that interpretation of the magnitude of effect sizes varies across different fields of inquiry. Relevant factors to consider include the nature of the phenomenon being studied (e.g., leadership) as well as the research methods used by researchers in the sample of studies. Eagly et al. (1992) observed that widely varying and largely uncontrolled condi-tions under which school leadership studies are typically conducted tend to reduce the magnitude of effect sizes. This constraint was relevant in the cur-rent study in which 100% of the data had been obtained through nonexperi-mental research conducted by doctoral students. Eagly and colleagues further concluded: “As a consequence, neither sex nor other variables would ordinar-ily produce large effect sizes in studies of principals’ leadership style” (Eagly et al., 1992, pp. 92-93). This assertion has been borne out in subsequent meta-analytic studies of school leadership conducted over the succeeding 20 years, which typically report “small effects” (e.g., Chin, 2007; Leithwood & Sun, 2012; Robinson et al., 2008; Scheerens, 2012; Sun & Leithwood, 2015; Witziers et al., 2003).

We kept these factors in mind when interpreting the magnitude of effect sizes in our study. Operationally, we followed Cohen’s (1992) recommenda-tion and coded d > .20 as a “small effect,” d > .50 as a “moderate effect,” and d > .80 as a “large effect.” These standards are applied in all subsequent tables and guide our discussion of the meta-analytic results. In addition to the d statistic, it is common to use Cohen’s U3 statistic to describe the mean dif-ference between two groups. This represents the percentage of one group that will be above the mean of the other group. For example, ds of .20, .50, and .80 indicate that 58%, 69%, and 79% of one group will be above the mean of the other group, respectively.

Results

We used the analytical procedures described above to determine the level and pattern of differences in instructional leadership exercised by male and female principals. We begin by describing the direction and magnitude of effect sizes reported from the studies at different construct levels. Next, we report the homogeneity analysis used to determine the approach required for subsequent meta-analytic procedures. Then we report the meta-analyses con-ducted for different construct levels as well as for the overall data set. Finally,




we examine the pattern of differences between male and female principals with respect to the instructional leadership dimensions and functions.

Descriptive Results

The 12 studies (17 data sets) that reported PIMRS Total Scores (i.e., full scale score) included 623 principals (273 males, 350 females). The direction of d suggested more active instructional leadership from female principals in 14 studies and males in 3 studies (see Table 2). The magnitude of d exceeded the cutoff representing a “practical effect” (i.e., .20) in 10 of the 14 studies where the direction suggested more active instructional leadership from female principals, and in only one study where the direction favored male principals (see Table 3).

Table 4 presents 48 ds drawn from 16 studies that reported dimension-level scores for 974 principals (424 males, 550 females). Forty of the dimen-sion-level ds suggested more active instructional leadership among female principals, 27 of which offered evidence of a practical effect (i.e., the d statis-tic exceeded −.20). In contrast, only two of the eight ds suggesting more active instructional leadership from male principals reached the level indicat-ing a practical effect (i.e., >. 20).

Analysis of 31 relevant data sets yielded 287 PIMRS function-level scores for 2,138 principals (1,215 males, 923 females; data not shown). Among the 231 function-level ds in the direction of female principals, 150 indicated a practical effect (e.g., 40 large effect, 36 moderate effect, and 74 small effect). In contrast, 16 indicated a difference in favor of male princi-pals at a level that could be interpreted as small. In sum, the direction of ds at all three construct levels suggested that female principals were per-ceived as exercising more active instructional leadership. The magnitude of ds, however, suggested that the effect was small to moderate in size. Subsequent analyses sought to further refine and test the significance of these descriptive findings.

Homogeneity Analysis

As discussed earlier, the appropriate sequence of meta-analytic procedures depends on the homogeneity of the study sample. A significant Q statistic indicates that the variability of the effect sizes cannot be explained only by sampling error. In this event, a single weighted mean effect size is not capable of estimating the population mean. Thus, our next step involved computing Q statistics for all 14 constructs.



19

Tab

le 3

. Ef

fect

Siz

e Es

timat

es fo

r 16

Stu

dies

Bas

ed o

n th

e PI

MR

S T

otal

Sco

re.

No.

Aut

hor

Yea

rD

ata

Typ

eD

ata

Sour

ceSc

hool

Le

vel

Sam

ple

Size

Tot

al P

IMR

S

Prin

cipa

lsM

ale

Fem

ale

Tea

cher

sd

se

Mag

nitu

de

and

Dir

ectio

n of

d

1A

tkin

son

2013

Raw

TA

LL75

1065

1311

−0.

391

0.21

4Fe

m+

2Ba

uer

2013

Raw

PH

7721

56−

0.07

50.

104

Fem

3Ba

uer

2013

Raw

TH

7721

5615

40.

062

0.08

6M

ale

4C

arr

2011

Raw

PE

62

4—

−1.

892

1.02

4Fe

m+

++

5Le

hl19

89Ex

tP

E16

79

−0.

995

0.26

8Fe

m+

++

6M

cCab

e19

93Ex

tP

M&

H5

32

−0.

013

0.24

0Fe

m 7

McD

onal

d20

12R

awP

E14

212

−0.

532

0.39

9Fe

m+

+ 8

McD

onal

d20

12R

awT

E14

212

104

−1.

414

0.30

3Fe

m+

++

9M

iller

1991

Ext

P1a

M44

3410

−0.

690

5.70

2Fe

m+

+10

Mill

er19

91Ex

tP2

M27

243

0.73

012

.134

Mal

e++

11M

unro

e20

09R

awP

E35

1421

−0.

042

0.14

1Fe

m12

Nog

ay19

95Ex

tP

M&

H61

3328

−0.

303

0.11

3Fe

m+

13N

ogay

1995

Ext

TM

&H

6133

2842

4−

0.35

00.

143

Fem

+14

Poov

atan

ikul

1993

Ext

PM

&H

4431

13−

0.85

00.

175

Fem

++

+15

Ruz

icsk

a19

89Ex

tT

E&M

118

315

5−

2.65

20.

189

Fem

++

+16

Tro

ut19

85Ex

tP

M&

H28

1414

0.11

80.

214

Mal

e17

Tro

ut19

85Ex

tT

M&

H28

1414

560

−0.

055

0.29

5Fe

m

Tot

al62

327

335

02,

708

Not

e. P

IMR

S =

Pri

ncip

al In

stru

ctio

nal M

anag

emen

t R

atin

g Sc

ale;

T =

tea

cher

; P =

pri

ncip

al; E

= e

lem

enta

ry s

choo

l; M

= m

iddl

e sc

hool

; H =

hig

h sc

hool

; ALL

= e

lem

enta

ry, m

iddl

e, a

nd h

igh

scho

ol; R

aw =

raw

dat

a fr

om a

utho

r; E

xt =

dat

a ex

trac

ted

from

dis

sert

atio

n. P

ositi

ve d

s es

timat

es

indi

cate

mor

e ac

tive

inst

ruct

iona

l lea

ders

hip

from

mal

e pr

inci

pals

and

neg

ativ

e ds

from

fem

ale

prin

cipa

ls. I

n th

e la

st c

olum

n, F

em in

dica

tes

that

the

di

rect

ion

of d

favo

red

fem

ales

, Mal

e in

dica

tes

that

it fa

vore

d m

ales

. Whe

re t

his

is fo

llow

ed b

y a

“+”

it in

dica

tes

a sm

all e

ffect

, “+

+”

a m

oder

ate

effe

ct, a

nd “

++

+”

a la

rge

effe

ct b

ased

on

d. If

no

“+”

is p

rese

nt, i

t in

dica

tes

that

the

mag

nitu

de o

f d s

ugge

sted

no

prac

tical

effe

ct.

a Thi

s st

udy

gath

ered

dat

a fr

om t

wo

diffe

rent

sam

ples

of p

rinc

ipal

s.



20

Tab

le 4

. Ef

fect

Siz

e Es

timat

es a

t th

e PI

MR

S D

imen

sion

Lev

el fo

r 16

Stu

dies

.

No.

Aut

hor

Yea

r

Sam

ple

Size

of

Prin

cipa

lsD

efin

es t

he S

choo

l’s

Mis

sion

Man

ages

the

In

stru

ctio

nal P

rogr

amPr

omot

es a

Pos

itive

Sc

hool

Clim

ate

Tot

alM

ale

Fem

dSE

Dir

/Mag

dSE

Dir

/Mag

dSE

Dir

/Mag

1A

tkin

son

2013

7510

65−

0.21

30.

294

Fem

+−

0.51

30.

213

Fem

++

−0.

384

0.22

8Fe

m+

2Ba

uer

2013

7721

56−

0.07

70.

133

Fem

−0.

135

0.11

7Fe

m−

0.05

30.

256

Fem

3Ba

uer

2013

7721

56−

0.04

40.

127

Fem

0.11

00.

108

Mal

e0.

072

0.08

0M

ale

4C

arr

2011

62

4−

2.11

11.

059

Fem

++

+−

0.82

80.

898

Fem

++

+−

2.45

61.

119

Fem

++

+ 5

Dur

yea

1988

289

19−

0.93

81.

236

Fem

++

+−

0.56

12.

308

Fem

++

−0.

080

4.38

1Fe

m 6

Gei

selm

an20

0418

076

104

−0.

583

0.07

5Fe

m+

+−

0.49

40.

075

Fem

+−

0.28

30.

075

Fem

+ 7

Gei

selm

an20

0418

076

104

−0.

161

0.03

5Fe

m−

0.14

00.

035

Fem

−0.

100

0.03

5Fe

m 8

McC

abe

1993

53

2−

0.18

00.

425

Fem

0.14

00.

256

Mal

e0.

270

0.25

1M

ale+

9M

cDon

ald

2012

142

12−

0.29

50.

453

Fem

+−

0.45

40.

526

Fem

+−

0.57

40.

402

Fem

++

10M

cDon

ald

2012

142

12−

1.29

10.

325

Fem

++

+−

1.50

80.

328

Fem

++

+−

1.29

40.

316

Fem

++

+11

Mun

roe

2009

3514

210.

158

0.16

6M

ale

−0.

122

0.17

3Fe

m−

0.06

20.

153

Fem

12N

ogay

1995

6133

28−

0.02

60.

178

Fem

−0.

158

0.13

3Fe

m−

0.40

70.

108

Fem

+13

Nog

ay19

9561

3328

−0.

543

0.16

1Fe

m+

+−

0.21

50.

150

Fem

+−

0.28

20.

145

Fem

+14

O’D

onne

ll20

0275

5718

0.22

70.

167

Mal

e+0.

110

0.17

1M

ale

0.14

80.

128

Mal

e15

O’D

onne

ll20

0275

5718

−0.

320

0.16

9Fe

m+

−0.

361

0.15

7Fe

m+

−0.

430

0.14

5Fe

m+

16R

uzic

ska

1989

118

3−

4.20

40.

153

Fem

++

+−

2.89

60.

168

Fem

++

+−

1.70

00.

265

Fem

++

+

N16

974

424

550

Not

e. P

ositi

ve d

s es

timat

es in

dica

te m

ore

activ

e in

stru

ctio

nal l

eade

rshi

p fr

om m

ale

prin

cipa

ls a

nd n

egat

ive

ds fr

om fe

mal

e pr

inci

pals

. In

the

last

co

lum

n, F

em in

dica

tes

that

the

dir

ectio

n of

d fa

vore

d fe

mal

es, M

ale

indi

cate

s th

at it

favo

red

mal

es. W

here

thi

s is

follo

wed

by

a “+

” it

indi

cate

s a

smal

l effe

ct, “

++

” a

mod

erat

e ef

fect

, and

“+

++

” a

larg

e ef

fect

bas

ed o

n d.

If n

o “+

” is

pre

sent

, it

indi

cate

s th

at t

he m

agni

tude

of d

sug

gest

ed n

o pr

actic

al e

ffect

.



Hallinger et al. 21

Q for the PIMRS Total Score is 24.854, which is not significant (p < .05, critical value of chi-square at α = .05, df = 16, is 26.296). Moreover, as shown in Table 5, 12 of the 14 Q statistics are not significant. These results indicate that the assumption of homogeneity is not rejected, and the data sets can be analyzed without additional testing of moderators. Therefore, we proceed next to the meta-analytic synthesis where we computed the weighted mean effect size for all 14 constructs to determine the relationship between princi-pal gender and instructional leadership.

Meta-Analysis of Effect Sizes by Construct Level

In Table 6 we report the weighted mean effect sizes for the PIMRS Total Score, 3 Dimensions, and 10 Functions. These were all negative (ranging from −0.159 to −0.358, M = −0.257), “small” in magnitude, and statistically significant (p < .01). The confidence intervals were not overly large and the upper level of all 14 confidence intervals still fell in the negative effects domain. These results further affirm the finding of small but statistically sig-nificant gender differences favoring more active instructional leadership from the female principals.

Table 5. Q Statistics for 14 PIMRS Constructs.

Instructional Leadership Constructs No. of ds Q df

PIMRS Total Score 17 24.854 16Dimensions Defines the School’s Mission 16 32.350* 15 Manages the Instructional Program 16 20.011 15 Develops the School Learning Climate 16 16.061 15Functions Frames the School’s Goals 29 30.870 28 Communicates the School’s Goals 28 31.468 27 Supervises and Evaluates Instruction 28 24.388 27 Coordinates the Curriculum 28 31.322 27 Monitors Student Progress 29 256.276* 28 Protects Instructional Time 30 34.156 29 Maintains High Visibility 28 15.020 27 Provides Incentives for Teachers 29 34.743 28 Promotes Professional Development 29 29.425 28 Provides Incentives for Learning 29 26.762 28

Note. PIMRS = Principal Instructional Management Rating Scale.*p < .05.



22

Tab

le 6

. Su

mm

ary

of W

eigh

ted

Mea

n Ef

fect

Siz

e Es

timat

es a

nd R

elat

ed S

tatis

tics

by C

onst

ruct

Lev

el.

Inst

ruct

iona

l Lea

ders

hip

Con

stru

cts

Dat

a Se

tsd

SED

ir &

Mag

Upp

er C

ILo

wer

CI

Z S

core

PIM

RS

Tot

al S

core

17−

0.29

50.

091

Fem

+−

0.11

6−

0.47

43.

236*

*D

imen

sion

s

D

efin

es t

he S

choo

l’s M

issi

on16

−0.

281

0.07

0Fe

m+

−0.

143

−0.

419

3.99

0**

M

anag

es t

he In

stru

ctio

nal P

rogr

am16

−0.

258

0.07

0Fe

m+

−0.

121

−0.

396

3.68

9**

Pr

omot

es a

Pos

itive

Sch

ool C

limat

e16

−0.

206

0.07

0Fe

m+

−0.

069

−0.

343

2.95

2**

Func

tions

Fram

es t

he S

choo

l’s G

oals

29−

0.24

20.

050

Fem

+−

0.14

3−

0.34

14.

809*

*

Com

mun

icat

es t

he S

choo

l’s G

oals

28−

0.27

80.

050

Fem

+−

0.17

9−

0.37

75.

503*

*

Supe

rvis

es a

nd E

valu

ates

Inst

ruct

ion

28−

0.24

40.

050

Fem

+−

0.14

5−

0.34

34.

844*

*

Coo

rdin

ates

the

Cur

ricu

lum

28−

0.25

90.

050

Fem

+−

0.16

1−

0.35

85.

146*

*

Mon

itors

Stu

dent

Pro

gres

s29

−0.

358

0.05

2Fe

m+

−0.

255

−0.

460

6.83

3**

Pr

otec

ts In

stru

ctio

nal T

ime

30−

0.21

40.

048

Fem

+−

0.11

9−

0.30

84.

409*

*

Mai

ntai

ns H

igh

Vis

ibili

ty28

−0.

159

0.05

0Fe

m−

0.06

1−

0.25

83.

175*

*

Prov

ides

Ince

ntiv

es fo

r T

each

ers

29−

0.28

00.

049

Fem

+−

0.18

4−

0.37

65.

738*

*

Prom

otes

Pro

fess

iona

l Dev

elop

men

t29

−0.

332

0.04

9Fe

m+

−0.

237

−0.

428

6.80

1**

Pr

ovid

es In

cent

ives

for

Lear

ning

29−

0.19

80.

050

Fem

−0.

099

−0.

296

3.93

5**

Not

e. P

ositi

ve e

ffect

siz

e es

timat

es in

dica

te m

ore

activ

e in

stru

ctio

nal l

eade

rshi

p fr

om m

ale

prin

cipa

ls a

nd n

egat

ive

effe

ct s

izes

from

fem

ales

.**

Whe

n Z

sco

re e

xcee

ds 1

.96

it is

sig

nific

ant

at p

< .0

1 (t

wo

taile

d).



Hallinger et al. 23

Overall Effect Size Analyses

Next, we averaged the weighted mean effect size into a single mean value for each study (see Table 7). In 33 of the 37 data sets (89%), the direction of the study’s mean effect size indicated more active instructional leadership by female principals. Finally, we computed a summary mean effect size for the full set of studies of −0.303 (SE = 0.043; CI = [−0.206, −.0371]). The Z value of 7.046 indicates that the overall effect size statistic is significant at the p < .01 level (see Table 7). A d of −0.303 suggests that 62% of females were above the mean of males. This analysis again supports the conclusion that there is a small but statistically significant difference in principal instruc-tional leadership, with females demonstrating more active engagement in this role.

Comparing Teacher and Principal Perceptions of Principal Instructional Leadership

Next, for reasons discussed earlier, we reanalyzed the teacher and principal data sets separately. We obtained a weighted mean effect size of −0.327 (upper CI −0.428, lower CI −0.227) for the principal data sets, and −0.208 (upper CI −0.353, lower CI −0.064) for the teacher respondent data sets. These d statistics, −0.327 and −0.208, suggest that 63% and 58% of females were above the mean of males, respectively. Although the results suggest somewhat larger differences perceived within the principal self-report data, both ds indicated small but statistically significant differences in favor of female principals.

To determine whether these differences were statistically significant, we calculated the between-groups homogeneity statistic (QB) using procedures specified earlier. QB, was 1.755, with 1 degree of freedom (i.e., the number of categories minus 1).2 Q was not statistically significant at p < .05, thereby indicating no significant between-groups effect based on the data source (the critical value of chi-square at α = .05 and df = 1 is 3.84). This affirmed the acceptability of our meta-analytic procedures that combined the teacher and principal data sets into a single database for analysis.

Nature of Gender Differences in Principal Instructional Leadership

Our next task was to examine the pattern of differences observed between the instructional leadership of the male and female principals. Simply stated, we wished to determine if differences in the instructional leadership of male and




Table 7. Effect Size Estimates on PIMRS Total Score by Study.

No. Author YearData Type

Data Source

School Level

N Principals

Total Effect Size

d SE Dir/Mag

1 Atkinson 2013 Raw T ALL 31 −0.391 0.341 Fem+ 2 Babcock 1991 Ext P E 215 −0.552 0.142 Fem++ 3 Bauer 2013 Raw P H 77 −0.075 0.256 Fem 4 Bauer 2013 Raw T H 77 0.062 0.163 Male 5 Carr 2011 Raw P E 6 −1.892 1.024 Fem+++ 6 Dunn 2010 Ext P M&H 128 −0.215 0.180 Fem+ 7 Duryea 1988 Ext P ALL 28 −0.527 0.411 Fem++ 8 Gallon 1998 Ext T ALL 15 −0.061 0.584 Fem 9 Geiselman 2004 Ext P E 180 −0.454 0.153 Fem+10 Geiselman 2004 Ext T E 180 −0.134 0.151 Fem11 Groff 2002 Ext P E 110 −0.515 0.194 Fem++12 Hallinger 1983 Ext T E 10 −0.533 0.831 Fem++13 Howell 1989 Ext P E 202 −0.296 0.155 Fem+14 Jennings 2013 Ext P E 30 −0.024 0.398 Fem15 Lehl 1989 Ext P E 16 −0.995 0.534 Fem+++16 Marin 2013 Ext P ALL 139 −0.119 0.170 Fem17 McCabe 1993 Ext P M&H 6 −0.013 0.913 Fem18 McDonald 2012 Raw P E 15 −0.532 0.770 Fem++19 McDonald 2012 Raw T E 15 −1.414 0.809 Fem+++20 Miller 1991 Ext P1 M 44 −0.690 0.367 Fem++21 Miller 1991 Ext P2 M 27 0.730 0.620 Male++22 Munroe 2009 Raw P E 35 −0.042 0.345 Fem23 Nogay 1995 Ext P M&H 61 −0.303 0.258 Fem+24 Nogay 1995 Ext T M&H 61 −0.350 0.259 Fem+25 O’Donnell 2002 Ext P M 75 0.162 0.271 Male26 O’Donnell 2002 Ext T M 75 −0.370 0.272 Fem+27 Peariso 2011 Ext P E 36 −0.109 0.354 Fem28 Poovatanikul 1993 Ext P M&H 44 −0.850 0.343 Fem+++29 Pratley 1992 Ext P M&H 76 −0.470 0.317 Fem+30 Pratley 1992 Ext T M&H 62 −0.532 0.349 Fem++31 Rose 1991 Ext P ALL 71 −0.239 0.285 Fem+32 Ruzicska 1989 Ext T E&M 11 −2.652 0.882 Fem+++33 Schoch 1992 Ext T E 70 −0.386 0.244 Fem+34 Trout 1985 Ext P M&H 28 0.118 0.378 Male35 Trout 1985 Ext T M&H 28 −0.055 0.378 Fem++36 Yang 1996 Ext P E 106 −0.358 0.196 Fem+37 Yang 1996 Ext T E 106 −0.133 0.194 Fem

(continued)



Hallinger et al. 25

female principals were broadly distributed or concentrated in particular PIMRS dimensions and functions. To address this question, we analyzed the direction and magnitude of effect sizes across the PIMRS dimensions and functions.

Figure 2 shows data previously arrayed in Table 5 in graphic form. The visual data indicate a fairly similar pattern of results across the 3 dimensions and the 10 functions. More specifically, the direction and magnitude of effects suggest small but significant differences on each of the three PIMRS dimensions (i.e., d(D1) = −0.281, d(D2) = −0.258, d(D3) = −0.206). Although there was more varia-tion in the magnitude of effect sizes on function-level constructs (i.e., −0.159 to −0.358), all 10 ds suggested more active instructional leadership from the female principals (p < .01). Thus, gender-related differences appeared to be a “general effect” rather than concentrated in a few specific functions such as Supervising and Evaluating Instruction or Communicating the School’s Goals.

Discussion

Differences in the leadership styles employed by male and female principals were identified in the literature on educational leadership as far back as the 1950s and 1960s (e.g., Gross & Trask, 1964; Hemphill et al., 1962; Ramseyer, 1955). Over time, an accumulating body of research has yielded anecdotal references to female principals engaging the instructional leadership role more actively than male principals (Hallinger, 1983, 2011). Given the increas-ing importance assigned to instructional leadership in educational policy and practice, it appeared timely to review the empirical literature on this issue.

We employed meta-analysis to quantitatively integrate findings from 40 independent data sets drawn from 28 studies that had used the PIMRS to assess the instructional leadership of more than 2,500 principals from three countries.

No. Author YearData Type

Data Source

School Level

N Principals

Total Effect Size

d SE Dir/Mag

Weighted mean ES 2,531 −0.303 0.043 Z for mean ES 7.046**Q (df = 36) 38.824

Note. Positive ds estimates indicate more active instructional leadership from male principals and negative ds from female principals. In the last column, Fem indicates that the direction of d favored females, Male indicates that it favored males. Where this is followed by a “+” it indicates a small effect, “++” a moderate effect, and “+++” a large effect based on d. If no “+” is present, it indicates that the magnitude of d suggested no practical effect.**The ES is considered significant at p < .01 (two-tailed) when the Z score exceeds 2.58.

Table 7. (continued)




In sum, the results indicated a “small but statistically significant effect” of gen-der on instructional leadership, with more active instructional leadership from female principals. This conclusion was reflected in the direction, magnitude, and significance of the total weighted mean effect size (Cohen’s d = −0.303; U3 = 62%; p < .01). Analysis of variation in effect sizes across the 3 PIMRS dimen-sions and 10 functions further indicated that gender differences were “general” rather than concentrated in specific areas of leadership practice.

To place these results in perspective, we refer to Eagly’s meta-analytic stud-ies of gender differences in leadership (e.g., Eagly & Carli, 2003; Eagly et al., 1992; Eagly et al., 2003). Although Eagly’s research has focused on gender differences using leadership constructs other than instructional leadership, she employed “the common metric of effect sizes, which in reviews of gender and leadership generally take the form of a standardized difference (or d), defined as the difference between the mean scores of women and men” (Eagly & Carli, 2003, p. 811). With this in mind, it should be noted that the effect sizes (i.e., Cohen’s d) reported in the current study were generally as large or larger than those reported in Eagly’s rigorous meta-analytic studies of gender and leader-ship (e.g., Eagly & Carli, 2003; Eagly et al., 1992; Eagly et al., 2003).

A third issue concerns our interpretation of the summary effect size (i.e., d = −0.303, U3 = 62%) as “potentially meaningful.” In fact, the meaning of the gender-related differences in instructional leadership identified in this study can only be determined when the gender and leadership relationship is stud-ied in conjunction with relevant dependent variables of interest to scholars and policy makers (e.g., teacher efficacy, teacher learning, teaching and learning quality, teacher commitment). This suggests the need for future mul-tivariate studies that examine both moderators of gender (e.g., years of

Figure 2. Weighted mean effect sizes and confidence intervals.



Hallinger et al. 27

teaching experience of the principal) as well as mediators of leadership effects on student learning. With this in mind, we cautiously characterize the “small effect” identified in this study as “potentially meaningful.” Thus, the importance of this study lies in documenting and advancing the resolution of a question that has retained its relevance in the educational leadership litera-ture for more than 50 years.

Limitations of the Study

As is the case in all meta-analyses (Lipsey & Wilson, 2001), key limitations derive from the sample of studies. For example, scholars have observed that master theses and doctoral dissertations meet a less rigorous and consistent standard than peer-reviewed journal articles (e.g., Bridges, 1982; Hallinger, 2011; Robinson et al., 2008). Nonetheless, as described earlier, almost all research syntheses and meta-analytic reviews published in our field’s top journals have relied heavily on doctoral dissertations. Moreover, quality con-cerns within the dissertations included in this study were ameliorated, to a degree, by two features of the studies. They all employed a common vali-dated instrument, the PIMRS, for data collection, and the statistics needed from the studies for the purpose of this meta-analysis were relatively basic (e.g., correlations). Therefore, we suggest that our database of studies was neither atypical nor a reason to disqualify the results.

A second limitation lies in the concentration of data sets (37 of 40) that came from studies conducted in the United States. This limited geographic range is relevant given the diverse roles and status accorded to women across different societies (Shakeshaft, 2006). It is, however, interesting to note that results from studies conducted in Thailand (Poovatanikul, 1993) and Taiwan (Yang, 1996) were quite consistent with results from the United States. Nonetheless, gener-alization of our findings beyond the United States awaits confirmation based on a larger sample of principals working a broader set of societies.

A third limitation concerns the exclusive focus on studies that used the PIMRS. Although deployment of a common instrument in all 40 data sets clearly enabled a more powerful application of meta-analysis, it introduced an “implicit bias” toward research conducted from an “American perspective on instructional leadership” (see Robinson et al., 2008; Scheerens, 2012; Witziers et al., 2003). This further highlights the need to be cautious in gen-eralizing the results beyond the United States.

A fourth limitation also follows from the use of the PIMRS as the depen-dent measure. This instrument uses a response scale designed to assess the “relative frequency” of observed leadership behavior (Hallinger & Wang, 2015; Latham & Wexley, 1981). The instrument was not designed to yield scores that, by themselves, measure the quality, utility, suitability, or




effectiveness of the principal’s instructional leadership. Thus, our character-izations of gender differences throughout the article have emphasized the level of principal activity or engagement in the instructional leadership role. Although, we assert that this approach is both justifiable and useful, it only offers a one-dimensional perspective on how principals enact their instruc-tional leadership role (Hallinger & Wang, 2015).3

The final limitation concerns the delimited scope of this study. This research neither addressed why female principals may engage the instruc-tional leadership role more actively than males, nor whether these perceived differences carry over into relevant teaching and learning processes and out-comes. Testing explanations for causes and impact of gender differences are critical for the purposes of policy and practice. Nonetheless, the first step in this research process was to establish the nature of gender differences in the enactment of the instructional leadership role. Having taking a step toward accomplishing that objective, future research should undertake multivariate studies that advance our understanding of the causes and effects of these per-ceived gender differences in principal instructional leadership. Thus, we emphasize the need for multivariate studies that examine potential modera-tors of gender (e.g., years of prior teaching experience of the principal, school size) as well as measures of the impact of principal instructional leadership (e.g., teacher behavior, school improvement).

Implications for Practice

When shifting the focus toward practice, the findings cohere into a leadership orientation that seems remarkably well-suited to 21st-century schools. Eagly and Carli (2003) described a leadership orientation among females as task-focused and democratic, while also exhibiting key features of transforma-tional leadership. Our results extend this assertion to include a stronger disposition to engage the principal’s role as an instructional leader.

Nonetheless, the small size of the gender differences found in this study warrants caution when moving from research findings to practice. Although our results do not support a call for wholesale changes to principal selec-tion, they do call into question the passive approach to principal recruit-ment and selection that continues to predominate in many education systems around the world. For example, in the United States, it is only recently that the overall gender distribution of principals has shifted slightly in favor of females (52% to 48%) at the primary school level. Even so, 70% of America’s high school principals continue to be male (Bitterman, Goldring, & Gray, 2013). Elsewhere in the world, in both developing and industrialized societies, the percentage of female principals consistently lags well behind males (Shakeshaft, 2006): Australia (20%/80%), France



Hallinger et al. 29

(30%/70%), Italy (30%/70%), Philippines (15%/75%), the United Kingdom (25%/75%). Our findings offer an “instrumental rationale” to complement equity-based arguments for increasing the density of female principals in the population of school leaders in contexts where they continue to lag male counterparts (see Coleman, 2007; Collard, 2001; Krüger, 1996, 2008; Nogay & Beebe, 2008; Pounder & Coleman, 2002; Shakeshaft, 2006).

This study was conducted within the lineage of a half-century effort to understand linkages between school leadership, teaching, and learning (Bossert et al., 1982; Bridges, 1967; Erickson, 1979; Hallinger & Heck, 1996; Scheerens, 2012). Our decision to focus on gender differences was stimulated by a series of reports that female principals appeared to engage in more active instructional leadership than male counterparts. We hope that this meta-analytic study has both brought greater clarity to the boundaries of cur-rent empirically grounded knowledge and identified potentially fruitful direc-tions for future research and practice.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors wish to acknowledge the funding support of the Research Grant Council (RGC) of Hong Kong for its support through the General Research Fund (GRF 841711).

Notes

1. The discrepancy between the sample size of Principal Instructional Management Rating Scale studies examined in this report and the number identified by Hallinger (2011) lies in the fact that not all the studies that he had identified contained data suitable for our analyses.

2. QPrin = 24.185; QTch = 15.770; Qw = QPrin + QTch = 39.956; QB = QT − QW = 41.711 − 39.956 = 1.755.

3. Robinson et al., 2008 make a useful clarification of this distinction in their dis-cussion of leadership capabilities.

References

Adkinson, J. A. (1981). Women in school administration: A review of the research. Review of Educational Research, 51, 311-343.

Arcy, J. (1980). Self-perceptions of leader behavior of male and female elemen-tary school principals in selected school districts in the Midwest United States (Unpublished doctoral dissertation). Iowa State University, Ames, IO.




Atkinson, R. (2013). An assessment of the perceived instructional leadership behaviors of assistant principals (Unpublished doctoral dissertation). Virginia Commonwealth University, Richmond, VA.

Babcock, C. (1991). Perceived instructional leadership behavior: Gender differ-ences, female principals, male principals (Unpublished doctoral dissertation). Western Michigan University, Kalamazoo, MI.

Bauer, A. (2013). A comparative study of the leadership behaviors of the assistant prin-cipal (Unpublished doctoral dissertation). St. John’s University, New York, NY.

Bitterman, A., Goldring, R., & Gray, L. (2013). Characteristics of public and private elementary and secondary school principals in the United States: Results from the 2011-12 Schools and Staffing Survey. Washington DC: National Center for Education Statistics, U.S. Department of Education.

Bossert, S., Dwyer, D., Rowan, B., & Lee, G. (1982). The instructional management role of the principal. Educational Administration Quarterly, 18(3), 34-64.

Bridges, E. (1967). Instructional leadership: A concept re-examined. Journal of Educational Administration, 5, 136-147.

Bridges, E. (1982). Research on the school administrator: The state-of-the-art, 1967-1980. Educational Administration Quarterly, 18(3), 12-33.

Carr, T. (2011). An examination of leadership styles in implementing instructional technology: A case study to examine the elementary school principal perspective (Unpublished doctoral dissertation). North Central University, Minneapolis, MN.

Charters, W. W., Jr., & Jovick, T. D. (1981). The gender of principals and principal/teacher relations in elementary schools. In P. A. Schmuck, W. W. Charters, & R. O. Carlson (Eds.), Educational policy and management: Sex differentials (pp. 307-331). Cambridge, MA: Academic Press.

Chin, J. M. C. (2007). Meta-analysis of transformational school leadership effects on school outcomes in Taiwan and the USA. Asia Pacific Education Review, 8, 166-177.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.Coleman, M. (2007). Gender and educational leadership in England: A comparison

of secondary headteachers’ views over time. School Leadership & Management, 27, 383-399.

Collard, J. L. (2001). Leadership and gender: An Australian perspective. Educational Management Administration & Leadership, 29, 343-355.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334.

Cronbach, L. J. (1988). Five perspectives on validity argument. In H. Wainer & H. Braun (Eds.), Test validity (pp. 3-17). Hillsdale, NJ: Erlbaum.

Cunningham, J. (2004). The effect of gender-role identity of female principals in Texas on teacher perceptions of instructional leadership behaviors (Unpublished doctoral dissertation). Texas A&M University, Corpus Christi, TX.

Dunn, A. (2010). A quantitative study of the perceptions of school leadership prep-aration in the State of Georgia (Unpublished doctoral dissertation). Walden University, Minneapolis, MN.



Hallinger et al. 31

Duryea, M. (1988). The relationship between perceived principal instructional leadership behavior and locus of control (Unpublished doctoral dissertation). University of Houston, Houston, TX.

Eagly, A. H., & Carli, L. L. (2003). The female leadership advantage: An evaluation of the evidence. Leadership Quarterly, 14, 807-834.

Eagly, A. H., Johannesen-Schmidt, M. C., & van Engen, M. (2003). Transformational, transactional, and laissez faire leadership styles: A meta-analysis comparing women and men. Psychological Bulletin, 95, 569-591.

Eagly, A. H., Karau, S., & Johnson, B. (1992). Gender and leadership style among school principals: A meta-analysis. Educational Administration Quarterly, 28, 76-102.

Edmonds, R. (1979). Effective schools for the urban poor. Educational Leadership, 37, 15-24.

Erickson, D. A. (1979). Research on educational administration: The state-of-the-art. Educational Researcher, 8(3), 9-14.

Fischel, A., & Pottker, J. (1977). Performance of women principals: A review of behavioral and attitudinal studies. In J. Pottker & A. Fischel (Eds.), Sex bias in the schools (pp. 289-299). London, England: Associated University Press.

Frasher, J., & Frasher, R. (1979). Educational administration: A feminine profession. Educational Administration Quarterly, 2, 1-13.

Gallon, S. (1998). A study of teachers’ perceptions of the principal as an instructional leader: A comparative analysis of elementary, middle and senior high school principals (Unpublished doctoral dissertation). Florida International University, Miami.

Geiselman, S. (2004). Predicting elementary school student achievement: The impact of principal gender and principal leadership skills (Unpublished doctoral disser-tation). University of Louisville, Louisville, KY.

Glass, G. V. (1977). Integrating findings: The meta-analysis of research. Review of Research in Education, 5, 351-379.

Goldring, E., Huff, J., May, H., & Camburn, E. (2008). School context and individ-ual characteristics: What influences principal practice? Journal of Educational Administration, 46, 332-352.

Groff, M. (2002). Elementary instructional leadership: Factors that affect perfor-mance (Unpublished doctoral dissertation). Saint Louis University, St. Louis, MO.

Gross, N., & Trask, A. (1964). Men and women as elementary school principals. Eugene, OR: ERIC.

Gross, N., & Trask, A. (1976). The sex factor and the management of schools. New York, NY: Wiley.

Hallinger, P. (1983). Assessing the instructional management behavior of principals (Unpublished doctoral dissertation). Stanford University, Stanford, CA.

Hallinger, P. (1990). Principal instructional management rating scale. Sarasota, FL: Leading Development Associates. (Original work published 1982)

Hallinger, P. (2011). A review of three decades of doctoral studies using the Principal Instructional Management Rating Scale: A lens on methodological progress in educational leadership. Educational Administration Quarterly, 47, 271-306.




Hallinger, P. (2014). Reviewing reviews of research in educational leadership: An empirical analysis. Educational Administration Quarterly, 50, 539-576.

Hallinger, P., & Heck, R. H. (1996). Reassessing the principal’s role in school effec-tiveness: A review of empirical research, 1980-1995. Educational Administration Quarterly, 32, 5-44.

Hallinger, P. & Heck, R. H. (2002). What do you call people with visions? The role of vision, mission and goals in school leadership and improvement. In K. Leithwood, P. Hallinger, J. Chapman, D. Corson, & A. Hart (Eds.), The second international handbook of educational leadership and administration (pp. 9-40). Dordrecht: Kluwer.

Hallinger, P., & Lee, M. S. (2013). Exploring principal capacity to lead reform of teaching and learning quality in Thailand. International Journal of Educational Development, 33, 305-315.

Hallinger, P., & Murphy, J. F. (1985). Assessing the instructional leadership behavior of principals. Elementary School Journal, 86, 217-248.

Hallinger, P., & Murphy, J. F. (2012). Running on empty? Finding the time and capacity to lead learning. NASSP Bulletin, 97, 5-21.

Hallinger, P., Murphy, J., Weil, M., Mesa, R. H. P., & Mitman, A. (1983). Effective schools: Identifying the specific practices and behaviors of the principal. NASSP Bulletin, 67(463), 83-91.

Hallinger, P., & Wang, W. C. (2015). Assessing instructional leadership with the Principal Instructional Management Rating Scale. Dordrecht, Netherlands: Springer.

Hallinger, P., Wang, W. C., & Chen, C. W. (2013). Assessing the measurement prop-erties of the Principal Instructional Management Rating Scale: A meta-analysis of reliability studies. Educational Administration Quarterly, 49, 272-309.

Heck, R. H., & Hallinger, P. (2014). Modeling the effects of school leadership on teach-ing and learning over time. Journal of Educational Administration, 52, 653-681.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL: Academic Press.

Hemphill, J., Griffiths, D., & Frederiksen, N. (1962). Administrative performance and personality. New York, NY: Teachers College, Columbia University.

Howell, V. (1989). Analyses by gender and by school enrollment, ratings of elemen-tary school principals’ instructional leadership behaviors (Unpublished doctoral dissertation). Tennessee State University, Knoxville, TN.

Jennings, T. C. (2013). School principals’ personality types and instructional lead-ership practices in selected distinguished and non-distinguished Title One ele-mentary schools in South Carolina (Unpublished doctoral dissertation). South Carolina State University, Orangeburg, SC.

Kane, M. T., Gilmore, G. M., & Crooks, T. J. (1976). Student evaluations of teach-ing: The generalizability of class means. Journal of Educational Measurement, 13, 171-183.

Kline, R. B. (2013). Exploratory and confirmatory factor analysis. In Y. Petscher & C. Schatsschneider (Eds.), Applied quantitative analysis in the social sciences (pp. 171-207). New York, NY: Routledge.



Hallinger et al. 33

Krüger, M. L. (1996). Gender issues in school headship: Quality versus power? European Journal of Education, 31, 447-461.

Krüger, M. L. (2008). School leadership, sex and gender: Welcome to difference. International Journal of Leadership in Education, 11, 155-168.

Latham, G., & Wexley, K. (1981). Increasing productivity through performance appraisal. Menlo Park, CA: Addison Wesley.

Lehl, A. (1989). A comparison study of the instructional knowledge and the instruc-tional leadership of elementary principals in Nebraska (Unpublished doctoral dissertation). University of Nebraska, Lincoln, NE.

Leithwood, K., Begley, P., & Cousins, B. (1990). The nature, causes and consequences of principals’ practices: An agenda for future research. Journal of Educational Administration, 28(4), 5-31.

Leithwood, K., & Jantzi, D. (2008). Linking leadership to student learning: The con-tributions of leader efficacy. Educational Administration Quarterly, 44, 496-528.

Leithwood, K., & Montgomery, D. J. (1982). The role of the elementary school prin-cipal in program improvement. Review of Educational Research, 52, 309-339.

Leithwood, K., & Sun, J. P. (2012). The nature and effects of transformational school leadership: A meta-analytic review of unpublished research. Educational Administration Quarterly, 48, 387-423.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis. Thousand Oaks, CA: Sage.

Marin, J. (2013). The relationship between self-efficacy, ethnicity, and beliefs about diversity to instructional leadership and transformational leadership practices of urban principals (Unpublished doctoral dissertation). University of Southern California, Los Angeles, CA.

Marzano, R. J., Waters, T., & McNulty, B. A. (2005). School leadership that works: From research to results. Arlington, VA: Association for Curriculum and Staff Development.

McCabe, B. (1993). Antecedents to principal instructional management behavior (Unpublished doctoral dissertation). University of California-Santa Barbara, Santa Barbara, CA.

McDonald, J. (2012). Instructional leadership and student achievement: The role of Catholic identity in supporting instructional leadership (Unpublished doctoral dissertation). Drexel University, Philadelphia, PA.

Miller, L. (1991). Perceptions of Alabama middle school teachers and princi-pals regarding the principal’s instructional leadership role and competencies as developed by effective school studies (Unpublished doctoral dissertation). University of Alabama, Tuscaloosa, AL.

Munroe, M. D. (2009). Correlation of emotional intelligence and instructional leadership behaviors (Unpublished doctoral dissertation). University of Phoenix, Phoenix, AZ.

Murphy, J. F., & Torre, D. (2014). Vision: Essential scaffolding. Educational Management Administration & Leadership, 43, 177-197.

Nogay, K. (1995). The relationship of superordinate and subordinate gender to the perceptions of leadership behaviors of female secondary principals (Unpublished doctoral dissertation). Youngstown State University, Youngstown, OH.




Nogay, K., & Beebe, R. J. (2008). Gender and perceptions: Females as secondary principals. Journal of School Leadership, 18(6), 583-602.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York, NY: McGraw-Hill.

O’Donnell, R. (2002). Middle-level principals’ instructional leadership behaviors and student achievement (Unpublished doctoral dissertation). Lehigh University, Lehigh, PA.

O’Donnell, R., & White, G. (2005). Within the accountability era: Principals’ instruc-tional leadership behaviors and student achievement. NASSP Bulletin, 89(645), 56-71.

Peariso, J. F. (2011). A study of principals’ instructional leadership behaviors and beliefs of good pedagogical practice among effective California high schools serving socioeconomically disadvantaged and English learners (Unpublished doctoral dissertation). Liberty University, Lynchburg, VA.

Poovatanikul, V. (1993). Analyses of the perceptions of Thai principals and teachers as to the principal’s role as instructional leader (Unpublished doctoral disserta-tion). Southern Illinois University, Carbondale, IL.

Pounder, J., & Coleman, M. (2002). Women—Better leaders than men? In general and educational management it still “all depends.” Leadership & Organization Development Journal, 23, 122-133.

Pratley, D. (1992). A study of middle level principals’ involvement in instructional leadership (Unpublished doctoral dissertation). Michigan State University, East Lansing, MI.

Ramseyer, J. A. (1955). Factors affecting educational administration: Guideposts for research and action (The School-Community Development Study; Monograph Series No. 2). Columbus, OH: College of Education, Ohio State University.

Robinson, V. M., Lloyd, C., & Rowe, K. (2008). The impact of leadership on student outcomes: An analysis of the differential effects of leadership types. Educational Administration Quarterly, 44, 635-674.

Rose, M. (1991). Perceptions of selected principals in South Carolina concerning their instructional management behaviors (Unpublished doctoral dissertation). University of Southern Mississippi, Hattiesburg, MS.

Rosenthal, R. (1994). Parametric measures of effect size. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 231-244). New York, NY: Sage.

Rosenthal, R., & Rubin, D. B. (1986). Meta-analytic procedures for combining stud-ies with multiple effect sizes. Psychological Bulletin, 99, 400-406.

Ruzicska, J. (1989). The relationships among principals’ sense of efficacy, instruc-tional leadership, and student achievement (Unpublished doctoral dissertation). University of San Francisco, San Francisco, CA.

Salley, C., McPherson, R., & Baehr, M. (1979). What principals do: A preliminary occupational analysis. In D. Erikson & T. Reller (Eds.), The principal in metro-politan schools (pp. 42-66). Berkeley, CA: McCutchan.

Scheerens, J. (2012). School leadership effects revisited: Review and meta-analysis of empirical studies. Dordrecht, Netherlands: Springer.



Hallinger et al. 35

Schoch, A. (1992). The relationship between instructional leadership behavior, school effectiveness, school size, gender, race, and years of principalship experi-ence in elementary schools in South Carolina (Unpublished doctoral disserta-tion). University of South Carolina, Columbia, SC.

Shakeshaft, C. (2006). Gender and educational management. In C. Skelton, B. Francis, & L. Smulyan (Eds.), The Sage handbook of gender and education (pp. 497-511). Thousand Oaks, CA: Sage.

Sun, J. P., & Leithwood, K. (2015). Direction-setting school leadership practices: A meta-analytical review of evidence about their influence. School Effectiveness and School Improvement, 26, 499-523.

Sutton, A. J., Song, F., Gilbody, S. M., & Abrams, K. R. (2000). Modeling publica-tion bias in meta-analysis: A review. Statistical Methods in Medical Research, 9, 421-445.

Trout, K. (1985). The relationship between gender and selected characteristics associ-ated with instructional leadership for senior high school principals (Unpublished doctoral dissertation). University of South Carolina, Columbia, SC.

Witziers, B., Bosker, R., & Kruger, M. (2003). Educational leadership and student achievement: The elusive search for an association. Educational Administration Quarterly, 34, 398-425.

Wright, B. D., Linacre, J. M., Gustafen, J. E., & Martin-Lof, P. (1994). Reasonable mean square fit values. Rasch Measurement Transactions, 8, 370.

Yang, C.-S. (1996). Instructional leadership behaviors of elementary school princi-pals in Taiwan, Republic of China (Unpublished doctoral dissertation). University of Northern Colorado, Greeley, CO.

Author Biographies

Dr. Philip Hallinger is professor of educational management at Chulalongkorn University (Thailand) and Distinguished Visiting Professor of educational leadership and management at the University of Johannesburg (South Africa). In 2014, he received the Excellence in Research Award from AERA and the Roald F. Campbell Award for Lifetime Achievement in Educational Administration from UCEA. His research interests include instructional leadership, school leadership effects and prob-lem-based leadership development.

Dr. Li Dongyu is a post-doctoral fellow in the Faculty of Education and Human Development, the Hong Kong Institute of Education. Her research interests include educational leadership, gender differences, and meta-analysis.

Dr. Wen-Chung Wang is chair professor of educational psychology and director of the Assessment Research Centre in the Hong Kong Institute of Education, Hong Kong SAR. He is a fellow of the AERA and associate editor of several psychology journals. His research interests include Rasch measurement, item response theory, computer-ized adaptive testing, cognitive diagnosis modeling, educational and psychological measurement, psychometrics.