Propensity Score Matching

62
DOI: 10.1111/j.1475-679X.2009.00361.x Journal of Accounting Research Vol. 48 No. 2 May 2010 Printed in U.S.A. Chief Executive Officer Equity Incentives and Accounting Irregularities CHRISTOPHER S. ARMSTRONG, ALAN D. JAGOLINZER, AND DAVID F. LARCKER Received 12 May 2008; accepted 8 September 2009 ABSTRACT This study examines whether Chief Executive Officer (CEO) equity-based holdings and compensation provide incentives to manipulate accounting re- ports. While several prior studies have examined this important question, the empirical evidence is mixed and the existence of a link between CEO equity incentives and accounting irregularities remains an open question. Because inferences from prior studies may be confounded by assumptions inherent in research design choices, we use propensity-score matching and assess hidden (omitted variable) bias within a broader sample. In contrast to most prior re- search, we do not find evidence of a positive association between CEO equity incentives and accounting irregularities after matching CEOs on the observ- able characteristics of their contracting environments. Instead, we find some evidence that accounting irregularities occur less frequently at firms where CEOs have relatively higher levels of equity incentives. The Wharton School, University of Pennsylvania; Stanford University, Graduate School of Business; Stanford University, Graduate School of Business. We thank Paul Rosenbaum for insightful methodological discussions and Bo Lu for making available his nonbipartite match- ing algorithm. We also thank Doug Skinner (editor), an anonymous referee, John Core, Ian Gow, Wayne Guay, Christopher Ittner, Daniel Taylor, Andrew Yim, and workshop participants at Penn State University and Tilburg University for helpful feedback. Jagolinzer acknowledges financial support from the James and Doris McNamara Faculty Fellowship and the John A. and Cynthia Fry Gunn Faculty Scholarship. 225 Copyright C , University of Chicago on behalf of the Accounting Research Center, 2009

Transcript of Propensity Score Matching

Page 1: Propensity Score Matching

DOI: 10.1111/j.1475-679X.2009.00361.xJournal of Accounting Research

Vol. 48 No. 2 May 2010Printed in U.S.A.

Chief Executive Officer EquityIncentives and Accounting

Irregularities

C H R I S T O P H E R S . A R M S T R O N G , ∗ A L A N D . J A G O L I N Z E R , †A N D D A V I D F . L A R C K E R ‡

Received 12 May 2008; accepted 8 September 2009

ABSTRACT

This study examines whether Chief Executive Officer (CEO) equity-basedholdings and compensation provide incentives to manipulate accounting re-ports. While several prior studies have examined this important question, theempirical evidence is mixed and the existence of a link between CEO equityincentives and accounting irregularities remains an open question. Becauseinferences from prior studies may be confounded by assumptions inherent inresearch design choices, we use propensity-score matching and assess hidden(omitted variable) bias within a broader sample. In contrast to most prior re-search, we do not find evidence of a positive association between CEO equityincentives and accounting irregularities after matching CEOs on the observ-able characteristics of their contracting environments. Instead, we find someevidence that accounting irregularities occur less frequently at firms whereCEOs have relatively higher levels of equity incentives.

∗The Wharton School, University of Pennsylvania; †Stanford University, Graduate Schoolof Business; ‡Stanford University, Graduate School of Business. We thank Paul Rosenbaum forinsightful methodological discussions and Bo Lu for making available his nonbipartite match-ing algorithm. We also thank Doug Skinner (editor), an anonymous referee, John Core, IanGow, Wayne Guay, Christopher Ittner, Daniel Taylor, Andrew Yim, and workshop participantsat Penn State University and Tilburg University for helpful feedback. Jagolinzer acknowledgesfinancial support from the James and Doris McNamara Faculty Fellowship and the John A. andCynthia Fry Gunn Faculty Scholarship.

225

Copyright C©, University of Chicago on behalf of the Accounting Research Center, 2009

179637C
Sticky Note
Pretty Neat paper To be used when causal variable is an endogeneous variable (Equity Incentive) is affecting variable of interest (accounting irregularity) How to do: Collect 'Portfoliodelta'=equity incentive Rank and group into 5 quintiles Run Ordered Logistic regression: y= 1 if rank is 1 0 otherwise; 2 if rank is 2 (see my note on ordered logistic regresssion) x = Know vraibles ot cause incentives Extract probs for each firm=Propensity Do matching: Match with a minimum distance alogorithn (table 4); this matchign is very creative with treatmetn and control being same quintiles in reverse order. Count number of irreularity between treatmetn and control and find the diff: No diff!
Page 2: Propensity Score Matching

226 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

1. Introduction

This study examines the relationship between chief executive officer(CEO) equity incentives and accounting irregularities (e.g., restatements,Securities and Exchange Commission Accounting and Auditing Enforce-ment Releases, and shareholder class action lawsuits). Although equityholdings may alleviate certain agency problems between executives andshareholders, concerns have arisen among researchers, regulators, and thebusiness press that “high-powered” equity incentives might also motivateexecutives to manipulate accounting information for personal gain. Thisview assumes that stock price is a function of reported earnings and thatexecutives manipulate accounting earnings to increase the value of theirpersonal equity holdings.1 If this allegation is true and the economic cost ofaccounting manipulation is large, this idea has important implications forexecutive-compensation contract design and corporate monitoring by bothinternal and external parties.

Although at least 10 recent studies have examined the relationship be-tween equity incentives and various types of accounting irregularities, noconclusive set of results has emerged from this literature. Eight prior stud-ies find evidence of a positive relationship, but even within this group theevidence is mixed with regard to which components of an executive’s eq-uity incentives (e.g., restricted stock, unvested options, and vested options)produce this association. Two additional studies do not find evidence of a re-lationship, even though they share similar proxies and samples with studiesthat do find a relationship.

Most prior studies adopt a research design that relies heavily on assump-tions about the functional form of the relationship between accountingirregularities and equity incentives (as well as whatever control variablesare used in the study). Specifically, these studies match firms on the out-come variable of interest (e.g., a firm that experienced accounting fraud ismatched with a firm that did not experience fraud during the same period)using a small number of variables such as firm size and industrial classifica-tion. Other potential confounding variables are “controlled” through theirinclusion in an estimation equation that relates accounting irregularities toequity incentives. Although common in empirical research, this researchdesign relies on a variety of restrictive and perhaps unrealistic assumptionsto produce reliable inferences.

Prior studies have also tended to analyze a relatively small sample of firmsthat lie in the intersection of the Standard & Poor’s ExecuComp databaseand either Government Accountability Office (GAO) Financial StatementRestatements or U.S. Securities and Exchange Commission (SEC) Account-ing and Auditing Enforcement Releases (AAERs). Since ExecuComp does

1 This view implicitly ignores (or considers as trivial) the effect of executive ethics, actions bymonitors, and executives’ expected costs associated with manipulation. This view also requiresthat the market is unable to distinguish between “true” and manipulated earnings.

Page 3: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 227

not provide data for the majority of firms in the economy, it is possible thatthe results of prior studies are influenced by selection bias.2 Moreover, itis not clear whether small samples (e.g., between 50 and 200 observations)provide sufficient statistical power for an analysis of the determinants of arelatively rare event such as a major accounting manipulation. This uncer-tainty hinders the ability to draw inferences regarding the primary researchhypothesis when a statistically significant relationship is not detected. Fi-nally, prior studies have generally ignored the likely endogenous matchingof executives with their observed compensation contracts and, thus, theirobserved level of equity incentives. Since this type of endogenous matchingis an important feature of the executive labor market, it is difficult to inter-pret prior results, because the reported parameter estimates are likely to bebiased.

We draw inferences regarding the relationship between CEO equity in-centives and accounting irregularities from a broad data set and use a re-search design that better addresses the potential confounds inherent in ob-servational studies (Rosenbaum and Rubin [1983], Rosenbaum [2002]). Toreduce the potential for “overt bias,” we employ a propensity-score matched-pair research design to join observations that are similar along a compre-hensive set of firm- and manager-level dimensions.3 The propensity-scoremethod forms matched pairs of CEO firm-years that have similar contractingenvironments but differing levels of CEO equity incentives. This approachalleviates misspecification that occurs when the research design assumesan incorrect functional form for the relationship between the variables ofinterest (including controls) and the outcome.

We also assess the sensitivity of our results to “hidden bias,” or unobservedcorrelated omitted variables, using the bounding techniques developed byRosenbaum [2002]. This bounding approach provides insight into the like-lihood that our results are confounded by explanations such as endogenousmatching of CEOs and equity incentives on the basis of unobserved variablessuch as the level of CEO risk aversion. Thus, our research design relaxes theassumptions of the traditional matched-pairs approach and assesses the im-pact of omitted-variable and endogeneity concerns.

In contrast to most prior studies, we do not observe a positive relationshipbetween CEO equity incentives and the incidence of accounting irregular-ities. Instead, our evidence suggests that the level of CEO equity incentives

2 Studies using ExecuComp data may be prone to selection bias concerns, since ExecuCompfocuses exclusively on firms listed in the Fortune 1500 (e.g., Cadman, Klasa, and Matsunaga[2006]).

3 Rosenbaum [2002, p. 71] defines overt bias as “one that can be seen in the data at hand,”which means that it is bias that is related to observable variables. It can result from eitheromission of observable variables or from the specification of an improper functional form forthe relationship between observable variables and the outcome variable of interest. In contrast,“hidden bias” is associated with the omission of unobservable variables (i.e., correlated omittedvariables). We consider both types of bias in our analysis.

Page 4: Propensity Score Matching

228 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

has a modest negative relationship with the incidence of accounting irregu-larities. This result is more consistent with the notion that equity incentivesreduce agency costs that arise with respect to financial reporting than is theinterpretation that equity incentives cause managers to manipulate reportedearnings.

Although we provide only one substantive application, propensity-scoremethods can (and perhaps should) be applied to other empirical accountingstudies in which the hypothesized causal variable is an endogenous choice bymanagers, boards of directors, or other similar parties. In particular, usingpropensity scores to generate matched pairs with maximum variation in thecausal variable of interest while minimizing the variation in the controls is, inmany cases, a superior econometric approach to matching on the outcomevariable and relying on a linear or some other assumed functional form tocontrol for confounding variables. Moreover, propensity-score methods alsoenable the researcher to explicitly quantify the sensitivity of the results forthe primary causal variable to unobserved correlated omitted variables.

Section 2 of this paper reviews the prior literature examining the relation-ship between executive incentives and accounting irregularities. Section 3describes the sample and our primary measurements. Section 4 discussesthe propensity-score matched-pair research design and compares this ap-proach with the regression research design that is common in prior studies.Section 5 presents our primary empirical results. Section 6 discusses sensi-tivity analyses. Section 7 provides concluding remarks. Finally, appendix Aincludes basic methodological background regarding observational studiesand appendix B discusses the importance of functional form when selectingregression or matching approaches for inference.

2. Prior Research

At least 10 recent studies (summarized in table 1) examine the relation-ship between accounting irregularities and executives’ equity incentives.These studies generally hypothesize that equity-based compensation andholdings provide incentives for managers to manipulate accounting num-bers (e.g., Harris and Bromiley [2007], Efendi, Srivastava, and Swanson[2007], Bergstresser and Philippon [2006]), perhaps to increase gains frompending insider sales (Cheng and Warfield [2005]). Harris and Bromiley[2007], for example, suggest that the likelihood of managerial impropri-ety rises with “the strength of inducements” and therefore test for a posi-tive relationship between the probability of accounting misrepresentationand stock-option compensation. Few studies (e.g., O’Connor et al. [2006],Burns and Kedia [2006]), however, explicitly consider the alternative pos-sibility that equity incentives might instead lessen management’s desire tomanipulate accounting numbers by aligning managers’ interests with thoseof shareholders.

Eight of the 10 papers listed in table 1 find some evidence that executives’equity incentives exhibit a positive statistical association with accounting

179637C
Highlight
Page 5: Propensity Score Matching

CH

IEF

EX

EC

UT

IVE

OFF

ICE

RE

QU

ITY

INC

EN

TIV

ES

229

T A B L E 1Summary of Prior Literature

AccountingPrimary Equity Irregularities Unit of Research Observed

Study Incentives Proxy Proxy Analysis Design Sample Association

Baber, Kang, andLiang [2007]

Compensation Mix,ExercisableOptions scaled bySharesOutstanding

Restatements CEO Matched pair (year,industry,exchange, assets)logisticregression

193 firm-years plusmatches,1997–2002

None

Harris andBromiley [2007]

Option and BonusValue scaledby TotalCompensationValue

Restatements CEO Matched pair (year,industry, sales)conditionallogisticregression

434 firm-years plusmatches,1997–2002

Positive for OptionValue scaled byTotalCompensationValue

Larcker,Richardson, andTuna [2007]

Compensation Mix Abnormal Accruals,Restatements

CEO OLS regression,pooled logisticregression

1,484 firm-years,118 firm-yearsplus all otherfirm-yearobservations,2002–2003

Positive, none

Efendi, Srivastava,and Swanson[2007]

Component Value,Option IntrinsicValue, OptionDelta

Restatements,SevereRestatements

CEO Matched pair (year,industry, assets)logisticregression,ordered logisticregression

95 firm-years plusmatches,2001–2002

Positive for OptionIntrinsic Valueand Option Delta

Erickson, Hanlon,and Maydew[2006]

Portfolio Delta AAERs Top 5 execs Matched firms(year, industry,assets) logisticregression

50 firm-years plusmatches,1996–2003

None

(Continued)

Page 6: Propensity Score Matching

230C.

S.A

RM

STR

ON

G,

A.D

.JA

GO

LIN

ZE

R,A

ND

D.

F.L

AR

CK

ER

T A B L E 1 —Continued

AccountingPrimary Equity Irregularities Unit of Research Observed

Study Incentives Proxy Proxy Analysis Design Sample Association

Johnson, Ryan,and Tian[2009]

Portfolio Delta andComponentDeltas

AAERs Top 5 execsand CEOonly

Matched pair (year,industry, revenues)conditional logisticregression

53 firm-years plusmatches,1992–2001

Positive only forincentives relatedto unrestrictedstock

Burns and Kedia[2006]

Portfolio Delta andComponentDeltas

Restatements,RestatementMagnitude

CEO Pooled logisticregression, pooledOLS regression

266 firm-years plusall otherExecuCompfirm-years,1995–2002

Positive only forincentives relatedto stock options

Bergstresser andPhilippon[2006]

Incentive Ratio(Portfolio Deltascaled byCompensation)

DiscretionaryAccruals

CEO OLS regression 4,761 firm-years,1994–2000

Positive

O’Connor et al.[2006]

Black–Scholes[1973] OptionValue

Restatements CEO Matched pair (year,industry, sales,income, optionvesting schedules)

65 firm-years plusmatches,2000–2004

Positive if (1) CEOis board chairand other boardmembers do notreceive options,or (2) CEO is notboard chair andother boardmembers receiveoptions

Cheng andWarfield[2005]

ComponentHoldings scaledby sharesoutstanding

Meet/Just BeatExpectations,AbnormalAccruals

CEO Pooled logisticregression, pooledOLS

4,301 firm-years,6,307 firm-years,1993–2000

Positive only forunexercisableoptions and stockholdings

Page 7: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 231

manipulation. Although the results of these studies might be consideredas a consensus for this research question, there is considerable variationacross inferences presented within these papers. This lack of consistencyoccurs even though similar proxies for accounting manipulation and equityincentives are used and there is considerable cross-sectional and temporaloverlap in their samples. For example, Johnson, Ryan, and Tian [2009] andErickson, Hanlon, and Maydew [2006] both assess the relationship betweenthe incidence of accounting fraud (identified using AAERs) and the equityportfolio delta computed for top firm executives.4 Although the two sam-ples exhibit considerable overlap, Johnson, Ryan, and Tian [2009] reportevidence of a strong positive association between unrestricted equity hold-ings and the incidence of accounting fraud, while Erickson, Hanlon, andMaydew [2006] do not observe any statistical association. Similarly, Baber,Kang, and Liang [2007] and Harris and Bromiley [2007] both examine therelationship between equity incentives and the incidence of accounting re-statements. Their samples differ in the number of observations but overlapcompletely in observation years. In spite of this overlap, the studies reportsurprisingly different results. Harris and Bromiley [2007] find a positive as-sociation between the incidence of accounting restatements and the ratio ofoption compensation to total compensation, while Baber, Kang, and Liang[2007] do not find a similar statistical association.

Some prior studies provide evidence of a positive association only forcertain components of option-related holdings (e.g., Harris and Bromiley[2007]; Burns and Kedia [2006], Efendi, Srivastava, and Swanson [2007]).Others provide evidence of a positive association for different equity compo-nents, such as unvested options and stock ownership (Cheng and Warfield[2005]), vested stock holdings ( Johnson, Ryan, and Tian [2009]), and theentire equity portfolio (Bergstresser and Philippon [2006]). Yet anotherstudy finds evidence of a positive association for option-related equity com-ponents only when conditioned on the Board of Directors’ compositionand compensation structure (O’Connor et al. [2006]). These inconsisten-cies highlight the difficulty in drawing general inferences regarding theassociation between equity incentives and accounting irregularities fromprior research.

3. Sample and Measurement Choice

Our sample of CEO equity incentives, measured between 2001 and2005, is obtained from a comprehensive database provided by Equilar,Inc.5 This database is similar to ExecuComp’s in that it provides executive-

4 Equity portfolio delta is the change in the (typically risk-neutral) dollar value of an exec-utive’s equity portfolio (stock, restricted stock, and stock-option holdings) for a 1% change inthe price of the underlying stock.

5 The period 2001–2005 overlaps with regulatory environment changes (e.g., Sarbanes–Oxley Act, Regulation FD, SEC Rule 10b5-1) that may affect inferences relative to those reported

Page 8: Propensity Score Matching

232 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

compensation and equity-holdings data collected from annual proxy filings(DEF 14A) with the SEC. However, the Equilar data provides 3,634, 3,930,4,043, 4,051, and 4,047 CEO-firm observations (in contrast with the roughly1,500 CEO-firm observations available annually from ExecuComp) acrossfiscal years 2001–2005, respectively.6

It is difficult to construct an appropriate empirical measure for the in-cidence of accounting manipulation, since this managerial action is unob-served. Most empirical studies infer manipulation from observing “extreme”outcomes in which manipulation is likely to have occurred (e.g., incidencesof accounting restatements and regulatory or legal action). One concernwith this measurement method is that it incorrectly classifies firms that ma-nipulate accounting but that are not identified for restatement or for reg-ulatory or legal action. The potential for misclassification is a limitation ofour study as well as of previous studies in this area.

To reduce the risk of misclassification, we consider three different typesof “accounting irregularities.” The first is financial restatements related toaccounting manipulation. These data are obtained from Glass-Lewis & Co.,which comprehensively collects restatement information from SEC filings,press releases, and other public data. We identify accounting restatementsbetween 2001 and 2005 that relate to perceived reporting manipulationclassified as accounting fraud, an SEC investigation, a securities class ac-tion suit, improper reserve allowances, improper revenue recognition, orimproper expense recognition.7 We code a restatement incident as the firstfiscal year in which improper accounting occurred that subsequently ne-cessitated a restatement. As shown in table 2 (panel A), we identify 464restatement incidents (3.4% of the total sample) across the time period cov-ered in our analysis, with the most observations occurring during fiscal year2004.

The second accounting irregularity we consider is whether the firm was ac-cused of accounting manipulation in a class action lawsuit. We identify thesefirms in a database provided by Woodruff-Sawyer and Co. that records classaction lawsuit damage periods between 2001 and 2005. The lawsuits allege

in studies that examine earlier periods. We assess the sensitivity of our inferences to time-periodchoice in section 6.4.

6 The total of 19,705 pooled observations is the maximum number of CEO-firm-years avail-able from Equilar. Eliminating observations with missing analysis data yields 13,706 prematchCEO-firm-year observations. Requiring one-year-ahead data yields 10,773 CEO-firm-year obser-vations for the propensity-score estimation. The propensity-score matching algorithm yields aprimary analysis sample of 9,118 CEO-firm-year observations (4,559 matched pairs).

7 Revenue recognition restatements may result from changes in GAAP or GAAP enforce-ment (e.g., Staff Accounting Bulletin 101). We classify these restatements as “manipulation,”since many GAAP enforcement changes resulted from regulatory perception that revenue wasbeing misreported. For sensitivity, we also restrict our restatement sample to the subsampleof Glass-Lewis restatements that note revenue recognition, expense recognition, or concernsover reserves and allowances (Palmrose, Richardson, and Scholz [2004]) and also note a ma-terial weakness, a late filing, an auditor change, or a restatement via 8-K filing. Results for thisrestricted restatement sample are qualitatively similar to our reported results.

Page 9: Propensity Score Matching

CH

IEF

EX

EC

UT

IVE

OFF

ICE

RE

QU

ITY

INC

EN

TIV

ES

233

T A B L E 2Descriptive Statistics for Accounting Irregularities, CEO Equity Incentives, and Firm Characteristics

Panel A: Accounting Irregularities (n = 13,706)a

Variable Period Number of Firms Percentage

Manipulation Restatement Pooled 464 3.4%2001 17 0.1%2002 69 0.5%2003 96 0.7%2004 203 1.5%2005 79 0.6%

Accounting Lawsuit Pooled 464 3.4%2001 122 0.9%2002 98 0.7%2003 118 0.9%2004 83 0.6%2005 43 0.3%

AAER Pooled 157 1.2%2001 50 0.4%2002 48 0.4%2003 35 0.3%2004 18 0.1%2005 6 0.0%

(Continued)

Page 10: Propensity Score Matching

234C.

S.A

RM

STR

ON

G,

A.D

.JA

GO

LIN

ZE

R,A

ND

D.

F.L

AR

CK

ER

T A B L E 2 —Continued

Panel B: CEO Equity Incentives (n = 10,773)b

EqIncQuint = 1 EqIncQuint = 2 EqIncQuint = 3 EqIncQuint = 4 EqIncQuint = 5

Variable Period Mean Median Mean Median Mean Median Mean Median Mean Median

PortDelta Pooled 24,980 24,357 88,124 84,941 210,928 204,436 519,365 489,012 4,819,973 1,822,6112001 25,999 27,159 91,085 87,636 224,167 222,667 537,664 501,469 5,117,285 2,114,7652002 17,086 16,875 62,607 60,660 153,596 149,491 400,245 379,358 4,274,464 1,412,6452003 25,841 25,213 94,731 91,931 227,433 224,623 549,970 518,465 4,829,008 1,863,5812004 28,628 27,830 99,706 97,226 230,843 223,186 572,769 548,654 4,994,529 1,963,0882005 26,072 26,110 91,802 89,460 218,183 216,843 536,657 498,068 4,930,282 1,879,171

Panel C. Firm characteristics (n = 10,773)c

Variable Mean Median Std. Dev.

Leverage 0.204 0.144 0.231MarketCap 3,775 551 16,791Idiosyncrisk 0.152 0.130 0.088MkttoBook 2.865 2.120 2.853Tenure 6.862 4.800 7.235OutsideChmn 0.132 0.000 0.339OutsideLdDir 0.194 0.000 0.395CEOApptdOutsDirs 0.700 0.800 0.329StaggeredBd 0.580 1.000 0.494PctOldOutsDirs 0.143 0.125 0.142PctBusyOutsDirs 0.250 0.222 0.209PctFoundingDirs 0.034 0.000 0.075OutsideDirHolds 0.013 0.003 0.033NumberDirs 8.600 8.000 2.633PctFinExpsAud 0.674 0.667 0.244

(Continued)

Page 11: Propensity Score Matching

CH

IEF

EX

EC

UT

IVE

OFF

ICE

RE

QU

ITY

INC

EN

TIV

ES

235

T A B L E 2 —Continued

Panel C. Firm characteristics (n = 10,773)Variable Mean Median Std. Dev.

DirCompMix 0.495 0.531 0.319NumInstOwns 133 89 151NumBlockhldrs 1.886 2.000 1.530Activists 0.013 0.000 0.115

aRestatement data are obtained from Glass-Lewis & Co., which comprehensively collects restatement information from SEC filings, press releases, and other public data. Weidentify accounting restatements between 2001 and 2005 that relate to perceived reporting manipulation classified as accounting fraud, an SEC investigation, a securities classaction suit, improper reserve allowances, improper revenue recognition, or improper expense recognition. We code a restatement incident as the first fiscal year in which improperaccounting occurred that subsequently required restatement.

Accounting lawsuits are obtained from a database provided by Woodruff-Sawyer and Co. that records class action lawsuit periods between 2001 and 2005. These lawsuits allegeearnings estimate improprieties, financial misrepresentation, failure to adhere to GAAP, or restatement of earnings. We code a lawsuit incident as the first fiscal year in which thefirm is named in a lawsuit damage period.

SEC Accounting and Auditing Enforcement Releases (AAERs) are identified from the comprehensive AAER listing provided on the SEC Web site for allegation periods between2001 and 2005. These allegations cite earnings-estimate improprieties, financial misrepresentation, or failure to adhere to GAAP. We code an AAER incident as the first fiscal year inwhich the SEC alleges accounting manipulation occurred, as detailed in the Enforcement Release.

bCEO equity incentives are measured as the portfolio delta (PortDelta), which is the change in the risk-neutral dollar value of the CEO’s equity portfolio for a 1% change inthe firm’s stock price (Core and Guay [1999]). To compute PortDelta, the value of stock and restricted stock is assumed to change dollar-for-dollar with changes in the price of theunderlying stock. The value of stock options is assumed to change according to the option’s delta, which is the derivative of its Black–Scholes [1973] value with respect to the priceof the underlying stock (See Core and Guay [2002]). Black–Scholes [1973] parameters are computed using methods similar to Core and Guay [2002]. Specifically, annualizedvolatility is calculated using continuously compounded monthly returns over the prior 36 months (with a minimum of 12 months of returns). The risk-free rate is calculated usingan interpolated interest rate on a Treasury note with the same maturity (to the closest month) as the remaining life of the option multiplied by 0.7 to account for the prevalence ofearly exercise. Dividend Yield is calculated as the dividends paid over the past 12 months scaled by the stock price at the beginning of the month.

cLeverage is the ratio of total debt to market value of assets computed from Compustat as (data9 + data34)/((data199 ∗ data25) + data9). MarketCap is the market value ofequity computed from Compustat as (data199 ∗ data25). Idiosyncrisk is the standard deviation of residuals from a firm-specific regression of monthly returns on the monthlyreturn to the CRSP value-weighted portfolio index (Core and Guay [1999]). At least 12 and no more than 36 monthly return observations are required for estimation. MkttoBookis the market value of equity divided by the book value of equity computed from Compustat as ((data199 ∗ data25)/data216). Tenure is the CEO’s tenure with the firm in years,as provided by Equilar. OutsideChmn is a dichotomous variable that equals 1 if the board chairman is delineated as an outsider by Equilar and is 0 otherwise. OutsideLdDir is adichotomous variable that equals 1 if the lead independent director is delineated as an outsider by Equilar and is 0 otherwise. CEOApptdOutsDirs is the number of outside directorswhose tenure is less than the CEO’s tenure, scaled by the total number of directors. StaggeredBd is a dichotomous variable that equals 1 if Equilar delineates the board serviceterms as staggered and is 0 otherwise. PctOldOutsDirs is the ratio of outside directors who are at least 69 years old to total directors. PctBusyOutsDirs is the ratio of outside directorswho serve simultaneously on at least two boards to total directors. PctFoundingDirs is the ratio of directors who are founding firm members to total directors. OutsideDirHolds isthe sum of shares held by outside directors to total shares outstanding. NumberDirs is the number of directors on the board. PctFinExpsAud is the ratio of directors with financialexpertise who serve on the audit committee to total directors. Directors are classified as financial experts if they have experience as CEO, CFO, financial accountant, or auditor,or if they have been licensed as a Certified Public or Chartered Accountant. DirCompMix is the ratio of total dollar equity compensation to total equity plus cash compensationfor nonexecutive directors. NumberInstOwns is the number of institutional owners delineated in the CDA/Spectrum database. NumBlockhldrs is the number of institutional own-ers that own at least 5% of outstanding shares. Activists is the number of institutional owners denoted as activists by Cremers and Nair [2005] and Larcker, Richardson, and Tuna [2007].

Page 12: Propensity Score Matching

236 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

disclosure or financial-statement earnings estimate improprieties, financialmisrepresentation, failure to adhere to GAAP, or restatement of earnings.8

We code a lawsuit incident as the first fiscal year in which the firm is namedin a lawsuit damage period. We identify 464 incidents of accounting-relatedlawsuit allegation periods (3.4% of the total sample) across the time pe-riod, with the most observations occurring during fiscal year 2001 (table 2,panel A).

The final accounting irregularity we consider is whether the firm wasaccused of accounting manipulation in an AAER from the SEC. We iden-tify these firms from the comprehensive AAER listing provided on the SECWeb site for allegation periods between 2001 and 2005 that allege earnings-estimate improprieties, financial misrepresentation, or failure to adhere toGAAP.9 We code an AAER incident as the first fiscal year in which the SECalleges that accounting manipulation occurred, as detailed in the Enforce-ment Release. Table 2 (panel A) shows that there were only 157 incidentsof accounting-related AAER allegation periods (1.2% of the total sample)across the time period, indicating that AAERs occur much less frequentlythan do both accounting restatements and accounting-related litigation.10

Consistent with prior literature (e.g., Core and Guay [1999], Erickson,Hanlon, and Maydew [2006], Burns and Kedia [2006]), we measure CEOequity incentives as the portfolio delta, defined as the (risk-neutral) dollarchange in the CEO’s equity portfolio value for a 1% change in the firm’sstock price. The value of stock and restricted stock is assumed to changedollar-for-dollar with changes in the price of the underlying stock. The valueof stock options is assumed to change according to the option’s delta, whichis the derivative of its Black–Scholes [1973] value with respect to the priceof the underlying stock (Core and Guay [2002]).11

8 Woodruff-Sawyer and Co. collects comprehensive class action lawsuit data to help estimatepremiums for brokering directors and officers’ liability insurance. A class action damage periodis the period that precedes the lawsuit filing date during which the plaintiff alleges that damages(e.g., accounting manipulation) had occurred.

9 We define an enforcement action allegation period as the period that precedes the AAERfiling date during which the SEC alleges that accounting manipulation had occurred. For mostAAER filings, the allegation period involves several years that well precede the AAER filing date.It is common, for example, to observe 2007-year filings that refer back to allegation windowsthat occur between 2001 and 2005.

10 Untabulated results show that all three measures display a positive contemporaneous cor-relation. In particular, the Pearson correlation between (i) restatements and AAERs is 0.07, (ii)restatements and litigation is 0.10, and (iii) AAERs and litigation is 0.21. All three are highlystatistically significant (p < 0.001 using a two-tailed test). There are 64 CEO-firm-year observa-tions that experience both restatement and litigation events, 27 observations that experienceboth restatement and AAER events, 55 observations that experience both litigation and AAERevents, and 13 observations that experience all three events contemporaneously.

11 The parameters of the Black–Scholes [1973] formula are calculated as follows. Annualizedvolatility is calculated using continuously compounded monthly returns over the prior 36months (with a minimum of 12 months of returns). The risk-free rate is calculated using

Page 13: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 237

Since we are concerned with economically substantive differences in thelevel of equity incentives among executives, we partition equity incentivesinto five quintiles for our analyses. Using quintiles also allows us relaxthe assumption that CEO equity incentives have a monotonic associationwith accounting irregularities.12 Quintile rankings also exhibit better mea-surement properties than continuous incentive measurements do, sincethe empirical distribution of CEO portfolio deltas is right-skewed (table 2,panel B).

Figure 1 presents frequency histograms for both contemporaneousand one-year-ahead accounting irregularities partitioned by CEO equity-incentives quintile. Consistent with results from prior literature, figure 1provides some evidence of a positive (univariate) relationship between CEOequity incentives and the incidence of accounting irregularities, with thestrongest monotonic pattern appearing for AAER and lawsuit outcomes.Rank correlations (untabulated) confirm that AAERs (coefficient = 0.023,p-value = 0.0070) and lawsuits (coefficient = 0.052, p-value = <0.0001)have a statistically positive association with CEO equity incentives. However,equity incentives are correlated with many characteristics of executives’ con-tracting environments that could also produce univariate patterns similarto those in figure 1.

4. Research Method

Since a pure experiment with random assignment is typically infeasible,most empirical accounting studies are observational in nature. There is anextensive literature in econometrics and statistics that identifies conditionsnecessary to make causal statements in an observational study about the im-pact of the treatment variable (CEO equity incentives) on the outcome(accounting irregularities). We summarize the theoretical framework inappendix A.

Prior research typically selects a set of firms with an observed account-ing irregularity and then obtains another firm without an irregularity that

interpolated interest rate on a Treasury note with the same maturity (to the closest month)as the remaining life of the option multiplied by 0.7 to account for the prevalence of earlyexercise. Dividend yield is calculated as the dividends paid over the past 12 months scaled bythe stock price at the beginning of the month. This is essentially the same method describedby Core and Guay [2002].

12 Relaxing the monotonicity assumption also allows us to better isolate the location ofany association between equity incentives and accounting irregularities on the support of theequity-incentives distribution. In the extreme case, there could be a positive association at oneend (e.g., high incentives) and a negative association at the other end (e.g., low incentives),and these separate effects would be obscured in a model that imposes monotonicity in therelationship.

Page 14: Propensity Score Matching

238 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

FIG. 1.—Accounting irregularities: frequency distributions. Frequency of observed Account-ing Irregularities for each equity-incentive quintile (Contemporaneous n = 13,706; One YearAhead n = 10,773). Accounting Irregularity frequency is presented for each quintile of CEOequity incentives. CEO equity incentives is measured as the portfolio delta (PortDelta), whichis the change in the risk-neutral dollar value of the CEO’s equity portfolio for a 1% change inthe firm’s stock price (Core and Guay [1999]). To compute PortDelta, the value of stock andrestricted stock is assumed to change dollar-for-dollar with changes in the price of the under-lying stock. The value of stock options is assumed to change according to the option’s delta,which is the derivative of its Black–Scholes [1973] value with respect to the price of the under-lying stock (See Core and Guay [2002]). Black–Scholes [1973] parameters are computed usingmethods similar to Core and Guay [2002]. Specifically, annualized volatility is calculated usingcontinuously compounded monthly returns over the prior 36 months (with a minimum of12 months of returns). The risk-free rate is calculated using an interpolated interest rate on aTreasury note with the same maturity (to the closest month) as the remaining life of the optionmultiplied by 0.7 to account for the prevalence of early exercise. Dividend Yield is calculatedas the dividends paid over the past 12 months scaled by the stock price at the beginning of themonth. Contemporaneous Irregularities are those that occur in the same fiscal year of CEOequity incentives measurement. One Year Ahead Irregularities are those that occur in the fiscalyear that follows CEO equity-incentive measurement.

Page 15: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 239

is matched on year, industry, and size.13 The effect of incentives on thefrequency of accounting irregularities is then inferred from the estimatedcoefficient on equity incentives. Other variables are “controlled” throughtheir inclusion in the regression estimation.

The validity of this common research design relies on several criticalassumptions. As discussed more fully in appendix B, the partial-matchedeconometric method produces unbiased parameter estimates only if thereis an identical functional relationship between the control variables and theoutcome variable for each level of treatment.14 If instead the true relation-ship between the controls and the outcome variable either differs acrosslevels of treatment or is inconsistent with the functional form imposed bythe research design, the partial-matched econometric method will producebiased parameter estimates. Further, this misspecification increases the like-lihood of drawing an erroneous conclusion about the existence of a causaleffect of the treatment.

We adopt an alternative approach that is more robust to misspecificationof the functional form of the underlying relationship between equity in-centives and accounting irregularities. Specifically, we use a matched-pairresearch design that matches a treatment firm with a control firm that is sim-ilar across all observable relevant variables. Our matching algorithm uses thecommon partial-match variables plus all other variables that would typicallybe included as control variables. Matching on these additional variables re-laxes the assumption of a constant functional relationship with the outcomevariable and therefore is robust to misspecification of the functional form(see appendix A).

4.1 IMPLEMENTATION OF THE PROPENSITY-SCORE MATCHED-PAIRS DESIGN

Our matched-pair research design consists of five steps. First, we estimatean ordered logistic propensity-score model, which is the probability thatan executive will receive a certain level of equity incentives (i.e., the treat-ment) conditional on observable features of the contracting environment.Second, we form matched pairs by identifying the pairings that result in

13 Most prior studies match on differences in the outcome rather than on differences in thetreatment. The distinction between the two alternative research designs has important infer-ential implications, since only the latter isolates the relationship of interest. Because matchingon the outcome does not remove variation in control variables, the research design implicitlysearches for any cause(s) of an effect. In contrast, to the extent it is possible to achieve covariatebalance, matching on the treatment removes variation in other potentially confounding vari-ables to isolate the effect of a treatment of interest. Further, matching on the outcome has two keylimitations. First, inferences from this design rely heavily on the assumed functional form ofthe relationship (see appendix B). Second, this design may induce low power, since it does notensure that variation remains in the treatment variable of interest (see section 6.3). In contrast,matching on the treatment is analogous to a randomized experiment in which the randomizedtreatment assignment deliberately induces variation in treatment.

14 Logistic regression, for example, assumes that a linear functional relationship exists be-tween the log of the odds ratio and the observable predictor variables.

Page 16: Propensity Score Matching

240 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

observations with the smallest propensity-score differences (i.e., the mostsimilar observed contracting environments) but the greatest difference inactual CEO equity incentives (i.e., the most dissimilar contracts). Third, weexamine the covariate balance between the treatment and control samplesand (if necessary) remove the most dissimilar matched pairs to achieve bet-ter control for potentially confounding factors.15 Fourth, we examine therelationship between equity incentives and accounting irregularities by as-sessing whether the frequency of accounting irregularities is significantlydifferent between the treatment and control groups. Fifth, we estimate thesensitivity of reported results to potential hidden bias by relaxing the as-sumption that matched observations have an equal probability of receivinga certain level of treatment conditional on the observable contracting envi-ronment (Rosenbaum [2002]). The final step explicitly acknowledges thatunobservable contracting characteristics can affect each executive’s level ofequity incentives (e.g., endogenous matching of executives and contractson unobservable firm and CEO characteristics such as CEO risk aversion).This assessment quantifies the potential impact of this confounding effecton the observed statistical association between the treatment variable andthe outcome.

4.1.1. Propensity-Score Model . One problem with implementing a matched-pair research design is the difficulty of obtaining proper matches when eachobservation is characterized by many relevant dimensions (or covariates).As the number of dimensions increases, it becomes increasingly difficultto find pairs of observations that are similar along all of these dimensions.Rosenbaum and Rubin [1983] develop the propensity score as a way toaddress this dimensionality problem. In particular, the propensity score isthe conditional probability of receiving some level of treatment given theobservable covariates.16

The treatment of interest in this study is the level of CEO equity incen-tives, so we require a propensity-score model of the conditional probabilityof receiving a certain level of equity incentives given observable featuresof a CEO’s contracting environment. Prior theoretical and empirical re-search suggests a number of economic and governance characteristics thatare associated with the level of CEO equity incentives, and we draw on thisliterature to specify the propensity-score model. Demsetz and Lehn [1985],

15 It is important to note that this step is not ad hoc and does not induce estimation bias. Thisstep simply identifies and then removes matched pairs for which the matching algorithm didnot produce an effective covariate match (without using any information about the outcomevariable). Removing these observations alleviates inference problems that are discussed inappendix B.

16 Rosenbaum and Rubin [1983] discuss the necessary conditions for matching on thepropensity score (which is a scalar value) rather than matching on each of the individualcovariates. One condition is that the outcome is independent of the treatment given the ob-served covariates. A second condition is that the propensity score cannot perfectly classifyobservations into the treatment or control groups. This is necessary to ensure that for eachobservation, there is a potential match that has a similar probability of receiving the treatment.

Page 17: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 241

for example, suggest that larger firms and firms with greater monitoringdifficulties will provide greater CEO incentives. Dechow and Sloan [1991]suggest that firms with CEO horizon problems will provide greater CEO eq-uity incentives. Finally, Core, Holthausen, and Larcker [1999] suggest thatfirm governance characteristics, in part, determine CEO equity incentives.Therefore, we include proxies for size (market capitalization), complexity(idiosyncratic risk), growth opportunities (market-to-book ratio), monitor-ing (leverage), CEO horizon problems (CEO tenure), and firm-governancecharacteristics (e.g., the number of directors, the number of activist share-holders) in the propensity-score estimation.

We estimate the following ordered logistic propensity-score model, annu-ally, for the CEOs in our sample:

Pr(EqIncQuint) = αk + β1Leveragei + β2Log(MarketCap)i

+ β3Log(Idiosyncrisk)i + β4MkttoBooki

+ β5Log(1 + Tenurei ) + β6OutsideChmni

+ β7OutsideLdDir i + β8CEOApptdOutsDirsi

+ β9StaggeredBdi + β10PctOldOutsDirsi

+ β11PctBusyOutsDirsi + β12PctFoundingDirsi

+ β13OutsideDirHoldsi t + β14Log(1 + NumberDirsi )+ β15PctFinExpsAudi + β16DirCompMixi

+ β17Log(1 + NumInstOwnsi )+ β18Log(1 + NumBlockhldrsi )+ β19Log(1 + Activistsi ) + εi . (1)

Variables are defined in appendix C. The independent variables in equa-tion (1) are measured in the year prior to equity-incentives measurement,and descriptive statistics for these variables are presented in table 2 (panelC).17

Table 3 reports the aggregated estimates of the annual ordered logis-tic propensity-score regression of the level of equity incentives.18 The firstcolumn presents the average of the annual coefficient estimates, and the

17 Although we select the predictor variables in equation (1) based on prior research, weacknowledge that this choice process is somewhat arbitrary. An alternative research designwould be to include only the traditional economic determinants of CEO incentives, as opposedto also including corporate governance variables. We include the governance variables becauseprior research shows that they are important determinants of the level of equity incentives (e.g.,Core, Holthausen, and Larcker [1999]). In addition, if the propensity score only uses economicdeterminants, there is a high likelihood that the resulting matches will not be balanced withrespect to the governance variables. This would result in an identification problem, which wouldmake it difficult to determine whether the accounting irregularities are caused by differencesin the level of equity incentives, corporate governance, or both.

18 In untabulated sensitivity analyses, we include two-digit SIC code as an additional propen-sity score estimation covariate. We also alter the algorithm to require matching from firmswith the same two-digit SIC code. Both procedures produce fewer matches and modestly worsecovariate balance across the treatment and control samples, but neither alters our primaryinferences.

179637C
Sticky Note
eqn 1 (a regression function) gives a propensity score for each observation - an outcome of logistic regression - probabilty? for each year.
Page 18: Propensity Score Matching

242 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

T A B L E 3Propensity-Score Estimation Using Ordered Logistic Regression

Yrs. withDependent Variable = Avg. Aggr. Yrs. with Neg.EqIncQuint Pred. Coeff. z-Statistic Pos. Coeff. Coeff.Leverage − −0.398 −3.938 0 4Log(MarketCap) + 1.524 41.320 4 0Log(Idiosyncrisk) + 0.111 2.438 4 0MkttoBook + 0.033 4.083 4 0Log(1 + Tenure) + 0.734 32.190 4 0OutsideChmn ? −0.699 −11.565 0 4OutsideLdDir ? −0.302 −3.861 0 4CEOApptdOutsDirs ? 1.027 15.844 4 0StaggeredBd ? −0.031 −0.824 2 2PctOldOutsDirs ? 0.085 0.674 2 2PctBusyOutsDirs ? −0.002 0.162 2 2PctFoundingDirs ? 3.015 10.837 4 0OutsideDirHolds ? 3.912 6.384 4 0Log(1 + NumberDirs) ? −1.272 −13.427 0 4PctFinExpsAud ? −0.006 −0.136 2 2DirCompMix ? 0.298 4.522 4 0Log(1 + NumInstOwns) ? −0.344 −6.737 0 4Log(1 + NumBlockhldrs) ? 0.195 5.432 4 0Log(1 + Activists) ? −0.673 −2.903 0 4Intercept EqIncQuint 1 → 2 + 4.524 21.288 4 0Intercept EqIncQuint 2 → 3 + 6.334 29.194 4 0Intercept EqIncQuint 3 → 4 + 7.882 35.516 4 0Intercept EqIncQuint 4 → 5 + 9.672 42.373 4 0CEO-firm-year obs. 10,773Adj. Pseudo-R2 0.273

EqIncQuint is a dichotomous variable that equals 1 if the CEO’s portfolio delta falls within the kth quintileof the cross-sectional distribution of CEO deltas and equals 0 otherwise. The portfolio delta is the change indollar value of the CEO’s equity portfolio for a 1% change in the firm’s underlying stock price. Leverage isthe ratio of total debt to market value of assets computed from Compustat as (data9 + data34)/((data199 ∗data25) + data9). MarketCap is the market value of equity computed from Compustat as (data199 ∗ data25).Idiosyncrisk is the standard deviation of residuals from a firm-specific regression of monthly returns on themonthly return to the CRSP value-weighted portfolio index (Core and Guay [1999]). At least 12 and no morethan 36 monthly return observations are required for estimation. MkttoBook is the market value of equitydivided by the book value of equity computed from Compustat as ((data199 ∗ data25)/data216). Tenure isthe CEO’s tenure with the firm in years, as provided by Equilar. OutsideChmn is a dichotomous variable thatequals 1 if the board chairman is delineated as an outsider by Equilar and is 0 otherwise. OutsideLdDir is adichotomous variable that equals 1 if the lead independent director is delineated as an outsider by Equilarand is 0 otherwise. CEOApptdOutsDirs is the number of outside directors whose tenure is less than the CEO’stenure, scaled by the total number of directors. StaggeredBd is a dichotomous variable that equals 1 if Equilardelineates the board service terms as staggered and is 0 otherwise. PctOldOutsDirs is the ratio of outsidedirectors who are at least 69 years old to total directors. PctBusyOutsDirs is the ratio of outside directors whoserve simultaneously on at least two boards to total directors. PctFoundingDirs is the ratio of directors who arefounding firm members to total directors. OutsideDirHolds is the sum of shares held by outside directors to totalshares outstanding. NumberDirs is the number of directors on the board. PctFinExpsAud is the ratio of directorswith financial expertise who serve on the audit committee to total directors. Directors are classified as financialexperts if they have experience as CEO, CFO, financial accountant, or auditor, or if they have been licensedas a Certified Public or Chartered Accountant. DirCompMix is the ratio of total dollar equity compensation tototal equity plus cash compensation for nonexecutive directors. NumberInstOwns is the number of institutionalowners delineated in the CDA/Spectrum database. NumBlockhldrs is the number of institutional owners thatown at least 5% of outstanding shares. Activists is the number of institutional owners denoted as activists byCremers and Nair [2005] and Larcker, Richardson, and Tuna [2007].

The first column reports the average coefficient estimate across year-specific estimation from 2001 through2005. The second column reports an aggregate z-statistic, which is calculated as the sum of the individualannual z-statistics divided by the square root of the number of years over which equation (2) is estimated.This aggregated z-statistic assumes that each annual estimation is independent of the other estimations. Thefinal two columns report the number of years for which the year-specific coefficient is positive and negative,respectively. Adj. Pseudo R2 is the average McFadden’s [2000] adjusted pseudo R2.

179637C
Sticky Note
table 3 is based on four annual regressions.
Page 19: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 243

second column reports an aggregated z-statistic.19 The final two columnsreport the number of years in which the sign of each annual coefficientestimate is positive and negative, respectively.

Our results are generally consistent with prior research, in that we findthat CEO equity incentives are greater at larger firms, firms with growth op-portunities, and firms with longer-tenured CEOs. In addition, we find thatequity incentives are lower at firms with stronger monitoring (e.g., outsidechairman, lead director, number of institutional shareholders, and numberof activist shareholders). We also observe that equity incentives exhibit a pos-itive association with the percentage of the outside directors appointed bythe CEO, the percentage of founders on the board, the percentage of sharesheld by the outside directors, and the degree to which equity incentivesare used to compensate outside directors. Finally, table 3 indicates that thepropensity-score model has reasonable explanatory power (Adj. Pseudo-R2 =27.3%). This is important, since a propensity score with very low explanatorypower effectively induces random matching, which increases the likelihoodthat inferences will be confounded by correlated omitted variables.

4.1.2. Matching Algorithm. In the case where a binary treatment is present(i.e., treatment or no treatment), matched pairs are formed by selecting anobservation that received the treatment and selecting another observationwith the closest propensity score that did not receive the treatment. Since weuse CEO equity-incentive quintiles as our treatment, matching becomes anoptimization problem of minimizing a function of the aggregate distancesbetween the propensity scores of the matched pairs. We follow the approachoutlined in Lu et al. [2001] and simultaneously minimize the differencebetween propensity scores and maximize the difference between equity-incentive levels with the following distance metric:

�i, j = (PScorei − PScore j )2

(δi − δ j )2If δi �= δ j

�i, j = ∞ If δi = δ j .

(2)

PScore is the propensity score computed from equation (1), δ is each observa-tion’s equity-incentive quintile, and i, j index the individual observations.20

19 The aggregated z-statistic is calculated as the sum of the individual annual z-statisticsdivided by the square root of the number of years for which the propensity score modelis estimated. The construction of this aggregate z-statistic assumes that each of the annualestimates is independent. However, the significance of either the individual or aggregatedresults presented in table 3 does not affect our primary analysis of the relationship betweenequity incentives and accounting irregularities since matched pairs are formed annually basedon the respective propensity score model.

20 The distance metric can be generalized to the case where the treatment variable (i.e.,denominator) is continuous. See Hirano and Imbens [2004] for a theoretical discussion andArmstrong, Ittner, and Larcker [2009] and Armstrong, Blouin, and Larcker [2009] for exam-ples of implementing this approach.

179637C
Highlight
179637C
Sticky Note
δ = 1,2,3,4 or 5; Pscore = Prob of belonging to its own quintile Threfore, the smaller value the of traingel Delata in eqn 2, the better matching.
Page 20: Propensity Score Matching

244 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

T A B L E 4Matched-Pair Frequencies for Equity-Incentive Quintiles: Frequencies of the Dosage Differences between

the Matched Pairs

Treatment Equity Incentive QuintileControl EquityIncentive Quintile 1 2 3 4 5 Total

1 0 670 188 25 71 9542 0 821 442 130 1,3933 0 786 404 1,1904 0 1,022 1,0225 0 0Total 0 670 1,009 1,253 1,627 4,559

Matched pairs are formed using the following distance metric:

�i, j = (P Scor e i −P Scor e j )2

(δi −δ j )2 if δi �= δ j

�i, j = ∞ if δi = δ j .

PScore is the propensity score computed from equation (1), δ is each observation’s equity incentivetreatment quintile, and i, j denote individual observations.

Matched pairs are identified through a nonbipartite algorithm to identify, across all possible permu-tations, the minimum sum of pairwise distances.

∑�i, j for i �= j , where each observation is paired with

another and observations can be used only once for matching.Higher equity-incentive observations are labeled as treatment, and lower equity-incentive observations

are labeled as control.

We then use a nonbipartite algorithm to identify, across all possible per-mutations, the minimum sum of pairwise distances,

∑�i, j for i �= j , where

each observation is paired with another and observations can be used onlyonce for matching (i.e., matching without replacement).21 In particular, weemploy the nonbipartite matching algorithm suggested by Derigs [1988],which is an “optimal” algorithm in the sense that it considers the potentialdistances between other matched pairs when forming a particular matchedpair (Lu et al. [2001]).

The distribution of matched pairs according to their pairwise equity-incentive quintiles is presented in table 4. The columns indicate the quintileof the treatment observation in each matched pair, while the rows indicatethe quintile of its control counterpart. For example, the (3,5) element ofthe matrix is 404, which indicates that there are 404 matched pairs forwhich the treatment is in the highest quintile of equity incentives (i.e., 5)and the control is in the middle quintile of equity incentives (i.e., 3). Thediagonal elements are all zero, since we preclude matches with identicalequity-incentive levels. Not surprisingly, most matched pairs (72.36%) lieimmediately off the diagonal, where the difference in the quintile rank ofincentives between the treatment and control is one. Only 4.96% of the

21 It is not clear whether prior studies match with or without replacement. If matching is donewith replacement and the same firm is included in multiple matches, it is necessary to adjust(increase) the standard error used for statistical tests. Depending on the correlation acrossmatches, this adjustment can be quite large. In general, the distinction between matching withand without replacement represents a tradeoff of efficiency versus bias.

179637C
Sticky Note
Take all sample (10,000+ obs) traingel delata as per eqn 2 - classify them according to quintile matching (low quintile vs high quintile) - calculate the sum of delta for each observation - find the minimum of these sums - and then???? I suppose only 4559 obs have matches, others did not match at all.
Page 21: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 245

paired observations have a difference of at least three quintiles. This re-sult indicates that CEOs with similar contracting environments tend to havesimilar levels of equity incentives and that the propensity-score estimationmethod reasonably predicts CEO equity-incentive levels.

4.1.3. Covariate Balance between Treatment and Control Samples. Covariatebalance is achieved if both the treatment and control groups appear similaralong their observable dimensions except for their level of equity incentives.An adequate degree of covariate balance is necessary to properly accountfor the confounding effects of the observed control variables used to matchthe observations. If balance is not achieved, it may be necessary to removethe matched pairs that contributed to the imbalance.22 Examining covariatebalance is important also because it can highlight potential identificationproblems. If there is a variable for which it is not possible to achieve ade-quate balance across the treatment and control groups, the treatment effectcannot be identified by the research design. For example, assume that thetreatment group (CEOs with high equity incentives) always consists of largerfirms than the control group (CEOs with low equity incentives). This settingwill produce an identification problem, because any observed difference inoutcome between the treatment and control groups cannot be uniquelyattributed to either the treatment or to firm size.

To assess covariate balance between the treatment and control groups,we report both a parametric t-test of the difference in means and a non-parametric Kolmogorov–Smirnov (KS) test of the difference between twodistributions.23 Table 5 presents the means and medians of the treatmentand control groups along with the p-values (two-tailed) for both the t-testand the KS-test. The p-values for the t-test and KS-test indicate that thematching algorithm was successful in achieving balance for most covariates.In particular, 19 of the 20 t-tests and 13 of the 20 KS-tests are not statisti-cally significant (p > 0.05, two-tailed). Moreover, even in the cases in whichthe means and medians are statistically different, the economic differencesbetween the treatment and control sample are very small. Statistical signif-icance appears to occur because we have a relatively large sample size for

22 Although removing observations can improve covariate balance, it may also restrict therange over which the researcher can make statements about the relationship between thetreatment and the outcome of interest. It is only appropriate to draw inferences from withinthe overlapping support of the distributions. Inferences from outside this range are based onextrapolation and rely on an assumption about the functional form of the relationship outsidethis range (e.g., linearity).

23 The t-test assumes normality of the data, while the two-sample Kolmogorov–Smirnov testis a nonparametric test and is sensitive to differences in both the location and shape of theempirical distributions of the samples. Following Sekhon [2009, p. 10], we bootstrap the KS-test statistic with 2,000 bootstrap samples because “the bootstrapped Kolmogorov–Smirnovtest, unlike the standard test, provides correct coverage even when there are point masses inthe distributions being compared.” This is important in our cases, since we include a numberof dichotomous variables in our specification.

179637C
Highlight
179637C
Highlight
Page 22: Propensity Score Matching

246 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

T A B L E 5Covariate Balance between the Matched Pairs: Test Statistics of Covariate Distributions for the Treatment

(High CEO Equity Incentives) and the Control (Low CEO Equity Incentives) Groups (n = 4,559Matched Pairs).

t-Test KS BootstrapMean Mean Median Median Difference Difference

Treatment Control Treatment Control p-Value p-Value

Leverage 0.205 0.200 0.149 0.152 0.281 0.407Log(MarketCap) 6.322 6.297 6.234 6.210 0.417 0.179Log(Idiosyncrisk) −2.045 −2.018 −2.059 −2.024 0.017 0.000MkttoBook 2.785 2.770 2.113 2.028 0.794 0.002Log(1 + Tenure) 1.640 1.625 1.740 1.705 0.445 0.000OutsideChmn 0.122 0.121 0.000 0.000 0.898 0.910OutsideLdDir 0.157 0.160 0.000 0.000 0.667 0.687CEOApptdOutsDirs 0.703 0.707 0.800 0.800 0.556 0.313StaggeredBd 0.597 0.594 1.000 1.000 0.749 0.747PctOldOutsDirs 0.152 0.152 0.125 0.133 0.945 0.252PctBusyOutsDirs 0.239 0.242 0.200 0.200 0.450 0.153PctFoundingDirs 0.036 0.035 0.000 0.000 0.436 0.000OutsideDirHolds 0.014 0.014 0.003 0.003 0.854 0.001Log(NumberDirs) 2.222 2.222 2.197 2.197 0.927 0.758PctFinExpsAud 0.745 0.750 1.000 1.000 0.531 0.540DirCompMix 0.495 0.507 0.549 0.537 0.075 0.000Log(NumInstOwns) 4.359 4.368 4.466 4.466 0.640 0.822Log(NumBlockhldrs) 0.895 0.917 1.099 1.099 0.067 0.009Log(Activists) 0.009 0.010 0.000 0.000 0.647 0.697

Leverage is the ratio of total debt to market value of assets computed from Compustat as (data9 +data34)/((data199 ∗ data25) + data9). MarketCap is the market value of equity computed from Compustatas (data199 ∗ data25). Idiosyncrisk is the standard deviation of residuals from a firm-specific regressionof monthly returns on the monthly return to the CRSP value-weighted portfolio index (Core and Guay[1999]). At least 12 and no more than 36 monthly return observations are required for estimation.MkttoBook is the market value of equity divided by the book value of equity computed from Compustat as((data199 ∗ data25)/data216). Tenure is the CEO’s tenure with the firm in years, as provided by Equilar.OutsideChmn is a dichotomous variable that equals 1 if the board chairman is delineated as an outsider byEquilar and is 0 otherwise. OutsideLdDir is a dichotomous variable that equals 1 if the lead independentdirector is delineated as an outsider by Equilar and is 0 otherwise. CEOApptdOutsDirs is the number ofoutside directors whose tenure is less than the CEO’s tenure, scaled by the total number of directors.StaggeredBd is a dichotomous variable that equals 1 if Equilar delineates the board service terms as staggeredand is 0 otherwise. PctOldOutsDirs is the ratio of outside directors who are at least 69 years old to totaldirectors. PctBusyOutsDirs is the ratio of outside directors who serve simultaneously on at least two boards tototal directors. PctFoundingDirs is the ratio of directors who are founding firm members to total directors.OutsideDirHolds is the sum of shares held by outside directors to total shares outstanding. NumberDirs isthe number of directors on the board. PctFinExpsAud is the ratio of directors with financial expertise whoserve on the audit committee to total directors. Directors are classified as financial experts if they haveexperience as CEO, CFO, financial accountant, or auditor, or if they have been licensed as a CertifiedPublic or Chartered Accountant. DirCompMix is the ratio of total dollar equity compensation to total equityplus cash compensation for nonexecutive directors. NumberInstOwns is the number of institutional ownersdelineated in the CDA/Spectrum database. NumBlockhldrs is the number of institutional owners that ownat least 5% of outstanding shares. Activists is the number of institutional owners denoted as activists byCremers and Nair [2005] and Larcker, Richardson, and Tuna [2007].

these tests. These results suggest that the covariates are generally balancedacross the treatment and control samples and that differences in these ob-served variables across the treatment and control groups are not likely toconfound our estimates of the average treatment effect.

179637C
Sticky Note
Matching is successful because moist diffs (19 out of 20?) diffs are not stastically significant.
Page 23: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 247

5. Results

5.1 PRIMARY RESULTS

Table 6 presents our primary results regarding the relationship betweenequity incentives and accounting irregularities. The formal statistical test ofthis relationship entails examining the discordant frequency of accountingirregularities that are associated with a particular treatment level.24,25 Ac-counting irregularities are counted for the first year in which an accounting-manipulation-related restatement is observed (panel A), in which the firmis involved in a class action damage period (panel B), or in which the firmis involved in an AAER damage period (panel C).26

For each accounting irregularity, we present the results for both the con-temporaneous and one-year-ahead relationship in three ways that take ad-vantage of different amounts of information about the equity-incentivesquintile of the treatment and control observation. First, we present resultsaccording to each possible pairing of equity-incentives quintile. Since thereare five levels of equity incentives and we preclude a matched pair fromhaving an identical level of equity incentives, there are 10 possible combi-nations for each pair. This is the finest level of aggregation and preservesinformation about both the magnitude of the difference in the level ofequity incentives and the location on the support of the equity-incentivedistribution. Second, we group matched pairs according to the differencein equity-incentive quintiles between the treatment and control observa-tions. This is a more coarse level of aggregation that preserves informationabout the difference in the level of equity incentives between the treatmentand control observations but ignores information about their location onthe support of the equity-incentive distribution (e.g., a 5–3 quintile pairis treated the same as a 3–1 quintile pair because they both represent adifference of two quintiles between the treatment and control observations).

24 A pair of observations is concordant if each observation experiences the same outcomeand discordant if each has a different outcome. We assess the significance between the numberof concordant and discordant pairs using McNemar’s [1947] χ2 statistic. With small samples,McNemar’s χ2 may be misleading and an exact cumulative binominal test should be used(Liddell [1983]). None of our inferences change when this exact test is used for evaluating theresults in table 6.

25 When it is not possible to achieve adequate covariate balance, an alternative approachis to form matched pairs with the propensity scores and then estimate a (logistic) regressionof the outcome as a function of treatment and the vector of control variables used in thepropensity-score model (Ho et al. [2007]). For sensitivity, we estimate logistic regressions ofaccounting irregularities on the level of equity incentives and the controls that were used in thepropensity-score estimation regression. Results (untabulated) are similar to those reported intable 6, an outcome that is not surprising given the high degree of covariate balance achievedthrough first-stage matching.

26 It is extremely rare for the same firm to appear in multiple discordant pairs. Therefore,correlation across observations from the same firm is unlikely to induce inference problems.Cross-sectional correlation is also not likely to induce inference problems, since treatment andcontrol firms are matched in the same year.

179637C
Highlight
Page 24: Propensity Score Matching

248C.

S.A

RM

STR

ON

G,

A.D

.JA

GO

LIN

ZE

R,A

ND

D.

F.L

AR

CK

ER

T A B L E 6Accounting Irregularities: Frequency of Observed Accounting Irregularities for the Treatment (High CEO Equity Incentives) and the Control (Low CEO Equity Incentives) Groups

Panel A: Accounting manipulation-related restatementsEqIncQuintt Restatement Frequencyt Restatement Frequencyt +1

T C T C p � $Incent T $Incent C T C p � $Incent T $Incent C

5 4 38 40 0.910 1,582,081 501,198 31 55 0.013 1.37 1,529,602 528,7845 3 8 11 0.646 15 16 1.0005 2 4 4 0.724 2 4 0.6835 1 0 3 0.248 2 1 1.0004 3 29 38 0.328 472,528 222,198 29 33 0.703 450,777 223,5294 2 16 17 1.000 11 11 0.8314 1 0 0 1.000 1 0 1.0003 2 36 32 0.716 204,615 96,106 35 44 0.368 185,660 91,7263 1 7 8 1.000 9 4 0.2672 1 23 21 0.880 83,670 29,833 24 21 0.766 78,978 25,552Pooled Pooled 161 174 0.512 429,801 147,548 159 189 0.120 713,271 178,435

Restatement Frequencyt Restatement Frequencyt +1DiffEqIncQuinttT − C T C p � $Incent T $Incent C T C p � $Incent T $Incent C

4 0 3 0.248 2 1 1.0003 4 4 0.724 3 4 1.0002 31 36 0.625 35 31 0.7121 126 131 0.803 368,860 167,047 119 153 0.045 1.04 371,114 162,026

(Continued)

179637C
Sticky Note
table 6 counts the number of irregularities; p values show that there is no diffrence in the number of irreguralities between control and treatment. Very pharmaceutical!
Page 25: Propensity Score Matching

CH

IEF

EX

EC

UT

IVE

OFF

ICE

RE

QU

ITY

INC

EN

TIV

ES

249

T A B L E 6 —Continued

Panel B: Accounting-related shareholder lawsuitsEqIncQuintt Lawsuit Frequencyt Lawsuit Frequencyt +1

T C T C p � $Incent T $Incent C T C p � $Incent T $Incent C

5 4 42 65 0.033 1.13 1,596,888 527,222 31 41 0.289 1,474,446 524,9695 3 6 16 0.055 1.49 7 22 0.009 3.155 2 4 8 0.386 5 5 0.7525 1 4 5 1.000 4 1 0.3714 3 31 36 0.625 451,960 226,679 22 17 0.522 401,866 183,2034 2 17 17 0.864 11 13 0.8384 1 1 0 1.000 0 0 1.0003 2 31 24 0.418 184,689 74,294 12 26 0.035 1.41 170,823 75,4323 1 0 2 0.480 1 1 0.4802 1 7 14 0.190 77,684 24,576 5 7 0.773 65,287 24,233Pooled Pooled 143 187 0.018 1.06 630,373 189,108 98 133 0.025 1.07 713,271 178,435

Lawsuit Frequencyt Lawsuit Frequencyt +1DiffEqIncQuinttT − C T C p � $Incent T $Incent C T C P � $Incent T $Incent C

4 4 5 1.000 4 1 0.3713 5 8 0.579 5 5 0.7522 23 35 0.149 19 36 0.031 1.301 111 139 0.088 1.01 568,488 276,337 70 91 0.115 581,103 265,093

(Continued)

Page 26: Propensity Score Matching

250C.

S.A

RM

STR

ON

G,

A.D

.JA

GO

LIN

ZE

R,A

ND

D.

F.L

AR

CK

ER

T A B L E 6 —Continued

Panel C: Accounting-related SEC enforcement actions (AAERs)EqIncQuintt AAER Frequencyt AAER Frequencyt +1

T C T C p � $Incent T $Incent C T C p � $Incent T $Incent C

5 4 9 29 0.002 6.26 1,636,782 477,647 7 19 0.031 1.80 1,523,063 514,9175 3 2 5 0.450 3 0 0.2485 2 0 2 0.480 0 1 1.0005 1 2 1 1.000 2 0 0.4804 3 10 10 0.823 439,622 243,417 9 5 0.423 381,774 238,1924 2 4 11 0.121 3 6 0.5054 1 0 0 0.000 1 0 1.0003 2 10 11 1.000 183,834 81,296 6 12 0.239 177,221 72,1153 1 1 1 0.480 1 0 1.0002 1 6 3 0.505 72,381 23,258 4 3 1.000 60,539 19,342Pooled Pooled 44 73 0.010 1.32 557,843 153,427 36 46 0.320 406,358 120,498

AAER Frequencyt AAER Frequencyt+1DiffEqIncQuinttT−C T C p � $Incent T $Incent C T C P � $Incent T $Incent C

4 2 1 1.000 2 0 0.4803 0 2 0.480 1 1 0.4802 7 17 0.066 1.29 7 6 1.0001 35 53 0.070 1.05 504,338 249,946 26 39 0.137 372,203 212,802

EqIncQuint is a dichotomous variable that equals 1 if the CEO’s portfolio delta falls within the kth quintile of the cross-sectional distribution of CEO deltas and equals 0 otherwise.The portfolio delta is the change in dollar value of the CEO’s equity portfolio for a 1% change in the firm’s underlying stock price. DiffEqIncQuint is the difference between EqIncQuintfor the Treatment and Control groups. Each cell contains the discordant pair frequency of observing an accounting irregularity. In other words, the frequency count in the Treatmentcategory denotes the number of observations for which there is an observed accounting irregularity in the treatment group but no observed accounting irregularity in the controlgroup. p-values are computed using McNemar’s nonparametric test for differences in frequency across distributions. � values quantify the amount of hidden bias necessary to alterthe statistical significance (p = 0.10) that results from the assumption that two observations with identical propensity scores have an equal probability of receiving treatment. $Incentis the median portfolio delta computed for the Treatment and Control matched observations reported in the frequency cells. $Incent is reported only when DiffEqIncQuint = 1 (andfor pooled data), since this reflects the minimum equity-incentive distance and there is sufficient sample size for tests of median differences. All $Incent T − $Incent C differences arestatistically significant at the 1% level (two-tailed) using a Kruskal–Wallis test.

Page 27: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 251

Third, we pool all of the treatment and control observations and look fordifferences in the incidence of accounting irregularities between these twogroups. This is the coarsest level of aggregation and ignores informationabout both the magnitude and location of the equity incentives. It con-siders only whether each observation in a matched pair has a higher orlower level of equity incentives. It does, however, have the benefit of max-imizing the sample size, which increases the power of the test. To helpassess the economic magnitude of incentive-level differences between thetreatment and control groups, table 6 also reports $Incent, which is the me-dian portfolio delta for matched observations reported in the frequencycells.27

The results presented in table 6 (panel A) do not support the notionthat higher equity-incentive levels are associated with a greater incidenceof accounting-related restatements. There are no instances of statisticallylarger restatement frequencies for treatment observations relative to con-trol observations for any comparison. In contrast, we find some modest evi-dence consistent with the alternative explanation that equity incentives alignmanagers’ interests with those of shareholders. When there is a differenceof one between the level of equity incentives in the treatment and controlobservations (i.e., DiffEqIncQuint = 1), there are 34 (= 119–153) more re-statement incidents observed in the subsequent year (p-value = 0.045) forthe firms with lower incentives (control firms) relative to the firms with higherincentives (treatment firms).

The results presented in table 6 (panel B) are similar to those in panel A.We find no evidence that higher equity incentives are associated with ahigher frequency of accounting-related lawsuits. Instead, the pooled resultsfor contemporaneous and one-year-ahead lawsuit frequency show more law-suits for firms with lower incentives relative to firms with higher incentives(p-value = 0.018 and 0.025, respectively). Looking forward one year, thereis also evidence of fewer lawsuits at firms with lower equity incentives, wherethe equity incentives differ by two quintiles (p-value 0.031).

Finally, the results in table 6 (panel C), which relates to AAER damageperiods, are consistent with those in panels A and B. There is no evidence ofa positive association between equity incentives and the incidence of AAERdamage periods. However, there is evidence that higher equity incentives areassociated with a lower incidence of AAERs. Specifically, when observationsare pooled and when the difference in the level of equity incentives is one ortwo, there are more contemporaneous AAERs for the control firms relativeto the treatment firms (p-value = 0.010, 0.070, and 0.066, respectively).

Overall, the results in table 6 do not provide evidence of a positive associ-ation between equity incentives and the frequency of accounting irregular-ities. In contrast, there is a modest negative association between incentives

27 Table 6 reports $Incent only when the difference between equity incentives quintiles isequal to 1 (and for pooled data), since this reflects the minimum equity-incentive distanceand there is sufficient sample size for tests of median differences. All treatment-control $Incentdifferences are statistically significant (p < 0.01, two-tailed) using a Kruskal–Wallis test.

Page 28: Propensity Score Matching

252 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

and the frequency of accounting irregularities. Thus, our results are moreconsistent with incentive alignment rather than with managerial rentextraction.

5.2 HIDDEN BIAS SENSITIVITY

It is well known that the results of nonexperimental empirical studies aresusceptible to hidden bias caused by the omission of an unobservable yetrelevant variable (i.e., a correlated omitted variable). Surprisingly, few em-pirical accounting studies attempt to quantify the potential effects of hiddenbias on their primary conclusions. We use a bounding approach outlined byRosenbaum [2002] and DiPrete and Gangl [2004] to assess the sensitivity ofour inferences to potential hidden bias that might exist, because of endoge-nous matching of executives and equity-incentive contracts and other similarfactors.28 Rosenbaum [2002] and DiPrete and Gangl [2004] note that al-though propensity-score matching effectively alleviates overt bias relating toobservable covariates, it does not remove hidden bias that might arise fromunobserved covariates. Both studies outline an approach to identify the lim-its at which an unobservable confounding variable would alter inferencesthat can be drawn from an analysis based on only the observed variables.

Rosenbaum [2002] shows that hidden bias exists if two observations (de-noted i and j) have the same observed x covariates but different probabilities(denoted as π) of receiving treatment because of some unobserved factor.In the case of a binary treatment, the odds that each observation, i and j,receive treatment are πi/(1 − πi ) and π j /(1 − π j ), respectively. Since thesetwo observations look similar across their observable covariates x, they wouldbe paired by a matching algorithm to minimize overt bias. If the odds ratio(denoted as � by Rosenbaum [2002]) does not equal 1, each observation ina matched pair has an unequal probability of receiving treatment and thereis a hidden bias inherent in the analysis. Rosenbaum [2002] shows that re-laxing the assumption that � = 1 (i.e., that two observations with identicalobservable covariates have an identical probability of receiving treatment)can be used to compute significance test boundaries under different as-sumptions about the strength of the hidden bias that is necessary to alterthe qualitative inferences from a study.

We assess the sensitivity of observed statistically significant results by es-timating the boundary � values for cases in which McNemar’s test p-values

28 It is quite likely that hidden bias is present in this study, as well as the papers summarizedin table 1, because of selection on unobservables (Heckman and Hotz [1989]). Selectionon unobservables occurs when it is not possible to observe all of the covariates that affect adecision maker’s selection. For example, if certain firms use contracts with a high level of equityincentives to select more risk-seeking executives and more risk-seeking executives choose firmsthat offer riskier compensation packages (with higher equity incentives), there is endogenousmatching on executive risk aversion. If risk-seeking executives are also more likely to manipulateaccounting reports, this endogenous matching on an unobservable variable (i.e., the degreeof CEO risk aversion) would induce hidden bias in our results, and we might misattributea difference in the frequency of accounting irregularities to differences in the level of CEOequity incentives, rather than to differences in the degree of CEO risk aversion.

Page 29: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 253

exceed 0.10 (two-tailed). Specifically, in the cases in which there is a sta-tistically significant difference between the outcomes of the treatment andcontrol groups, we calculate the value of � (or the odds ratio) at which asignificance level of 0.10 would be obtained. These � values allow us to quan-tify the amount of hidden bias necessary to invalidate the statistical signifi-cance that results from the assumption that two observations with identicalpropensity scores have an equal probability of receiving the treatment.

These boundary values are presented (where applicable) in table 6. Wefind that there are several cases in which a statistically significant relation-ship is observed, yet only a small � value is needed to reduce the statisticalsignificance of the result.29 This finding suggests that these results are verysensitive to hidden bias and should be interpreted with caution. For exam-ple, consider the case of one-year-ahead restatements related to accountingmanipulation for the 5–4 incentive quintile pairs presented in panel A oftable 6. In this case, there are 55 restatements observed from the controlgroup and 31 restatements observed from the treatment group. Althoughthis difference is statistically significant (p-value = 0.013), a value of � = 1.37would result in a p-value of 0.10. Therefore, this result would be marginallysignificant (p-value = 0.10) if control firms were actually 1.37 times more(rather than equally) likely to receive lower equity incentives than treat-ment firms, after conditioning on observable features of the contractingenvironment using the propensity score.

There are also cases in which the observed results are much less sensitiveto hidden bias. For example, panel C of table 6 reports 9 AAER incidents forthe fifth incentive quintile and 29 AAER incidents for the fourth incentivequintile in the contemporaneous AAER estimation (p-value = 0.002). Forthis comparison, we find that � = 6.26, which provides robust support forthe inference that very high levels of CEO equity incentives minimize AAERfrequency.

Overall, there is mixed evidence on whether results are robust to hiddenbias. The most robust results occur in the higher quintiles of the equity-incentive distribution. As discussed above, these results also provide someevidence of a negative association between incentives and irregularities.Thus, the results that appear least sensitive to potential hidden bias arethose that are consistent with an incentive-alignment explanation.

29 To our knowledge, no objective benchmark exists to determine whether a given � is “large”or “small.” Therefore, the designation is subjective and depends on the reader’s prior beliefsas to the degree of endogenous selection on unobservable factors (e.g., risk-aversion, talent,productivity) in CEO contracting. Larger values of �, however, provide greater confidencethat results are robust to hidden bias. Smaller values of � indicate that results are sensitiveto hidden bias, thereby confounding inferences from the analysis. Future research shouldconsider identifying threshold � values, perhaps through gathering and evaluating an empiricaldistribution of � values implicit in other studies. An alternative approach outlined by Altonji,Elder, and Taber [2005] is to express the degree of selection on unobservable factors relativeto the degree of selection on observable factors that would be necessary to alter the statisticalsignificance of the results.

Page 30: Propensity Score Matching

254 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

6. Sensitivity Analyses

6.1 PROXY FOR CEO EQUITY INCENTIVES

It is possible that our proxy for equity incentives does not adequately mea-sure the degree to which CEOs’ utility is sensitive to changes in firm value.To assess the sensitivity of our results to our choice of equity-incentive proxy,we consider an alternative equity-incentive measure, EqCompMix. This mea-sure is computed as the ratio of the risk-neutral dollar value of options plusrestricted stock compensation to the risk-neutral value of total annual com-pensation (i.e., stock options, restricted stock, salary, bonus, and target long-term incentive-plan payouts). This (or a similar) incentive measure has beenused in prior studies (e.g., Erickson, Hanlon, and Maydew [2006]; Baber,Kang, and Liang [2007]). In addition, compensation consultants commonlyuse equity mix in their recommendations to the board concerning execu-tive compensation, and it may be a suitable alternative proxy for managerialincentives. After re-estimating the propensity-score model, matching algo-rithm, and primary tests, we find results (untabulated) to be generally similarto those reported in table 6.

6.2 EQUILAR VERSUS EXECUCOMP SAMPLE

It is possible that our results are sensitive to sample selection because priorstudies generally use data from the larger and more mature firms that com-prise the ExecuComp database. To evaluate this possibility, we re-estimateour results after constraining the sample to the subset of Equilar firms thatalso appear in the ExecuComp database. Our results (not tabulated) areconsistent with those reported in table 6.

6.3 ECONOMETRIC APPROACH

Since the propensity-score matched-pair research design is quite differentfrom the more traditional outcome-matched logistic regression, it is instruc-tive to examine the sensitivity of our results to the choice of econometricapproach. In table 7 (panel A), we report conditional logistic estimatesfrom an outcome-matched sample. This sample was generated by matching(without replacement) firms with an accounting irregularity to firms with-out an accounting irregularity by year, two-digit SIC code, and total assets.The remaining variables from the propensity-score estimation (see table 3)are included as controls. In contrast to the propensity-score results in table6, we find little statistically significant evidence of an association betweenaccounting irregularities and equity incentives.30

To explore the sensitivity of our results further, table 7 (panel B) providescomparative results of the covariate balance obtained from propensity-score

30 The only statistically significant difference is observed for one-year-ahead AAERs for whichthe estimated coefficient for equity quintile 3 is smaller than both the estimated coefficientsfor equity quintiles 2 and 4. These results are mixed and are inconsistent with results intable 6.

Page 31: Propensity Score Matching

CH

IEF

EX

EC

UT

IVE

OFF

ICE

RE

QU

ITY

INC

EN

TIV

ES

255

T A B L E 7Sensitivity Analysis: Conditional Logistic Regression

Panel A: Regression estimatesRestatementt Lawsuitt AAERt

Coeff. z-Stat Coeff. z-Stat Coeff. z-Stat

EqIncQuint 2 = 1 0.037 0.10 −0.219 −0.60 −0.794 −0.72EqIncQuint 3 = 1 −0.014 −0.06 −0.193 −0.67 −0.412 −0.36EqIncQuint 4 = 1 −0.288 −0.66 −0.012 −0.03 −0.174 −0.17EqIncQuint 5 = 1 −0.406 −0.94 −0.613 −1.35 −1.382 −1.09Matched CEO–firm-year obs. 770 798 262Adj. Pseudo-R2 0.144 0.198 0.429

Tests of Coefficients p-Value p-Value p-ValueEqIncQuint 2 = 1 �= EqIncQuint 3 = 1 0.89 0.90 0.38EqIncQuint 2 = 1 �= EqIncQuint 4 = 1 0.38 0.51 0.32EqIncQuint 2 = 1 �= EqIncQuint 5 = 1 0.20 0.23 0.52EqIncQuint 3 = 1 �= EqIncQuint 4 = 1 0.47 0.40 0.75EqIncQuint 3 = 1 �= EqIncQuint 5 = 1 0.24 0.12 0.31EqIncQuint 4 = 1 �= EqIncQuint 5 = 1 0.74 0.02 0.13

Restatementt +1 Lawsuitt +1 AAERt +1

Coeff. z-Stat Coeff. z-Stat Coeff. z-StatEqIncQuint 2 = 1 −0.372 −1.41 −0.076 −0.17 1.666 1.19EqIncQuint 3 = 1 −0.073 −0.24 −0.163 −0.33 −0.691 −0.39EqIncQuint 4 = 1 −0.108 −0.33 −0.115 −0.20 2.258 1.13EqIncQuint 5 = 1 −0.767 −1.79 −0.263 −0.58 0.437 0.24Matched CEO–firm-year obs. 668 514 176Adj. Pseudo-R2 0.095 0.184 0.506

Tests of Coefficients p-Value p-Value p-ValueEqIncQuint 2 = 1 �= EqIncQuint 3 = 1 0.89 0.81 0.01EqIncQuint 2 = 1 �= EqIncQuint 4 = 1 0.38 0.93 0.60EqIncQuint 2 = 1 �= EqIncQuint 5 = 1 0.20 0.69 0.40EqIncQuint 3 = 1 �= EqIncQuint 4 = 1 0.47 0.90 0.00EqIncQuint 3 = 1 �= EqIncQuint 5 = 1 0.24 0.76 0.19EqIncQuint 4 = 1 �= EqIncQuint 5 = 1 0.74 0.71 0.12

(Continued)

179637C
Sticky Note
Table 7 is standard methodolgy - to copare against the new mehod - part of sensitivity analysis
Page 32: Propensity Score Matching

256C.

S.A

RM

STR

ON

G,

A.D

.JA

GO

LIN

ZE

R,A

ND

D.

F.L

AR

CK

ER

T A B L E 7 —ContinuedPanel B: Covariate balance between irregularity and matched observations

Table 5 Restatementt Restatementt +1 Lawsuitt Lawsuitt +1 AAERt AAERt +1Median Median Median Median Median Median Median

Trt.-Cntrl. Irreg.-Match Irreg.-Match Irreg.-Match Irreg.-Match Irreg.-Match Irreg.-MatchPortDelta 1.193∗∗∗ −0.291∗∗∗ −0.117 −0.209 0.029 −0.248∗ 0.050Leverage 0.000 0.034∗∗∗ 0.019∗∗∗ 0.010∗∗∗ 0.000 0.027∗∗∗ 0.019∗∗∗

Log(MarketCap) 0.041 0.000 0.002 0.003 0.002 0.002 0.002Log(Idiosyncrisk) −0.036∗∗∗ 0.059∗∗∗ 0.041 0.118∗∗∗ 0.116∗∗∗ 0.134∗∗ 0.127∗∗

MkttoBook 0.057∗∗∗ −0.312∗∗∗ −0.277∗∗ −0.105 −0.026 −0.320 −0.082Log(1 + Tenure) 0.000∗∗∗ −0.261∗∗∗ 0.000 0.000 0.169 −0.000 0.084OutsideChmn 0.000 0.000 0.000 0.000 0.000 0.000 0.000OutsideLdDir 0.000 0.000 0.000 0.000 0.000 0.000 0.000CEOApptdOutsDirs 0.000 −0.066∗∗ 0.000 0.000 0.000 0.000 0.000StaggeredBd 0.000 0.000 0.000 0.000 0.000 0.000 0.000PctOldOutsDirs 0.000 0.000 0.000 0.000∗∗∗ 0.000 −0.000∗ 0.000PctBusyOutsDirs 0.000 0.000 0.015 0.000 0.000 −0.033 −0.054PctFoundingDirs 0.000 0.000 0.000 0.000 0.000 0.000 0.000OutsideDirHolds 0.000 0.000 0.000 0.000 0.000 0.000 0.000Log(NumberDirs) 0.000 0.068 0.000 0.000 0.000 0.000 0.000PctFinExpsAud 0.000 0.000 0.000 0.000 0.000 0.000 0.000DirCompMix 0.000 −0.004 0.000 0.035∗∗ 0.077∗∗∗ −0.050 −0.025Log(NumInstOwns) 0.000 0.063 0.057 0.079∗∗ 0.083∗∗∗ 0.141 0.115Log(NumBlockhldrs) 0.000 0.000 0.000 0.000 0.000 0.000 0.000Log(Activists) 0.000 0.000 0.000 0.000 0.000 0.000 0.000Number of pairs 4,559 385 334 399 257 131 88

EqIncQuint is a dichotomous variable that equals 1 if the CEO’s portfolio delta (PortDelta) falls within the kth quintile of the cross-sectional distribution of CEO deltas and equals 0 otherwise. PortDelta is thechange in dollar value of the CEO’s equity portfolio for a 1% change in the firm’s underlying stock price. Leverage is the ratio of total debt to market value of assets computed from Compustat as (data9 +data34)/((data199 ∗ data25) + data9). MarketCap is the market value of equity computed from Compustat as (data199 ∗ data25). Idiosyncrisk is the standard deviation of residuals from a firm-specific regression ofmonthly returns on the monthly return to the CRSP value-weighted portfolio index (Core and Guay [1999]). At least 12 and no more than 36 monthly return observations are required for estimation. MkttoBook isthe market value of equity divided by the book value of equity computed from Compustat as ((data199 ∗ data25)/data216). Tenure is the CEO’s tenure with the firm in years, as provided by Equilar. OutsideChmn is adichotomous variable that equals 1 if the board chairman is delineated as an outsider by Equilar and is 0 otherwise. OutsideLdDir is a dichotomous variable that equals 1 if the lead independent director is delineatedas an outsider by Equilar and is 0 otherwise. CEOApptdOutsDirs is the number of outside directors whose tenure is less than the CEO’s tenure, scaled by the total number of directors. StaggeredBd is a dichotomousvariable that equals 1 if Equilar delineates the board service terms as staggered and is 0 otherwise. PctOldOutsDirs is the ratio of outside directors who are at least 69 years old to total directors. PctBusyOutsDirsis the ratio of outside directors who serve simultaneously on at least two boards to total directors. PctFoundingDirs is the ratio of directors who are founding firm members to total directors. OutsideDirHolds isthe sum of shares held by outside directors to total shares outstanding. NumberDirs is the number of directors on the board. PctFinExpsAud is the ratio of directors with financial expertise who serve on theaudit committee to total directors. DirCompMix is the ratio of total dollar equity compensation to total equity plus cash compensation for nonexecutive directors. NumberInstOwns is the number of institutionalowners delineated in the CDA/Spectrum database. NumBlockhldrs is the number of institutional owners that own at least 5% of outstanding shares. Activists is the number of institutional owners denoted as ac-tivists by Cremers and Nair [2005] and Larcker, Richardson, and Tuna [2007]. ∗ , ∗∗ , and ∗∗∗ denote statistical significance at the 0.10, 0.05, and 0.01 levels, respectively, from a KS bootstrap test of median differences.

Page 33: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 257

matching with those obtained from standard partial outcome-based match-ing. Although it is difficult to make direct comparisons with prior studiesbecause of differences in sample composition and sample size, these resultsshow that standard partial outcome-matching generally does not achieve bal-ance for Leverage and Log(Idiosyncrisk) and often does not achieve balancefor MkttoBook. In contrast, propensity-score matching appears to achievebalance for Leverage and yields generally smaller median differences forLog(Idiosyncrisk) and MkttoBook across samples. These results at least sug-gest the possibility that the difference in results between tables 6 and 7 isrelated to the absence of covariate balance in the outcome-matched sample.

Equally important, covariate balance comparisons clearly show thatpropensity-score matching induces considerably more variation in the pri-mary variable of interest, PortDelta. This is an important distinction, becauseincreasing variation in the treatment variable will generate more powerfultests of the relationship between equity incentives and accounting irregu-larities.

6.4 TIME PERIOD

To assess the sensitivity of our results to alternative (earlier) time periods,we compare the results produced by standard partial outcome-matched lo-gistic regression and the propensity score using the AAER sample in Erick-son, Hanlon, and Maydew [2006]. In this test, we examine the associationbetween one year ahead AAERs and the level of equity incentives. FollowingErickson, Hanlon, and Maydew [2006], each AAER firm is matched to twofirms without an AAER from the same year, two-digit SIC code, and similartotal assets. Although Hosmer and Lemeshow [2000] show that conditionallogistic regression is more appropriate when the sample is formed by match-ing on both the dependent and independent variables, we report the resultsfor both standard logistic and conditional logistic estimation in table 8.31

We include control variables used by Erickson, Hanlon, and Maydew [2006]in this estimation.

Consistent with Erickson, Hanlon, and Maydew [2006], we do not observeevidence of a relationship between the incidence of AAERs and the level

31 Accounting researchers often cite Maddala [1991] to justify estimation methods regardinglimited dependent variables. In limited dependent variable regressions (e.g., logit or probit) inwhich observations are matched based on outcome alone, Maddala [1991] shows that bias is observedonly in the intercept. Therefore, in this specific setting, one can draw unbiased inferences fromnonintercept coefficients and can correct for the bias in the intercept (e.g., King and Zeng[2001]). Hosmer and Lemeshow [2000] show, however, that conditional logistic regressionis required to produce appropriate inferences in cases in which observations are matched basedon outcome and on selected control variables. In this specific setting, conditional logistic regres-sion is necessary to account for the lack of independence between matched pairs in the sam-ple, because pair component observations are not randomly sampled. When Maddala [1991,p. 790] states that the conditional logit “. . . is not relevant for the problems in accounting thatwe are dealing with,” he is not considering cases in which the sample is formed by matchingon both the dependent and selected independent variables.

179637C
Highlight
Page 34: Propensity Score Matching

258 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

of CEO equity incentives from estimates produced by standard logistic re-gression. However, conditional logistic regression provides some evidenceof a positive association between the incidence of AAERs and the level ofCEO equity incentives. The association is most pronounced for CEOs inthe highest incentive quintile and is more consistent with results from sev-eral prior studies (table 1). This finding also illustrates that inferences froma matched sample are sensitive to the choice of standard or conditionallogistic estimation.32

Finally, we estimate the results from a propensity-score matched-pair de-sign within this sample.33 Similar to the conditional logistic regression re-sults, table 8 (panel B) reports evidence of a positive association betweenthe incidence of AAERs and the level of CEO equity incentives at the highestlevel of the equity-incentive distribution. However, in contrast to conditionallogistic results in table 8 (panel A), panel B also reports evidence of a nega-tive association between the incidence of AAERs and the level of CEO equityincentives at the lowest level of the equity-incentive distribution. � values

T A B L E 8Sensitivity Analysis: Early Sample

Panel A: Regression estimatesAAERt +1 AAERt +1

Coeff. z-Stat Coeff. z-Stat

EqIncQuint 2 = 1 −0.155 −0.21 −0.040 −0.04EqIncQuint 3 = 1 0.642 0.87 1.066 1.00EqIncQuint 4 = 1 0.099 0.13 1.158 0.97EqIncQuint 5 = 1 1.063 1.36 3.414 2.30CEO–firm-year obs. 150 150Adj. Pseudo-R2 0.191 0.412Estimation Method Logistic Cond. Logistic

Tests of Coefficients p-Value p-ValueEqIncQuint 2 = 1 �= EqIncQuint 3 = 1 0.25 0.25EqIncQuint 2 = 1 �= EqIncQuint 4 = 1 0.72 0.23EqIncQuint 2 = 1 �= EqIncQuint 5 = 1 0.09 0.01EqIncQuint 3 = 1 �= EqIncQuint 4 = 1 0.39 0.91EqIncQuint 3 = 1 �= EqIncQuint 5 = 1 0.50 0.02EqIncQuint 4 = 1 �= EqIncQuint 5 = 1 0.12 0.02

(Continued)

32 Although we followed the methods used by Erickson, Hanlon, and Maydew [2006] toselect our control sample, our control sample may differ from theirs. Thus, we cannot make adirect comparison between the results in table 8 and those in Erickson, Hanlon, and Maydew[2006].

33 The CEO incentive propensity score is estimated as a function of the regressors reportedin Erickson, Hanlon, and Maydew [2006]. Specifically, EqIncQuint = β 1 + β 2 CEO = CHAIR+ β 3 NUMMTGS + β 4 FINANCING + β 5 LEVERAGE + β 6 MARKET VALUE OF EQUITY +β 7 ALTMAN’s Z + β 8 BOOK TO MARKET + β 9 EARNINGS TO PRICE + β 10 RET ON ASSETS+ β 11 SALES GROWTH + β 12 AGE OF FIRM + β 13 M&A IN FIRST YEAR OF FRAUD + β 14STOCK VOLATILITY + β 15 CEO TENURE + β 16 MISSING CEO TENURE + ε i .

Page 35: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 259

T A B L E 8 —Continued

Panel B: AAER frequency for treatment (high CEO equity incentives) and control(low CEO equity incentives) groups, matched by propensity score

EqIncQuint AAER Frequency

T C T C p �

5 4 16 5 0.029 1.685 3 0 0 1.0005 2 0 0 1.0005 1 0 0 1.0004 3 3 6 0.5054 2 0 0 1.0004 1 0 0 1.0003 2 3 2 1.0003 1 0 0 1.0002 1 2 13 0.010 11.92Pooled Pooled 24 26 0.888

AAER FrequencyDiffEqIncQuintT − C T C p �

4 0 0 1.0003 0 0 1.0002 0 0 1.0001 24 26 0.888

Regression statistics in panel A derive from logistic or conditional logistic regression of

AAE R = β1 +5∑

k=2

βk EqIncQuintk + β6C E O = CHAIR

+ β7NUMMTGS + β8FINANCING + β9LEVERAGE

+ β10MARKET VALUE OF EQUITY + β11ALTMAN ′S Z

+ β12BOOK TO MARKET + β13EARNINGS TO PRICE

+ β14RET ON ASSETS + β15SALES GROWTH + β16AGE OF FIRM

+ β17M&A IN FIRST YEAR OF FRAUD + β18STOCK VOLATILITY

+ β19CEO TENURE + β20MISSING CEO TENURE + εi .

EqIncQuint is a dichotomous variable that equals 1 if the CEO’s portfolio delta falls within the kth quintileof the cross-sectional distribution of CEO deltas and equals 0 otherwise. The portfolio delta is the change indollar value of the CEO’s equity portfolio for a 1% change in the firm’s underlying stock price. See Erickson,Hanlon, and Maydew [2006] for other variable definitions.

Cells in panel B contain the discordant pair frequency of observing an accounting irregularity. In otherwords, the frequency count in the Treatment category denotes the number of observations where there isan observed accounting irregularity in the treatment group but no observed accounting irregularity in thecontrol group. p-values are computed using McNemar’s nonparametric test for differences in frequencyacross distributions. DiffEqIncQuint is the difference between EqIncQuint for the Treatment and Controlgroups. Propensity scores are estimated through logistic regression of

E q Inc Quint = β1 + β2C E O = CHAIR + β3NUMMTGS + β4FINANCING

+ β5LEVERAGE + β6MARKET VALUE OF EQUITY + β7ALTMAN ′S Z

+ β8BOOK TO MARKET + β9EARNINGS TO PRICE + β10RET ON ASSETS

+ β11SALES GROWTH + β12AGE OF FIRM + β13M&A IN FIRST YEAR OF FRAUD

+ β14STOCK VOLATILITY + β15CEO TENURE + β16MISSING CEO TENURE + εi .

� values quantify the amount of hidden bias necessary to alter the statistical significance (p = 0.10) thatresults from the assumption that two observations with identical propensity scores have an equal probabilityof receiving treatment.

Page 36: Propensity Score Matching

260 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

reported in panel B indicate that the positive association observed at theupper end of the equity-incentive distribution is considerably more sensitiveto potential hidden bias relative to the negative association observed at thelower end of the equity-incentive distribution.

There are two observations worth noting in this analysis. First, results aresensitive to research design choice and appear to be sensitive to the timeperiod selected. Second, a nonmonotonic relationship may exist betweenequity incentives and accounting irregularities. Therefore, it is difficult toassess the relationship between equity incentives and accounting irregular-ities without considering research design choices that relax assumptionsregarding the functional form linking treatment and control variables tothe outcome.

7. Conclusion

The widespread use of “high-powered” equity incentives for CEOs andother top executives has generated interest in assessing whether these in-centives align managers’ interests with those of shareholders or whetherthey instead induce managers to manipulate accounting information forpersonal gain. A number of studies have examined this question, but theirevidence is quite mixed regarding the relationship between equity incentivesand various accounting irregularities. This paper examines this researchquestion using a larger sample that is more representative of the economyand an econometric approach that better alleviates overt bias and providesan assessment of hidden bias.

Using a propensity-score matched-pair research design that is robust tomisspecification of the underlying functional form that confounds the tra-ditional logistic regression approaches, we find little evidence of a positiverelationship between CEO equity incentives and the incidence ofaccounting-related restatements, shareholder lawsuits alleging accountingmanipulation, and AAERs. If anything, our results suggest that higher equity-based compensation and holdings may actually reduce the incidence of im-proper financial reporting. Specifically, we find some evidence that firms atwhich the CEO has greater equity incentives have a lower frequency of ac-counting irregularities than do firms with similar contracting environmentsat which the CEO has a relatively lower level of equity incentives.

Unlike most prior research, our results are most consistent with the notionthat equity incentives play a role in aligning managers’ interests with thoseof shareholders with regard to financial reporting. In sensitivity analyses,we find results that are similar when we use CEO equity mix (rather thanportfolio delta) and when we constrain our sample to ExecuComp (ratherthan Equilar) firms.

Results generated with a propensity-score matched-pair research designcan be quite different from those produced using standard and conditionallogistic regression. Since the propensity-score approach is robust to the func-tional form linking control variables to the outcome, propensity-score results

Page 37: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 261

provide a better basis for statistical inference about the effect of the treat-ment in the absence of precise knowledge about the underlying structuralrelationship among the variables of interest. Moreover, the propensity-scoreapproach allows for an explicit assessment for the sensitivity of the results tohidden bias (e.g., correlated omitted variables). Finally, there seems to be atemporal aspect for this research question, and this aspect may account forsome of the differences between our results and those of prior research.

One important question that we do not answer is what, if not equity in-centives, compels managers to engage in accounting manipulation. It wouldbe useful for future research to develop and estimate structural models ofmanagerial decisions regarding accounting manipulation. At this point, wedo not know why executives engage in illegal and unethical behavior thatcan result in substantial legal and human-capital costs (e.g., Armstrong andLarcker [2009]). To gain further insight into this question, future researchmight consider behavioral explanations in addition to traditional economicor agency rationalizations. Manipulative behavior may result from social in-fluence, where other firms, for example, manipulate accounting and thusallow executives to infer that this behavior is “legitimate.” Alternatively, thisbehavior may be a function of lax ethical norms in the firm or the per-sonal characteristics of executives engaged in accounting irregularities (e.g.,Chatterjee and Hambrick [2007]). Research in this direction would likely en-hance our understanding of the determinants of accounting irregularities.

Finally, propensity-score methods should be considered for future em-pirical accounting research in which the hypothesized causal variable is anendogenous choice (except, perhaps, in settings in which the outcome vari-able is very costly to collect). In particular, researchers should use propensity-score methods to generate matched pairs that induce maximum variationin the causal variable of interest (i.e., a full sample match). This approachis consistent with fundamental research in econometrics and statistics andis an arguably superior econometric approach to matching on the outcomevariable and relying on a regression model to control for confounding vari-ables (i.e., a partial match). Future research should also consider boundingmethods to explicitly quantify the sensitivity of the results for the primarycausal variable to unobserved correlated omitted variables. This will pro-vide readers with the necessary information to assess the extent to whichreported results are robust to correlated omitted variable and endogeneityconcerns.

APPENDIX A

Background for an Observational Study

The potential outcomes framework (Rubin [1974, 1977], Holland [1986],Heckman and Navarro-Lozano [2004]) is useful for illustrating the featuresof an observational study. Assume that, for each individual i, there is anindicator Di that equals 1 if the individual receives the treatment (e.g., high

Page 38: Propensity Score Matching

262 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

equity incentives) and equals 0 otherwise. For each individual there is a po-tential outcome (e.g., accounting irregularity) if the individual receives thetreatment, denoted Y 1, and another potential outcome (e.g., no accountingirregularity) if the individual does not receive the treatment, denoted Y 0.The potential outcomes for each individual are defined as Yi = Yi(Di), andthese are a function of both observable (denoted by X ) and unobservableoutcome-specific covariates (denoted by ε0 and ε1).34 In the case of additiveseparability, we can write these outcomes as follows.

Y1 = µ1(X) + ε1 (A1a)Y0 = µ0(X) + ε0. (A1b)

The individual-level treatment effect, � = (Y 1 − Y 0), represents theeffect of the treatment on a particular individual.35 Although this quantityexists in theory, it cannot be observed because only one of the two potentialoutcomes is observed for any particular individual. The outcome that didnot occur (e.g., Y 0 if the treatment was not received) is referred to as the“counterfactual” outcome, and its unobservability creates an identificationproblem that precludes the determination of the treatment effect for aspecific individual.

One way to address this identification problem is to group observationsaccording to whether they received the treatment and estimate the differ-ence between the average outcomes of the treatment and control groups(i.e., those that did and did not receive the treatment, respectively), whichcan identify the average treatment effect (ATE). One particularly importantestimator of the average treatment effect is the average treatment effect onthe treated (ATT), which is the effect of treatment for those individuals whoactually receive treatment.36

ATT = E [�|X, D = 1]

= E [Y1 − Y0|X, D = 1]

= E [Y1|X, D = 1] − E [Y0|X, D = 1]

= E [Y1|X, D = 1] − E [Y0|X, D = 0]. (A2)

34 The unobservable components ε0 and ε1 are indexed separately to allow for the possibilitythat these factors differ according to whether treatment was received. If the incidence oftreatment is related to unobservables ε0 and ε1, then there is an endogenous relationshipknown as selection on the unobservables (Heckman and Robb [1985]), which results in hiddenbias. Below, we discuss how bounds can be established on the size of this relationship relativeto the relationship between the outcome and the observable variables.

35 A treatment effect is often referred to as a “causal effect,” which is defined as the differencebetween an observed outcome and its unobserved, counterfactual outcome.

36 The identifying assumption required to estimate ATT (and, implicitly, used by the match-ing method to estimate the ATT) derived by Heckman et al. [1997] is E[Y 0 | X , D = 1] = E[Y 0| X , D = 0] = E[Y 0 | X ]. This requires that the expected outcome of those not receiving treat-ment conditional on the observable covariates X is the same regardless of whether treatmentwas received.

Page 39: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 263

The ATT estimator compares the average outcome for those individualswho received treatment (Y 1) to the average outcome for those individualswho did not receive treatment (which serve as an estimate of the counter-factual outcome, Y 0).

In a matched-pair research design, each observation that received thetreatment is paired with an observation that is similar along all other rele-vant observable dimensions (i.e., X ) but that did not receive the treatment.Since each matched pair is similar in every observable respect except thatone observation received the treatment while the other did not, any differ-ence in the outcome can, in the absence of hidden bias, be attributed to thedifference in treatment. The average effect of the treatment is calculated bycombining equations (A1a) and (A1b) with equation (A2) as follows.37

E [Y1 − Y0|X, D]v = E [µ1(X) − µ0(X) + ε1 − ε0|X, D]

= E [µ1(X) + ε1|X, D = 1] − E [µ0(X) + ε0|X, D = 0]

= E [µ1(X)|X, D = 1] − E [µ0(X)|X, D = 0]

= E [µ1(X)|X, D = 1] − E [µ0(X)|X, D = 1]

= E [µ1(X) − µ0(X)|X]. (A3)

It is important to note that even if the functional forms of µ1 and µ0

are different, matching on X will still produce an unbiased estimate of theaverage treatment effect.

An alternative way to develop an estimator of the average treatment effectis to recast equations (A1a) and (A1b) in a “switching regression” framework(e.g., Roy [1951], Goldfeld and Quandt [1973], and Rubin [1978]) to yieldthe following linear model:

Y = Y0 + (Y1 − Y0)D

= α0 + β0 X + ε0 + (α1 + β1 X + ε1 − (α0 + β0 X + ε0))D . (A4)

If the treatment solely affects the level of the outcome so that there is ahomogeneous treatment effect (i.e., β 0 = β 1 and ε0 = ε1), this equationsimplifies to

Y = α0 + βX + (α1 − α0)D + ε, (A5)

and the estimated coefficient on the treatment indicator, D, provides anestimate of the treatment effect (i.e., α1 − α0). This approach assumes a

37 The second step is based on the assumption that E[ε1 | X , D = 1] = E[ε0 | X , D = 0] =E[ε | X ] or that the error is mean independent of the treatment. This assumption is referred toas “selection on observables” (Heckman and Robb [1985]) because it implies that there are nounobserved factors that affect selection into the treatment and control groups. As we discussfurther below, one way to assess the importance of this assumption is to establish boundarieson the significance level of the results, to assess the degree to which selection on unobservablevariables would be required to alter the conclusions of the study (Rosenbaum [2002]).

Page 40: Propensity Score Matching

264 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

linear relationship between the outcome and controls. It also assumes thatthe relationship between the outcome and every control variable is identicalfor the treatment and control samples. The implications of violating theseassumptions are developed in appendix B.

APPENDIX B

Comparison of Matching and Regression Approaches

To compare the efficacy of propensity-score matching relative to regres-sion methods (including partial matching with regression), we rely onfigure B1, which depicts three cases of the true underlying relationshipbetween Y (the outcome) and X (the observed covariate or controlvariable). We assume that the probability density functions for Y givenX are distributed normally with different means and possibly differentvariances. In the first case (panel A), both the treatment and controlobservations exhibit an identical linear relationship between X and Y . In thesecond case (panel B), both the treatment and control observations exhibita nonidentical linear relationship between X and Y . In particular, the degreeto which treatment affects the outcome is linear in both the treatment andcontrol samples, but the slopes differ across this partition. In the third (and

FIG. B1.—Continued

Page 41: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 265

FIG. B1.—Inferring causal effects without covariate balance. This figure depicts a nonlinearrelationship between the dependent and independent variables (i.e., X and Y , respectively)for both the treatment and control samples. The average treatment effect is appropriately eval-uated at the average of the overlapping support. Matching accomplishes this task by using onlyobservations from the treatment and control samples in the region of overlapping support. Lin-ear regression with a treatment indicator estimates a linear projection over the entire supportof both the treatment and control distributions and assumes an identical slope, but differentintercepts for the two samples. In this example, the average treatment effect estimated fromlinear regression will underestimate the average treatment effect.

Page 42: Propensity Score Matching

266 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

perhaps most realistic) case (panel C), both the treatment and control ob-servations exhibit a nonidentical, nonlinear relationship between X and Y .In this setting, the degree to which treatment affects the outcome may benonlinear for both the treatment and control groups, and the functionalform of the relationship differs across this partition.

Perhaps because of perceived difficulty in identifying an appropriatematch across multiple dimensions, researchers often use a partial matchwith regression-controls research design. In this design, researchers matchobservations along only a few dimensions (e.g., year, industry, and size) andthen “control” for other dimensions by including additional variables ina regression analysis (e.g., structure of the board of directors). Inferencesfrom this research design (or from a regression without matching), however,rely on potentially unrealistic, stringent assumptions about the underlyingrelationship between the outcome variable and the “control” variables. Ifthese assumptions are not satisfied, inferences are likely to be confounded.Consider, for example, a setting in which two firms are matched on sizeand industry membership and for which there is an additional covariate(e.g., board size) that is expected to be related to the outcome of inter-est. If the researcher pools observations, includes a treatment indicator, D,and estimates a linear regression of this relationship, the estimation modelresembles:

Yi = α + γDi + βXi + ε, (B1)

where X is a covariate that is not included in the matching procedure but isinstead included as a “control.”38 For this estimation, it can be shown thatthe coefficient for X at the point X = E(XOS) reflects a weighted averageof the slope coefficients that would be estimated within the treatment andcontrol groups separately.39

In the case of an identical linear treatment effect illustrated in panel A offigure B1, the estimated coefficient on the treatment indicator will providean unbiased estimate of the average treatment effect (i.e., E [Y1|D = 1, X] −E [Y0|D = 0, X]). This occurs because the slope coefficients are identical forboth the treatment and control groups and can be seen from the expressionfor γ , which is the estimate of the treatment effect in equation (B1). Thecovariate in the regression essentially adjusts the estimated treatment andcontrol means to the mean value of the covariate in the overlapping support,

38 For this example, we assume a linear functional form for expositional purposes only.The issues we discuss in this section generalize to any specific functional form estimation of apooled, partial-match regression setting with additional controls.

39 Specifically, β = ρβ 1 + (1 − ρ)β 0, where ρ and (1 − ρ) is the fraction of the pooledsample that is from the treatment (control) group, respectively, and β 0 and β 1 representthe within-control and within-treatment sample slope coefficients, respectively. The averagetreatment effect is evaluated at point E(XOS ), since it is the expected value within the regionof overlapping support for X .

Page 43: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 267

XOS (although any value in the overlapping support will provide identicalestimates in this case).

In the case of a nonidentical linear treatment effect, illustrated in panel Bof figure B1, the coefficient estimate of the treatment indicator from equa-tion (B1) will yield a biased estimate of the treatment effect, γ . This biasedestimate occurs because the estimated slope coefficient in equation (B1)is the weighted-average pooled estimate for both the treatment and con-trol samples, and this pooled estimate is not equal to the actual slope foreither of the samples, so the estimated counterfactual is incorrect. In theparticular case illustrated in panel B, the estimated treatment effect fromequation (B1) will underestimate the true, average treatment effect. Con-versely, the estimate of the control on the outcome will be overstated becausea portion of the treatment effect will be misattributed to the control.40 Asolution to this problem (similar to the test of parallel lines in traditionalanalysis of covariance) is to alter equation (B1) to incorporate a separateintercept and slope for each sample and estimate:41

Yi = α + γDi + βXi + δDi ∗ Xi + εi . (B2)

If the researcher can correctly specify the functional form linking X andY (e.g., linear over the entire range of X ), the transformation from equa-tion (B1) to equation (B2) will provide an unbiased estimate for the treat-ment effect.

Finally, the case of a nonidentical, nonlinear treatment effect is illus-trated in panel C of figure B1. It can easily be seen that estimating a modelsimilar to equation (B1) will almost never provide an unbiased estimateof the desired treatment effect. In the unlikely case that the underlyingnonlinear structural model is known, the functions depicted in panel Ccan be estimated and the treatment effect can be evaluated at any point,but not by using linear regression. Fortunately, the average treatment ef-fect can be estimated with a matched pair research design, as shown inappendix A.

The discussion above shows that the addition of “control” variables ina regression framework may not adequately control for the effect of con-founding variables on the outcome of interest. In particular, panel A offigure B1 illustrates the highly specialized case in which this approach will

40 The situation can also be reversed, depending on the relationship between the con-trol covariate and the outcome in the treatment and control subsamples. It can easily bethe case that the true average treatment effect is overstated because part of the effect of thecontrol on the outcome is misattributed to the treatment. The key point is that the coeffi-cient on the indicator variable in equation (A5) is not the correct estimate for the treatmenteffect.

41 Although this is a simple estimation modification, none of the papers discussed intable 1 examine whether the slopes on the covariates differ across the treatment and controlgroups.

Page 44: Propensity Score Matching

268 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

provide an unbiased estimate of the average treatment effect. However, thisoutcome requires a constant slope linking X to Y across groups. Panels Band C of figure B1 illustrate how, in the more general case, modeling therelationship as linear results in a biased estimate of the treatment effect.In general, a matched research design (in which the match is performedalong all relevant, observable dimensions) will provide a more robust esti-mate of the average treatment effect. The only case in which the regressionapproach can dominate the propensity-score matched design occurs whenthe structural model linking the outcome variable to the covariates is knownand can be fully specified. However, knowledge of the underlying structuralmodel is extremely unlikely, and misspecification of this structural modelcan result in additional sources of bias in the estimates of the treatmenteffect.

APPENDIX C

Variable Definitions

Variable Definition Data Source

EqIncQuint The quintile ranking of the CEO’s portfolio delta forwhich quintiles are computed annually from thecross-sectional distribution of portfolio deltas.Portfolio delta is calculated as the change in therisk-neutral dollar value of the CEO’s equityportfolio for a 1% change in the firm’s stock price.

Equilar

Leverage The ratio of total debt to market value of assetscomputed as (data9 + data34)/((data199 ∗ data25)+ data9)

Compustat

MarketCap The market value of equity computed as (data199 ∗data25)

Compustat

Idiosyncrisk The standard deviation of residuals from afirm-specific regression of monthly returns on themonthly return to the CRSP value-weightedportfolio index using the previous 36 months (andrequiring at least 12 months) of observations (Coreand Guay [1999])

CRSP

MkttoBook The market value of equity divided by the book valueof equity computed as ((data199 ∗ data25)/data216)

Compustat

Tenure The CEO’s tenure with the firm in years EquilarOutsideChmn A dichotomous variable that equals 1 if the chairman

of the Board of Directors is an outsider and 0otherwise

Equilar

OutsideLdDir A dichotomous variable that equals 1 if the firm hasappointed a lead independent director and 0otherwise

Equilar

(Continued)

Page 45: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 269

Variable Definition Data Source

CEOApptdOutsDirs The fraction of outside directors appointed by theCEO; calculated as the number of outside directorswhose tenure is less than the CEO’s tenure, scaledby the total number of directors

Equilar

StaggeredBd A dichotomous variable that equals 1 if the corporatedirectors have staggered terms and 0 otherwise

Equilar

PctOldOutsDirs The number of outside directors who are at least69 years old scaled by the total number of directors

Equilar

PctBusyOutsDirs The number of outside directors who servesimultaneously on at least two boards scaled by thetotal number of directors

Equilar

PctFoundingDirs The number of directors who are founders of the firmscaled by the total number of directors

Equilar

OutsideDirHolds The number of shares held by outside directors scaledby the total number of shares outstanding

Equilar

NumberDirs The number of directors on the board EquilarPctFinExpsAud The number of directors with financial expertise who

serve on the audit committee scaled by the totalnumber of directors. Financial experts are directorswho have experience as CEO, CFO, financialaccountant, or auditor, or who have been licensed asa Certified Public or Chartered Accountant. Thisvariable is manually coded from detailedbiographical data.

Equilar

DirCompMix The ratio of total dollar equity compensation to totalequity plus cash compensation for nonexecutivedirectors

Equilar

NumberInstOwns The number of institutional owners of the firm’s shares CDA/SpectrumNumBlockhldrs The number of institutional owners that own at least

5% of the firm’s outstanding sharesCDA/Spectrum

Activists The number of institutional owners denoted asactivists. Activist shareholders are identified asCDA/Spectrum manager numbers 12,000, 12,100,12,120, 18,740, 38,330, 81,590, 49,050, 54,360,57,500, 58,650, 63,600, 63,850, 63,895, 66,550,66,610, 66,635, 82,895, 83,360, 90,803, and 93,405(Cremers and Nair [2005], Larcker, Richardson,and Tuna [2007]).

CDA/Spectrum

REFERENCES

ALTONJI, J. G.; T. E. ELDER; AND C. R. TABER. “Selection on Observed and Unobserved Variables:Assessing the Effectiveness of Catholic Schools.” Journal of Political Economy 113 (2005): 151–84.

ARMSTRONG, C.; J. BLOUIN; AND D. LARCKER. “The Incentives for Tax Planning.” Working paper,The Wharton School at the University of Pennsylvania, 2009.

ARMSTRONG, C.; C. D. ITTNER; AND D. LARCKER. “Economic Characteristics, Corporate Gover-nance, and the Influence of Compensation Consultants on Executive Pay Levels.” Workingpaper, The Wharton School at the University of Pennsylvania, 2009.

ARMSTRONG, C., AND D. LARCKER. “Discussion of ‘The Impact of the Options Backdating Scandalon Shareholders’ and ‘Taxes and the Backdating of Stock Option Exercise Dates.’” Journalof Accounting and Economics 47 (2009): 50–58.

Page 46: Propensity Score Matching

270 C. S. ARMSTRONG, A. D. JAGOLINZER, AND D. F. LARCKER

BABER, W.; S. KANG; AND L. LIANG. “Shareholder Rights, Corporate Governance, and Account-ing Restatement.” Working paper, Georgetown University, 2007.

BERGSTRESSER, D., AND T. PHILIPPON. “CEO Incentives and Earnings Management.” Journal ofFinancial Economics 80 (2006): 511–29.

BLACK, F., AND M. SCHOLES. “The Pricing of Options and Corporate Liabilities.” Journal ofPolitical Economy 81 (1973): 637–54.

BRICKLEY, J. A.; J. L. COLES; AND R. L. TERRY. “Outside Directors and the Adoption of PoisonPills.” Journal of Financial Economics 35 (1994): 371–90.

BURNS, N., AND S. KEDIA. “The Impact of Performance-Based Compensation on Misreporting.”Journal of Financial Economics 79 (2006): 35–67.

CADMAN, B.; S. KLASA; AND S. MATSUNAGA. “Evidence on How Systematic Differences BetweenExecuComp and Non-ExecuComp Firms Can Affect Empirical Research Results.” Workingpaper, University of Oregon, 2006.

CHATTERJEE, A., AND D. HAMBRICK. “It’s All about Me: Narcissistic CEOs and Their Effectson Company Strategy and Performance.” Administrative Science Quarterly 52 (2007): 351–86.

CHENG, Q., AND T. D. WARFIELD. “Equity Incentives and Earnings Management.” The AccountingReview 80 (2005): 441–76.

CORE, J., AND W. GUAY. “The Use of Equity Grants to Manage Optimal Equity Incentive Levels.”Journal of Accounting & Economics 28 (1999): 151–84.

CORE, J., AND W. GUAY. “Estimating the Value of Employee Stock Option Portfolios and TheirSensitivities to Price and Volatility.” Journal of Accounting Research 40 (2002): 613–30.

CORE, J.; R. HOLTHAUSEN; AND D. LARCKER. “Corporate Governance, Chief Executive Offi-cer Compensation, and Firm Performance.” Journal of Financial Economics 51 (1999): 371–406.

CREMERS, M., AND V. B. NAIR. “Governance Mechanisms and Equity Prices.” Journal of Finance60 (2005): 2859–94.

DECHOW, P., AND R. SLOAN. “Executive Incentive and the Horizon Problem.” Journal of Account-ing & Economics 14 (1991): 51–89.

DEMSETZ, H., AND K. LEHN. “The Structure of Corporate Ownership: Causes and Conse-quences.” Journal of Political Economy 83 (1985): 1155–77.

DERIGS, U. “Solving Non-bipartite Matching Problems via Shortest Path Techniques.” Annals ofOperations Research 13 (1988): 225–61.

DIPRETE, T., AND M. GANGL. “Assessing Bias in the Estimation of Causal Effects: RosenbaumBounds on Matching Estimators and Instrumental Variables Estimation with Imperfect In-struments.” Sociological Methodology 34 (2004): 271–310.

EFENDI, J.; A. SRIVASTAVA; AND E. P. SWANSON. “Why Do Corporate Managers Misstate FinancialStatements? The Role of Option Compensation and Other Factors.” Journal of FinancialEconomics 85 (2007): 667–708.

ERICKSON, M.; M. HANLON; AND E. L. MAYDEW. “Is There a Link between Executive EquityIncentives and Accounting Fraud?” Journal of Accounting Research 44 (2006): 113–43.

GOLDFELD, S. M., AND R. E. QUANDT. “The Estimation of Structural Shifts by Switching Regres-sions.” Annals of Economic and Social Measurement 2 (1973): 475–85.

HARRIS, J., AND P. BROMILEY. “Incentives to Cheat: The Influence of Executive Compensationand Firm Performance on Financial Misrepresentation.” Organizational Science 18 (2007):350–67.

HECKMAN, J., AND S. NAVARRO-LOZANO. “Using Matching, Instrumental Variables, and ControlFunctions to Estimate Economic Choice Models.” The Review of Economics and Statistics 86(2004): 30–57.

HECKMAN, J., AND R. ROBB. “Using Longitudinal Data to Estimate Age, Period, and CohortEffects in Earnings Equations.” in Cohort Analysis in Social Research Beyond the IdentificationProblem, edited by Mason and Feinberg. New York: Springer-Verlag, 1985.

HECKMAN, J. J., AND V. J. HOTZ. “Choosing among Alternative Nonexperimental Methods forEstimating the Impact of Social Programs: The Case of Manpower Training.” Journal of theAmerican Statistical Association 84 (1989): 862–74.

Page 47: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 271

HECKMAN, J. J.; H. ICHIMURA; AND P. E. TODD. “Matching as an Econometric Evaluation Esti-mator: Evidence from Evaluating a Job Training Programme.” Review of Economic Studies 64(1997): 605–54.

HIRANO, K., AND G. W. IMBENS. “The Propensity Score with Continuous Treatments.” in AppliedBayesian Modeling and Causal Inference From Incomplete Data Perspectives, edited by A. Gelmanand X. L. Meng. England: West Sussex, 2004: 73–84.

HO, D. E.; K. IMAI; G. KING; AND E. A. STUART. “Matching as Nonparametric Preprocessing forReducing Model Dependence in Parametric Causal Inference.” Political Analysis 199 (2007):1–38.

HOLLAND, P. “Statistics and Causal Inference (with discussion).” Journal of the American StatisticalAssociation 81 (1986): 945–70.

HOSMER, D. W., AND S. LEMESHOW. Applied Logistic Regression. New York: John Wiley and Sons,2000.

JOHNSON, S. A.; H. E. RYAN; AND Y. S. TIAN. “Managerial Incentives and Corporate Fraud: TheSources of Incentives Matters.” Review of Finance 13 (2009): 115–45.

KING, G., AND L. ZENG. “Logistic Regression in Rare Events Data.” Political Analysis 9 (2001):137–63.

LARCKER D.; S. RICHARDSON; AND I. TUNA. “Corporate Governance, Accounting Outcomes, andOrganizational Performance.” The Accounting Review 82 (2007): 963–1008.

LIDDELL, F. “Simplified Exact Analysis of Case-Referent Studies: Match Pairs; DichotomousExposure.” Journal of Epidemiology and Community Health 37 (1983): 82–4.

LU, B.; E. ZANUTOO; R. HORNIK; AND P. R. ROSENBAUM. “Matching with Doses in an Obser-vational Study of a Media Campaign Against Drug Use.” Journal of the American StatisticalAssociation 96 (2001): 1245–53.

MADDALA, G. S. “A Perspective on the Use of Limited-Dependent and Qualitative VariableModels in Accounting Research.” The Accounting Review 66 (1991): 788–807.

MCFADDEN, D. “Statistical Tools.” Working paper, University of California, Berkeley, 2000.MCNEMAR, Q. “Note on the Sampling Error of the Differences between Correlated Proportions

of Percentages.” Psychometrika 12 (1947): 153–57.O’CONNOR, J. P.; R. L. PRIEM; J. E. COOMBS; AND K. M. GILLEY. “Do CEO Stock Options Prevent

or Promote Fraudulent Financial Reporting?” Academy of Management Journal 49 (2006):483–500.

PALMROSE, Z.; V. J. RICHARDSON; AND S. SCHOLZ. “Determinants of Market Reactions to Restate-ment Announcements.” Journal of Accounting & Economics 37 (2004): 59–89.

ROSENBAUM, P. R. Observational Studies, 2nd ed. Berlin: Springer Series in Statistics, 2002.ROSENBAUM, P. R., AND D. B. RUBIN. “The Central Role of the Propensity Score in Observational

Studies for Causal Effects.” Biometrika 70 (1983): 41–55.ROY, A. D. “Some Thoughts on the Distribution of Earnings.” Oxford Economic Papers 3 (1951):

135–46.RUBIN, D. B. “Estimating Causal Effects of Treatments in Randomized and Nonrandomized

Studies.” Journal of Educational Psychology 66 (1974): 688–701.RUBIN, D. B. “Assignment to a Treatment Group on the Basis of a Covariate.” Journal of Educa-

tional Statistics 2 (1977): 1–26.RUBIN, D. B. “Bayesian Inference for Causal Effects: The Role of Randomization.” Annals of

Statistics 6 (1978): 34–58.SEKHON, J. S. “Multivariate and Propensity Score Matching Software with Automated Balance

Optimization: The Matching Package for R.” Journal of Statistical Software (2009), forthcom-ing.

Page 48: Propensity Score Matching

DOI: 10.1111/j.1475-679X.2010.00367.xJournal of Accounting Research

Vol. 48 No. 2 May 2010Printed in U.S.A.

Discussion ofChief Executive Officer Equity

Incentives and AccountingIrregularities

J O H N E . C O R E ∗

1. Introduction

In an interesting and important paper, Armstrong, Jagolinzer, and Lar-cker (AJL) re-examine the question of whether CEO equity incentives causeaccounting irregularities. As the authors note, this question is already muchstudied in the literature with a variety of methods and samples. Broadlyspeaking, the prior literature hypothesizes that equity incentives cause man-agers to manipulate accounting information, and generally finds a positiverelation or no relation between equity incentives and proxies for manipula-tion. AJL add to this literature a larger sample that includes smaller firms (abeginning sample of roughly 4,000 firms a year versus roughly 1,500 firmsa year used in ExecuComp). In addition, they introduce a novel method(propensity score matching) that is robust to misspecification of functionalform (“overt bias”) and that provides an assessment of correlated omittedvariables bias (“hidden bias”).

In contrast to prior research that generally examines pre–Sarbanes-Oxley data (2001 and before), AJL’s sample runs from 2001 to 2005.

∗The Wharton School, University of Pennsylvania. This discussion has benefited from, andreflects the comments of, participants at the 2009 Journal of Accounting Research conference. Igratefully acknowledge helpful comments from Wayne Guay and Greg Miller, and the financialsupport of the Wharton School. Any errors are the sole responsibility of the author.

273

Copyright C©, University of Chicago on behalf of the Accounting Research Center, 2010

Page 49: Propensity Score Matching

274 J. E. CORE

Using this later sample, AJL find no evidence of a positive relationbetween CEO equity incentives and their proxies for accounting irregular-ities. As proxies for accounting irregularities, they use restatements, share-holder lawsuits, and U.S. Securities and Exchange Commission Account-ing and Auditing Enforcement Releases (AAERs). Using “partial match”logistic regressions similar to those used by some prior researchers, AJLfind no relation between incentives and accounting manipulation. Us-ing propensity score matching, the authors generally find no relation be-tween incentives and accounting manipulation, although there is some ev-idence of a negative relation between incentives and lawsuits. However,the negative relation with lawsuits does not appear to be robust to po-tential correlated omitted variables bias. The authors note that their re-sults are similar if they examine their later accounting irregularities us-ing the sample of ExecuComp firms generally used by prior research.Finally, the authors also show results using the earlier sample used byErickson, Hanlon, and Maydew [2006]. In a “partial match” conditional lo-gistic regression, AJL find a positive relation between incentives and fraudthat is consistent with Johnson, Ryan, and Tian [2009] and in contrastto Erickson, Hanlon, and Maydew [2006]. However, in the better speci-fied, full-sample propensity-score model, AJL find no relation between in-centives and accounting manipulation, consistent with the full-sample lo-gistic regression in Erickson, Hanlon, and Maydew [2006]. In summary,the propensity score method shows no relation between incentives and ac-counting irregularities in the late sample, and no relation between incen-tives and accounting irregularities in the early sample. Overall, AJL’s resultssuggest no relation between incentives and accounting irregularities. Theseresults are important in that they add to a growing body of evidence thatshows that, when empirical tests are correctly specified, there is no evidenceof a relation between incentives and accounting irregularities.

The remainder of this discussion is as follows. In section 2, I briefly out-line the hypotheses and method of the prior literature and those of AJL. Insection 3, I discuss AJL’s results. Much of the discussion in the conferencewas on understanding propensity-score methods and whether and to whatextent these methods should be used in accounting research, and I addressthese issues in section 4. In the final section, I conclude.

2. Summary of Prior Literature—Hypotheses and Method

2.1 HYPOTHESES

As noted in AJL’s introduction, the basic hypothesis in prior literature isthat equity incentives motivate executives to manipulate accounting infor-mation for personal gain. For this to be the case, executives need to believethat they can increase the stock price by manipulating earnings. In addi-tion, as discussed by participants and tested by, for example, Bergstresserand Philippon [2006] and Erickson, Hanlon, and Maydew [2006], ifexecutives are rational when they manipulate earnings in an attempt to

Page 50: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 275

increase the value of their holdings, they will do so when they expect tobe able to benefit by selling equity. Thus, one expects vested options andstock holdings to predict rational manipulation; it would be irrational foran executive to manipulate if all his or her options and stock holdings wereunvested.

Participants noted that the underlying “more equity is worse” story in thisliterature is similar to the story in an earlier and ongoing literature that“more equity is better” (e.g., Morck, Shleifer, and Vishny [1988]). In boththese literatures, how results are interpreted depend crucially on whetherequity incentives are assumed to be exogenous or endogenous (Demsetzand Lehn [1985]). Equity holdings in theory are chosen by boards to max-imize firm value and by managers to maximize private value, but for the“more equity is worse” and “more equity is better” stories to go through,these choices need to have a substantial random (exogenous) component.The “more equity is worse” story is causal if there is a substantial randomcomponent to the way that boards grant and require equity ownership, andif equity ownership has gotten to be too large on average. If boards do notunderstand equity compensation, but continue to grant it, it is conceivablethat equity incentives are random and too large.

On the other hand, there are other explanations that involve maximiz-ing, nonrandom behavior. AJL give the example of risk-aversion: CEOs withlow risk aversion may hold more incentives and may be more likely to com-mit fraud. Another is monitoring difficulty. Firms with greater monitoringdifficulty may use more incentives, but their greater monitoring difficultyallows more accounting manipulation. Another possible link is CEO power.Suppose that powerful CEOs overpay themselves using equity, and have tohold this equity to camouflage their excess pay. Then if powerful CEOs aremore likely to commit fraud, there may appear to be a link between equityincentives and fraud.

Whether there are perverse effects of equity incentives is a very importantquestion. As discussed earlier and as demonstrated by Hribar and Nichols[2007], tests using equity incentives as an independent variable appear tobe readily confounded by endogeneity and correlated omitted variablesproblems.1 Thus, techniques that readily assess the sensitivity of results tocorrelated omitted variables seem not only worthwhile but also necessary.

2.2 METHOD

Prior work tests the hypothesis that incentives cause accounting irregu-larities using versions of the following regression model:

Accounting Irregularity = β ′X + γ Equity Incentives + ε. (1)

In this model, the “treatment” or variable of interest is Equity Incentives, andthe dependent variable is some type of accounting irregularity or earnings

1 Note that endogeneity is fundamentally a correlated omitted variables problem. If onecould observe the part of an endogenous regressor that was correlated with the error term,one could include this omitted variable, and the regression would be correctly specified.

Page 51: Propensity Score Matching

276 J. E. CORE

management, and X are controls for determinants of Accounting Irregularityand/or Equity Incentives.

As illustrated in AJL’s table 1, the prior literature uses a variety of prox-ies for incentives and accounting irregularities. Dependent variables exam-ined by the prior literature include: (1) AAERs (e.g., Erickson, Hanlon, andMaydew [2006]); (2) restatements (e.g., Burns and Kedia [2006]); and (3)accruals (e.g., Begstresser and Philippon [2006]). Perhaps reflecting theuncertainty discussed earlier about how to measure incentives to manageearnings, the prior literature has used a variety of proxies for equity incen-tives including: (1) total portfolio equity incentives (e.g., Erickson, Hanlon,and Maydew [2006]); (2) vested incentives (e.g., Erickson, Hanlon, andMaydew [2006] and Burns and Kedia [2006]); and (3) option compensa-tion as a percentage of total compensation (e.g., Burns and Kedia [2006]).In addition, studies differ as to whose incentives are measured: Most stud-ies use (1) CEO incentives, but some studies (e.g., Erickson, Hanlon, andMaydew [2006], Johnson, Ryan, and Tian [2009]) use (2) incentives for thetop five executives.

The prior literature takes two approaches to estimating equation (1),both of which are used by Erickson, Hanlon, and Maydew [2006]. The firstapproach is to estimate equation (1) on the full sample of firms using OLSregression or logistic regression (as is appropriate depending on whetherthe dependent variable is continuous or discrete). This is the approachused by Erickson, Hanlon, and Maydew [2006] (table 6), Begstresser andPhilippon [2006], and Burns and Kedia [2006]. The other approach iswhat AJL term “partial match”—for each firm with an accounting irreg-ularity, the researcher finds one or two firms without irregularities thatare matched on year, industry, and size, etc. This is the approach used byErickson, Hanlon, and Maydew [2006] (table 5) and Johnson, Ryan, andTian [2009].

I will discuss propensity-score matching in more detail in the fourth sec-tion, but the intuition for the procedure is as follows. If one uses regres-sion to estimate equation (1), one must assume a linear relation betweenthe dependent variable and the controls (i.e., β ′X ). The propensity scoreis generated by estimating a logistic or probit regression. The estimatedpropensity score intuitively provides a single matching variable that opti-mally weights the control variables X . By matching on the propensity scoreand taking differences, the researcher avoids the need to specify the func-tional form of the relation between the dependent variable and the con-trols.2 Both regression and the propensity score procedure require the ex-ogeneity assumption that the residuals are uncorrelated with incentives. Asdiscussed earlier, this assumption seems tenuous in the case of incentives.The propensity score approach allows a “Rosenbaum [2002] bounds” test

2 The propensity score method in a sense pushes back the linearity assumption, as thepropensity score is generated in a logistic regression model in which linearity is assumed.See equation (5) later.

Page 52: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 277

of how inference is affected under various correlated omitted variables sce-narios. Frank [2000] has developed a similar bounding test for OLS regres-sion, and Larcker and Rusticus [2009] show an example of this boundingtest in an accounting setting.

Most work on propensity scores has been developed for the situation inwhich the incentive variable (or “treatment” variable) above in equation (1)is binary, so that the match is firms with incentives on firms without incen-tives. To use propensity score methods to estimate equation (1) with its con-tinuous measure of incentives, AJL must use one of two advanced propen-sity score methods (see their footnote 20). One approach (from Hiranoand Imbens [2004]) leaves the treatment variable (incentives) continuous.This approach allows nonlinearities in the control variables, but estimates amonotonic treatment effect, similar to a standard regression. The other ap-proach, following Lu et al. [2001], requires ranking the treatment variableand replacing the continuous variable with its ranks. AJL follow the latterapproach and rank incentives into five quintiles, so that the regression ana-logue of their propensity score approach is

Accounting Irregularity = β ′X +5∑

i=1

γi EqIncQuintile + ε, (2)

where EqIncQuintile is the quintile rank of the CEOs total portfolio incen-tives. This ranking procedure introduces the possibility of nonmonotoniceffects of the incentive variable at the cost of complexity and of making theincentive variable discrete. Note that the hypotheses discussed earlier donot suggest a nonmonotonic relation between incentives and accountingirregularities. There are two additional points to note about the Lu et al.[2001] approach. One, while standard propensity score methods are muchstudied, the “propensity score with doses” approach of Lu et al. [2001] wasnovel to the statistics literature. Second, while both Lu et al. [2001] and AJLrank the treatment variable into quintiles, the choice is ad hoc, and confer-ence participants wondered about the sensitivity of the results to alternativeranking schemes.

3. Results

In order to focus on results that are comparable to prior literature, Iwill first discuss the results in AJL’s table 8. Here AJL use the 50 AAERsidentified by Erickson, Hanlon, and Maydew [2006]. The specification usedby Erickson et al. is:

AAERt+1 = β ′X + γTop Five Equity Incentivest + εt+1. (3)

When AJL estimate “partial match” conditional logistic regressions in ta-ble 8, panel A, it is with equation (2) above that uses CEO equity incen-tives. Table 8, panel A shows estimates of this “partial match” model. Herethe sample is 50 AAERs and 100 firms without AAERs matched on year,two-digit SIC code, and total assets. The logistic regression in the left two

Page 53: Propensity Score Matching

278 J. E. CORE

columns of table 8, panel A shows no monotonic relation, and is similar tothe logistic regression results of Erickson, Hanlon, and Maydew [2006].3

However, as AJL and Johnson, Ryan, and Tian [2009] note, partial match-ing with a binary dependent variable requires estimation by conditional lo-gistic regression, so the logistic regression results are not interpretable. Theconditional logistic regression in the right two columns of table 8, panel Ashows what appears to be a positive, monotonic relation between incen-tives and AAERs. This result is similar to the conditional logistic regressionresults in Johnson, Ryan, and Tian [2009]. For this sample, inference isgreatly different when the correct conditional logistic regression is used(incentives are positively related to AAERs) from when it is not (incentivesare unrelated to AAERs).

In contrast to the conditional logistic regression estimates, the propensityscore model shown in table 8, panel B shows no evidence of a monotonicrelation between incentives and AAERs. Similar to the conditional logisticregression results, AAERs appear to be higher in the fifth quintile, but incontrast to the conditional logistic regression results, AAERs appear to belower in the second quintile. The row labeled “pooled” in this panel testswhether AAERs are different for all of the quintile comparisons combined,and is in essence a test for monotonicity: Are more incentives associatedwith higher or lower accounting irregularities? This test is most directlycomparable to the prior literature that tested for a monotonic effect usinglinear regressions, and provides no evidence of a monotonic effect (p =0.89).

These propensity score results are generated using the full sample, whichis the sample of 50 AAERs pooled with all ExecuComp firms withoutan AAER (this sample numbered 13,033 firms in Erickson, Hanlon, andMaydew [2006]). Erickson, Hanlon, and Maydew’s [2006] full sample re-sults in their table 6 show no monotonic relation between incentives andAAERs, and are correctly specified if the linear relation shown in equation(3) is correctly specified. Because the propensity score approach used byAJL also uses the full sample, it seems to be most comparable to the fullsample logistic regression in Erickson, Hanlon, and Maydew [2006]. Thepropensity score findings of no monotonic relation essentially confirm sim-ilar findings using full sample logistic regression in Erickson et al. In otherwords, the propensity score procedure relaxes the linearity assumptions,but does not appear to yield different inference. AJL miss an opportunity bynot showing full sample logistic regression estimates of equation (2). Theyshow that the propensity score provides different results from the small-sample “partial match” logistic regression, but the reader is left to wonderabout the comparison to the full sample logistic regression.

In the main results shown in tables 6 and 7, AJL also examine restate-ments and lawsuits, and use a later sample. AJL provide specifications in

3 Johnson et al. note that they find similar results to Erickson et al. when they estimate anunconditional logistic regression.

Page 54: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 279

which the first accounting manipulation year is measured the year after in-centives (“year-ahead”) and in which the first accounting manipulation yearis measured in the same year as incentives (“contemporaneous”). I focuson the year-ahead results because this specification seems to have a moreplausible causal link and because it is consistent with the specification usedby Erickson, Hanlon, and Maydew [2006] (shown in equation (3) earlier).Consider first the partial match results for year-ahead accounting irregular-ities in table 7. Very few of the comparisons are significant, and althoughthere is no test of this, none of the estimates suggest a monotonic rela-tion between incentives and accounting irregularities. These partial matchtests are conducted by matching each firm with an accounting irregularityto a single firm without an accounting irregularity based on year, two-digitSIC code, and total assets, so the sample sizes are fairly small (176 to 668observations).

In contrast, the propensity score method shown in table 6 uses muchlarger samples, probably about 7,300 observations for year-ahead account-ing irregularities.4 The results for restatements (table 6, panel A) and forAAERs (table 6, panel C) both show that quintile 5 is significantly less thanquintile 4. However, for both measures, there are no other significant com-parisons, and there is no evidence of a monotonic relation between incen-tives and restatements or between incentives and AAERs. Panel B showsthe results for lawsuits. The propensity score model shows that quintile 5is significantly less than quintile 4, and that quintile 3 is significantly lessthan quintile 2. There is evidence consistent with the hypothesis that thereis a negative monotonic relation between incentives and lawsuits, but thisrelation is very sensitive to correlated omitted variables. The Rosenbaum[2002] bounds test here shows a � = 1.07, which means only a small changein the odds ratio of 50/50 (� = 1.00) under the null would render the re-sults insignificant. The propensity score method shows no monotonic re-lation between incentives and accounting irregularities in the late sample,and no monotonic relation between incentives and accounting irregulari-ties in the early sample. Overall, these results suggest no relation betweenincentives and accounting irregularities.

4. Understanding Armstrong, Jagolinzer, and Larcker’sPropensity-Score Procedure

The major innovation of AJL is to estimate an analogue of equation (2)using matching by a novel propensity score technique. As noted earlier,this technique was first used in the statistics literature by Lu et al. [2001],and is a generalization of a simpler, typical, “standard” propensity scoretechnique. In the standard technique, the treatment variable of interest is

4 There are 9,118 observations for the contemporaneous model, and I assume that 80% ofthese are in the year-ahead model.

Page 55: Propensity Score Matching

280 J. E. CORE

binary, and therefore only one matched comparison is estimated. In AJL’sgeneralization, the variable of interest can take on five discrete values. Con-sequently, 10 matched comparisons are estimated.

To understand AJL’s more sophisticated approach, it is useful to firstunderstand the standard approach, which is what the AJL appendicesillustrate.

4.1 ‘‘STANDARD’’ PROPENSITY SCORE APPROACH

As discussed in AJL’s appendix B, the standard propensity score tech-nique is a method for estimating the “treatment” effect of a binary indi-cator variable D on a dependent variable Y , controlling for the effects ofcontrol variables X . The standard propensity score approach is used as analternative to a linear regression approach such as

Y = α + β ′X + γD + ε. (4)

In accounting, the standard propensity score technique has been usedto address the question of whether a firm with a Big 4 auditor (D = 1) payshigher audit fees (e.g., Clatworthy, Makepeace, and Peel [2009])? Or, doesthe use of a compensation consultant increase CEO compensation (e.g.,Armstrong, Ittner, and Larcker [2009])?

Unlike equation (4), which assumes linearity, the propensity scoremethod, by matching observations closely, makes any nonlinearities in thecontrols irrelevant. The method begins by assuming that the choice to re-ceive treatment can be modeled with the following linear model:5

D∗ = δ′X + u (D = 1 if D∗ ≥ 0; otherwise D = 0). (5)

One estimates equation (5) by logistic regression or by probit regres-sion, and computes the propensity score as the predicted probability:E [probability | X ] = p (δ̂′X ). The propensity score is used to match a treatedobservation with an untreated observation. The propensity score providesa single matching variable that optimally weights the control variables X(subject to the assumed linear form in equation (5)). Consequently, as AJL(p. 263) note: “since each matched pair is similar in every observable re-spect except that one observation received the treatment while the otherdid not, any difference in the outcome can . . . be attributed to the dif-ference in treatment.” The treatment effect is estimated as the mean ormedian difference between the matched samples of treated and untreatedobservations.

4.2 RELATION BETWEEN REGRESSION, HECKMAN REGRESSION,AND PROPENSITY SCORE

As also discussed in AJL’s appendix B, there are a number of otherapproaches to estimating binary treatment effects, including regression

5 In general, the determinants of treatment in equation (5) can be different from the con-trols X in equation (4). For ease of exposition, I assume that they are the same.

Page 56: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 281

approaches with which readers are likely to be more familiar. In this sec-tion, I compare an OLS regression approach, the Heckman regression ap-proach, and the standard propensity score approach.

As discussed in AJL’s appendices, the linear model shown in equation(4) is restrictive. A more general model shown in AJL’s equation (A1a) and(A1b) assumes that treated (Y 1) and untreated observations (Y 0) can beexpressed as

Y1 = μ1(X ) + ε1 (6)

Y0 = μ0(X ) + ε0. (7)

Here, for example, audit fees have a potentially different and potentiallynonlinear relation with the control variables X for a firm with a Big 4 au-ditor (D = 1) than for a firm without a Big 4 auditor (D = 0). Note that,conditional on X and the treatment, the expected values are

E [Y1 | X, D = 1] = E [μ1(X ) | X ] + E [ε1 | X, D = 1] (8)

E [Y0 | X, D = 0] = E [μ0(X ) | X ] + E [ε0 | X, D = 0]. (9)

As shown in AJL’s equation (A3), the average effect of the treatment isconsistently estimated as

E [Y1 − Y0 | X, D] = E [μ1(X ) | X ] − E [μ0(X ) | X ]. (10)

To estimate the average treatment effect using equations (8)–(10), oneneeds two assumptions: (1) what are the conditional expectations of theresiduals above? and (2) can linear regressions provide a good estimate ofthe μ(·) functions above? What method is used depends on the answers tothese questions.

If the residuals are uncorrelated with the choice to receive treatment,then

E [ε1 | X, D = 1] − E [ε0 | X, D = 0] = 0. (11)

As discussed in AJL’s footnote 37, the matching literature refers to thisexogeneity assumption as “selection on observables” (Heckman and Robb[1985]).

If one can assume exogeneity as shown in equation (11), and if a linearregression on X (and possibly functions of X ) provides a good estimate ofthe μ(·) functions above, then, as illustrated in AJL’s figure B1, panel B,one may use linear regressions to estimate equations (8) and (9), and usethe fitted parameters to obtain an unbiased estimate of the treatment effectin equation (10).

However, if one is not willing to assume linearity, but can assume ex-ogeneity as shown in equation (11), one can use the propensity scoremethod to obtain an unbiased estimate of the treatment effect in equation(10).

Page 57: Propensity Score Matching

282 J. E. CORE

4.3 ENDOGENEITY

On the other hand, when the treatment is a choice like large incentives,it may not be reasonable to assume that the residuals are uncorrelated withthe choice to receive treatment. Instead of assuming (11), the Heckmanmethod offers a way to test the validity of this exogeneity assumption. Touse Heckman, one must assume that the error terms in equations (5)–(7)are jointly normal. Given the normality assumption, the conditional ex-pectations of the residuals in equations (8) and (9) can be expressedas E [ε1 | X, D = 1] = σ1λ(δ̂′X ) and E [ε0 | X, D = 0] = −σ0λ(−δ̂′X ), whereδ̂ is obtained by estimating equation (5) with a probit model, λ(·) =ϕ(·)/�(·), and ϕ(·) and �(·) are the density and cumulative density, respec-tively, of the standard normal. Given the expected value of the residuals,and assuming linearity, one may use the following regressions to estimateequations (8) and (9):

E [Y1 | X, D = 1] = α1 + β ′1X + σ1λ(δ̂′X ) (12)

E [Y0 | X, D = 0] = α0 + β ′0X − σ0λ(−δ̂′X ). (13)

Then one uses the fitted parameters to obtain an unbiased estimate of thetreatment effect in equation (10). If σ 1 or σ 0 are significantly differentfrom zero in the equations above, there would be an endogeneity problemin a linear model that did not include the λ(·)’s. The Heckman approachthus provides a way to test for, and to correct for, endogeneity by provid-ing estimates of the conditional expectations of the residuals. However, inpractice, there are concerns about the sensitivity of inference from Heck-man models, so care must be used in identifying and estimating them (e.g.,Chaney, Jeter, and Shivakumar [2008], Clatworthy, Makepeace, and Peel[2009], Francis and Lennox [2008]), just as care must be used in identi-fying and estimating instrumental variables models more generally (e.g.,Larcker and Rusticus [2009]).

The propensity score approach cannot directly assess whether the exo-geneity assumption in equation (11) is correct. So instead of testing forendogeneity, one assesses whether the results are sensitive to endogene-ity using the Rosenbaum [2002] bounds approach that AJL describe onp. 252. The test works as follows. The propensity score procedure pairsobservations with similar propensity scores, and then computes the signif-icance of observed differences, assuming that the propensity scores, andthe chance that each observation in a pair received treatment, are equal.But if there is a correlated omitted variable, the true propensity scores arenot equal, and the chance that each observation in a pair received treat-ment is not equal. The bounding procedure recomputes the significance

Page 58: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 283

levels of the given pairs, assuming different likelihoods of receivingtreatment.6

4.4 STANDARD ERRORS

Note that the λ(·)’s above in the Heckman approach are generated re-gressors that result from estimating equation (5). Accordingly, if one es-timated equations (12) and (13) with OLS, the p-values would be biased(likely too small) because they do not account for the extra variance gen-erated by using equation (5) to estimate the λ(·)’s. One of the impor-tant contributions of the Heckman approach is that it gives valid standarderrors.

Like the Heckman procedure, propensity score approaches are multi-step procedures that begin by estimating the probability model in equation(5), and then use the generated propensity scores to estimate the treatmenteffect. As such, the standard error of the estimated treatment effect mustaccount for not only the final stage treatment effect estimation, but alsofor the first stage probability model estimation (Li and Prabhala [2006],p. 55; Wooldridge [2002], pp. 619–21). One disadvantage of the propensityscore procedure is that it is not clear how to compute correct standarderrors, and standard errors do not appear to have been developed for someprocedures (especially not standard errors that are robust to time-seriesand cross-sectional correlation). It appears that, similar to Lu et al. [2001],AJL’s significance tests for the propensity score matching are only basedon uncertainty in the final stage (i.e., significance levels for differences inoutcomes using McNemar’s [1947] test), and do not reflect the uncertaintyin the first-stage estimation of the logistic regression model. Accordingly,one must use caution in interpreting the p-values shown in tables 6 and 8,panel B, as the significance levels are likely to be overstated.

4.5 WHICH METHOD(S) SHOULD BE USED?

In summary of the above, two key determinants of model choice are:(1) is the variable interest likely to be exogenous? and (2) can linear mod-els provide a good specification? Both OLS regression and the propensity-score method assume exogeneity, and both offer ways of testing the sen-sitivity of inference to the assumption of exogeneity. OLS methods offerthe advantages of familiarity and of well-developed standard errors for avariety of cross-section and time-series correlation. Propensity-score meth-ods offer the advantage of not requiring linearity assumptions, but a dis-advantage is that the methods are newer and less widely used. In addition,they are two-stage methods, and standard errors do not appear well de-veloped. In addition to these are instrumental variables methods, which

6 For example, a sign test for a median assumes under the null that there is an equal prob-ability that one member of a pair is greater than the other. The bounds procedure tests thesensitivity of inference to changes in this probability under the null.

Page 59: Propensity Score Matching

284 J. E. CORE

have the advantage of addressing endogeneity directly, but come at thecost of requiring good instruments. Which method is best used seems tobe an open issue that depends on the setting. It seems useful to offer aparallel second method as a sensitivity analysis to the primary method cho-sen (as AJL do by showing partial match results). Clatworthy, Makepeace,and Peel [2009] provide an excellent illustration of the standard propen-sity score, and also compare estimates obtained from ordinary regression,Heckman regressions, and propensity scores. Unfortunately, they do notassess the sensitivity of the propensity score results with the Rosenbaumbounds test discussed earlier, so it is not possible to determine whether thedifference between the Heckman results and propensity score results is dueto endogeneity.

For a more general comparison of propensity score methods and instru-mental variables methods, see Heckman and Navarro-Lozano [2004] andWooldridge [2002]. Finally, Angrist and Pischke [2009, pp. 86–7] offer thefollowing comparison of regression and propensity score matching:

The propensity score estimates reported by Dehejia and Wahba are re-markably close to the randomized trial that constitutes their benchmark.Nevertheless, we believe regression should be the starting point for mostempirical projects. This is not a theorem; undoubtedly, there are circum-stances in which propensity score matching provides more reliable esti-mates of average causal effects. The first reason we don’t find ourselves onthe propensity score bandwagon is practical: there are many details to befilled in when implementing propensity score matching, such as how tomodel the score and how to do inference; these details are not yet stan-dardized. Different researchers might therefore reach different conclu-sions, even when using the same data and covariates. Moreover, . . . thereisn’t much theoretical daylight between regression and propensity scorematching. If the regression model for covariates is fairly flexible, say, closeto saturated, regression can be seen as a type of propensity score weight-ing, so the difference is mainly in the implementation.

5. Conclusion

AJL contribute new findings to the growing body of research on incen-tives and accounting manipulation, and they contribute a new techniqueto the accounting literature as a whole. Their propensity score approachoffers an alternative to multiple regression that does not assume linearityand that allows an explicit assessment for the sensitivity of the results toendogeneity bias (correlated omitted variables bias). In a “partial match”conditional logistic regression of AAERs from 1992 to 2001, AJL find apositive relation between incentives and fraud that is consistent with John-son, Ryan, and Tian [2009]. However, in the better-specified propensity-score model specification, AJL find no relation between incentives andfraud. Using a later (2001 to 2005) and larger sample that includes smallerfirms, they find no evidence to support the hypothesis that there is a causal

Page 60: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 285

positive relation between CEO equity incentives and accounting manipula-tion. Although they find some evidence of a negative relation, their bound-ing test suggests that this relation is not causal and is likely due to a corre-lated omitted variables bias.

AJL focus their tables and discussion on showing the superiority of thepropensity score method to the partial match method. It is intuitive thatthe propensity score may be more powerful. Partial match designs proceedby matching first on one variable, then on another, etc., for example, firston industry, then on size, and then on profitability, etc. As the numberof desired matching variables increases, it becomes increasingly difficult tofind pairs of observations, and, as a result, a small number of observations isused (150 in AJL’s replication of Erickson, Hanlon, and Maydew [2006]). Incontrast, the propensity-score collapses all the relevant variables to a singledimension, so many more observations can be used (about 7,300 in AJL’sreplication of Erickson, Hanlon, and Maydew [2006]), and the pairs thatare used are better matched.

However, it is important to note that the propensity score method ap-plies to many more situations than simply to partial match regressions.If one is willing to rank the variable of interest, one can use the propen-sity score technique of AJL as an alternative to any regression approach.Had the authors fit full sample logistic regressions analogous to Burnsand Kedia [2006] and Erickson, Hanlon, and Maydew [2006], they wouldhave increased their contribution by demonstrating the performance ofthe propensity score method in comparison to these regression methods.Hribar and Nichols [2007] show that Begstresser and Philippon’s [2006]result that incentives are associated with earnings management is driven bya correlated omitted variable problem; they do this by including the omit-ted variable (operating volatility). Would the bounding procedure used byAJL suggest an omitted variable problem in the Begstresser and Philippon[2006] specification without “knowing” what the omitted variable was? Ifit did, this would be a useful demonstration of the bounding procedure’svalue.

The authors conclude the paper by saying that “future research shouldalso consider bounding methods to explicitly quantify the sensitivity of theresults for the primary causal variable to unobserved correlated omittedvariables” (p. 261). Their result on lawsuits discussed above shows why thisis important. In the absence of this sensitivity test, it is tempting to takethe lawsuit result, which is consistent with the hypothesis that more in-centives lead to better outcomes for shareholders, and to translate it intoa “more incentives are better” policy prescription similar to Jensen andMurphy [1990]. But the bounding procedure shows that this relation is un-likely to be causal. Or suppose the result went the other way and suggestedthat more incentives cause fraud, as it did in much of the early literatureon incentives and accounting manipulation. Then it is tempting to trans-late the result into a “more incentives are worse” policy prescription similarto those now being used (in the apparent lack of any scientific evidence)

179637C
Highlight
179637C
Highlight
Page 61: Propensity Score Matching

286 J. E. CORE

to justify pay and incentive regulation in the banking industry. What AJLare asking for, and what we as scientists should wholeheartedly support, isnot only a statement by researchers that a variable of interest is significantlyassociated with the outcome, but also, and more importantly, a statementabout whether the observed association is likely to be causal.

As discussed earlier in the context of the Heckman procedure, and asis illustrated in Larcker and Rusticus [2009] in the context of OLS regres-sions, there are other methods for assessing the sensitivity of inference. Soresearchers do not have to use propensity-score methods to use this valu-able idea. While there was much interest in the conference about this newpropensity score method, there was also skepticism, not only by conferenceparticipants but also by other researchers (e.g., Angrist and Pischke [2009],Heckman and Navarro-Lozano [2004], Wooldridge [2002]). Which bound-ing method to use, and how to use the propensity score method if it is used,are useful areas for future accounting research to examine.

REFERENCES

ANGRIST, J. D., AND J. PISCHKE. Mostly Harmless Econometrics: An Empiricist’s Companion.Princeton: Princeton University Press, 2009.

ARMSTRONG, C.; C. D. ITTNER; AND D. F. LARCKER. “Economic Characteristics, Corporate Gover-nance, and the Influence of Compensation Consultants on Executive Pay Levels.” Workingpaper, The Wharton School, University of Pennsylvania, 2009.

BERGSTRESSER, D., AND T. PHILIPPON. “CEO Incentives and Earnings Management.” Journal ofFinancial Economics 80 (2006): 511–29.

BURNS, N., AND S. KEDIA. “The Impact of Performance-Based Compensation on Misreporting.”Journal of Financial Economics 79 (2006): 35–67.

CHANEY, P.; D. JETER; AND L. SHIVAKUMAR. “Self Selection Models and Endogeneity Issues inAccounting Research: The Case of Audit Pricing.” Working paper, Vanderbilt University,2008.

CLATWORTHY, M.; G. MAKEPEACE; AND M. PEEL. “Selection Bias and the Big Four Premium:New Evidence using Heckman and Matching Models.” Accounting and Business Research 39(2009): 139–66.

DEMSETZ, H., AND K. LEHN. “The Structure of Corporate Ownership: Causes and Conse-quences.” Journal of Political Economy 93 (1985): 1155–77.

ERICKSON, M.; M. HANLON; AND E. L. MAYDEW. “Is There a Link between Executive EquityIncentives and Accounting Fraud?” Journal of Accounting Research 44 (2006): 113–43.

FRANCIS, J. R., AND C. S. LENNOX. “Selection Models in Accounting Research.” Working paper,University of Missouri, Columbia, 2008.

FRANK, K. A. “Impact of a Confounding Variable on a Regression Coefficient.” Sociological Meth-ods and Research 29 (2000), 147–94.

HECKMAN, J., AND S. NAVARRO-LOZANO. “Using Matching, Instrumental Variables and ControlFunctions to Estimate Economic Choice Models.” The Review of Economics and Statistics 86(2004): 30–57.

HECKMAN, J., AND R. ROBB. “Using Longitudinal Data to Estimate Age, Period, and CohortEffects in Earnings Equations,” in Cohort Analysis in Social Research Beyond the IdentificationProblem, edited by W.M. Mason and S.E. Feinberg. New York: Springer-Verlag, 1985.

HIRANO, K., AND G. W. IMBENS. “The Propensity Score with Continuous Treatments,” in AppliedBayesian Modeling and Causal Inference from Incomplete Data Perspectives, edited by A. Gelmanand X. L. Meng. West Sussex, England: John Wiley & Sons, 2004: 73–84.

HRIBAR, P., AND C. D. NICHOLS. “The Use of Unsigned Earnings Quality Measures in Tests ofEarnings Management.” Journal of Accounting Research 45 (2007): 1017–53.

Page 62: Propensity Score Matching

CHIEF EXECUTIVE OFFICER EQUITY INCENTIVES 287

JENSEN, M. C., AND K. J. MURPHY. “Performance Pay and Top-Management Incentives.” Journalof Political Economy 98 (1990): 225–64.

JOHNSON, S. A.; H. E. RYAN; AND Y. S. TIAN. “Managerial Incentives and Corporate Fraud: TheSources of Incentives Matters.” Review of Finance 13 (2009): 115–45.

LARCKER, D. F., AND T. O. RUSTICUS. “On the Use of Instrumental Variables in AccountingResearch.” Working paper, Stanford University, 2009.

LI, K., AND N. R. PRABHALA. “Self-Selection Models in Corporate Finance,” in Handbook of Cor-porate Finance: Empirical Corporate Finance, edited by B. Espen Eckbo. North-Holland: Elsevier,2006.

LU, B.; E. ZANUTOO; R. HORNIK; AND P. R. ROSENBAUM. “Matching with Doses in an Obser-vational Study of a Media Campaign against Drug Use.” Journal of the American StatisticalAssociation 96 (2001): 1245–53.

MCNEMAR, Q. “Note on the Sampling Error of the Differences between Correlated Propor-tions of Percentages.” Psychometrika 12 (1947): 153–57.

MORCK, R.; A. SHLEIFER; AND R. W. VISHNY. “Management Ownership and Market Valuation:An Empirical Analysis.” Journal of Financial Economics 20 (1988): 293–315.

ROSENBAUM, P. R. Observational Studies, Second edition. Berlin: Springer Series in Statistics,2002.

WOOLDRIDGE, J. M. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MITPress, 2002.