REGRESSION DIAGNOSTIC USING LOCAL INFLUENCE: A REVIEW

17
This article was downloaded by: [The Aga Khan University] On: 07 November 2014, At: 23:28 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Communications in Statistics - Theory and Methods Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsta20 REGRESSION DIAGNOSTIC USING LOCAL INFLUENCE: A REVIEW M. Mercedes Suárez Rancel a & Miguel A. González Sierra a a Department of Statistics, Operation Research and Computation, Mathematics Faculty , University of La Laguna , La Laguna-Tenerife-Canary Islands, 38271, Spain Published online: 15 Feb 2007. To cite this article: M. Mercedes Suárez Rancel & Miguel A. González Sierra (2001) REGRESSION DIAGNOSTIC USING LOCAL INFLUENCE: A REVIEW, Communications in Statistics - Theory and Methods, 30:5, 799-813, DOI: 10.1081/STA-100002258 To link to this article: http://dx.doi.org/10.1081/STA-100002258 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Transcript of REGRESSION DIAGNOSTIC USING LOCAL INFLUENCE: A REVIEW

This article was downloaded by: [The Aga Khan University]On: 07 November 2014, At: 23:28Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Communications in Statistics - Theory and MethodsPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/lsta20

REGRESSION DIAGNOSTIC USING LOCAL INFLUENCE: AREVIEWM. Mercedes Suárez Rancel a & Miguel A. González Sierra aa Department of Statistics, Operation Research and Computation, Mathematics Faculty ,University of La Laguna , La Laguna-Tenerife-Canary Islands, 38271, SpainPublished online: 15 Feb 2007.

To cite this article: M. Mercedes Suárez Rancel & Miguel A. González Sierra (2001) REGRESSION DIAGNOSTIC USING LOCALINFLUENCE: A REVIEW, Communications in Statistics - Theory and Methods, 30:5, 799-813, DOI: 10.1081/STA-100002258

To link to this article: http://dx.doi.org/10.1081/STA-100002258

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

COMMUN. STATIST.—THEORY METH., 30(5), 799–813 (2001)

REGRESSION DIAGNOSTIC USING LOCALINFLUENCE: A REVIEW

M. Mercedes Suarez Rancel∗ andMiguel A. Gonzalez Sierra#

Department of Statistics, Operation Research andComputation, Mathematics Faculty, University ofLa Laguna, 38271, La Laguna-Tenerife-Canary

Islands, Spain

ABSTRACT

The local influence approach of Cook (1) to regression di-agnostic is developed and discussed, and compared with Cook’s(2) deletion approach. The ability of the local influence approachto handle cases simultaneously, as well as some of its theoreticaland practical difficulties, are reviewed. The perturbation ideas ofthe approach are applied to the linear model making distinctionbetween the local perturbations on the assumptions of the modeland the data.

Key Words: Curvature; Diagnostics; Local influence; Regression.

1. INTRODUCTION

Statistical models are simplification of reality; we rarely expect the model tobe exactly true. Nevertheless, when we select a statistical technique and perform

∗E-mail: [email protected]#E-mail: [email protected]

799

Copyright C© 2001 by Marcel Dekker, Inc. www.dekker.com

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

800 SUAREZ RANCEL AND GONZALEZ SIERRA

statistical inference, we often act as if the model is true. This is often justifiedby claiming that “small” deviations from the theoretical properties of the selectedinferential techniques or cause only minor changes in the results produced by theinference. Unfortunately, this argument need not be true. In many applicationsapparently small changes in a model, a model assumption, or a data point, canhave very large effects on the results.

A practical and well established approach to influence analysis in statisticalmodeling is based on case deletion; Cook (2) pioneered the idea. The effect orinfluence of the ith case of the data is measured by a comparison of parameterestimates before and after deletion of the ith case. The idea of differentiationinstead of deletion is prominent in the local influence approach of Cook (1). Localinfluence is based on the perturbation of a case and not on its total deletion,and employs a differential comparison of parameter estimates before and afterperturbation. Pregibon (3) and Cook and Weisberg (4) establish this perturbationscheme like a reasonable compromise between conservatism (the case-weight is1) and liberalism (the case-weight is 0). There are several parallels between theapproaches which be stressed.

Cook (1) applied the method to linear regression analysis, but indicted thatthe method is general and can be applied to a wide variety of problems, givinglater, an unifying approach to local influence (see Cook, (5)) . Applications of localinfluence analysis to specific problems have been described in several publications.Beckman, Nachtsheim and Cook (6) develop and describe applications to mixedmodel analysis of variance. Lee and Zhao (7), Thomas and Cook (8, 9), O’HaraHines et al. (10) and Paula (11, 12) apply local influence methods to the generalizedlinear model, while Pettitt and Bin Daud (13) and Weissfeld (14) do the same for theCox proportional hazards model. Billor (15) uses the maximum pseudo likelihoodridge estimator to assess of changing case-weights in ridge regression. Laurent andCook (16) propose a local diagnostic in non linear regression. The idea of localinfluence is adapted for use in binary regression model by Tan Qu and Kutner (17)and Farrell and Cadigan (18).

In this paper, we review the application and interpretation of local influenceanalysis method. In Section 2 the local influence concept of Cook (1) is introduced.In Section 3 and Section 4 the local influence is applied to the linear regressionmodel, making distinction between the local perturbations on the assumptions ofthe model and the data. Section 5 reviews theoretical and practical difficulties whicharise in Cook’s approach. Finally, in Section 6 several alternatives developmentsover local influence are considered.

2. LOCAL INFLUENCE: COOK’S VERSION

We suppose a statistical model with parameters β of interest and a set of n re-sponse data y to which the model will be fitted; covariate data and other parameters

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 801

may also be present. The log likelihood or log profile likelihood, as the case maybe, of parameters β will be denoted by L(β). Suppose the perturbations are w;the w are mathematical quantities introduced into the model. Parameter estimates,usually maximum likelihood, are then obtained as functions of the perturbation,say as β (w); the null perturbation is w0, with β (w0) = β being the parameter esti-mate for the unperturbed model. Influence assessment involves the comparison ofβ(w) and β . One general way of comparing them is by their likelihood distanceapart, defined as

D(w) = 2[L(β) − L

(β (w)

)].

In Cook’s (1) approach to local influence an important idea is given forthe simultaneous handling of perturbations to all cases. Cook suggests regard-ing the likelihood distance D(w) or some other suitable measure of the distancebetween β (w) and β = β (w=1) as a surface in the n-dimensional Euclidean spaceof w. He then proposes a line across this surface with direction cosines � whichpasses through the point w0 of the null perturbation; this line is thus of the formw = w0 + a� , with “a ” measuring distance along the line and |a| being the Eu-clidean magnitude of the perturbation. The direction cosines � are thus associatedwith the cases of the data, and the n dimensional surface D(w) is replaced bythe much more tractable one-dimensional influence curve {a, D(w0 + a�)}. Thenext key idea is to choose the direction �. This is ideally done in relation to aperturbation of chosen magnitude “a ”, so that D(w0 + a�) is as large as possible;the individual perturbations contributing to “a ” which each case should receiveare thus selected, the most influential cases being those associated with the largerdirection cosines. There is some elegant theory for determining � due to Cook (1).The required second derivative of D(w0 + a�), which Cook emphasis as the cur-vature at w0, and which we will call “Cook’s directional curvature” in direction �,evaluates as

C� = 2|�′�′ L−1��|, where ‖�‖ = 1,

� = {∂2L(β/w)/∂βi∂w j } and (2)

L = {∂2L(β)/∂βi∂β j } evaluated at β and w0.

There are several ways in which (2) can be used to study the local influencein practice. The extremes Cmax = max� C� and Cmin = min�C�, are two possibleoptions. The Cmax and Cmin correspond to the maximum and minimum absoluteeigenvalues of

F = {∂2L(βw)/∂w j∂wk}, with j, k = 1, 2, . . . q. (2a)

The vector � indicates how to perturb the postulated model to obtain thegreatest local changes in the likelihood displacement. When simultaneously per-turbing all case-weights in linear regression, for example, suppose that the ithelement of �max, �i , is found to be relatively large. This indicates that perturbations

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

802 SUAREZ RANCEL AND GONZALEZ SIERRA

in the weight wi of the ith case may lead to substantial changes in the results ofthe analysis.

3. LOCAL INFLUENCE OF ASSUMPTIONS FORTHE LINEAR MODEL

The key model assumptions of standard linear regression which are exam-ined here with regard to local influence, are the constancy of variance of errorand the independence of errors given their constant variance. Once confirmed thelocal influence in the homocedasticity of the problem and having applied the corre-sponding transformations, it is necessary to ask if the local influential observationshave effect the parameters of the transformation. More concretely, the effect of thedisturbances of the variance on the estimates of the parameters of Box and Cox(19), will be analyzed.

3.1. Variances Perturbations

We consider the disturbed model

Y = Xβ + ε, (3)

where Var(ε) = σ 2 W −1 with W = diag(w1, w2, . . . , wn) and σ 2 being known.The relevant part of the log-likelihood for the perturbed model is

L(β/w) = − 1

2σ 2

n∑i=1

wi (yi − x ′iβ)2

, (4)

when yi is the ith component of Y and x ′i is the ith row of X. Differentiating (4)

with respect to β and w, and evaluating at β and w0 = 1, we find

� = X ′ D(e)/σ 2,

where e = (ei ) is the n × 1 - vector of ordinary residuals when w0 = 1 and D(e) =diag(e1, e2, . . . , en). Since, L(β) = −X ′ X/σ 2, and

C� = 2|�′�′ L−1��| = 2�′ D(e)PD(e)�/σ 2, (5)

where P = X (X ′ X )−1 X ′ is the prediction matrix and ‖�‖ = 1 .When σ 2 being unknown, the analogous result is

C� = 2�′[D(e)PD(e) + esqe′sq/2nσ 2]�/σ 2, (6)

where esq is the n × 1 vector with elements e2i .

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 803

For a simple random sample, F has only one nonzero eigenvalue, Cmax, withcorresponding eigenvector �max = e/‖e‖. Thus, the local changes in β will be zerowhen w0 = 1 is perturbed in any direction that is orthogonal to e. In this simplesituation, the maximum curvature is Cmax = 2 which is independent of the data.For this reason a curvature of 2 serves as a useful general reference, with curvaturesmuch larger then 2 indicating notable local sensitivity.

Note that if we find that perturbation to the ith variance in a linear regressionmodel causes large changes to the parameter estimates, it does not allow us to inferthat the variance of the ith case was different to the rest. The conclusion is reachedthat a case is influential if one or more of the model assumptions for this casecannot be varied without important changes to the resulting parameter estimates.The strength of influence of a particular case might be gauged by the number oftypes of such perturbation which cause large estimate changes.

3.2. Perturbation of Independence

To asses the importance of the independence assumption in the standardlinear model, it is proposed to perturb the variance matrix from the unit diagonalform to a tridiagonal type representing first order moving average autocorrelationin the index of the data. Because of the local nature of the analysis, it is only firstorder dependence which is relevant. Thus, in the standard linear perturbed model,Lawrance (20) considers the perturbations on ε, Var(ε) = σ 2V , where

V =

1 ρ1 0 . . . 0

ρ1 1 ρ2 . . . 0

0 ρ2 1. . .

......

.... . .

. . . ρn−1

0 0 . . . ρn−1 1

, (7)

and where ρi (i = 1, 2, . . . , n) are the perturbations away from 0. Lawrance per-turbs one assumption at a time, and variances have been assumed constant. He as-sumes in the local influence framework that ρi = a �i (1 ≤ i ≤ n − 1), where |a| <

min{1/max (�i positives), 1/max (−�i negatives)}.Using the generalized squared least estimators, the estimator of β is

β = (X ′V −1 X )−1

X ′V −1Y. (8)

Differentiating (8) with respect to “a ”, we obtain

∂β(a)

∂a

∣∣∣∣∣(a=0)

= (X ′ X )−1 X ′ R�, (9)

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

804 SUAREZ RANCEL AND GONZALEZ SIERRA

where

R =

e2 0 0 . . . 0

e1 e3 0 · · · 0

0 e2 e4. . .

......

.... . .

. . . en

0 0 . . . 0 en−1

. (10)

of dimension n × (n − 1). Thus, Cook’s directional curvature becomes

C� = �′ R X (X X )−1 X ′ R�/σ 2, (11)

which may be maximised to obtain �max.Finally, we consider the effects of correlation perturbations to the (i, i + 1)

th cases. Immediately from the ith diagonal element of (11), Cook’s curvature forthe ith case become(

pii e2i+1 + 2pii+1ei ei+1 + pi+1i+1e2

i

)/σ 2 (12)

The first and last terms will be large when the leverages and residuals arelarge; (12) becomes larger when the associated off-diagonal element of the leveragematrix is large, and of the same sign as the product of the adjacent residuals. Thepresence of the off-diagonal term is not surprising when the effect of interestconcerns pairs of cases.

3.3. Regression Transformation Diagnostics Using Local Influence

The assumption of constant variances in the transformed model (3) is slightlyperturbed; this is seen as a way to handle cases badly modeled (for whateverreasons). In particular, it provides a technical approach to determining those casesof the data that have strongest general influence on the estimated transformationparameter. Following Box and Cox (19), the log-likelihood for λ, given data y andweight w after maximizing over β and σ 2, is

Lw(λ) = −1

2n − 1

2n log(2π ) + 1

2n log(n) + 1

2n

n∑i=1

log(1 + wi )

− 1

2n log{z′ Bwz}, (13)

where z = z(λ) = y(λ)/J (λ)1/n, J (λ) = ∏ni=1 dy(λ)

i /dyi , and

Bw = W − WX(X ′WX )−1 X ′W. (14)

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 805

The maximum likelihood estimate of λ, denoted by λw , then satisfies

z′ Bw z = 0, (15)

where (·) denotes the derivative of z with respect to λ. Furthermore, the estimatorλw can be regarded as a surface with Euclidean coordinates w. A curve over thissurface is mapped from a straight line path that passes through the point of nullperturbation. The direction and location of this path are specified by w = w0 + a�

w = w0 + a� passing through the null point. The basis of Lawrance’s local in-fluence diagnostic is the slope of the curve on the λw surface at the point of nullperturbation, a = 0. Lawrance (21) obtained the maximum local influence, whichis given as follows

�max,i = ri ri

/ [n∑

i=1

{ri ri }]1/2

, i = 1, 2, . . . , n (16)

where ri and ri are ith residuals from the regression of z and z on the columns ofX , respectively. The diagnostic technique that arises from the direction of max-imum slope, (16), is the plot of the scaled product of ri ri against the case indexi; any points that are separated from a central band about 0 are indicative casesor groups of cases that are, at least locally, most influential. The advantage of(16) is that it assesses the effect of joint perturbations on the data cases. Thus,Lawrance (21) and Pena and Yohai (22) assert that, in a local sense, the resultsare free from masking effects, that present difficulties to individual case-deletionmethods. However, the available outliers identification methods using local influ-ence often do not succeed in detecting them because they are based on measureswhich are not resistant to the masking and swamping effects. They are affectedby the observations they are supposed to identify (see Cook, Pena and Weisberg,(23)), and they don’t find suitable cut off points. In order to improve these ef-fects, Schall and Gonin (24) replace the notion of likelihood displacement by theLp-norm displacement, and Suarez and Gonzalez (25) propose to use an alternativequasi-likelihood displacement

LD(i)(wi ) = −2[L ˆ(β) − L (i)(βwi | w)

] + [Var ˆ(Y ) − Var(Yw)],

where L(i) denote the likelihood when the ith observation is eliminated.In studying diagnostics on the transformation parameter estimator, Lawrance

(21) indicated that cases that are locally influential may often have large influence inthe deletion-parameter estimator, λ(i) . Furthermore, he discussed the relationshipbetween his local-influence diagnostic �max,i and Cook and Wang’s (26) deletionestimator. Since Lawrance’s perturbation scheme does not affect the jacobian ofthe power transformation, Tsai and Wu (27) give an alternative local influencediagnostic.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

806 SUAREZ RANCEL AND GONZALEZ SIERRA

4. LOCAL PERTURBATIONS IN THE DATA OFTHE LINEAR REGRESSION MODEL

Considering now the sensitivity of parameter estimates to data values usedin the estimation, assuming a correct model. The local influence approach is usedto perturb data values, a logically distinct operation from perturbing assumptions.The identification of data values upon which the parameter estimates cruciallyrest, is of course, of much practical importance; it drives the process of datavalidation and correction. Lawrance (20) considers perturbations to response andindividual explanatory variables; the explanatory case has been fully studied byCook (1), although the derivations here are intended to be slightly more explicit.First perturbations to response values are discussed.

4.1. Perturbations to Response Values

Perturbing Y to Y + w, gives

β(w) = (X ′ X )−1 X ′(Y + w), where

∂β(w)

∂w

∣∣∣∣∣w=0

= (X ′ X )−1 X ′.

Thus, the Cook’s directional curvature is

C� = �′ P�.

Since P is idempotent, the k non-zero eigenvalues are unity. There is nounique eigenvector in this situation and thus no unique direction maximizing thelocal curvature. The eigenvectors �max maximizing C� are not functions of theresponses. The individual Cook’s curvatures are just the leverage coefficients. Also,

∂ X β(w)

∂w

∣∣∣∣∣w=0

= X (X ′ X )−1 X ′ = P,

so that the two-way table of fitted values and case derivatives is given by theleverages matrix, perhaps a useful further way to interpret P .

Another way to perturb the response values is multiplicatively; Y is perturbedto {diag(1 + a�}Y ; hence

∂β(a)

∂a

∣∣∣∣∣a=0

= (X ′ X )−1 X ′ D(Y )�,

where D(Y) is the diagonal matrix with entries Y. Hence, Cook’s directional cur-vature is

C� = �′ D(Y )P D(Y )�,

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 807

which is maximized to produce unique �max; the individual curvatures are thenpii y2

i ; the use of such perturbation scheme remains to be justified.Schwarzmann (28) notes that if the response vector y is perturbed to y + w

the direction of maximum curvature is that of the residual vector e. This provides afurther justification for a local-influence analysis, because now residual diagnosticmethods are seen to be a special case of perturbing the data and inspecting thedirection of the greatest local sensitivity. As the vector of ordinary residuals co-incides with the direction of maximum curvature for additive perturbations of theresponse vector, it becomes clear that diagnostics based on residuals will identify“wrong data” rather than a “wrong model”.

4.2. Perturbations to Explanatory Variables

It is well known that minor perturbations of the explanatory variables inlinear regression can seriously influence the results of a least squares analysiswhen collinearity is present. In this sense, is interesting to study the influenceof the ith row of X on the eigenstructure of X in general and on its conditionnumber.

Farebrother (29) shows that the condition number traditionally used by nu-merical analysts is closely related to variant of Cook’s (1) absolute measure oflocal influence.

By replacing the values of the ith explanatory variable Xi , the (i + 1)thcolumn of X, by Xi + w; and taking w as a� where � is a column vector ofdirection cosines, becomes

∂β(a)

∂a

∣∣∣∣∣a=0

= (X ′ X )−1 {di e′ − βi X ′}�, and C� = �′M�

where M = (ed ′i − β i X )(X ′ X )−1(di e′β i X ′)/σ 2, and di = (0, . . . , 1, . . . , 0).

It can be verified that the direction of maximum curvature is

�max = e − βi x∗,

where x∗ is the residual vector from regressing Xi on the other explanatory vari-ables. The elements of �max are proportional to the vertical distance from the lineof slope βi in the added variable plot of e against x∗ . Thus, if we considering nowthe deletion diagnostic ϑ = e[i] − e = βi x∗ , where e[i] is the residual when theith explanatory variable is deleted, which verify that

ϑ = e − �max, (17)

a case that is locally influential is not necessary globally influential, andreciprocally.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

808 SUAREZ RANCEL AND GONZALEZ SIERRA

5. PROBLEMS OF LOCAL INFLUENCE

Cook (1) gives a general method for assessing the influence of local de-partures from assumptions in likelihood-based models (not necessarily regressionmodels). For that, he suggests using the normal curvature of a likelihood dis-placement surface. Initially, there are practical and theoretical difficulties whicharise in Cook’s approach, and which deserve further attention: lack of invarianceof the curvature under reparametrisation of the perturbation scheme; choice ofyardstick; computability of the maximum curvature; and lack of definition of theparameters.

So, Schall and Dunne (30) suggests a modification of Cook’s measure for thelocal influence of model perturbations which is invariant over reparametrizationsof the perturbation scheme, and Billor and Loynes (31) propose a new measureof local influence which is simpler to compute. Moreover they divide perturbationschemes into two main groups: model perturbation, which means modifying theassumptions, and data perturbation.

5.1. Scaled Curvature

Schall and Dunne (30) consider the variance inflation factor (VIF) in linearregression. The ith VIF, associated with βi , is defined as

VIFi = {(X ′ X )−1}i i

{(X ′ X )i i }−1= (X ′ X )i i (X ′ X )i i , (18)

where(X ′ X )i i and (X ′ X )i i respectively denote the ith diagonal elements of X ′ Xand its inverse. It is clear that VIFi ≥ 1.

To extend the notion of collinearity and variance inflation factor to generallikelihood models, they use

VIFi = Iii I ii (i = 1, 2, . . . , k), (19)

where Iii and I ii respectively denote the ith diagonal elements of I (informationmatrix) and its inverse. Statistic (19) can be interpreted as measuring the parametercollinearity between θi and θ(i).

In the local curvature context, they define the (k + 1) × (k + 1) matrix for(θ, a), where w = w0 + a�, as

I = −(

L ��

�′�′ �′ B�

), (20)

where L , � is defined by (2) and B = E(G) = E[∂2L(θ | w)/∂w∂w′] .

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 809

Thus

VIF� = I(k+1)(k+1) I(k+1)(k+1) = �′ B�

�′ B� − �′ �′ L−1��

= |�′ B�||�′ B�| − 1

2 C�

= 1

1 − c�

, (21)

where c� = 12 C�/|�′ B�| is the scaled curvature, which is invariant under common

reparametrisation of the components of w.The direction of maximum curvature is not necessarily the same as the

direction of maximum variance inflation factor. However, the two will coincidewhen B is a multiple of the identity matrix.

5.2. Billor and Loynes’ Approach

To avoid the difficulties of Cook’s local influence approach, Billor andLoynes (31) suggest, an alternative likelihood displacement

LD∗(w) = −2[L ˆ(θ ) − L(θw | w], (22)

where L(θw | w) is the log-likelihood of the perturbed model, while Cook (1),uses only the perturbation in the estimation of the parameters. Billor & Loynes(31) suggest that the first derivative of LD∗ provides valuable information aboutthe local behavior of LD∗, so they use the direction which produces the maximumincrement in LD∗ with the following slope

lmax = ‖∇LD•(w0)‖ = 2‖∇L(θ | w)‖.If we take the (perturbed) model

Y = Xβ + ε (23)

where Var(ε) = σ 2W −1 and W = diag(1, 1, . . . , 1 + wi , 1, . . . , 1), then

�i = �max,i =(

1 − e2i

σ 2

), (24)

which is simpler to compute than the classic approach. Furthermore, if we replacethe residuals in this expression by the errors ε, which are assumed to be independentnormally distributed random variables with means 0 and variance σ 2, a convenient

cut-off point is obtained as√

2n + 4√

14n . It may be noted that this value dependof the dimensions of the problem.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

810 SUAREZ RANCEL AND GONZALEZ SIERRA

6. FURTHER TOPICS

Because LD(w) is, in general, difficult to compute and with the objective ofunderstanding the nature of the local influence and not only to detect it, Escobarand Meeker (32) and Wu and Luo (33, 34) propose alternative developments overlocal influence.

6.1. Taylor Series Approximation for LD(w)

Escobar and Meeker (32) suggest that consideration of the overall quadraticapproximation to LD, i.e. to the complete surface rather than restricting attentionto sections in single directions � , leads to more understanding. We have

LD(w) ≈ 1

2w′ Fw = 1

2

n∑i=1

n∑j=1

wiw j

r∑s=1

λsνsiνs j (25)

where wi is the weight of the ith case belongs to [−1,1], F is defined in (2a)with range r which the spectral decomposition of is V �V ′ = ∑r

s=1 λsνsν′s where

� a r × r diagonal matrix, the eigenvalues of F are λ1, λ2, . . . , λn (in descen-dent order), and V = (ν1, ν2, . . . , νr ) is the correspond n × r matrix of orthogonaleigenvectors with norm 1. This shows when studying the effects of joint perturba-tions, one should examine all of the eigenvectors and particularly those with largeeigenvalues and those that have directions dominated by just a few components.

6.2. Second Order Approach

Inspired by Cook’s (1) assessment of local influence by studying the cur-vature of a surface associated with the overall discrepancy measure, Wu and Luo(33) assess the local influence through the curvature of the perturbation-formedsurface of residual sum of squares (RSS) and multiple potential respectively.

Consider the perturbed model (3) and η a quantity under consideration, likeSSR (residual sum of squares) or multiple potentials, which haven’t zero derivativeat the null perturbation. The variation is locally dominated by the first order term,the tangent approximation: which is the first order approach. But if the curvatureis large, the second order terms may still be important globally: hence the secondorder approach, to be used alongside the first order approach.

In general, the curvature is no longer identical to the second derivative:

C� =∥∥(−η

1

)∥∥�′η′�‖α‖2(1 + η′η)

= �′η′�(1 + η′η)1/2�′(I + η′η)�

, (26)

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 811

where∣∣∣∣∂η

∂a

∣∣∣∣a=0

= η′� and∂2η

∂a2

∣∣∣∣a=0

= �′η′�. (27)

The curvature (26) takes the form

�′F�

�D�. (28)

According to matrix theory, the local maximum curvatures and their correspondingdirections are the eigenvalue-vector solution of equation

|F − λD| = 0. (29)

They discuss the Residual Sum of Squares, and two versions of expected multiplepotential, defined as

tr(PJ (I − PJ )−1) and |I − PJ |respectively, where J ′ = (i1, . . . , im), X ( j) be the minor of X with rows indexed byJ deleted, X j be the minor of X with only the rows indexed by J remained, and Pj them × m minor of P given by the intersection of the rows and columns indexed by J.

7. SUMMARY AND CONCLUDING REMARKS

Statistical models are generally approximate descriptions of more compli-cated processes, and because of this lack of exactness, consideration of the in-fluence of model perturbations is important. Local influence is a useful tool foridentifying importantly influential cases and for assessing the effects the pertur-bations to the assumed data/model will have on inferences.

In this paper the weighting diagnostic is reviewed and introduced like a com-plementary tool to the deletion diagnostic. The analysis of possible connections be-tween both diagnostics provides a further justification for local-influence analysis.

ACKNOWLEDGMENTS

We thank Ali S. Hadi and Robert Loynes for their helpful comments on anearlier version of this manuscript.

REFERENCES

(1) Cook, R. D. Assessment of Local Influence. Journal Of The Royal StatisticalSociety Series B-Methodological, 1986, Vol. 48, 133–169.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

812 SUAREZ RANCEL AND GONZALEZ SIERRA

(2) Cook, R. D. Detection of Influential Observations in Linear Regression.Technometrics, 1977, Vol. 19, 15–18.

(3) Pregibon, D. Logistic Regression Diagnostics. The Annals of Statistics,1981, Vol. 9, 705–724.

(4) Cook, R. D.; Weisberg, S. Residuals and Influence in Regression; 1982.London: Chapman and Hall.

(5) Cook, R. D. Influence Assessment. Journal Of Applied Statistics, 1987, Vol.14, 117–131.

(6) Beckman, R.; Nachtsheim, C.; Cook, R. D. New Diagnostics for MixedModel Analysis of Variance. Technometrics, 1987, Vol. 29, 413–426.

(7) Lee, A. H.; Zaho, Y. J. Sensitivity of Pearson’s Goodness-of-Fit- Statisticin Generalized Linear Models. Communications in Statistics-Theory andMethods, 1996, Vol. 25, 143–157.

(8) Thomas, W.; Cook, R. D. Assessing Influence on Regression-Coefficientsin Generalized Linear-Models. Biometrika, 1989, Vol. 76, 741–749.

(9) Thomas, W.; Cook, R. D. Assessing Influence on Predictions from Gener-alized Linear Models. Technometrics, 1990, Vol. 32, 59–65.

(10) O’Hara Hines, R. J.; Lawless, J. F.; Carter, D. M. Diagnostics for a Cumula-tive Multinomial Generalized Linear-Model, with Applications to GropedToxicological Mortality Data. Journal of the American Statistical Associa-tion, 1992, Vol. 87, 1059–1069.

(11) Paula, G. A. Assessing Local Influence in Restricted Regression Models.Computational Statistics and Data Analysis, 1993, Vol. 16, 63–79.

(12) Paula, G. A. Influence and Residuals in Restricted Regression Models. Jour-nal of Statistical Computation and Simulation, 1995, Vol. 51, 315–331.

(13) Pettitt, A. N.; Bin Daud, I. Case-Weighted Measures of Influence for Pro-portional Hazards Regression. Applied Statistics, 1989, Vol. 38, 51–67.

(14) Weissfeld, L. A. Influence diagnostics for the Proportional Hazards Models.Statistics and Probability Letters, 1990, Vol. 10, 411–417.

(15) Billor, N.; Loynes R. M. An Application of the Local Influence Approachto Ridge Regression. Journal of Applied Statistics, 1999, Vol. 26, 177–183.

(16) Laurent, R. T.; Cook, R. D. Leverage, Local Influence and Curvature inNonlinear Regression. Biometrika, 1993, Vol. 80, 99–106.

(17) Tan, M., Qu, Y.; Kutner, M. H. Model Diagnostics for Marginal Regres-sion Analysis of Correlated Binary Data. Communications in Statistics-Simulation and Computation, 1997, Vol. 26, 539–558.

(18) Farrell, P. J.; Cadigan, N. G. Local Influence in Binary Regression Mod-els and its Correspondence with Global Influence. Communications inStatistics-Theory and Methods, 2000, Vol. 29, 349–368.

(19) Box; G. E. P.; Cox, D. R. An Analysis of Transformations. Journal Of TheRoyal Statistical Society Series B, 1964, Vol. 26, 211–246.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

ORDER REPRINTS

REGRESSION DIAGNOSTIC 813

(20) Lawrance, A. J. Local and Deletion Influence. Institute of Mathematics andits Applications Volume: Proceedings on Robustness and Diagnostics (de.W. Stahel), Springer Verlag, 1990, 141–157.

(21) Lawrance, A. J. Regression Transformation Diagnostic Using Local Influ-ence. Journal of the American Statistical Association, 1988, Vol. 83, 1067–1072.

(22) Pena, D.; Yohai, V. J. The Detection of Influential Subsets in Linear Regres-sion by Using an Influence Matrix. Journal of The Royal Statistical SocietySeries B-Methodological, 1995, Vol. 57, 145–156.

(23) Cook, R. D.; Pena, D.; Weisberg, S. The Likelihood Displacement: A Unify-ing principle for influence measures. Communications in Statistics, Theoryand Methods, 1988, Vol. 17, 623–640.

(24) Schall, R.; Gonin, R. Diagnostics for Nonlinear Lp-norm Estimation. Com-putational Statistics and Data Analysis, 1991, Vol. 11, 189–198.

(25) Suarez, M. M,; Gonzalez, M. A. Measures and Procedures for the Identi-fication of Locally Influential Observations. Communications in Statistics-Theory and Methods, 2000, Vol. 28, 343–366.

(26) Cook, R. D.; Wang, P. C. Transformation and Influential Cases in Regression.Technometrics, 1983, Vol. 25, 337–343.

(27) Tsai, C. L.; Wu, X. Z. Transformation-Model Diagnostics. Technometrics,1992, Vol. 34, 197–202.

(28) Schwarzmann, B. A Connection Between Local-Influence and ResidualDiagnostics. Technometrics, 1991, Vol. 33, 103–104.

(29) Farebrother, R.W. Relative Local Influence and The Condition Number.Communications In Statistics-Simulation and Computation, 1992, Vol. 21,707–710.

(30) Schall, R.; Dunne, T. T. A Note on the Relationship Between ParameterCollinearity and Local Influence. Biometrika, 1992, Vol. 79, 399–404.

(31) Billor, N.; Loynes R. M. Local Influence: a New Approach. Communicationsin Statistics-Theory and Methods, 1993, Vol. 22, 1595–1611.

(32) Escobar, L. A.; Meeker, W. Q. Assessing Influence in Regression-AnalysisWith Censored Data. Biometrics, 1992, Vol. 48, 507–528.

(33) Wu, X.; Luo, Z. Residual Sum of Square And Multiple Potential Diagnosticsby a 2nd-Order Local Approach. Statistics and Probability Letters, 1993a,Vol. 16, 289–296.

(34) Wu, X.; Luo, Z. 2nd-Order Approach to Local Influence. Journal of the RoyalStatistical Society Series B-Methodological, 1993b, Vol. 55, 929–936.

Received November 1998Revised December 2000

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014

Order now!

Reprints of this article can also be ordered at

http://www.dekker.com/servlet/product/DOI/101081STA100002258

Request Permission or Order Reprints Instantly!

Interested in copying and sharing this article? In most cases, U.S. Copyright Law requires that you get permission from the article’s rightsholder before using copyrighted content.

All information and materials found in this article, including but not limited to text, trademarks, patents, logos, graphics and images (the "Materials"), are the copyrighted works and other forms of intellectual property of Marcel Dekker, Inc., or its licensors. All rights not expressly granted are reserved.

Get permission to lawfully reproduce and distribute the Materials or order reprints quickly and painlessly. Simply click on the "Request Permission/Reprints Here" link below and follow the instructions. Visit the U.S. Copyright Office for information on Fair Use limitations of U.S. copyright law. Please refer to The Association of American Publishers’ (AAP) website for guidelines on Fair Use in the Classroom.

The Materials are for your personal use only and cannot be reformatted, reposted, resold or distributed by electronic means or otherwise without permission from Marcel Dekker, Inc. Marcel Dekker, Inc. grants you the limited right to display the Materials only on your personal computer or personal wireless device, and to copy and download single copies of such Materials provided that any copyright, trademark or other notice appearing on such Materials is also retained by, displayed, copied or downloaded as part of the Materials and is not removed or obscured, and provided you do not edit, modify, alter or enhance the Materials. Please refer to our Website User Agreement for more details.

Dow

nloa

ded

by [

The

Aga

Kha

n U

nive

rsity

] at

23:

28 0

7 N

ovem

ber

2014