Research Article Robust AIC with High Breakdown Scale...

8
Research Article Robust AIC with High Breakdown Scale Estimate Shokrya Saleh Institute of Mathematical Sciences, University of Malaya, 50603 Kuala Lumpur, Malaysia Correspondence should be addressed to Shokrya Saleh; [email protected] Received 25 May 2014; Revised 20 August 2014; Accepted 20 August 2014; Published 8 September 2014 Academic Editor: Bernard J. Geurts Copyright © 2014 Shokrya Saleh. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Akaike Information Criterion (AIC) based on least squares (LS) regression minimizes the sum of the squared residuals; LS is sensitive to outlier observations. Alternative criterion, which is less sensitive to outlying observation, has been proposed; examples are robust AIC (RAIC), robust Mallows Cp (RCp), and robust Bayesian information criterion (RBIC). In this paper, we propose a robust AIC by replacing the scale estimate with a high breakdown point estimate of scale. e robustness of the proposed methods is studied through its influence function. We show that, the proposed robust AIC is effective in selecting accurate models in the presence of outliers and high leverage points, through simulated and real data examples. 1. Introduction Akaike Information Criterion (AIC) [1] is a powerful tech- nique for model selection, and it has been widely used for selecting models in many fields of study. Consider a multiple linear regression model: =+ x + , (1) where x =(x 1 ,..., x ) is a vector containing explanatory variables, is the response variable, is a vector of parameters, is the intercept parameter, and is the error component, which is independent and identically distributed (iid), with mean 0 and variance 2 . e classical AIC is defined as AIC = ln ( SSE ) + 2, (2) where SSE =∑ 2 , with = LS x LS . Since the LS estimator is vulnerable in presence of out- liers, it is not surprising that AIC inherits this problem. Sev- eral robust AIC alternatives to the AIC have been proposed in the literature (see [24]). For an example Ronchetti [3] has proposed and investigated the properties of a robust version of AIC, with respect to -estimation. A similar idea was used by Martin [2] for autoregressive models. Furthermore, very recently armaratnam and Claeskens [4] have proposed a robust AIC with respect to -estimation and - estimation, to generalize the information criteria, using the full likelihood models. In spite of all these complicated approaches of deriving robust AIC, we introduce a straight- forward approach to derive robust AIC, which focuses on modifying the estimate of the scale. To set the idea, the influence of outlier on AIC is illus- trated through the presence of outliers in the -direction (called a vertical outlier) or in the -direction (called a leverage point). For this, a point with coordinates (0, 10 ) is added, where the value of ranges between (1.5 and 3). A similar approach is done for leverage points, by replacing the value x with ( 10 , 0) (Figure 1). Table 1 and Figure 2 show that the value of AIC increases as the size of contamination in ( 10 , 10 ) increases, as expected, and if or is extremely large, the AIC is unbounded; that is, it will tend to infinity. e remainder of the paper has been organized as follows. Section 1.1 reviews some robust regression estimate methods. In Section 1.2 we review a robust version of AIC; we discuss the robustness problem from the viewpoint of model selec- tion and point out the sensitivity of robust AIC, based on - estimator to leverage points. We derive the influence function of AIC and study its properties in Section 2. e performance of robust AIC is evaluated and compared to the commonly used classical AIC in Section 3. Finally, concluding remark is presented in Section 4. Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2014, Article ID 286414, 7 pages http://dx.doi.org/10.1155/2014/286414

Transcript of Research Article Robust AIC with High Breakdown Scale...

Page 1: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

Research ArticleRobust AIC with High Breakdown Scale Estimate

Shokrya Saleh

Institute of Mathematical Sciences University of Malaya 50603 Kuala Lumpur Malaysia

Correspondence should be addressed to Shokrya Saleh f84fhhotmailcom

Received 25 May 2014 Revised 20 August 2014 Accepted 20 August 2014 Published 8 September 2014

Academic Editor Bernard J Geurts

Copyright copy 2014 Shokrya SalehThis is an open access article distributed under the Creative CommonsAttribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Akaike Information Criterion (AIC) based on least squares (LS) regression minimizes the sum of the squared residuals LS issensitive to outlier observations Alternative criterion which is less sensitive to outlying observation has been proposed examplesare robust AIC (RAIC) robust Mallows Cp (RCp) and robust Bayesian information criterion (RBIC) In this paper we propose arobust AIC by replacing the scale estimate with a high breakdown point estimate of scale The robustness of the proposed methodsis studied through its influence function We show that the proposed robust AIC is effective in selecting accurate models in thepresence of outliers and high leverage points through simulated and real data examples

1 Introduction

Akaike Information Criterion (AIC) [1] is a powerful tech-nique for model selection and it has been widely used forselecting models in many fields of study

Consider a multiple linear regression model

119910119894= 120583 + x119879

119894120573 + 120598119894 (1)

where x119894= (x1198941 x

119894119901)119879 is a vector containing 119901 explanatory

variables 119910119894is the response variable 120573 is a vector of 119901

parameters 120583 is the intercept parameter and 120598119894is the error

component which is independent and identically distributed(iid) with mean 0 and variance 1205902 The classical AIC isdefined as

AIC = ln(SSE119899) + 2119901 (2)

where SSE = sum 1199032

119894 with 119903

119894= 119910119894minus 120583LS minus x119879

119894LS

Since the LS estimator is vulnerable in presence of out-liers it is not surprising that AIC inherits this problem Sev-eral robust AIC alternatives to the AIC have been proposedin the literature (see [2ndash4]) For an example Ronchetti [3] hasproposed and investigated the properties of a robust versionof AIC with respect to119872-estimation A similar ideawas usedby Martin [2] for autoregressive models Furthermore veryrecently Tharmaratnam and Claeskens [4] have proposed

a robust AIC with respect to 119878-estimation and 119872119872-estimation to generalize the information criteria using thefull likelihood models In spite of all these complicatedapproaches of deriving robust AIC we introduce a straight-forward approach to derive robust AIC which focuses onmodifying the estimate of the scale

To set the idea the influence of outlier on AIC is illus-trated through the presence of outliers in the 119884-direction(called a vertical outlier) or in the 119883-direction (called aleverage point) For this a point with coordinates (0 119910

10) is

added where the value of 119910 ranges between (minus15 and 3) Asimilar approach is done for leverage points by replacing thevalue xwith (119909

10 0) (Figure 1) Table 1 and Figure 2 show that

the value of AIC increases as the size of contamination in(11990910 11991010) increases as expected and if 119909 or 119910 is extremely

large the AIC is unbounded that is it will tend to infinityThe remainder of the paper has been organized as follows

Section 11 reviews some robust regression estimate methodsIn Section 12 we review a robust version of AIC we discussthe robustness problem from the viewpoint of model selec-tion and point out the sensitivity of robust AIC based on119872-estimator to leverage pointsWe derive the influence functionof AIC and study its properties in Section 2The performanceof robust AIC is evaluated and compared to the commonlyused classical AIC in Section 3 Finally concluding remark ispresented in Section 4

Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2014 Article ID 286414 7 pageshttpdxdoiorg1011552014286414

2 Journal of Applied Mathematics

minus10 minus05 00 05 10

3

2

1

0

minus1

minus2

x

y

minus4 minus2 0 2 4

10

05

00

minus05

minus10

x

y

Figure 1 Data and positions for 11991010and 119909

10points

6

4

2

0

0 0

minus2

minus20 minus10 10 20 minus20 minus10 10 20

AIC AIC

4

3

2

1

0

minus1

minus2

y10 x10

Figure 2 Effect of adding one observation (0 11991010) or (119909

10 0) on the values of AIC

Table 1 AIC for different values of 11991010and 119909

10

11991010

AIC 11990910

AIC3 32 45 3325 29 4 322 24 35 3015 19 3 281 12 25 2605 02 0 minus050 minus05 minus25 26minus05 03 minus3 29minus1 12 minus35 31minus15 19 minus4 32

11 A Robust Regression Estimate The119872-estimator [5] 119872is

the value of 120573 that minimizes the following function

119894=119899

sum

119894=1

120588 (119910119894minus 120583 minus x119879

119894120573) (3)

where 120588 is symmetric and nondecreasing on [0infin) Further-more 120588(0) = 0 and 120588 is almost continuously differentiableanywhere Furthermore120588 is a function which is less sensitiveto outliers than squares yielding the estimating equation

119894=119899

sum

119894=1

120595 (119910119894minus 120583 minus x119879

119894) x119894= 0 (4)

Journal of Applied Mathematics 3

where 120595 = 1205881015840 If we choose 120588 function in (3) as Tukey

Biweight function

120588 (119903 119889) =

3(119903

119889)

2

minus 3(119903

119889)

4

+ (119903

119889)

6

if |119903| le 119889

1 elsewhere(5)

where 119889 = 15476 yields 119887 = 119864Φ[120588(119885 119889)] = 2(1 minus

1198650(119889)) with Φ the standard normal cumulative distribution

function and 119885 sim 119873(0 1) the resulting estimator is thenBiweight 119878-estimator 119872-estimators are efficient and highlyrobust to unusual value of 119884 but one rogue leverage pointcan break themdown completely For this reason generalized119872-estimators were introduced which solve

119894=119899

sum

119894=1

119908 (x119894) 120595 (119910

119894minus 120583 minus x119879

119894) x119894= 0 (6)

where 119908 is a weight function [6]In recent years a good deal of attention in the literature

has been focused on high-breakdown methods that is onemethod must be resistant to even multiple severe outliersMany methods are based on minimizing a more robustscale estimate than sum of squared residuals For exampleRousseeuw [7] proposed LMS a high-breakdown methodthat minimizes the median of the squared residuals ratherthan their sum In addition Rousseeuw [8] proposed leasttrimmed squares (LTS) which minimizes the sum of the 119867smallest squared residuals defined as

LTS119867119873 = arg min119894=119867

sum

119894=1

1199032

119894(120573) (7)

based on the ordered absolute residuals |119903(1)| ⩽ |119903

(2)| ⩽

sdot sdot sdot ⩽ |119903(119899)| LTS converges at the rate of 11989912 with the

same asymptotic efficiency under normality as Huberrsquos skipestimator The convergence rate of LMS is 119899

13 and itsobjective function is less smooth than LTS

12 The Robust Version of AIC Consider scale estimate oferrors defined by

119878 =SSE

(119899 minus 119901) (8)

with SSE = sum119894=119899

119894=11199032

119894and 119903119894= 119910119894minus 120583 minus x119879

119894 By replacing the

value of SSE in (2) in terms of 119878 AIC in (2) can be expressedas follows

AIC119878= ln(

(119899 minus 119901) 1198782

119899) + 2119901 (9)

Notice that the largest values of AIC indicate that the model(and hence the explanatory variables) is less successful inexplaining the variations in the response while a small valueof AIC indicates an excellent fit (ie model) to the responsedata

Ronchetti [3] proposed a robust counterpart of theAIC statistic The extension of AIC to RAIC is inspired

by the extension of maximum likelihood estimation to 119872-estimationThe author derivedRAIC for an error distributionwith density function119891(120598) = exp(minus120588(120598)) For a given constant120572 and a given function 120588 the author chose the model thatminimizes

RAIC (119901 120572 120588)

=

119894=119899

sum

119894=1

120588((119910119894minus 120583 minus x119879

119894119872)

) + 120572119901

(10)

where is some robust estimate of 120590 and 119872

is the 119872-estimator defined as in (3) Huber [9] suggested 120572 =

2119864([1205952(((119910119894minus120583minus x119879

119894)))])(119864[1205951015840(((119910

119894minus120583minus x119879

119894)))]) and

119896 = 1345We introduce an alternative robust version of AIC by

replacing 119878 in (8) to the robust estimator of scale whichattains a 50 breakdown point When 119867 = 1198992 (8) findsthe estimates corresponding to the half samples having thesmallest sum of squares of residuals As such as expected thebreakdown point is 50 and the estimated scale from LTS is

2

(LTS119867119873) =1

119867

119894=119867

sum

119894=1

1199032

(119894)((LTS119867119873)) (11)

For other robust estimations the 119872-estimator and theBiweight 119878-estimator are compared to least trimmed ofsquares Based on the results shown in Table 2 it is evidentthat the119872-estimator is muchmore robust than LS but suffersfrom leverage pointsThe Biweight 119878-estimator and LTS showrobust behavior the AICBS is stable even though the size ofthe outliers increases In the next section we generalize thesefindings by computing the associated influence functions

2 Influence Function

Consider the linear model in (1) for 119894 = 1 119899 Assume thedistribution of errors satisfying 119865

120590(x) = 119865

0(x120590) where 120590 is

the residual scale parameter and 1198650is symmetric with valid

probability density functionLet x and 119910 be independent stochastic variables with

distribution 119867 The functional 119879 is Fisher-consistent for theparameters (120583120573) at the model distribution119867 as follows

119879 (119867) = [119886 (119867)

b (119867)] = [120583

120573] (12)

For a Fisher-consistent scale estimator 119865120590(x) = 119865(x120590) for

all 120590 gt 0 In general the influence function of 119879 at thedistribution 119865 is defined as

119868119865 ((x 119910) 119879119867)

= lim120598rarr0

119879 ((1 minus 120598)119867 + 120598Δ(x119910) minus 119879 (119867))

120598

=120597

120597120598(119879 (Δ

(x119910)))

(13)

where 119879(119867) is the functional defined as the solution of theobjective model and Δ

(x119910) is the distribution which contains

4 Journal of Applied Mathematics

Table 2 Robust AIC for different values of 11991010

(vertical) and 11990910

(leverage)

11991010

AIC119872

AICBS AICLTS 11990910

AIC119872

AICBS AICLTS

3 37 39 34 45 39 39 3425 37 39 34 4 37 39 342 37 39 34 35 36 39 3415 37 39 34 3 36 39 341 37 39 34 25 37 39 3405 37 39 34 0 35 36 320 36 36 32 minus25 37 39 34minus05 37 39 34 minus3 37 39 34minus1 37 39 34 minus35 39 39 34minus15 37 39 34 minus4 4 39 34

outliers The following theorem gives the influence functionof AIC with any scale 119878

Theorem 1 Let 119867 be some distribution other than 119865 Take(X 119884) sim 119867 and denote by 120598 the error term of the modelAssume that 119878has the property that is differentiablewith partialderivatives equal to zero at the origin (00) Then

119868119865 ((x 119910) AIC119878 119867)

=2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650)

(14)

where 119903119894119901= 119910119894minus120583minus x119879

119894119901and 2119901= 119878119878119864(119899 minus 119901) are computed

from sum model (The proof is in the Appendix)

It is clear that the influence function in (13) is boundedif the 119868119865(119903

119894119901120590119901 119878 1198650) is also bounded It is evident that

AIC is nonrobust since the LS estimate has unboundedinfluence function The influence function of 119872-estimationwith respect to 119884 is bounded by choice of 120595 but it isunbounded with respect to the119883 direction That is

119868119865 ((x 119910) 119879 119865)

= 119872minus1(120595 119865) x120595 (119910 minus x119879119879 (119865))

(15)

where 120595 = 1205881015840 and119872 is a certain 119901 times 119901matrix given by

119872(120595 119865)

= int1205951015840(119910 minus x119879119879 (119865)) xx119879119889119865 (x 119910)

(16)

The influence function of the AIC using LTS estimatorsfollowingTheorem 1 takes the form

119868119865 ((x 119910) AICLTS 119867)

=2119899

(119899 minus 119901)sdot 119868119865(

119903119894119901

120590119901

2

LTS 1198650) (17)

We have noted that the influence function of AICLTS isbounded in both 119884 and 119883 directions as 119868119865(119903

119894119901120590119901 2

LTS 1198650)

is bounded Moreover we conclude that AIC with highbreakdown point estimator provides reliable fits to the datain presence of leverage points

3 Numerical Examples

In this section AICLTS AICLMS and AICBS are comparedwith AIC and RAIC In this study 50 independent replicatesof 3 independent uniform random variables on [minus1 1] ofx1198941 x1198942 and x

1198943and 50 independent normally distributed

errors 120598119894sim 119873(0 9) were generated The true model is given

by 119910119894= x1198941+ x1198942+ 120598119894 for 119894 = 1 50 using two variables

x1198941and x

1198942 We have computed the following AIC based on

the respective estimation criterion (i) LS estimation (ii)119872-estimator (iii) BS estimator (iv) LTS estimator and (v) LMSestimator In order to illustrate the robustness to outliers weconsider the following cases

(a) vertical outliers (outliers in the 119910 only)(b) good leverage points (outliers in the 119910 and x)(c) bad leverage points (outliers in the x only)

For all situations we randomly generate (0 5 10 2030 and 40) of outliers from119873(50 01

2) and119873(100 052)

distributions respectively For each of these settings wesimulate 1000 samples

31 Simulation Result The resulting fit to the data is classifiedas one of the following

(i) correct fit (true model)(ii) overfit (models containing all the variables in the true

model plus other variables that are redundant)(iii) under fit (models with only a strict of the variables in

true model)(iv) wrong fit (model that are neither of the above)

Tables 3 4 and 5 show detailed simulation result for differentversions of AIC methods For uncontaminated datasets theclassical AIC performs best compared to robust AIC Byintroducing vertical outliers the classical AIC selects a largeproportion of wrong fit models and as expected the robustAIC will usually (ie with higher proportion) select thecorrect model

For bad leverage points we observe that AIC tends toproduce overfit and with high level of contamination it takesa higher proportion of wrong fit However AIC

119872tend to

produce either an under fit or wrong fit model However therobust estimate produces comparable power in the presenceof bad leverage points

For good leverage points AIC tends also to produceoverfit On the other hand the robust AIC tend to produceeither correct fit or under fit model

32 Example 2 (Stack Loss Data) Stack loss data was pre-sented by [10] This data set consists of 21 observations onthree independent variables and it contains four outliers

Journal of Applied Mathematics 5

Table 3 Percentage of select models from classical AIC robust RAIC AICLTS AICLMS and AICBS with vertical outliers

120598

AIC RAIC AICLTS AICLMS AICBS

0

Correct fit 846 544 576 652 452Overfit 154 0 0 0 0Under fit 0 436 412 34 546Wrong fit 0 20 12 08 02

5

Correct fit 28 496 568 626 454Overfit 2 0 0 0 0Under fit 22 51 428 366 546Wrong fit 732 14 04 08 0

10

Correct fit 32 450 51 56 398Overfit 08 0 0 0 0Under fit 202 524 48 438 552Wrong fit 758 26 10 02 0

20

Correct fit 42 308 494 58 342Overfit 02 0 0 0 0Under fit 228 678 496 414 658Wrong fit 728 14 10 06 0

30

Correct fit 16 0 440 512 232Overfit 02 0 0 0 0Under fit 226 738 554 482 768Wrong fit 728 262 06 06 0

40

Correct fit 22 0 376 412 124Overfit 06 0 0 0 0Under fit 228 708 624 584 874Wrong fit 744 292 0 04 02

(cases 1 3 4 and 21) and high leverage points (cases 1 2 3and 21) The data are given in Table 6

We applied the traditional and robust versions of AICmethods on the data Table 7 shows that the classical methodselects the full model and robust RAIC method ignored oneof the important variables (x

1198942) whereas robust AICmethods

based on high break down points estimators agreed with theimportance of the two variables x

1198941and x1198942

4 Conclusions

The least trimmed squares (LTS) and the least mediansquares (LMS) are robust regression methods frequentlyused in practice Nevertheless they are not commonly usedfor selecting models This paper introduced the AkaikeInformation Criterion (AIC) based on LTS and LMS scaleswhich are robust against outliers and leverage point Our sim-ulation result illustrated excellent performance of AICLTS andAICLMS for contaminated data sets This paper focused onthe AIC variable selection criteria one might be interested inextending other robust model selection criteria to advancedrobust breakdown point estimation methods such as LTS

LMS or BS estimators In addition this paper has consideredregression model with continuous variables however futurestudies might consider mixed variables (ie continuous anddummy)

Appendix

Proof of Theorem 1 Consider

119868119865 ((x 119910) AIC 119867) = 120597

120597120598(AIC (Δ

(x119910)))10038161003816100381610038161003816100381610038161003816120598=0

=120597

120597120598(ln

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

=119899

(119899 minus 119901) sdot 1198782119901(119867)

sdot120597

120597120598(1198782

119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A1)

=119899

(119899 minus 119901) 1205902119901

sdot 2120590119901

120597

120597120598(119878119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A2)

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

2 Journal of Applied Mathematics

minus10 minus05 00 05 10

3

2

1

0

minus1

minus2

x

y

minus4 minus2 0 2 4

10

05

00

minus05

minus10

x

y

Figure 1 Data and positions for 11991010and 119909

10points

6

4

2

0

0 0

minus2

minus20 minus10 10 20 minus20 minus10 10 20

AIC AIC

4

3

2

1

0

minus1

minus2

y10 x10

Figure 2 Effect of adding one observation (0 11991010) or (119909

10 0) on the values of AIC

Table 1 AIC for different values of 11991010and 119909

10

11991010

AIC 11990910

AIC3 32 45 3325 29 4 322 24 35 3015 19 3 281 12 25 2605 02 0 minus050 minus05 minus25 26minus05 03 minus3 29minus1 12 minus35 31minus15 19 minus4 32

11 A Robust Regression Estimate The119872-estimator [5] 119872is

the value of 120573 that minimizes the following function

119894=119899

sum

119894=1

120588 (119910119894minus 120583 minus x119879

119894120573) (3)

where 120588 is symmetric and nondecreasing on [0infin) Further-more 120588(0) = 0 and 120588 is almost continuously differentiableanywhere Furthermore120588 is a function which is less sensitiveto outliers than squares yielding the estimating equation

119894=119899

sum

119894=1

120595 (119910119894minus 120583 minus x119879

119894) x119894= 0 (4)

Journal of Applied Mathematics 3

where 120595 = 1205881015840 If we choose 120588 function in (3) as Tukey

Biweight function

120588 (119903 119889) =

3(119903

119889)

2

minus 3(119903

119889)

4

+ (119903

119889)

6

if |119903| le 119889

1 elsewhere(5)

where 119889 = 15476 yields 119887 = 119864Φ[120588(119885 119889)] = 2(1 minus

1198650(119889)) with Φ the standard normal cumulative distribution

function and 119885 sim 119873(0 1) the resulting estimator is thenBiweight 119878-estimator 119872-estimators are efficient and highlyrobust to unusual value of 119884 but one rogue leverage pointcan break themdown completely For this reason generalized119872-estimators were introduced which solve

119894=119899

sum

119894=1

119908 (x119894) 120595 (119910

119894minus 120583 minus x119879

119894) x119894= 0 (6)

where 119908 is a weight function [6]In recent years a good deal of attention in the literature

has been focused on high-breakdown methods that is onemethod must be resistant to even multiple severe outliersMany methods are based on minimizing a more robustscale estimate than sum of squared residuals For exampleRousseeuw [7] proposed LMS a high-breakdown methodthat minimizes the median of the squared residuals ratherthan their sum In addition Rousseeuw [8] proposed leasttrimmed squares (LTS) which minimizes the sum of the 119867smallest squared residuals defined as

LTS119867119873 = arg min119894=119867

sum

119894=1

1199032

119894(120573) (7)

based on the ordered absolute residuals |119903(1)| ⩽ |119903

(2)| ⩽

sdot sdot sdot ⩽ |119903(119899)| LTS converges at the rate of 11989912 with the

same asymptotic efficiency under normality as Huberrsquos skipestimator The convergence rate of LMS is 119899

13 and itsobjective function is less smooth than LTS

12 The Robust Version of AIC Consider scale estimate oferrors defined by

119878 =SSE

(119899 minus 119901) (8)

with SSE = sum119894=119899

119894=11199032

119894and 119903119894= 119910119894minus 120583 minus x119879

119894 By replacing the

value of SSE in (2) in terms of 119878 AIC in (2) can be expressedas follows

AIC119878= ln(

(119899 minus 119901) 1198782

119899) + 2119901 (9)

Notice that the largest values of AIC indicate that the model(and hence the explanatory variables) is less successful inexplaining the variations in the response while a small valueof AIC indicates an excellent fit (ie model) to the responsedata

Ronchetti [3] proposed a robust counterpart of theAIC statistic The extension of AIC to RAIC is inspired

by the extension of maximum likelihood estimation to 119872-estimationThe author derivedRAIC for an error distributionwith density function119891(120598) = exp(minus120588(120598)) For a given constant120572 and a given function 120588 the author chose the model thatminimizes

RAIC (119901 120572 120588)

=

119894=119899

sum

119894=1

120588((119910119894minus 120583 minus x119879

119894119872)

) + 120572119901

(10)

where is some robust estimate of 120590 and 119872

is the 119872-estimator defined as in (3) Huber [9] suggested 120572 =

2119864([1205952(((119910119894minus120583minus x119879

119894)))])(119864[1205951015840(((119910

119894minus120583minus x119879

119894)))]) and

119896 = 1345We introduce an alternative robust version of AIC by

replacing 119878 in (8) to the robust estimator of scale whichattains a 50 breakdown point When 119867 = 1198992 (8) findsthe estimates corresponding to the half samples having thesmallest sum of squares of residuals As such as expected thebreakdown point is 50 and the estimated scale from LTS is

2

(LTS119867119873) =1

119867

119894=119867

sum

119894=1

1199032

(119894)((LTS119867119873)) (11)

For other robust estimations the 119872-estimator and theBiweight 119878-estimator are compared to least trimmed ofsquares Based on the results shown in Table 2 it is evidentthat the119872-estimator is muchmore robust than LS but suffersfrom leverage pointsThe Biweight 119878-estimator and LTS showrobust behavior the AICBS is stable even though the size ofthe outliers increases In the next section we generalize thesefindings by computing the associated influence functions

2 Influence Function

Consider the linear model in (1) for 119894 = 1 119899 Assume thedistribution of errors satisfying 119865

120590(x) = 119865

0(x120590) where 120590 is

the residual scale parameter and 1198650is symmetric with valid

probability density functionLet x and 119910 be independent stochastic variables with

distribution 119867 The functional 119879 is Fisher-consistent for theparameters (120583120573) at the model distribution119867 as follows

119879 (119867) = [119886 (119867)

b (119867)] = [120583

120573] (12)

For a Fisher-consistent scale estimator 119865120590(x) = 119865(x120590) for

all 120590 gt 0 In general the influence function of 119879 at thedistribution 119865 is defined as

119868119865 ((x 119910) 119879119867)

= lim120598rarr0

119879 ((1 minus 120598)119867 + 120598Δ(x119910) minus 119879 (119867))

120598

=120597

120597120598(119879 (Δ

(x119910)))

(13)

where 119879(119867) is the functional defined as the solution of theobjective model and Δ

(x119910) is the distribution which contains

4 Journal of Applied Mathematics

Table 2 Robust AIC for different values of 11991010

(vertical) and 11990910

(leverage)

11991010

AIC119872

AICBS AICLTS 11990910

AIC119872

AICBS AICLTS

3 37 39 34 45 39 39 3425 37 39 34 4 37 39 342 37 39 34 35 36 39 3415 37 39 34 3 36 39 341 37 39 34 25 37 39 3405 37 39 34 0 35 36 320 36 36 32 minus25 37 39 34minus05 37 39 34 minus3 37 39 34minus1 37 39 34 minus35 39 39 34minus15 37 39 34 minus4 4 39 34

outliers The following theorem gives the influence functionof AIC with any scale 119878

Theorem 1 Let 119867 be some distribution other than 119865 Take(X 119884) sim 119867 and denote by 120598 the error term of the modelAssume that 119878has the property that is differentiablewith partialderivatives equal to zero at the origin (00) Then

119868119865 ((x 119910) AIC119878 119867)

=2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650)

(14)

where 119903119894119901= 119910119894minus120583minus x119879

119894119901and 2119901= 119878119878119864(119899 minus 119901) are computed

from sum model (The proof is in the Appendix)

It is clear that the influence function in (13) is boundedif the 119868119865(119903

119894119901120590119901 119878 1198650) is also bounded It is evident that

AIC is nonrobust since the LS estimate has unboundedinfluence function The influence function of 119872-estimationwith respect to 119884 is bounded by choice of 120595 but it isunbounded with respect to the119883 direction That is

119868119865 ((x 119910) 119879 119865)

= 119872minus1(120595 119865) x120595 (119910 minus x119879119879 (119865))

(15)

where 120595 = 1205881015840 and119872 is a certain 119901 times 119901matrix given by

119872(120595 119865)

= int1205951015840(119910 minus x119879119879 (119865)) xx119879119889119865 (x 119910)

(16)

The influence function of the AIC using LTS estimatorsfollowingTheorem 1 takes the form

119868119865 ((x 119910) AICLTS 119867)

=2119899

(119899 minus 119901)sdot 119868119865(

119903119894119901

120590119901

2

LTS 1198650) (17)

We have noted that the influence function of AICLTS isbounded in both 119884 and 119883 directions as 119868119865(119903

119894119901120590119901 2

LTS 1198650)

is bounded Moreover we conclude that AIC with highbreakdown point estimator provides reliable fits to the datain presence of leverage points

3 Numerical Examples

In this section AICLTS AICLMS and AICBS are comparedwith AIC and RAIC In this study 50 independent replicatesof 3 independent uniform random variables on [minus1 1] ofx1198941 x1198942 and x

1198943and 50 independent normally distributed

errors 120598119894sim 119873(0 9) were generated The true model is given

by 119910119894= x1198941+ x1198942+ 120598119894 for 119894 = 1 50 using two variables

x1198941and x

1198942 We have computed the following AIC based on

the respective estimation criterion (i) LS estimation (ii)119872-estimator (iii) BS estimator (iv) LTS estimator and (v) LMSestimator In order to illustrate the robustness to outliers weconsider the following cases

(a) vertical outliers (outliers in the 119910 only)(b) good leverage points (outliers in the 119910 and x)(c) bad leverage points (outliers in the x only)

For all situations we randomly generate (0 5 10 2030 and 40) of outliers from119873(50 01

2) and119873(100 052)

distributions respectively For each of these settings wesimulate 1000 samples

31 Simulation Result The resulting fit to the data is classifiedas one of the following

(i) correct fit (true model)(ii) overfit (models containing all the variables in the true

model plus other variables that are redundant)(iii) under fit (models with only a strict of the variables in

true model)(iv) wrong fit (model that are neither of the above)

Tables 3 4 and 5 show detailed simulation result for differentversions of AIC methods For uncontaminated datasets theclassical AIC performs best compared to robust AIC Byintroducing vertical outliers the classical AIC selects a largeproportion of wrong fit models and as expected the robustAIC will usually (ie with higher proportion) select thecorrect model

For bad leverage points we observe that AIC tends toproduce overfit and with high level of contamination it takesa higher proportion of wrong fit However AIC

119872tend to

produce either an under fit or wrong fit model However therobust estimate produces comparable power in the presenceof bad leverage points

For good leverage points AIC tends also to produceoverfit On the other hand the robust AIC tend to produceeither correct fit or under fit model

32 Example 2 (Stack Loss Data) Stack loss data was pre-sented by [10] This data set consists of 21 observations onthree independent variables and it contains four outliers

Journal of Applied Mathematics 5

Table 3 Percentage of select models from classical AIC robust RAIC AICLTS AICLMS and AICBS with vertical outliers

120598

AIC RAIC AICLTS AICLMS AICBS

0

Correct fit 846 544 576 652 452Overfit 154 0 0 0 0Under fit 0 436 412 34 546Wrong fit 0 20 12 08 02

5

Correct fit 28 496 568 626 454Overfit 2 0 0 0 0Under fit 22 51 428 366 546Wrong fit 732 14 04 08 0

10

Correct fit 32 450 51 56 398Overfit 08 0 0 0 0Under fit 202 524 48 438 552Wrong fit 758 26 10 02 0

20

Correct fit 42 308 494 58 342Overfit 02 0 0 0 0Under fit 228 678 496 414 658Wrong fit 728 14 10 06 0

30

Correct fit 16 0 440 512 232Overfit 02 0 0 0 0Under fit 226 738 554 482 768Wrong fit 728 262 06 06 0

40

Correct fit 22 0 376 412 124Overfit 06 0 0 0 0Under fit 228 708 624 584 874Wrong fit 744 292 0 04 02

(cases 1 3 4 and 21) and high leverage points (cases 1 2 3and 21) The data are given in Table 6

We applied the traditional and robust versions of AICmethods on the data Table 7 shows that the classical methodselects the full model and robust RAIC method ignored oneof the important variables (x

1198942) whereas robust AICmethods

based on high break down points estimators agreed with theimportance of the two variables x

1198941and x1198942

4 Conclusions

The least trimmed squares (LTS) and the least mediansquares (LMS) are robust regression methods frequentlyused in practice Nevertheless they are not commonly usedfor selecting models This paper introduced the AkaikeInformation Criterion (AIC) based on LTS and LMS scaleswhich are robust against outliers and leverage point Our sim-ulation result illustrated excellent performance of AICLTS andAICLMS for contaminated data sets This paper focused onthe AIC variable selection criteria one might be interested inextending other robust model selection criteria to advancedrobust breakdown point estimation methods such as LTS

LMS or BS estimators In addition this paper has consideredregression model with continuous variables however futurestudies might consider mixed variables (ie continuous anddummy)

Appendix

Proof of Theorem 1 Consider

119868119865 ((x 119910) AIC 119867) = 120597

120597120598(AIC (Δ

(x119910)))10038161003816100381610038161003816100381610038161003816120598=0

=120597

120597120598(ln

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

=119899

(119899 minus 119901) sdot 1198782119901(119867)

sdot120597

120597120598(1198782

119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A1)

=119899

(119899 minus 119901) 1205902119901

sdot 2120590119901

120597

120597120598(119878119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A2)

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

Journal of Applied Mathematics 3

where 120595 = 1205881015840 If we choose 120588 function in (3) as Tukey

Biweight function

120588 (119903 119889) =

3(119903

119889)

2

minus 3(119903

119889)

4

+ (119903

119889)

6

if |119903| le 119889

1 elsewhere(5)

where 119889 = 15476 yields 119887 = 119864Φ[120588(119885 119889)] = 2(1 minus

1198650(119889)) with Φ the standard normal cumulative distribution

function and 119885 sim 119873(0 1) the resulting estimator is thenBiweight 119878-estimator 119872-estimators are efficient and highlyrobust to unusual value of 119884 but one rogue leverage pointcan break themdown completely For this reason generalized119872-estimators were introduced which solve

119894=119899

sum

119894=1

119908 (x119894) 120595 (119910

119894minus 120583 minus x119879

119894) x119894= 0 (6)

where 119908 is a weight function [6]In recent years a good deal of attention in the literature

has been focused on high-breakdown methods that is onemethod must be resistant to even multiple severe outliersMany methods are based on minimizing a more robustscale estimate than sum of squared residuals For exampleRousseeuw [7] proposed LMS a high-breakdown methodthat minimizes the median of the squared residuals ratherthan their sum In addition Rousseeuw [8] proposed leasttrimmed squares (LTS) which minimizes the sum of the 119867smallest squared residuals defined as

LTS119867119873 = arg min119894=119867

sum

119894=1

1199032

119894(120573) (7)

based on the ordered absolute residuals |119903(1)| ⩽ |119903

(2)| ⩽

sdot sdot sdot ⩽ |119903(119899)| LTS converges at the rate of 11989912 with the

same asymptotic efficiency under normality as Huberrsquos skipestimator The convergence rate of LMS is 119899

13 and itsobjective function is less smooth than LTS

12 The Robust Version of AIC Consider scale estimate oferrors defined by

119878 =SSE

(119899 minus 119901) (8)

with SSE = sum119894=119899

119894=11199032

119894and 119903119894= 119910119894minus 120583 minus x119879

119894 By replacing the

value of SSE in (2) in terms of 119878 AIC in (2) can be expressedas follows

AIC119878= ln(

(119899 minus 119901) 1198782

119899) + 2119901 (9)

Notice that the largest values of AIC indicate that the model(and hence the explanatory variables) is less successful inexplaining the variations in the response while a small valueof AIC indicates an excellent fit (ie model) to the responsedata

Ronchetti [3] proposed a robust counterpart of theAIC statistic The extension of AIC to RAIC is inspired

by the extension of maximum likelihood estimation to 119872-estimationThe author derivedRAIC for an error distributionwith density function119891(120598) = exp(minus120588(120598)) For a given constant120572 and a given function 120588 the author chose the model thatminimizes

RAIC (119901 120572 120588)

=

119894=119899

sum

119894=1

120588((119910119894minus 120583 minus x119879

119894119872)

) + 120572119901

(10)

where is some robust estimate of 120590 and 119872

is the 119872-estimator defined as in (3) Huber [9] suggested 120572 =

2119864([1205952(((119910119894minus120583minus x119879

119894)))])(119864[1205951015840(((119910

119894minus120583minus x119879

119894)))]) and

119896 = 1345We introduce an alternative robust version of AIC by

replacing 119878 in (8) to the robust estimator of scale whichattains a 50 breakdown point When 119867 = 1198992 (8) findsthe estimates corresponding to the half samples having thesmallest sum of squares of residuals As such as expected thebreakdown point is 50 and the estimated scale from LTS is

2

(LTS119867119873) =1

119867

119894=119867

sum

119894=1

1199032

(119894)((LTS119867119873)) (11)

For other robust estimations the 119872-estimator and theBiweight 119878-estimator are compared to least trimmed ofsquares Based on the results shown in Table 2 it is evidentthat the119872-estimator is muchmore robust than LS but suffersfrom leverage pointsThe Biweight 119878-estimator and LTS showrobust behavior the AICBS is stable even though the size ofthe outliers increases In the next section we generalize thesefindings by computing the associated influence functions

2 Influence Function

Consider the linear model in (1) for 119894 = 1 119899 Assume thedistribution of errors satisfying 119865

120590(x) = 119865

0(x120590) where 120590 is

the residual scale parameter and 1198650is symmetric with valid

probability density functionLet x and 119910 be independent stochastic variables with

distribution 119867 The functional 119879 is Fisher-consistent for theparameters (120583120573) at the model distribution119867 as follows

119879 (119867) = [119886 (119867)

b (119867)] = [120583

120573] (12)

For a Fisher-consistent scale estimator 119865120590(x) = 119865(x120590) for

all 120590 gt 0 In general the influence function of 119879 at thedistribution 119865 is defined as

119868119865 ((x 119910) 119879119867)

= lim120598rarr0

119879 ((1 minus 120598)119867 + 120598Δ(x119910) minus 119879 (119867))

120598

=120597

120597120598(119879 (Δ

(x119910)))

(13)

where 119879(119867) is the functional defined as the solution of theobjective model and Δ

(x119910) is the distribution which contains

4 Journal of Applied Mathematics

Table 2 Robust AIC for different values of 11991010

(vertical) and 11990910

(leverage)

11991010

AIC119872

AICBS AICLTS 11990910

AIC119872

AICBS AICLTS

3 37 39 34 45 39 39 3425 37 39 34 4 37 39 342 37 39 34 35 36 39 3415 37 39 34 3 36 39 341 37 39 34 25 37 39 3405 37 39 34 0 35 36 320 36 36 32 minus25 37 39 34minus05 37 39 34 minus3 37 39 34minus1 37 39 34 minus35 39 39 34minus15 37 39 34 minus4 4 39 34

outliers The following theorem gives the influence functionof AIC with any scale 119878

Theorem 1 Let 119867 be some distribution other than 119865 Take(X 119884) sim 119867 and denote by 120598 the error term of the modelAssume that 119878has the property that is differentiablewith partialderivatives equal to zero at the origin (00) Then

119868119865 ((x 119910) AIC119878 119867)

=2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650)

(14)

where 119903119894119901= 119910119894minus120583minus x119879

119894119901and 2119901= 119878119878119864(119899 minus 119901) are computed

from sum model (The proof is in the Appendix)

It is clear that the influence function in (13) is boundedif the 119868119865(119903

119894119901120590119901 119878 1198650) is also bounded It is evident that

AIC is nonrobust since the LS estimate has unboundedinfluence function The influence function of 119872-estimationwith respect to 119884 is bounded by choice of 120595 but it isunbounded with respect to the119883 direction That is

119868119865 ((x 119910) 119879 119865)

= 119872minus1(120595 119865) x120595 (119910 minus x119879119879 (119865))

(15)

where 120595 = 1205881015840 and119872 is a certain 119901 times 119901matrix given by

119872(120595 119865)

= int1205951015840(119910 minus x119879119879 (119865)) xx119879119889119865 (x 119910)

(16)

The influence function of the AIC using LTS estimatorsfollowingTheorem 1 takes the form

119868119865 ((x 119910) AICLTS 119867)

=2119899

(119899 minus 119901)sdot 119868119865(

119903119894119901

120590119901

2

LTS 1198650) (17)

We have noted that the influence function of AICLTS isbounded in both 119884 and 119883 directions as 119868119865(119903

119894119901120590119901 2

LTS 1198650)

is bounded Moreover we conclude that AIC with highbreakdown point estimator provides reliable fits to the datain presence of leverage points

3 Numerical Examples

In this section AICLTS AICLMS and AICBS are comparedwith AIC and RAIC In this study 50 independent replicatesof 3 independent uniform random variables on [minus1 1] ofx1198941 x1198942 and x

1198943and 50 independent normally distributed

errors 120598119894sim 119873(0 9) were generated The true model is given

by 119910119894= x1198941+ x1198942+ 120598119894 for 119894 = 1 50 using two variables

x1198941and x

1198942 We have computed the following AIC based on

the respective estimation criterion (i) LS estimation (ii)119872-estimator (iii) BS estimator (iv) LTS estimator and (v) LMSestimator In order to illustrate the robustness to outliers weconsider the following cases

(a) vertical outliers (outliers in the 119910 only)(b) good leverage points (outliers in the 119910 and x)(c) bad leverage points (outliers in the x only)

For all situations we randomly generate (0 5 10 2030 and 40) of outliers from119873(50 01

2) and119873(100 052)

distributions respectively For each of these settings wesimulate 1000 samples

31 Simulation Result The resulting fit to the data is classifiedas one of the following

(i) correct fit (true model)(ii) overfit (models containing all the variables in the true

model plus other variables that are redundant)(iii) under fit (models with only a strict of the variables in

true model)(iv) wrong fit (model that are neither of the above)

Tables 3 4 and 5 show detailed simulation result for differentversions of AIC methods For uncontaminated datasets theclassical AIC performs best compared to robust AIC Byintroducing vertical outliers the classical AIC selects a largeproportion of wrong fit models and as expected the robustAIC will usually (ie with higher proportion) select thecorrect model

For bad leverage points we observe that AIC tends toproduce overfit and with high level of contamination it takesa higher proportion of wrong fit However AIC

119872tend to

produce either an under fit or wrong fit model However therobust estimate produces comparable power in the presenceof bad leverage points

For good leverage points AIC tends also to produceoverfit On the other hand the robust AIC tend to produceeither correct fit or under fit model

32 Example 2 (Stack Loss Data) Stack loss data was pre-sented by [10] This data set consists of 21 observations onthree independent variables and it contains four outliers

Journal of Applied Mathematics 5

Table 3 Percentage of select models from classical AIC robust RAIC AICLTS AICLMS and AICBS with vertical outliers

120598

AIC RAIC AICLTS AICLMS AICBS

0

Correct fit 846 544 576 652 452Overfit 154 0 0 0 0Under fit 0 436 412 34 546Wrong fit 0 20 12 08 02

5

Correct fit 28 496 568 626 454Overfit 2 0 0 0 0Under fit 22 51 428 366 546Wrong fit 732 14 04 08 0

10

Correct fit 32 450 51 56 398Overfit 08 0 0 0 0Under fit 202 524 48 438 552Wrong fit 758 26 10 02 0

20

Correct fit 42 308 494 58 342Overfit 02 0 0 0 0Under fit 228 678 496 414 658Wrong fit 728 14 10 06 0

30

Correct fit 16 0 440 512 232Overfit 02 0 0 0 0Under fit 226 738 554 482 768Wrong fit 728 262 06 06 0

40

Correct fit 22 0 376 412 124Overfit 06 0 0 0 0Under fit 228 708 624 584 874Wrong fit 744 292 0 04 02

(cases 1 3 4 and 21) and high leverage points (cases 1 2 3and 21) The data are given in Table 6

We applied the traditional and robust versions of AICmethods on the data Table 7 shows that the classical methodselects the full model and robust RAIC method ignored oneof the important variables (x

1198942) whereas robust AICmethods

based on high break down points estimators agreed with theimportance of the two variables x

1198941and x1198942

4 Conclusions

The least trimmed squares (LTS) and the least mediansquares (LMS) are robust regression methods frequentlyused in practice Nevertheless they are not commonly usedfor selecting models This paper introduced the AkaikeInformation Criterion (AIC) based on LTS and LMS scaleswhich are robust against outliers and leverage point Our sim-ulation result illustrated excellent performance of AICLTS andAICLMS for contaminated data sets This paper focused onthe AIC variable selection criteria one might be interested inextending other robust model selection criteria to advancedrobust breakdown point estimation methods such as LTS

LMS or BS estimators In addition this paper has consideredregression model with continuous variables however futurestudies might consider mixed variables (ie continuous anddummy)

Appendix

Proof of Theorem 1 Consider

119868119865 ((x 119910) AIC 119867) = 120597

120597120598(AIC (Δ

(x119910)))10038161003816100381610038161003816100381610038161003816120598=0

=120597

120597120598(ln

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

=119899

(119899 minus 119901) sdot 1198782119901(119867)

sdot120597

120597120598(1198782

119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A1)

=119899

(119899 minus 119901) 1205902119901

sdot 2120590119901

120597

120597120598(119878119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A2)

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

4 Journal of Applied Mathematics

Table 2 Robust AIC for different values of 11991010

(vertical) and 11990910

(leverage)

11991010

AIC119872

AICBS AICLTS 11990910

AIC119872

AICBS AICLTS

3 37 39 34 45 39 39 3425 37 39 34 4 37 39 342 37 39 34 35 36 39 3415 37 39 34 3 36 39 341 37 39 34 25 37 39 3405 37 39 34 0 35 36 320 36 36 32 minus25 37 39 34minus05 37 39 34 minus3 37 39 34minus1 37 39 34 minus35 39 39 34minus15 37 39 34 minus4 4 39 34

outliers The following theorem gives the influence functionof AIC with any scale 119878

Theorem 1 Let 119867 be some distribution other than 119865 Take(X 119884) sim 119867 and denote by 120598 the error term of the modelAssume that 119878has the property that is differentiablewith partialderivatives equal to zero at the origin (00) Then

119868119865 ((x 119910) AIC119878 119867)

=2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650)

(14)

where 119903119894119901= 119910119894minus120583minus x119879

119894119901and 2119901= 119878119878119864(119899 minus 119901) are computed

from sum model (The proof is in the Appendix)

It is clear that the influence function in (13) is boundedif the 119868119865(119903

119894119901120590119901 119878 1198650) is also bounded It is evident that

AIC is nonrobust since the LS estimate has unboundedinfluence function The influence function of 119872-estimationwith respect to 119884 is bounded by choice of 120595 but it isunbounded with respect to the119883 direction That is

119868119865 ((x 119910) 119879 119865)

= 119872minus1(120595 119865) x120595 (119910 minus x119879119879 (119865))

(15)

where 120595 = 1205881015840 and119872 is a certain 119901 times 119901matrix given by

119872(120595 119865)

= int1205951015840(119910 minus x119879119879 (119865)) xx119879119889119865 (x 119910)

(16)

The influence function of the AIC using LTS estimatorsfollowingTheorem 1 takes the form

119868119865 ((x 119910) AICLTS 119867)

=2119899

(119899 minus 119901)sdot 119868119865(

119903119894119901

120590119901

2

LTS 1198650) (17)

We have noted that the influence function of AICLTS isbounded in both 119884 and 119883 directions as 119868119865(119903

119894119901120590119901 2

LTS 1198650)

is bounded Moreover we conclude that AIC with highbreakdown point estimator provides reliable fits to the datain presence of leverage points

3 Numerical Examples

In this section AICLTS AICLMS and AICBS are comparedwith AIC and RAIC In this study 50 independent replicatesof 3 independent uniform random variables on [minus1 1] ofx1198941 x1198942 and x

1198943and 50 independent normally distributed

errors 120598119894sim 119873(0 9) were generated The true model is given

by 119910119894= x1198941+ x1198942+ 120598119894 for 119894 = 1 50 using two variables

x1198941and x

1198942 We have computed the following AIC based on

the respective estimation criterion (i) LS estimation (ii)119872-estimator (iii) BS estimator (iv) LTS estimator and (v) LMSestimator In order to illustrate the robustness to outliers weconsider the following cases

(a) vertical outliers (outliers in the 119910 only)(b) good leverage points (outliers in the 119910 and x)(c) bad leverage points (outliers in the x only)

For all situations we randomly generate (0 5 10 2030 and 40) of outliers from119873(50 01

2) and119873(100 052)

distributions respectively For each of these settings wesimulate 1000 samples

31 Simulation Result The resulting fit to the data is classifiedas one of the following

(i) correct fit (true model)(ii) overfit (models containing all the variables in the true

model plus other variables that are redundant)(iii) under fit (models with only a strict of the variables in

true model)(iv) wrong fit (model that are neither of the above)

Tables 3 4 and 5 show detailed simulation result for differentversions of AIC methods For uncontaminated datasets theclassical AIC performs best compared to robust AIC Byintroducing vertical outliers the classical AIC selects a largeproportion of wrong fit models and as expected the robustAIC will usually (ie with higher proportion) select thecorrect model

For bad leverage points we observe that AIC tends toproduce overfit and with high level of contamination it takesa higher proportion of wrong fit However AIC

119872tend to

produce either an under fit or wrong fit model However therobust estimate produces comparable power in the presenceof bad leverage points

For good leverage points AIC tends also to produceoverfit On the other hand the robust AIC tend to produceeither correct fit or under fit model

32 Example 2 (Stack Loss Data) Stack loss data was pre-sented by [10] This data set consists of 21 observations onthree independent variables and it contains four outliers

Journal of Applied Mathematics 5

Table 3 Percentage of select models from classical AIC robust RAIC AICLTS AICLMS and AICBS with vertical outliers

120598

AIC RAIC AICLTS AICLMS AICBS

0

Correct fit 846 544 576 652 452Overfit 154 0 0 0 0Under fit 0 436 412 34 546Wrong fit 0 20 12 08 02

5

Correct fit 28 496 568 626 454Overfit 2 0 0 0 0Under fit 22 51 428 366 546Wrong fit 732 14 04 08 0

10

Correct fit 32 450 51 56 398Overfit 08 0 0 0 0Under fit 202 524 48 438 552Wrong fit 758 26 10 02 0

20

Correct fit 42 308 494 58 342Overfit 02 0 0 0 0Under fit 228 678 496 414 658Wrong fit 728 14 10 06 0

30

Correct fit 16 0 440 512 232Overfit 02 0 0 0 0Under fit 226 738 554 482 768Wrong fit 728 262 06 06 0

40

Correct fit 22 0 376 412 124Overfit 06 0 0 0 0Under fit 228 708 624 584 874Wrong fit 744 292 0 04 02

(cases 1 3 4 and 21) and high leverage points (cases 1 2 3and 21) The data are given in Table 6

We applied the traditional and robust versions of AICmethods on the data Table 7 shows that the classical methodselects the full model and robust RAIC method ignored oneof the important variables (x

1198942) whereas robust AICmethods

based on high break down points estimators agreed with theimportance of the two variables x

1198941and x1198942

4 Conclusions

The least trimmed squares (LTS) and the least mediansquares (LMS) are robust regression methods frequentlyused in practice Nevertheless they are not commonly usedfor selecting models This paper introduced the AkaikeInformation Criterion (AIC) based on LTS and LMS scaleswhich are robust against outliers and leverage point Our sim-ulation result illustrated excellent performance of AICLTS andAICLMS for contaminated data sets This paper focused onthe AIC variable selection criteria one might be interested inextending other robust model selection criteria to advancedrobust breakdown point estimation methods such as LTS

LMS or BS estimators In addition this paper has consideredregression model with continuous variables however futurestudies might consider mixed variables (ie continuous anddummy)

Appendix

Proof of Theorem 1 Consider

119868119865 ((x 119910) AIC 119867) = 120597

120597120598(AIC (Δ

(x119910)))10038161003816100381610038161003816100381610038161003816120598=0

=120597

120597120598(ln

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

=119899

(119899 minus 119901) sdot 1198782119901(119867)

sdot120597

120597120598(1198782

119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A1)

=119899

(119899 minus 119901) 1205902119901

sdot 2120590119901

120597

120597120598(119878119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A2)

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

Journal of Applied Mathematics 5

Table 3 Percentage of select models from classical AIC robust RAIC AICLTS AICLMS and AICBS with vertical outliers

120598

AIC RAIC AICLTS AICLMS AICBS

0

Correct fit 846 544 576 652 452Overfit 154 0 0 0 0Under fit 0 436 412 34 546Wrong fit 0 20 12 08 02

5

Correct fit 28 496 568 626 454Overfit 2 0 0 0 0Under fit 22 51 428 366 546Wrong fit 732 14 04 08 0

10

Correct fit 32 450 51 56 398Overfit 08 0 0 0 0Under fit 202 524 48 438 552Wrong fit 758 26 10 02 0

20

Correct fit 42 308 494 58 342Overfit 02 0 0 0 0Under fit 228 678 496 414 658Wrong fit 728 14 10 06 0

30

Correct fit 16 0 440 512 232Overfit 02 0 0 0 0Under fit 226 738 554 482 768Wrong fit 728 262 06 06 0

40

Correct fit 22 0 376 412 124Overfit 06 0 0 0 0Under fit 228 708 624 584 874Wrong fit 744 292 0 04 02

(cases 1 3 4 and 21) and high leverage points (cases 1 2 3and 21) The data are given in Table 6

We applied the traditional and robust versions of AICmethods on the data Table 7 shows that the classical methodselects the full model and robust RAIC method ignored oneof the important variables (x

1198942) whereas robust AICmethods

based on high break down points estimators agreed with theimportance of the two variables x

1198941and x1198942

4 Conclusions

The least trimmed squares (LTS) and the least mediansquares (LMS) are robust regression methods frequentlyused in practice Nevertheless they are not commonly usedfor selecting models This paper introduced the AkaikeInformation Criterion (AIC) based on LTS and LMS scaleswhich are robust against outliers and leverage point Our sim-ulation result illustrated excellent performance of AICLTS andAICLMS for contaminated data sets This paper focused onthe AIC variable selection criteria one might be interested inextending other robust model selection criteria to advancedrobust breakdown point estimation methods such as LTS

LMS or BS estimators In addition this paper has consideredregression model with continuous variables however futurestudies might consider mixed variables (ie continuous anddummy)

Appendix

Proof of Theorem 1 Consider

119868119865 ((x 119910) AIC 119867) = 120597

120597120598(AIC (Δ

(x119910)))10038161003816100381610038161003816100381610038161003816120598=0

=120597

120597120598(ln

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

=119899

(119899 minus 119901) sdot 1198782119901(119867)

sdot120597

120597120598(1198782

119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A1)

=119899

(119899 minus 119901) 1205902119901

sdot 2120590119901

120597

120597120598(119878119901(119867120598))

10038161003816100381610038161003816100381610038161003816120598=0

(A2)

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

6 Journal of Applied Mathematics

Table 4 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with bad leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 0 0 546 608 438Overfit 704 0 0 0 0Under fit 0 644 44 384 554Wrong fit 296 356 14 08 08

10

Correct fit 0 0 638 678 510Overfit 63 0 0 0 0Under fit 0 546 346 314 488Wrong fit 37 424 16 08 02

20

Correct fit 0 0 566 634 492Overfit 548 0 0 0 0Under fit 02 608 424 358 506Wrong fit 448 392 10 08 02

30

Correct fit 0 0 566 614 46Overfit 376 0 0 0 0Under fit 025 296 426 36 538Wrong fit 604 426 08 24 02

40

Correct fit 10 0 552 646 514Overfit 138 0 0 0 0Under fit 12 544 438 338 486Wrong fit 81 454 10 16 0

Table 5 Percentage of select models from classical AIC robust RAIC and robust AICLTS AICLMS and AICBS with good leverage points

120598

AIC RAIC AICLTS AICLMS AICBS

5

Correct fit 02 470 536 584 426Overfit 998 0 0 0 0Under fit 0 504 46 412 574Wrong fit 0 26 04 04 0

10

Correct fit 0 442 546 594 404Overfit 996 0 0 0 0Under fit 02 538 446 398 596Wrong fit 02 20 08 08 0

20

Correct fit 08 324 504 562 332Overfit 976 0 0 0 0Under fit 08 666 484 43 664Wrong fit 08 10 12 08 02

30

Correct fit 18 0 460 506 270Overfit 978 968 0 0 0Under fit 28 28 536 492 728Wrong fit 06 04 04 02 02

40

Correct fit 02 0 374 37 138Overfit 974 100 0 0 0Under fit 22 0 626 626 862Wrong fit 02 0 0 04 0

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

Journal of Applied Mathematics 7

Table 6 Stack loss data set

x1

x2

x3

119910

80 27 89 4280 27 88 3775 25 90 3762 24 87 2862 22 87 1862 23 87 1862 24 93 1962 24 93 2058 23 87 1558 18 80 1458 18 89 1458 17 88 1358 18 82 1158 19 93 1250 18 89 850 18 86 750 19 72 850 19 79 850 20 80 956 20 82 1570 20 91 15

Table 7 The result variable selection of stack loss data

Selected variables AIC RAIC AICLTS AICLMS AICBS

x1

67 80 47 43 55x2

71 65 59 54 70x3

84 73 70 64 73x1 x2

82 90 55 47 69x1 x3

87 89 69 63 76x2 x3

91 90 81 73 88x1 x2 x3

47 106 76 67 91

where

AIC (Δ(x119910)) = ln(

119899 minus 119901

119899sdot 1198782

119901(119867120598)) + 2119901

AIC (119867) = ln(119899 minus 119901

119899sdot 1198782

119901(119867)) + 2119901

1198782

119901(119867) = 120590

2

119901

120597

120597120598119878119901(119867120598)

10038161003816100381610038161003816100381610038161003816120598=0=

120597

120597120598119878 ((1 minus 120598) 119865120590119901

+ 120598Δ119903119894119901)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901

120597

120597120598119878 ((1 minus 120598) 1198650 + 120598Δ 119903119894119901120590119901

)

10038161003816100381610038161003816100381610038161003816120598=0

= 120590119901119868119865(

119903119894119901

120590119901

119878 119865120590)

(A3)

where

119903119894119901= 119910119894minus 120583 minus x119879

119894119901 (A4)

Inserting (A2) into (A1) yields

119868119865 ((x 119910) AIC 119867) = 2119899

(119899 minus 119901)119868119865(

119903119894119901

120590119901

119878 1198650) (A5)

Conflict of Interests

The author declares that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

Theauthorwould like to thankProfessorNorAishahHamzahand Dr Rossita M Yunus for their support to complete thisstudyThe author is also grateful to the anonymous reviewersfor their valuable comments on earlier draft of this paperThis research has been funded byUniversity ofMalaya underGrant no RG208-11AFR

References

[1] H Akaike ldquoInformation theory and an extension of the max-imum likelihood principlerdquo in Breakthroughs in Statistics pp610ndash624 Springer 1992

[2] R D Martin ldquoRobust estimation of autoregressive modelsrdquo inDirections in Time Series pp 228ndash254 1980

[3] E Ronchetti Robust testing in linear models The innitesimalapproach [doctoral thesis] University of Geneva GenevaSwitzerland 1982

[4] K Tharmaratnam and G Claeskens ldquoA comparison of robustversions of the AIC based on M- S- and MM-estimatorsrdquoStatistics vol 47 no 1 pp 216ndash235 2013

[5] P J Huber ldquoRobust estimation of a location parameterrdquo Annalsof Mathematical Statistics vol 35 no 1 pp 73ndash101 1964

[6] W S Krasker and R E Welsch ldquoEfficient bounded-influenceregression estimationrdquo Journal of the American statistical Asso-ciation vol 77 pp 379ndash595 1982

[7] P J Rousseeuw ldquoLeast median of squares regressionrdquo Journalof the American Statistical Association vol 79 no 388 pp 871ndash880 1984

[8] P Rousseeuw ldquoMultivariate estimation with high breakdownpointrdquo in Mathematical Statistics and Applications W Gross-mann G Pflug I Vinceze and W Wertz Eds pp 283ndash297Reidel Dordrecht The Netherlands 1985

[9] P J HuberRobust Statistics Springer NewYork NY USA 2011[10] K A Brownlee Statistical Theory and Methodology in Science

and Engineering John-Wiley New York NY USA 2009

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Robust AIC with High Breakdown Scale Estimatedownloads.hindawi.com/journals/jam/2014/286414.pdf · 1. Introduction Akaike Information Criterion (AIC) [ ] is a powerful

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of