Research Article Robust Quadratic Regression and Its...

11
Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2013, Article ID 210510, 10 pages http://dx.doi.org/10.1155/2013/210510 Research Article Robust Quadratic Regression and Its Application to Energy-Growth Consumption Problem Yongzhi Wang, 1 Yuli Zhang, 2 Fuliang Zhang, 3 and Jining Yi 3,4 1 College of Instrumentation & Electrical Engineering, Jilin University, Changchun 130061, China 2 Department of Automation, TNList, Tsinghua University, Beijing 100084, China 3 Development and Research Center of China Geological Survey, Beijing 100037, China 4 School of Earth Sciences and Resources, China University of Geosciences, Beijing 100083, China Correspondence should be addressed to Yongzhi Wang; [email protected] Received 1 May 2013; Accepted 8 August 2013 Academic Editor: Yudong Zhang Copyright © 2013 Yongzhi Wang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We propose a robust quadratic regression model to handle the statistics inaccuracy. Unlike the traditional robust statistic approaches that mainly focus on eliminating the effect of outliers, the proposed model employs the recently developed robust optimization methodology and tries to minimize the worst-case residual errors. First, we give a solvable equivalent semidefinite programming for the robust least square model with ball uncertainty set. en the result is generalized to robust models under 1 - and -norm critera with general ellipsoid uncertainty sets. In addition, we establish a robust regression model for per capital GDP and energy consumption in the energy-growth problem under the conservation hypothesis. Finally, numerical experiments are carried out to verify the effectiveness of the proposed models and demonstrate the effect of the uncertainty perturbation on the robust models. 1. Introduction Traditional regression analysis is a useful tool to model the linear or nonlinear relationship between the observed data. In the simplest linear regression model, there is only one explanatory variable (the regressors) and one depen- dent variable (the regressand) that is assumed to be an affine function of ; it is further extended to a polynomial regression model where is an th order polynomial of . In this case, the corresponding multivariate regression model contains more than one explanatory variable. To make the regression models work well, there are several specific assumptions on the model and the observed data. Consider the following standard multivariate linear regression model: = =1 + , = 1, . . . , , (1) where ( , ) =1 are given observed data and is a ran- dom error vector. Assuming that ( ) =1 have zero mean and constant variance, they are independent of each other. Besides the assumption on the random errors, there is another important weak exogeneity assumption that the explanatory variables are known deterministic values. Under this assumption, one can arbitrarily transform their values and construct any complex function relationship between the regressors and the regressand. For example, in this case the polynomial regression is merely a linear regression with regressors { 1 , 2 ,..., }. Although this weak exogeneity assumption makes the linear regression model very powerful to fit the given data or predict the regressand for given known regressors, it may lead to overfitting or inconsistent estimations [1]. Actually this assumption may be quite unreasonable in some case. For instance, in the process of collecting data, there is oſten unavoidable observation noise that makes the observed data quite inaccurate. Furthermore, in statistics the incomplete sampling approach sometimes can only give an approxima- tion of the real values. Researches on regression models with imprecise data have been reported. One way to handle the noisy observation is the measure error model or the errors-in-variable model, where it is assumed that there exist some unknown latent (or

Transcript of Research Article Robust Quadratic Regression and Its...

Page 1: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2013 Article ID 210510 10 pageshttpdxdoiorg1011552013210510

Research ArticleRobust Quadratic Regression and Its Application toEnergy-Growth Consumption Problem

Yongzhi Wang1 Yuli Zhang2 Fuliang Zhang3 and Jining Yi34

1 College of Instrumentation amp Electrical Engineering Jilin University Changchun 130061 China2Department of Automation TNList Tsinghua University Beijing 100084 China3Development and Research Center of China Geological Survey Beijing 100037 China4 School of Earth Sciences and Resources China University of Geosciences Beijing 100083 China

Correspondence should be addressed to Yongzhi Wang iamwangyongzhi126com

Received 1 May 2013 Accepted 8 August 2013

Academic Editor Yudong Zhang

Copyright copy 2013 Yongzhi Wang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Wepropose a robust quadratic regressionmodel to handle the statistics inaccuracyUnlike the traditional robust statistic approachesthat mainly focus on eliminating the effect of outliers the proposed model employs the recently developed robust optimizationmethodology and tries to minimize the worst-case residual errors First we give a solvable equivalent semidefinite programmingfor the robust least square model with ball uncertainty set Then the result is generalized to robust models under 119897

1- and 119897

infin-norm

critera with general ellipsoid uncertainty sets In addition we establish a robust regression model for per capital GDP and energyconsumption in the energy-growth problem under the conservation hypothesis Finally numerical experiments are carried out toverify the effectiveness of the proposed models and demonstrate the effect of the uncertainty perturbation on the robust models

1 Introduction

Traditional regression analysis is a useful tool to modelthe linear or nonlinear relationship between the observeddata In the simplest linear regression model there is onlyone explanatory variable 119909 (the regressors) and one depen-dent variable 119910 (the regressand) that is assumed to be anaffine function of 119909 it is further extended to a polynomialregression model where 119910 is an 119899th order polynomial of119909 In this case the corresponding multivariate regressionmodel contains more than one explanatory variable Tomakethe regression models work well there are several specificassumptions on the model and the observed data

Consider the following standard multivariate linearregression model

119910119894=

119899

sum

119895=1

119909119894119895+ 120576119894 119894 = 1 119898 (1)

where (119909119894 119910119894)119898

119894=1are given observed data and 120576 is a ran-

dom error vector Assuming that (120576119894)119898

119894=1have zero mean

and constant variance they are independent of each other

Besides the assumption on the random errors there isanother important weak exogeneity assumption that theexplanatory variables are known deterministic values Underthis assumption one can arbitrarily transform their valuesand construct any complex function relationship betweenthe regressors and the regressand For example in this casethe polynomial regression is merely a linear regression withregressors 1199091

119894 1199092

119894 119909

119901

119894

Although this weak exogeneity assumption makes thelinear regression model very powerful to fit the given dataor predict the regressand for given known regressors it maylead to overfitting or inconsistent estimations [1] Actuallythis assumption may be quite unreasonable in some caseFor instance in the process of collecting data there is oftenunavoidable observation noise that makes the observed dataquite inaccurate Furthermore in statistics the incompletesampling approach sometimes can only give an approxima-tion of the real values

Researches on regression models with imprecise datahave been reported One way to handle the noisy observationis the measure error model or the errors-in-variable modelwhere it is assumed that there exist some unknown latent (or

2 Mathematical Problems in Engineering

true) variables that follow the true functional relationshipand the actual observations are affected by certain randomnoise [2] Based on different assumptions about randomnoise there are a variety of regression models such as themethod ofmoments [3] that is based on the third-(or higher-)order joint cumulants of observable variables and theDemingregression [4] assuming that the ratio of the noise variance isknown A brief historical overview of linear regression witherrors in variables can be found in [5]

In addition to the errors-in-variablemodelsmotivated bythe robust optimization theory under uncertainty studies onthe robust regression models are reported In such case theperturbations are deterministic and unknown but boundedGhaoui and Lebret [6] study the robust linear regressionwith bounded uncertainty sets under least square criterionThey utilize the second-order cone programming (SOCP)and semi-definite programming to minimize the worst-case residual errors Shivaswamy et al [7] propose SOCPformulations for robust linear classification and regressionmodels when the first two moments of the uncertain data arecomputable Ben-Tal et al [8] provide an excellent frameworkof robust linear classification and regression Based on gen-eral assumption on the uncertainty sets they provide explicitequivalent formulations for robust least squares 119897

1 119897infin and

Huber penalty regressions For more results regarding robustclassification and regression that are similar to this workwe refer the readers to [9ndash11] Also according to [8] thetraditional robust statistic approaches (see [12]) which tryto reject the outliers in data are different from the point ofview in this paper as the authors here intend to minimize themaximal (worst-case) residual errors However in order toovercome the confliction a two-step approach can be easilyimplemented First the outliers are identified and relateddata is removed Then our proposed method is applied inorder to safely eliminate the effect generated from imprecisedata We employ this approach in the real energy-growthregression problem in Section 3

Besides the regression models there are a wide variety offorecasting models such as support vector machines deci-sion tree neural network and Bayes classifier For example[13] utilizes the support vectormachine based on trend-basedsegmentation method for financial time series forecastingThe proposed models have been tested by using variousstocks from America stock market with different trends [14]proposes a new adaptive local linear prediction method toreduce the parameters uncertainties in the prediction of achaotic time series Real hydrological time series are usedto validate the effectiveness of the proposed methods Morerelated literatures can be found in [15] (chaotic time seriesanalysis) [16] (fractal time series) and [17] (knowledge-basedGreenrsquos kernel for support vector regression) Compared withthese models we focus on the handling of the statisticsinaccuracy The regression model is an appropriate basis todevelop effective and tractable robust models

In this paper we try to extend the robust linear regres-sion model to general multivariate quadratic regression andprovide equivalent tractable formulations Different from thesimple extension from the classical linear model to classi-cal polynomial (even general nonlinear) models under the

weak exogeneity assumption the perturbation of explanatoryvariables in the quadratic terms will affect the model in acomplex nonlinear manner Although [8 12] have discussedthe robust polynomial interpolation problem only an upperbound and the corresponding suboptimal coefficients aregiven They further conjecture that the proposed problemcannot be solved exactly in polynomial time Our proposedrobust multivariate quadratic regression model in this paperalso needs to solve a complex biquadratic min-max opti-mization problem However under certain assumption onthe uncertainty sets we can obtain a series of equivalentsemidefinite programming formulations for robust quadraticregression under different residual error criteria

In particular we first extend the traditional quadraticregression model by introducing the separable ball (2-norm)uncertainty set and formulate the optimal robust regressionproblem as a min-max problem that tries to minimize themaximal residual error By utilizing the S-lemma [18] andSchur complement lemma we provide an equivalent semi-definite programming formulation for the robust least squarequadratic regression model with ball uncertainty set Thisresult is then generalized to models with general ellipsoiduncertainty sets and under the 119897

1- 119897infin-norm criteria Fur-

thermore the robust quadratic regression models are appliedto the economic growth and energy consumption regressionproblem We take the per capital GDP as the explanatoryvariable and the per capital energy consumption as thedependent variable Under the conservation hypothesis weestablish a corresponding robust model Finally we test theproposed model on different history data sets and compareour models with the classical regression models

The paper proceeds as follows In Section 2 we presenta general robust quadratic regression model give a solvableequivalent semi-definite programming for the robust leastsquare quadratic regression model with ball uncertainty setand further generalize the result In Section 3 the proposedmodels are applied to the energy-growth problemNumericalexperiments are carried out in Section 4 and Section 5concludes this paper and gives future research directions

2 Robust Quadratic Regression Models

21 General RobustModels Consider the standardmultivari-ate quadratic regression model

119910 = 119909119879119876119909 + 2120572

119879119909 + 120573 (2)

where 119909 isin 119877119899 denotes the 119899-dimension explanatory data 119910 isin

119877 denotes the dependent data and 119876 isin 119877119899times119899 120572 isin 119877

119899 and120573 isin 119877 are unknown coefficients that will be determined basedon certain minimal criteria

Given a set of data 119863 = [119883 119884119879] isin 119877

(119899+1)times119898 where 119883 =

[1199091 119909

119898] isin 119877

119899times119898 and 119884 = [1199101 119910

119898] isin 119877

119898 we utilizethe 119901-norm to measure the prediction error

119890119901(120572 120573 119876) =

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119909119879

11198761199091minus 21205721198791199091minus 120573

119910119898

minus 119909119879

119898119876119909119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817119901

(3)

Mathematical Problems in Engineering 3

In traditional regression models we assume that theexplanatory data are precise and reliable Based on thisweak exogeneity assumption the quadratic regression can beexpressed as the following linear regression

min120572120573119876

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119876 ∘ 119883

1minus 21205721198791199091minus 120573

119910119898

minus 119876 ∘ 119883119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817 119901

(4)

where 119883119894

= 119909119894119909119879

119894are the problem data and the linear

operator ∘ for matrix 119860 and 119861 isin 119877119904times119897 is defined as 119860 ∘

119861 = sum119904

119894=1sum119897

119895=1119860119894119895119861119894119895 Therefore we can easily solve the

above linear regression model for 119901 = 1 2 (the least squareregression) and +infin

To relax the weak exogeneity assumption we assume thatthe real data are contained in the following uncertainty set

119880 = [119883 119884119879] 119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ119910119894 119894 = 1 119898

10038171003817100381710038171003817(Δ119910119894 Δ119909119894)119894=1119898

100381710038171003817100381710038172le 120575

(5)

To minimize the worst-case residual error we establishthe following robust quadratic regression model

min120572120573119876

max[119883119884119879]isin119880

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119909119879

11198761199091minus 21205721198791199091minus 120573

119910119898

minus 119909119879

119898119876119909119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817119901

(6)

From the computational perspective although the robustlinear regression problem (where the coefficients 119876 are setto zero) with a large variety of uncertainty sets can beefficiently solved the robust quadratic regression problemsare much more difficult Actually for general uncertaintysets and least square criteria even the inner maximizationproblem which includes convex biquadratic polynomial asthe objective function and general convex set as feasible setis in general not solvable in polynomial run timeNext wewillintroduce some meaningful uncertainty sets and provide thecorresponding tractable equivalences

22 Separable Ball Uncertainty SetsModel In this subsectionwe consider the following separable ball uncertainty set

119880119904

= 1198801times 1198802times sdot sdot sdot times 119880

119898 (7)

where119880119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(8)

and 120575119894ge 0 Thus the inner problem (IP) is of the following

form (here we first consider square of the original objectivefunction)

(IP) max119898

sum

119894=1

(119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894 119894 = 1 119898

(9)

Note that for the inner problem the separable uncertaintyset and the summation form of the objective functionallow us to decompose it into 119898 small scale subproblemswith quadratic objective function and ball constraints Thequadratic objective function and constraints motivate us touse the following S-lemma to obtain an equivalent solvablereformulation

Lemma 1 (inhomogeneous version of S-lemma [8]) Let119860 119861

be symmetric matrices of the same size and let the quadraticform 119909

119879119860119909 + 2119886

119879119909 + 120573 be strictly positive at some point Then

the implication

119909119879119860119909 + 2119886

119879119909 + 120573 ge 0 997904rArr 119909

119879119861119909 + 2119887

119879119909 + 120573 ge 0 (10)

holds true if and only if

exist120582 ge 0 [119861 minus 120582119860 (119887 minus 120582119886)

119879

119887 minus 120582119886 120573 minus 120582120572] ⪰ 0 (11)

We can obtain the following equivalent semidefiniteprogramming for the separable robust least square quadraticregression model

Proposition 2 The robust least square quadratic regressionmodel with separable uncertainty set 119880

119904is equivalent to the

following semidefinite programming

min120572120573119876V119906119903119905120591

V

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) + 119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus 119865 ⪰ 0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894 119894 = 1 119898

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

V 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(12)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

119872+= 119894 120575

119894gt 0⋂ 1 119898

1198720= 119894 120575

119894= 0⋂ 1 119898

(13)

4 Mathematical Problems in Engineering

Proof First consider the inner maximization subproblem Itis obvious that

(IP119894) max (119910

119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894

lArrrArr min 1199062

119894

st 119906119894ge

10038161003816100381610038161003816119910119894minus119909119879

119894119876119909119894minus2120572119879119909119894minus120573

10038161003816100381610038161003816 forall(119909

119894 119910119894) isin 119880119894

(14)

If 120575119894= 0 we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894+ 119906119894ge 0

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894minus 119906119894ge 0

(15)

where 119903119894= 119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894

If 120575119894gt 0 we can utilize the S-lemma as follows

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894+ 119906119894ge 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr (

1

Δ119910119894

Δ119909119894

)

119879

(

119906119894+ 119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

times (

1

Δ119910119894

Δ119909119894

) ge 0 forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist119905119894ge0 st(

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

)

⪰ 0

(16)

Note that in the last step if 1205752

119894gt 0 then there exists

(Δ119910119894 Δ119909119894) = (0 0) such that quadratic form 120575

2

119894minus(Δ119910

119894 Δ119909119894)2

2

is strictly positive thus the condition of S-lemma holds trulySimilarly we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894le 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894minus 119906119894le 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist120591119894ge 0 st(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

)

⪰ 0

(17)

Thus the inner maximization problem is equivalent to thefollowing semi-definite programming

(IP) min119906119894 119903119894 119905119894120591119894

119898

sum

119894=1

1199062

119894

st (

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

) ⪰ 0

119894 isin 119872+

(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

) ⪰ 0

119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

(18)

Note that based on the Schur complement lemma thesecond-order cone constraint V ge radicsum

119898

119894=11199062119894can also be

formalized as the following semi-definite constraint

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0 (19)

Thus we complete the proof by embedding the equivalentsemi-definite programming into the outer problem

Due to the advance of interior algorithms for conicprogramming the above semidefinite programming can beefficiently solved in polynomial run time There are severalefficient and free software packages for solving the semidef-inite programming such as the SDPT3 [19] Next we makeseveral extensions based on the separable robust least squarequadratic regression model

Mathematical Problems in Engineering 5

23 Ellipsoid Uncertainty Set and More Norm CriterionThe above result on standard ball uncertainty set can befurther extended to that on the following general ellipsoiduncertainty set

1198801015840

119904= 1198801015840

1times 1198801015840

2times sdot sdot sdot times 119880

1015840

119898

1198801015840

119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817119875119894 (Δ119910

119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(20)

where 119875119894

isin 119877119896times(119899+1) Linear transformation operator 119875

119894

allows us to impose more restrictions on the uncertaintyset For example if we choose the diagonal matrix 119875

119894=

Diag1205901 120590

119899+1 we can put different weights on deviation

of components of (119909119894 119910119894) general matrix can further restrict

the correlated deviation of different componentsTo obtain the corresponding reformulation we only need

to modify the first two constraints based on the S-lemma asfollows

(119906119894minus 1199051198941205752

119894

119905119894119875119879

119894119875119894

) + 119865 ⪰ 0 119894 isin 119872+

(119906119894minus 1205911198941205752

119894

120591119894119875119879

119894119875119894

) minus 119865 ⪰ 0 119894 isin 119872+

(21)

We further consider the robust quadratic regressionmodels with 119897

infin-norm and 119897

1-norm criterion Note that for

119897infin-norm criteria the inner maximization problem is of the

following form

max1le119894le119898

max(119909119894 119910119894)isin119880119894

119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

lArrrArr min 119906 119906 ge10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(22)

And for 1198971-norm criteria we have the following equivalent

reformulation

max1le119894le119898

119898

sum

119894=1

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

(119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

lArrrArr min

119898

sum

119894=1

119906119894 119906119894ge

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(23)

Using the similar approach as in Proposition 2 both canbe further reformulated as semi-definite programming

Proposition 3 The separable robust quadratic regressionmodel under 119897

infin-norm and 119897

1-norm criteria are equivalent to

the following semidefinite programming respectively

(119897infin-norm) min

120572120573119876119906119903119905120591

119906

st (

119906 minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) +119865⪰0 119894isin119872+

(

119906 minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus119865⪰ 0 119894isin119872+

119903119894+ 119906 ge 0 119903

119894minus 119906 ge 0 119894 isin 119872

0

119903119894=120573+119909

119879

119894119876119909119894+2120572119879119909119894minus119910119894 119894=1 119898

119905119894ge 0 120591119894ge 0 119903119894isin 119877 119894 = 1 119898

119906 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(1198971-norm) min

120572120573119876119906119903119905120591

119898

sum

119894=1

119906119894

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

)+119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

)minus119865⪰0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(24)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

) (25)

3 Robust Energy-Growth Regression Models

Studies have been reported on the causal relationshipbetween economic growth and energy consumption In thissection we try to apply the proposed robust quadraticregression model to the energy-growth problem

The seminal paper of J Kraft andA Kraft [20] first studiesthe casual relationship for USA In a recent survey Ilhan [21]categorizes the casual relationships into four types no causal-ity unidirectional causality running from economic growth

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

2 Mathematical Problems in Engineering

true) variables that follow the true functional relationshipand the actual observations are affected by certain randomnoise [2] Based on different assumptions about randomnoise there are a variety of regression models such as themethod ofmoments [3] that is based on the third-(or higher-)order joint cumulants of observable variables and theDemingregression [4] assuming that the ratio of the noise variance isknown A brief historical overview of linear regression witherrors in variables can be found in [5]

In addition to the errors-in-variablemodelsmotivated bythe robust optimization theory under uncertainty studies onthe robust regression models are reported In such case theperturbations are deterministic and unknown but boundedGhaoui and Lebret [6] study the robust linear regressionwith bounded uncertainty sets under least square criterionThey utilize the second-order cone programming (SOCP)and semi-definite programming to minimize the worst-case residual errors Shivaswamy et al [7] propose SOCPformulations for robust linear classification and regressionmodels when the first two moments of the uncertain data arecomputable Ben-Tal et al [8] provide an excellent frameworkof robust linear classification and regression Based on gen-eral assumption on the uncertainty sets they provide explicitequivalent formulations for robust least squares 119897

1 119897infin and

Huber penalty regressions For more results regarding robustclassification and regression that are similar to this workwe refer the readers to [9ndash11] Also according to [8] thetraditional robust statistic approaches (see [12]) which tryto reject the outliers in data are different from the point ofview in this paper as the authors here intend to minimize themaximal (worst-case) residual errors However in order toovercome the confliction a two-step approach can be easilyimplemented First the outliers are identified and relateddata is removed Then our proposed method is applied inorder to safely eliminate the effect generated from imprecisedata We employ this approach in the real energy-growthregression problem in Section 3

Besides the regression models there are a wide variety offorecasting models such as support vector machines deci-sion tree neural network and Bayes classifier For example[13] utilizes the support vectormachine based on trend-basedsegmentation method for financial time series forecastingThe proposed models have been tested by using variousstocks from America stock market with different trends [14]proposes a new adaptive local linear prediction method toreduce the parameters uncertainties in the prediction of achaotic time series Real hydrological time series are usedto validate the effectiveness of the proposed methods Morerelated literatures can be found in [15] (chaotic time seriesanalysis) [16] (fractal time series) and [17] (knowledge-basedGreenrsquos kernel for support vector regression) Compared withthese models we focus on the handling of the statisticsinaccuracy The regression model is an appropriate basis todevelop effective and tractable robust models

In this paper we try to extend the robust linear regres-sion model to general multivariate quadratic regression andprovide equivalent tractable formulations Different from thesimple extension from the classical linear model to classi-cal polynomial (even general nonlinear) models under the

weak exogeneity assumption the perturbation of explanatoryvariables in the quadratic terms will affect the model in acomplex nonlinear manner Although [8 12] have discussedthe robust polynomial interpolation problem only an upperbound and the corresponding suboptimal coefficients aregiven They further conjecture that the proposed problemcannot be solved exactly in polynomial time Our proposedrobust multivariate quadratic regression model in this paperalso needs to solve a complex biquadratic min-max opti-mization problem However under certain assumption onthe uncertainty sets we can obtain a series of equivalentsemidefinite programming formulations for robust quadraticregression under different residual error criteria

In particular we first extend the traditional quadraticregression model by introducing the separable ball (2-norm)uncertainty set and formulate the optimal robust regressionproblem as a min-max problem that tries to minimize themaximal residual error By utilizing the S-lemma [18] andSchur complement lemma we provide an equivalent semi-definite programming formulation for the robust least squarequadratic regression model with ball uncertainty set Thisresult is then generalized to models with general ellipsoiduncertainty sets and under the 119897

1- 119897infin-norm criteria Fur-

thermore the robust quadratic regression models are appliedto the economic growth and energy consumption regressionproblem We take the per capital GDP as the explanatoryvariable and the per capital energy consumption as thedependent variable Under the conservation hypothesis weestablish a corresponding robust model Finally we test theproposed model on different history data sets and compareour models with the classical regression models

The paper proceeds as follows In Section 2 we presenta general robust quadratic regression model give a solvableequivalent semi-definite programming for the robust leastsquare quadratic regression model with ball uncertainty setand further generalize the result In Section 3 the proposedmodels are applied to the energy-growth problemNumericalexperiments are carried out in Section 4 and Section 5concludes this paper and gives future research directions

2 Robust Quadratic Regression Models

21 General RobustModels Consider the standardmultivari-ate quadratic regression model

119910 = 119909119879119876119909 + 2120572

119879119909 + 120573 (2)

where 119909 isin 119877119899 denotes the 119899-dimension explanatory data 119910 isin

119877 denotes the dependent data and 119876 isin 119877119899times119899 120572 isin 119877

119899 and120573 isin 119877 are unknown coefficients that will be determined basedon certain minimal criteria

Given a set of data 119863 = [119883 119884119879] isin 119877

(119899+1)times119898 where 119883 =

[1199091 119909

119898] isin 119877

119899times119898 and 119884 = [1199101 119910

119898] isin 119877

119898 we utilizethe 119901-norm to measure the prediction error

119890119901(120572 120573 119876) =

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119909119879

11198761199091minus 21205721198791199091minus 120573

119910119898

minus 119909119879

119898119876119909119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817119901

(3)

Mathematical Problems in Engineering 3

In traditional regression models we assume that theexplanatory data are precise and reliable Based on thisweak exogeneity assumption the quadratic regression can beexpressed as the following linear regression

min120572120573119876

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119876 ∘ 119883

1minus 21205721198791199091minus 120573

119910119898

minus 119876 ∘ 119883119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817 119901

(4)

where 119883119894

= 119909119894119909119879

119894are the problem data and the linear

operator ∘ for matrix 119860 and 119861 isin 119877119904times119897 is defined as 119860 ∘

119861 = sum119904

119894=1sum119897

119895=1119860119894119895119861119894119895 Therefore we can easily solve the

above linear regression model for 119901 = 1 2 (the least squareregression) and +infin

To relax the weak exogeneity assumption we assume thatthe real data are contained in the following uncertainty set

119880 = [119883 119884119879] 119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ119910119894 119894 = 1 119898

10038171003817100381710038171003817(Δ119910119894 Δ119909119894)119894=1119898

100381710038171003817100381710038172le 120575

(5)

To minimize the worst-case residual error we establishthe following robust quadratic regression model

min120572120573119876

max[119883119884119879]isin119880

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119909119879

11198761199091minus 21205721198791199091minus 120573

119910119898

minus 119909119879

119898119876119909119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817119901

(6)

From the computational perspective although the robustlinear regression problem (where the coefficients 119876 are setto zero) with a large variety of uncertainty sets can beefficiently solved the robust quadratic regression problemsare much more difficult Actually for general uncertaintysets and least square criteria even the inner maximizationproblem which includes convex biquadratic polynomial asthe objective function and general convex set as feasible setis in general not solvable in polynomial run timeNext wewillintroduce some meaningful uncertainty sets and provide thecorresponding tractable equivalences

22 Separable Ball Uncertainty SetsModel In this subsectionwe consider the following separable ball uncertainty set

119880119904

= 1198801times 1198802times sdot sdot sdot times 119880

119898 (7)

where119880119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(8)

and 120575119894ge 0 Thus the inner problem (IP) is of the following

form (here we first consider square of the original objectivefunction)

(IP) max119898

sum

119894=1

(119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894 119894 = 1 119898

(9)

Note that for the inner problem the separable uncertaintyset and the summation form of the objective functionallow us to decompose it into 119898 small scale subproblemswith quadratic objective function and ball constraints Thequadratic objective function and constraints motivate us touse the following S-lemma to obtain an equivalent solvablereformulation

Lemma 1 (inhomogeneous version of S-lemma [8]) Let119860 119861

be symmetric matrices of the same size and let the quadraticform 119909

119879119860119909 + 2119886

119879119909 + 120573 be strictly positive at some point Then

the implication

119909119879119860119909 + 2119886

119879119909 + 120573 ge 0 997904rArr 119909

119879119861119909 + 2119887

119879119909 + 120573 ge 0 (10)

holds true if and only if

exist120582 ge 0 [119861 minus 120582119860 (119887 minus 120582119886)

119879

119887 minus 120582119886 120573 minus 120582120572] ⪰ 0 (11)

We can obtain the following equivalent semidefiniteprogramming for the separable robust least square quadraticregression model

Proposition 2 The robust least square quadratic regressionmodel with separable uncertainty set 119880

119904is equivalent to the

following semidefinite programming

min120572120573119876V119906119903119905120591

V

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) + 119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus 119865 ⪰ 0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894 119894 = 1 119898

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

V 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(12)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

119872+= 119894 120575

119894gt 0⋂ 1 119898

1198720= 119894 120575

119894= 0⋂ 1 119898

(13)

4 Mathematical Problems in Engineering

Proof First consider the inner maximization subproblem Itis obvious that

(IP119894) max (119910

119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894

lArrrArr min 1199062

119894

st 119906119894ge

10038161003816100381610038161003816119910119894minus119909119879

119894119876119909119894minus2120572119879119909119894minus120573

10038161003816100381610038161003816 forall(119909

119894 119910119894) isin 119880119894

(14)

If 120575119894= 0 we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894+ 119906119894ge 0

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894minus 119906119894ge 0

(15)

where 119903119894= 119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894

If 120575119894gt 0 we can utilize the S-lemma as follows

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894+ 119906119894ge 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr (

1

Δ119910119894

Δ119909119894

)

119879

(

119906119894+ 119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

times (

1

Δ119910119894

Δ119909119894

) ge 0 forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist119905119894ge0 st(

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

)

⪰ 0

(16)

Note that in the last step if 1205752

119894gt 0 then there exists

(Δ119910119894 Δ119909119894) = (0 0) such that quadratic form 120575

2

119894minus(Δ119910

119894 Δ119909119894)2

2

is strictly positive thus the condition of S-lemma holds trulySimilarly we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894le 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894minus 119906119894le 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist120591119894ge 0 st(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

)

⪰ 0

(17)

Thus the inner maximization problem is equivalent to thefollowing semi-definite programming

(IP) min119906119894 119903119894 119905119894120591119894

119898

sum

119894=1

1199062

119894

st (

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

) ⪰ 0

119894 isin 119872+

(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

) ⪰ 0

119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

(18)

Note that based on the Schur complement lemma thesecond-order cone constraint V ge radicsum

119898

119894=11199062119894can also be

formalized as the following semi-definite constraint

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0 (19)

Thus we complete the proof by embedding the equivalentsemi-definite programming into the outer problem

Due to the advance of interior algorithms for conicprogramming the above semidefinite programming can beefficiently solved in polynomial run time There are severalefficient and free software packages for solving the semidef-inite programming such as the SDPT3 [19] Next we makeseveral extensions based on the separable robust least squarequadratic regression model

Mathematical Problems in Engineering 5

23 Ellipsoid Uncertainty Set and More Norm CriterionThe above result on standard ball uncertainty set can befurther extended to that on the following general ellipsoiduncertainty set

1198801015840

119904= 1198801015840

1times 1198801015840

2times sdot sdot sdot times 119880

1015840

119898

1198801015840

119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817119875119894 (Δ119910

119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(20)

where 119875119894

isin 119877119896times(119899+1) Linear transformation operator 119875

119894

allows us to impose more restrictions on the uncertaintyset For example if we choose the diagonal matrix 119875

119894=

Diag1205901 120590

119899+1 we can put different weights on deviation

of components of (119909119894 119910119894) general matrix can further restrict

the correlated deviation of different componentsTo obtain the corresponding reformulation we only need

to modify the first two constraints based on the S-lemma asfollows

(119906119894minus 1199051198941205752

119894

119905119894119875119879

119894119875119894

) + 119865 ⪰ 0 119894 isin 119872+

(119906119894minus 1205911198941205752

119894

120591119894119875119879

119894119875119894

) minus 119865 ⪰ 0 119894 isin 119872+

(21)

We further consider the robust quadratic regressionmodels with 119897

infin-norm and 119897

1-norm criterion Note that for

119897infin-norm criteria the inner maximization problem is of the

following form

max1le119894le119898

max(119909119894 119910119894)isin119880119894

119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

lArrrArr min 119906 119906 ge10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(22)

And for 1198971-norm criteria we have the following equivalent

reformulation

max1le119894le119898

119898

sum

119894=1

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

(119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

lArrrArr min

119898

sum

119894=1

119906119894 119906119894ge

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(23)

Using the similar approach as in Proposition 2 both canbe further reformulated as semi-definite programming

Proposition 3 The separable robust quadratic regressionmodel under 119897

infin-norm and 119897

1-norm criteria are equivalent to

the following semidefinite programming respectively

(119897infin-norm) min

120572120573119876119906119903119905120591

119906

st (

119906 minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) +119865⪰0 119894isin119872+

(

119906 minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus119865⪰ 0 119894isin119872+

119903119894+ 119906 ge 0 119903

119894minus 119906 ge 0 119894 isin 119872

0

119903119894=120573+119909

119879

119894119876119909119894+2120572119879119909119894minus119910119894 119894=1 119898

119905119894ge 0 120591119894ge 0 119903119894isin 119877 119894 = 1 119898

119906 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(1198971-norm) min

120572120573119876119906119903119905120591

119898

sum

119894=1

119906119894

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

)+119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

)minus119865⪰0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(24)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

) (25)

3 Robust Energy-Growth Regression Models

Studies have been reported on the causal relationshipbetween economic growth and energy consumption In thissection we try to apply the proposed robust quadraticregression model to the energy-growth problem

The seminal paper of J Kraft andA Kraft [20] first studiesthe casual relationship for USA In a recent survey Ilhan [21]categorizes the casual relationships into four types no causal-ity unidirectional causality running from economic growth

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Mathematical Problems in Engineering 3

In traditional regression models we assume that theexplanatory data are precise and reliable Based on thisweak exogeneity assumption the quadratic regression can beexpressed as the following linear regression

min120572120573119876

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119876 ∘ 119883

1minus 21205721198791199091minus 120573

119910119898

minus 119876 ∘ 119883119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817 119901

(4)

where 119883119894

= 119909119894119909119879

119894are the problem data and the linear

operator ∘ for matrix 119860 and 119861 isin 119877119904times119897 is defined as 119860 ∘

119861 = sum119904

119894=1sum119897

119895=1119860119894119895119861119894119895 Therefore we can easily solve the

above linear regression model for 119901 = 1 2 (the least squareregression) and +infin

To relax the weak exogeneity assumption we assume thatthe real data are contained in the following uncertainty set

119880 = [119883 119884119879] 119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ119910119894 119894 = 1 119898

10038171003817100381710038171003817(Δ119910119894 Δ119909119894)119894=1119898

100381710038171003817100381710038172le 120575

(5)

To minimize the worst-case residual error we establishthe following robust quadratic regression model

min120572120573119876

max[119883119884119879]isin119880

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817

(

1199101minus 119909119879

11198761199091minus 21205721198791199091minus 120573

119910119898

minus 119909119879

119898119876119909119898

minus 2120572119879119909119898

minus 120573

)

100381710038171003817100381710038171003817100381710038171003817100381710038171003817100381710038171003817119901

(6)

From the computational perspective although the robustlinear regression problem (where the coefficients 119876 are setto zero) with a large variety of uncertainty sets can beefficiently solved the robust quadratic regression problemsare much more difficult Actually for general uncertaintysets and least square criteria even the inner maximizationproblem which includes convex biquadratic polynomial asthe objective function and general convex set as feasible setis in general not solvable in polynomial run timeNext wewillintroduce some meaningful uncertainty sets and provide thecorresponding tractable equivalences

22 Separable Ball Uncertainty SetsModel In this subsectionwe consider the following separable ball uncertainty set

119880119904

= 1198801times 1198802times sdot sdot sdot times 119880

119898 (7)

where119880119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(8)

and 120575119894ge 0 Thus the inner problem (IP) is of the following

form (here we first consider square of the original objectivefunction)

(IP) max119898

sum

119894=1

(119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894 119894 = 1 119898

(9)

Note that for the inner problem the separable uncertaintyset and the summation form of the objective functionallow us to decompose it into 119898 small scale subproblemswith quadratic objective function and ball constraints Thequadratic objective function and constraints motivate us touse the following S-lemma to obtain an equivalent solvablereformulation

Lemma 1 (inhomogeneous version of S-lemma [8]) Let119860 119861

be symmetric matrices of the same size and let the quadraticform 119909

119879119860119909 + 2119886

119879119909 + 120573 be strictly positive at some point Then

the implication

119909119879119860119909 + 2119886

119879119909 + 120573 ge 0 997904rArr 119909

119879119861119909 + 2119887

119879119909 + 120573 ge 0 (10)

holds true if and only if

exist120582 ge 0 [119861 minus 120582119860 (119887 minus 120582119886)

119879

119887 minus 120582119886 120573 minus 120582120572] ⪰ 0 (11)

We can obtain the following equivalent semidefiniteprogramming for the separable robust least square quadraticregression model

Proposition 2 The robust least square quadratic regressionmodel with separable uncertainty set 119880

119904is equivalent to the

following semidefinite programming

min120572120573119876V119906119903119905120591

V

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) + 119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus 119865 ⪰ 0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894 119894 = 1 119898

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

V 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(12)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

119872+= 119894 120575

119894gt 0⋂ 1 119898

1198720= 119894 120575

119894= 0⋂ 1 119898

(13)

4 Mathematical Problems in Engineering

Proof First consider the inner maximization subproblem Itis obvious that

(IP119894) max (119910

119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894

lArrrArr min 1199062

119894

st 119906119894ge

10038161003816100381610038161003816119910119894minus119909119879

119894119876119909119894minus2120572119879119909119894minus120573

10038161003816100381610038161003816 forall(119909

119894 119910119894) isin 119880119894

(14)

If 120575119894= 0 we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894+ 119906119894ge 0

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894minus 119906119894ge 0

(15)

where 119903119894= 119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894

If 120575119894gt 0 we can utilize the S-lemma as follows

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894+ 119906119894ge 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr (

1

Δ119910119894

Δ119909119894

)

119879

(

119906119894+ 119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

times (

1

Δ119910119894

Δ119909119894

) ge 0 forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist119905119894ge0 st(

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

)

⪰ 0

(16)

Note that in the last step if 1205752

119894gt 0 then there exists

(Δ119910119894 Δ119909119894) = (0 0) such that quadratic form 120575

2

119894minus(Δ119910

119894 Δ119909119894)2

2

is strictly positive thus the condition of S-lemma holds trulySimilarly we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894le 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894minus 119906119894le 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist120591119894ge 0 st(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

)

⪰ 0

(17)

Thus the inner maximization problem is equivalent to thefollowing semi-definite programming

(IP) min119906119894 119903119894 119905119894120591119894

119898

sum

119894=1

1199062

119894

st (

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

) ⪰ 0

119894 isin 119872+

(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

) ⪰ 0

119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

(18)

Note that based on the Schur complement lemma thesecond-order cone constraint V ge radicsum

119898

119894=11199062119894can also be

formalized as the following semi-definite constraint

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0 (19)

Thus we complete the proof by embedding the equivalentsemi-definite programming into the outer problem

Due to the advance of interior algorithms for conicprogramming the above semidefinite programming can beefficiently solved in polynomial run time There are severalefficient and free software packages for solving the semidef-inite programming such as the SDPT3 [19] Next we makeseveral extensions based on the separable robust least squarequadratic regression model

Mathematical Problems in Engineering 5

23 Ellipsoid Uncertainty Set and More Norm CriterionThe above result on standard ball uncertainty set can befurther extended to that on the following general ellipsoiduncertainty set

1198801015840

119904= 1198801015840

1times 1198801015840

2times sdot sdot sdot times 119880

1015840

119898

1198801015840

119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817119875119894 (Δ119910

119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(20)

where 119875119894

isin 119877119896times(119899+1) Linear transformation operator 119875

119894

allows us to impose more restrictions on the uncertaintyset For example if we choose the diagonal matrix 119875

119894=

Diag1205901 120590

119899+1 we can put different weights on deviation

of components of (119909119894 119910119894) general matrix can further restrict

the correlated deviation of different componentsTo obtain the corresponding reformulation we only need

to modify the first two constraints based on the S-lemma asfollows

(119906119894minus 1199051198941205752

119894

119905119894119875119879

119894119875119894

) + 119865 ⪰ 0 119894 isin 119872+

(119906119894minus 1205911198941205752

119894

120591119894119875119879

119894119875119894

) minus 119865 ⪰ 0 119894 isin 119872+

(21)

We further consider the robust quadratic regressionmodels with 119897

infin-norm and 119897

1-norm criterion Note that for

119897infin-norm criteria the inner maximization problem is of the

following form

max1le119894le119898

max(119909119894 119910119894)isin119880119894

119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

lArrrArr min 119906 119906 ge10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(22)

And for 1198971-norm criteria we have the following equivalent

reformulation

max1le119894le119898

119898

sum

119894=1

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

(119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

lArrrArr min

119898

sum

119894=1

119906119894 119906119894ge

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(23)

Using the similar approach as in Proposition 2 both canbe further reformulated as semi-definite programming

Proposition 3 The separable robust quadratic regressionmodel under 119897

infin-norm and 119897

1-norm criteria are equivalent to

the following semidefinite programming respectively

(119897infin-norm) min

120572120573119876119906119903119905120591

119906

st (

119906 minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) +119865⪰0 119894isin119872+

(

119906 minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus119865⪰ 0 119894isin119872+

119903119894+ 119906 ge 0 119903

119894minus 119906 ge 0 119894 isin 119872

0

119903119894=120573+119909

119879

119894119876119909119894+2120572119879119909119894minus119910119894 119894=1 119898

119905119894ge 0 120591119894ge 0 119903119894isin 119877 119894 = 1 119898

119906 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(1198971-norm) min

120572120573119876119906119903119905120591

119898

sum

119894=1

119906119894

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

)+119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

)minus119865⪰0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(24)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

) (25)

3 Robust Energy-Growth Regression Models

Studies have been reported on the causal relationshipbetween economic growth and energy consumption In thissection we try to apply the proposed robust quadraticregression model to the energy-growth problem

The seminal paper of J Kraft andA Kraft [20] first studiesthe casual relationship for USA In a recent survey Ilhan [21]categorizes the casual relationships into four types no causal-ity unidirectional causality running from economic growth

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

4 Mathematical Problems in Engineering

Proof First consider the inner maximization subproblem Itis obvious that

(IP119894) max (119910

119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573)2

st (119909119894 119910119894) isin 119880119894

lArrrArr min 1199062

119894

st 119906119894ge

10038161003816100381610038161003816119910119894minus119909119879

119894119876119909119894minus2120572119879119909119894minus120573

10038161003816100381610038161003816 forall(119909

119894 119910119894) isin 119880119894

(14)

If 120575119894= 0 we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894+ 119906119894ge 0

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894ge 0

forall(119909119894 119910119894) isin 119880119894lArrrArr 119903

119894minus 119906119894ge 0

(15)

where 119903119894= 119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894

If 120575119894gt 0 we can utilize the S-lemma as follows

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894+ 119906119894ge 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894+ 119906119894ge 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr (

1

Δ119910119894

Δ119909119894

)

119879

(

119906119894+ 119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

)

times (

1

Δ119910119894

Δ119909119894

) ge 0 forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist119905119894ge0 st(

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

)

⪰ 0

(16)

Note that in the last step if 1205752

119894gt 0 then there exists

(Δ119910119894 Δ119909119894) = (0 0) such that quadratic form 120575

2

119894minus(Δ119910

119894 Δ119909119894)2

2

is strictly positive thus the condition of S-lemma holds trulySimilarly we have that

119909119879

119894119876119909119894+ 2120572119879119909119894+ 120573 minus 119910

119894minus 119906119894le 0 forall(119909

119894 119910119894) isin 119880119894

lArrrArr Δ119909119879

119894119876Δ119909119894+ 2(119876

119879119909119894+ 120572)119879

Δ119909119894minus Δ119910119894+ 119903119894minus 119906119894le 0

forall1003817100381710038171003817(Δ119910119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

lArrrArr exist120591119894ge 0 st(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

)

⪰ 0

(17)

Thus the inner maximization problem is equivalent to thefollowing semi-definite programming

(IP) min119906119894 119903119894 119905119894120591119894

119898

sum

119894=1

1199062

119894

st (

119906119894+ 119903119894minus 1199051198941205752

119894minus1

2(119876119879119909119894+ 120572)119879

minus1

2119905119894

01times119899

119876119879119909119894+ 120572 0

119899times1119876 + 119905119894119868119899times119899

) ⪰ 0

119894 isin 119872+

(

119906119894minus 119903119894minus 1205911198941205752

119894

1

2minus(119876119879119909119894+ 120572)119879

1

2120591119894

01times119899

minus119876119879119909119894minus 120572 0

119899times1120591119894119868119899times119899

minus 119876

) ⪰ 0

119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

(18)

Note that based on the Schur complement lemma thesecond-order cone constraint V ge radicsum

119898

119894=11199062119894can also be

formalized as the following semi-definite constraint

(V 119906

119879

119906 V119868119898times119898

) ⪰ 0 (19)

Thus we complete the proof by embedding the equivalentsemi-definite programming into the outer problem

Due to the advance of interior algorithms for conicprogramming the above semidefinite programming can beefficiently solved in polynomial run time There are severalefficient and free software packages for solving the semidef-inite programming such as the SDPT3 [19] Next we makeseveral extensions based on the separable robust least squarequadratic regression model

Mathematical Problems in Engineering 5

23 Ellipsoid Uncertainty Set and More Norm CriterionThe above result on standard ball uncertainty set can befurther extended to that on the following general ellipsoiduncertainty set

1198801015840

119904= 1198801015840

1times 1198801015840

2times sdot sdot sdot times 119880

1015840

119898

1198801015840

119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817119875119894 (Δ119910

119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(20)

where 119875119894

isin 119877119896times(119899+1) Linear transformation operator 119875

119894

allows us to impose more restrictions on the uncertaintyset For example if we choose the diagonal matrix 119875

119894=

Diag1205901 120590

119899+1 we can put different weights on deviation

of components of (119909119894 119910119894) general matrix can further restrict

the correlated deviation of different componentsTo obtain the corresponding reformulation we only need

to modify the first two constraints based on the S-lemma asfollows

(119906119894minus 1199051198941205752

119894

119905119894119875119879

119894119875119894

) + 119865 ⪰ 0 119894 isin 119872+

(119906119894minus 1205911198941205752

119894

120591119894119875119879

119894119875119894

) minus 119865 ⪰ 0 119894 isin 119872+

(21)

We further consider the robust quadratic regressionmodels with 119897

infin-norm and 119897

1-norm criterion Note that for

119897infin-norm criteria the inner maximization problem is of the

following form

max1le119894le119898

max(119909119894 119910119894)isin119880119894

119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

lArrrArr min 119906 119906 ge10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(22)

And for 1198971-norm criteria we have the following equivalent

reformulation

max1le119894le119898

119898

sum

119894=1

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

(119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

lArrrArr min

119898

sum

119894=1

119906119894 119906119894ge

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(23)

Using the similar approach as in Proposition 2 both canbe further reformulated as semi-definite programming

Proposition 3 The separable robust quadratic regressionmodel under 119897

infin-norm and 119897

1-norm criteria are equivalent to

the following semidefinite programming respectively

(119897infin-norm) min

120572120573119876119906119903119905120591

119906

st (

119906 minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) +119865⪰0 119894isin119872+

(

119906 minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus119865⪰ 0 119894isin119872+

119903119894+ 119906 ge 0 119903

119894minus 119906 ge 0 119894 isin 119872

0

119903119894=120573+119909

119879

119894119876119909119894+2120572119879119909119894minus119910119894 119894=1 119898

119905119894ge 0 120591119894ge 0 119903119894isin 119877 119894 = 1 119898

119906 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(1198971-norm) min

120572120573119876119906119903119905120591

119898

sum

119894=1

119906119894

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

)+119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

)minus119865⪰0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(24)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

) (25)

3 Robust Energy-Growth Regression Models

Studies have been reported on the causal relationshipbetween economic growth and energy consumption In thissection we try to apply the proposed robust quadraticregression model to the energy-growth problem

The seminal paper of J Kraft andA Kraft [20] first studiesthe casual relationship for USA In a recent survey Ilhan [21]categorizes the casual relationships into four types no causal-ity unidirectional causality running from economic growth

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Mathematical Problems in Engineering 5

23 Ellipsoid Uncertainty Set and More Norm CriterionThe above result on standard ball uncertainty set can befurther extended to that on the following general ellipsoiduncertainty set

1198801015840

119904= 1198801015840

1times 1198801015840

2times sdot sdot sdot times 119880

1015840

119898

1198801015840

119894= (119909

119894 119910119894) isin 119877119899+1

119909119894= 119909119894+ Δ119909119894

119910119894= 119910119894+ Δ1199101198941003817100381710038171003817119875119894 (Δ119910

119894 Δ119909119894)10038171003817100381710038172 le 120575

119894

(20)

where 119875119894

isin 119877119896times(119899+1) Linear transformation operator 119875

119894

allows us to impose more restrictions on the uncertaintyset For example if we choose the diagonal matrix 119875

119894=

Diag1205901 120590

119899+1 we can put different weights on deviation

of components of (119909119894 119910119894) general matrix can further restrict

the correlated deviation of different componentsTo obtain the corresponding reformulation we only need

to modify the first two constraints based on the S-lemma asfollows

(119906119894minus 1199051198941205752

119894

119905119894119875119879

119894119875119894

) + 119865 ⪰ 0 119894 isin 119872+

(119906119894minus 1205911198941205752

119894

120591119894119875119879

119894119875119894

) minus 119865 ⪰ 0 119894 isin 119872+

(21)

We further consider the robust quadratic regressionmodels with 119897

infin-norm and 119897

1-norm criterion Note that for

119897infin-norm criteria the inner maximization problem is of the

following form

max1le119894le119898

max(119909119894 119910119894)isin119880119894

119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

lArrrArr min 119906 119906 ge10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(22)

And for 1198971-norm criteria we have the following equivalent

reformulation

max1le119894le119898

119898

sum

119894=1

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

(119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

lArrrArr min

119898

sum

119894=1

119906119894 119906119894ge

10038161003816100381610038161003816119910119894minus 119909119879

119894119876119909119894minus 2120572119879119909119894minus 120573

10038161003816100381610038161003816

forall (119909119894 119910119894) isin 119880119894 1 le 119894 le 119898

(23)

Using the similar approach as in Proposition 2 both canbe further reformulated as semi-definite programming

Proposition 3 The separable robust quadratic regressionmodel under 119897

infin-norm and 119897

1-norm criteria are equivalent to

the following semidefinite programming respectively

(119897infin-norm) min

120572120573119876119906119903119905120591

119906

st (

119906 minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

) +119865⪰0 119894isin119872+

(

119906 minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

) minus119865⪰ 0 119894isin119872+

119903119894+ 119906 ge 0 119903

119894minus 119906 ge 0 119894 isin 119872

0

119903119894=120573+119909

119879

119894119876119909119894+2120572119879119909119894minus119910119894 119894=1 119898

119905119894ge 0 120591119894ge 0 119903119894isin 119877 119894 = 1 119898

119906 120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(1198971-norm) min

120572120573119876119906119903119905120591

119898

sum

119894=1

119906119894

st (

119906119894minus 1199051198941205752

119894

119905119894

119905119894119868119899times119899

)+119865 ⪰ 0 119894 isin 119872+

(

119906119894minus 1205911198941205752

119894

120591119894

120591119894119868119899times119899

)minus119865⪰0 119894 isin 119872+

119903119894+ 119906119894ge 0 119903

119894minus 119906119894ge 0 119894 isin 119872

0

119903119894= 120573 + 119909

119879

119894119876119909119894+ 2120572119879119909119894minus 119910119894

119894 = 1 119898

119905119894ge 0 120591

119894ge 0 119906

119894 119903119894isin 119877 119894 = 1 119898

120573 isin 119877 120572 isin 119877119899 119876 isin 119877

119899times119899

(24)

where

119865 = (

119903119894

minus1

2(119876119879119909119894+ 120572)119879

minus1

20 0

1times119899

119876119879119909119894+ 120572 0

119899times1119876

) (25)

3 Robust Energy-Growth Regression Models

Studies have been reported on the causal relationshipbetween economic growth and energy consumption In thissection we try to apply the proposed robust quadraticregression model to the energy-growth problem

The seminal paper of J Kraft andA Kraft [20] first studiesthe casual relationship for USA In a recent survey Ilhan [21]categorizes the casual relationships into four types no causal-ity unidirectional causality running from economic growth

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

6 Mathematical Problems in Engineering

1960 1970 1980 1990 2000 20100

2

Year

2

4

Per c

apita

l ene

rgy

(ton)

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

(b)

Figure 1 Germany data from 1960 to 2006

0

2

4

Year

2

4

6

8

10Pe

r cap

ital e

nerg

y (to

n)

1860 1880 1900 1920 1940 1960 1980 2000 2020

Per c

apita

l GD

P ($

1000

0)

Per capital GDP Per capital energy

(a)

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

(b)

Figure 2 USA data from 1870 to 2006

to energy consumption the reverse case and the bidirectionalcausality Note that the resulted relationships depend onthe selected data and analysis approaches Sometimes theresults obtained from different approaches conflict with eachother when even using the data from the same country Forexample using the Toda-Yamamoto causality test methodBowden and Payne [22] show that energy consumption playsan important role in economic growth in USA based onhistory data from 1949 to 2006 while using the same methodSoytas and Sari [23] find that no causality exists betweenthem based on USA data from 1960 to 2006 On the otherhand based on the sameUSArsquos data from 1947 to 1990 Cheng[24] and Stern [25] conclude different causalities by utilizingdifferent analyzing approaches

Unlike the previous energy-growth studies we attemptto provide a long-run stationary regression model betweenthe per capital GDP (G) and per capital energy consumption

(EC) The underlying assumption of our model is similarto the traditional ldquoconservation hypothesisrdquo that means thatan increase in real GDP will cause an increase in energyconsumption [21] The ldquoper capitalrdquo perspective providesus with a new insight on the causality and new regressionmodels Figures 1 and 2 demonstrate the relationship betweenper capital energy consumption and per capital GDP inUSA and Germany respectively From the subfigures onthe left hand side we can see that in both countries thereis a gradual increase in economy while the per capitalenergy consumption may decrease after reaching a certainlevel the subfigures on the right hand side inspire us toestablish a nonlinear regression model to characterize therelationship

To eliminate effect of the imprecise statistics data weemploy the proposed robust quadratic regression model andput different weights on the residual errors at different time

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Mathematical Problems in Engineering 7

Table 1 LS-CQR and LS-RQR models with different 120598

Model 119876 120572 120573 Err 119879 (s)CQR 120598 = 000 minus4254 6721 minus6099 1688 0000RQR 120598 = 001 minus3899 6225 minus5433 1621 0500RQR 120598 = 002 minus3690 5938 minus5063 1663 0500RQR 120598 = 003 minus3423 5561 minus4564 1735 0500RQR 120598 = 004 minus2900 4755 minus3363 2029 0516RQR 120598 = 005 minus2243 3719 minus1817 2574 0484

points Specifically we establish the followingweighted robustquadratic regression model

min120572120573119902

max(119866119905 119864119862119905)isin119880

120598

119905

(

119879

sum

119905=1

(119908119905(EC119905minus 1199021198662

119905minus 2120572119866

119905minus 120573))119901

)

1119901

(26)

where the weight factor 119908119905

isin [0 1] represents the relativeimportance of the predicted residual error in the 119905th year Wecould set 119908

119905= 0 for the abnormal data point and set 119908

119905as an

increase function of 119905 to emphasize the importance of recentdata The uncertainty set is defined as

119880120598

119905= (119866

119905EC119905)

10038171003817100381710038171003817(119866119905minus 119866119905) (EC

119905minus EC119905)100381710038171003817100381710038172

le 120575119905 (27)

where 120575119905

= 120576radic1198662

119905+ EC2119905 Parameter 120576 controls the relative

amplitude of the fluctuation in observed dataThe weighted robust quadratic regression model can be

summarized as follows

(1) Solve the classical quadratic regression model usingthe nominal values (119866

119905EC119905)119879

119905=1

(2) Based on the quadratic regression remove the datawith the first 119896 largest residual errors and set weightsvalue 119908

119905

(3) Solve the equivalent semi-definite programmingproblem and return the final weighted robustquadratic regression model

4 Numerical Experiments

In this section we verify the effectiveness of the proposedrobust quadratic regression models on several data sets Theequivalent semi-definite programming problem is solved bythe SDPT3 solver [19] Numerical experiments are imple-mented usingMATLAB 770 and run on Intel(R) Core(TM)2CPU E7400

First we test the proposed robust least square quadraticregression (LS-RQR) model with Germany data from 1960 to2006As previously discussed after the preliminary quadraticregression analysis we will remove the data with the first 119896largest residual errors where 119896 = 3timesdata sizeThen for therest of data we establish the classical least square quadraticregression (LS-CQR) and LS-RQR models respectively

Table 1 lists the computation results for LS-CQR andLS-RQR with a series of 120598 values The listed Err value

History data

15

20

25

30

35

40

45

50

Per c

apita

l ene

rgy

(ton)

06 08 10 12 14 16 18 20Per capital GDP ($10000)

LS-RQR ( = 001)LS-CQR ( = 000)

LS-RQR ( = 002)

LS-RQR ( = 004)LS-RQR ( = 003)

LS-RQR ( = 005)120598

120598

120598

120598

120598

120598

Figure 3 LS-CQR and LS-RQR models on Germany data

0

2

4

6

8

10

12

14

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 001)

LS-RQR ( = 003)LS-RQR ( = 005)

120598

120598

120598

120598

Figure 4 Mean square error of LS-CQR and LS-RQRmodels when120598 varies

represents the mean square error from the nominal valueand 119879 represents the run time for solving the optimizationproblem It is seen that the resulted robust model exhibitssmaller absolute values of 119876 120572 and 120573 with the increase of120598 value that is the regression curve is more flat as the model

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

8 Mathematical Problems in Engineering

15

20

25

30

35

40

45

50

History data

06 08 10 12 14 16 18 20

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

L1-RQR ( = 002)L2-RQR ( = 002)LI-RQR ( = 002)

120598

120598120598

Figure 5 RQR models under 1198971- 1198972- and 119897

infin-norm criteria

2

3

4

5

6

7

8

9

10

00 05 10 15 20 25 30 35

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

5

10

15

20

25

30

Err

000 002 004 006 008 010

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 6 USA data from 1870 to 2006

parameters are less precise It is obvious that one drawback ofthe robustmodel is that themean square error will increase asuncertainty increases Figure 3 plots the regression curves fordifferent models and also supports our analysis of the effectof increasing data uncertainty on robust regression

To demonstrate the effectiveness of the robust modelswe test the worst-case performance of the resulted modelswhen 120598 varies from 0 to 01 Specifically for each 120598 valuewe randomly generate 500 groups of data from the defineduncertainty set 119880120598

119905and then calculate the maximal residual

error at each data point Figure 4 plots the worst-case errorof LS-CQR model and LS-RQR models with 120598 = 001 003and 005 It is seen that the error of LS-CQR model increases

rapidly and LS-RQR with 120598 = 005 has the most flat errorcurve Figure 4 also indicates that it is critical to accuratelyestimate the variability of the data and set proper value for120598 In our case we recommend LS-RQR with 120598 = 003 that isalmost always better than the traditional LS-CQR model

Next we test the proposed RQR models under 1198971(L1-

RQR) and 119897infin-(LI-RQR) norm criteria on the same data set

Figure 5 plots the corresponding regression curves for thesame uncertainty set 120598 = 002 For the same 120598 value LI-RQR model can be considered as the most robust one andL1-RQR andL2-RQRmodels are similar It is noticeable that itcontradicts with the traditional robust regression terms Forexample [26] refers to the 119897

1-norm regression as the robust

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Mathematical Problems in Engineering 9

10 12 14 16 18 20 22 24

35

40

45

50

55

60Pe

r cap

ital e

nerg

y (to

n)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)120598

120598

(b)

Figure 7 Switzerland data from 1965 to 2006

06 08 10 12 14 16 18 20 22 24

25

30

35

40

45

50

55

60

Per c

apita

l ene

rgy

(ton)

Per capital GDP ($10000)

History dataCQRRQR ( = 003)120598

(a)

0

2

4

6

8

10

12

000 002 004 006 008 010

Err

LS-CQRLS-RQR ( = 003)

120598

120598

(b)

Figure 8 Belgium data from 1960 to 2006

regression model in the sense that the corresponding modelis insensitive to the large residual errors(corresponding to theoutliers)However after removing the possible abnormal datapoints here we try tomake our regression analysis insensitiveto the worst-case residual errors at each data point

Finally we apply the proposed RQR model on more datasets including USA data from 1870 to 2006 Switzerland datafrom 1965 to 2006 and Belgium data from 1960 to 2006Figures 6 7 and 8 give the resulted regressionmodels and theworst-case residual errors for different 120598 values It is seen thatthe proposed RQRmodels still almost always outperform the

CQR model especially for large uncertainty sets Based onthe robust quadratic regression models these three countriesreach the highest per capital energy consumption points atper capital GDP value around 23 000 while the peak valuesvary from 57 to 85 Ton

5 Conclusions and Future Works

In this paper we studied themultivariate quadratic regressionmodel with imprecise statistic data Unlike the traditionalrobust statistic approaches that focus on the detection of

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

10 Mathematical Problems in Engineering

the outliers and the elimination of the effects we employedthe recently developed robust optimization framework anduncertainty set theory

In particular we first extended the existing robust lin-ear regression results to the robust least square quadraticregression model with the separable ball uncertainty setThe specific form of the uncertainty set allowed us to usethe well-known S-lemma and give the tractable equivalentsemidefinite programmingWe further generalized the resultto robust models under 119897

1- and 119897

infin-norm criteria with general

ellipsoid uncertainty sets Next the proposed robust modelswere applied to the energy-growth problem Under the clas-sical conservation hypothesis we employed the traditionalquadratic regression model to remove the abnormal dataand established a robust quadratic regression model forthe per capital GDP and per capital energy consumptionFinally the proposed models were tested on the historydata of Germany USA Switzerland and Belgium From thenumerical experiments we found that (1) the amplitude of theuncertainty perturbation 120575 plays a critical role on the robustmodels (2) with the increase of 120575 the robust model has amore flat curve (3) for the same 120575 value compared with 119897

1-

and 1198972-normmodels 119897

infin-norm model is the most robust one

(4) as expected the robust approach provides a serial robustregression models that can reduce the worst-case residualerrors when the observed data contain noise

For further research robust polynomial (nonlinear)regressionmodels are interesting in their own right Althoughwe may always reduce them to the linear regression modelwith polynomially (or nonlinearly) transformed uncertaintydata set it is still worth studying whether the resultedregression models are solvable for quadratic regression withcoupled uncertainty sets

Acknowledgment

This work was supported by Geological Survey Project ofChina (nos 1212010881801 1212011120995)

References

[1] Z Griliches and V Ringstad ldquoErrors-in-the-variables bias innonlinear contextsrdquo Econometrica vol 38 no 2 pp 368ndash3701970

[2] W A Fuller Measurement Error Models John Wiley amp SonsNew York NY USA 1987

[3] T Erickson and T M Whited ldquoTwo-step GMM estimationof the errors-in-variables model using high-order momentsrdquoEconometric Theory vol 18 no 3 pp 776ndash799 2002

[4] P J Cornbleet and N Gochman ldquoIncorrect least-squaresregression coefficientsrdquo Clinical Chemistry vol 25 no 3 pp432ndash438 1979

[5] J W Gillard ldquoAn historical overview of linear regression witherrors in both variablesrdquo Tech Rep Cardiff University Schoolof Mathematics Cardiff UK 2006

[6] L El Ghaoui and H Lebret ldquoRobust solutions to least-squaresproblemswith uncertain datardquo SIAM Journal onMatrixAnalysisand Applications vol 18 no 4 pp 1035ndash1064 1997

[7] P K Shivaswamy C Bhattacharyya and A J Smola ldquoSecondorder cone programming approaches for handling missing anduncertain datardquo Journal ofMachine Learning Research vol 7 pp1283ndash1314 2006

[8] A Ben-Tal L El Ghaoui and A Nemirovski Robust Optimiza-tion Princeton University Press Princeton NJ USA 2009

[9] T B Trafalis and R C Gilbert ldquoRobust classification andregression using support vector machinesrdquo European Journal ofOperational Research vol 173 no 3 pp 893ndash909 2006

[10] H Xu C Caramanis and S Mannor ldquoRobustness and reg-ularization of support vector machinesrdquo Journal of MachineLearning Research vol 10 pp 1485ndash1510 2009

[11] T B Trafalis andRCGilbert ldquoRobust support vectormachinesfor classification and computational issuesrdquoOptimizationMeth-ods amp Software vol 22 no 1 pp 187ndash198 2007

[12] P J Huber Robust Statistics JohnWiley amp Sons New York NYUSA 1981

[13] J L Wu and P C Chang ldquoA trend-based segmentation methodand the support vector regression for financial time seriesforecastingrdquo Mathematical Problems in Engineering vol 2012Article ID 615152 20 pages 2012

[14] D X She and X H Yang ldquoA new adaptive local linearprediction method and its application in hydrological timeSeriesrdquoMathematical Problems in Engineering vol 2010 ArticleID 205438 15 pages 2010

[15] Z Liu ldquoChaotic time series analysisrdquoMathematical Problems inEngineering vol 2010 Article ID 720190 31 pages 2010

[16] M Li ldquoFractal time series a tutorial reviewrdquo MathematicalProblems in Engineering vol 2010 Article ID 157264 26 pages2010

[17] T Farooq A Guergachi and S Krishnan ldquoKnowledge-basedgreenrsquos kernel for support vector regressionrdquo MathematicalProblems in Engineering vol 2010 Article ID 378652 16 pages2010

[18] I Polik and T Terlaky ldquoA survey of the S-lemmardquo SIAMReviewvol 49 no 3 pp 371ndash418 2007

[19] K C Toh R H Tutunu and M J Todd ldquoOn the imple-mentation and usage of SDPT3Ca Matlab software package forsemidefinitequadratic-linear programmingrdquo version 4 0 2006httpecommonslibrarycornelleduhandle181315133

[20] J Kraft and A Kraft ldquoOn the relationship between energy andGNPrdquo Journal of Energy and Development vol 3 no 2 pp 401ndash403 1978

[21] O Ilhan ldquoA literature survey on energy growth nexusrdquo EnergyPolicy vol 38 pp 340ndash349 2010

[22] N Bowden and J E Payne ldquoThe causal relationship betweenUSenergy consumption and real output a disaggregated analysisrdquoJournal of Policy Modeling vol 31 no 2 pp 180ndash188 2009

[23] U Soytas and R Sari ldquoEnergy consumption economic growthand carbon emissions challenges faced by an EU candidatememberrdquo Ecological Economics vol 68 no 6 pp 1667ndash16752009

[24] B Cheng ldquoAn investigation of cointegration and causalitybetween energy consumption and economic growthrdquo Journalof Energy Development vol 21 no 1 pp 73ndash84 1995

[25] D I Stern ldquoEnergy and economic growth in the USA amultivariate approachrdquo Energy Economics vol 15 no 2 pp 137ndash150 1993

[26] S Boyd andLVandenbergheConvexOptimization CambridgeUniversity Press Cambridge UK 2004

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Robust Quadratic Regression and Its ...downloads.hindawi.com/journals/mpe/2013/210510.pdf · quadratic regression model with ball uncertainty set. is result is then

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of