Multi Linear Regression Handout 2x1

8/18/2019 Multi Linear Regression Handout 2x1

1/67

31-03-2016

1

Automation LabIIT Bombay

CL 202: Introduction to Data AnalysisLinear and Nonlinear Regression

Sachin C. Patawardhan and Mani Bhushan

Department of Chemical Engineering

I.I.T. Bombay

31-Mar-16 Regression 1


Outline

Mathematical Models in Chemical Engineering

Linear Regression Problem

Ordinary and Weighted Least Squares formulations throughalgebraic viewpoint and geometric interpretations

Ordinary and Weighted Least Squares formulations through

probabilistic viewpoint Ordinary Least Squares and Minimum Variance Estimation

Ordinary Least Squares and Maximum Likelihood Estimation

Confidences intervals for parameter estimates andhypothesis testing

Nonlinear regression problem: Nonlinear in parametermodels and maximum likelihood parameter estimation

Examples of linear and nonlinear regression

Appendix: Ordinary Least Squares and Cramer-Rao Bound



2/67

31-03-2016

2


Mathematical Models

Mathematical Model: mathematical description of a real

physical process

Used in all fields: biology, physiology, engineering, chemistry,

biochemistry, physics, and economics

Deterministic models: each variable and parameter can be

assigned a definite fixed number or a series of fixed

numbers, for any given set of conditions.

Stochastic models: variables or parameters used to

describe the input-output relationships and the structure of

the elements (and the constraints) are not precisely known



Elements of a Model


Independent inputs (x)

Output (y) (dependent variable)

Parameters (θ)

Transformation operator (T) Algebraic

Differential

),..,,,..( 11 mn x xT 1 x

n x y

Mathematical Model


3/67

31-03-2016

3


Mathematical Models

Models are used for

Behavior Prediction/Analysis: Understand the influence of the

independent inputs to a system on the observed system output

System/process/material design

Catalyst design, membrane design

Equipment Design: sizing of processing equipment

Flow-sheeting: deciding flow of material and energy in a

chemical plant

System / process operation: monitoring and control, safety and

hazard analysis, abnormal behavior diagnosis



Models in Chemical Engineering

Models popularly used in chemical engineering

Transport phenomena based models: continuum equations

describing the conservation of mass, momentum, and energy

Population balance models: Residence time distributions

(RTD) and other age distributions

Empirical models based on data fitting: Typical example-

polynomials used to fit empirical data, thermodynamic

correlations, correlations based on dimensionless groups

used in heat, mass and momentum transfer, transfer

function models used in process control



4/67

31-03-2016

4


Empirical Modeling

Exact expression relating the dependent and the

independent variable may not be known

Weierstrass theorem states that any continuous function

can be approximated by a polynomial function with arbitrary

degree of accuracy.

Invoking Weierstrass theorem, relationship between the

dependent and independent variables is approximated as a

polynomial

The order of polynomial used typically depends on range of

values over which approximation has been constructed.



Empirical Modeling Examples


][

][

43

2

2

T T T for T T R

T T T for bT a R

1

resistanceofdependenceeTemperatur

][

][

43

2

2

T T T for T T C

T T T for bT aC

C

p

1p

pofdependenceeTemperatur

32

2

nnnT

cnbnaT

atomscarbonofno.offunctionas

serieshomologousainnshydrocarboofpointBoiling


5/67

31-03-2016

5


Empirical Modeling Examples


][][

][][

434

22

212

P P P for T T T for

fPT eP dT cP bT aY

P P P for T T T for

P T Y

3

1

yieldreactionofdependencepressureandeTemperatur

pa Nu /RePr transferheatinmodels

basedgroupessDimensionl

A

RT E C ek /0Ar-

EquationsRateReaction

.

:

:

)1(1

volatilityrelative:

fractionmolevapor

fractionmoleliquid

:ModelVLESimplified

y

x

x

x

y


Linear in Parameter Models


Defining

formabstractfollowingtheindrepresente

becanthatmodelsconsiderwewithbeginTo

)(...)()(

....

2211

21

xxx

x

p p

T

m

f f f y

x x x

v z z z y

y

v

p p ...

,

2211

thenoftmeasurementheinerrorsandmodelingin

errorsfromarisingerrorcombinedadenoteLet

writecanwevariables,newDefining

p p

ii

z z z y

f z

...

),(

2211

x

Sources of error• Measurement Errors in dependent variable (y)• Modeling or Approximation Errors


6/67

31-03-2016

6




For the class of models considered till now, the dependentvariable is a linear function of model parameter

)()()( 2121 xxxx g g g :DefinitionFunctionLinear

vθ v z z z y

θ

z z z

T

p p

T

p

T

p

z

z

Defining

...

....

....

2211

21

21

minimized.isobjectivescalarsome

thatsuchestimateequations

modelandexperimenttindependenfromgenerated

and

setsdataGiven

n

i

T i

i

n Z n y

vvv

θ nivθ y

n

y y y

,....,,

,...,,:

,....,,,........,,

) (

) ( ) ( ) (

21

2121

21

z

zzzSS




iwvwvv

or vvv

i

n

i

iin

n

i

in

allforwhere

:norm-2

FunctionObjectiveofChoice

0,...,

,...,

1

2

1

1

2

1

In practice, the 2-norm

based formulation is

preferred over the other

two choices because of

(a) Amenability to theanalytical treatment

(b) Ease of geometric

interpretations

(c) Ease of interpretation

from viewpoint of

probability and statistics

n

i

iin

n

i

in vwvvvvv1

1

1

1 ,...,,..., or

Norm-1

Norm-

in vi

Maxvv ,...,1


7/67

31-03-2016

7


Model Parameter Estimation


)()2()1(2121

21

,....,,,........,,

1)()(

n

Z n y

T T

y y y

x x f x f

vbz az vbxa y

a,b

zzzSS

z

θ

and from

formtheofmodellinearsimpleaof

)(parametersofestimationConsider

)...(

......

)...(

.....

)2...(

)1...(

222

111

nvbxa y

ivbxa y

vbxa y

vbxa y

nnn

iii

V

...

...

......

......

Y

...

...

n

i

n

i

n

i

v

v

v

v

θ

b

a

x

x

x

x

y

y

y

y

2

1

2

1

2

1

1

1

1

1

A



Number of unknown variables =

2 (parameters) + n (errors)

Number of equations = n

Number of equations < number of unknowns

The system of linear equation has infinite number of

solution.

To estimate model parameters, we resort to optimization

The necessary conditions for optimality provide 2 additional

constraints so that the combined system of equations has a

unique solution.



8/67

31-03-2016

8




00

2121

21

b

J

a

J

nibz az yv

vvv J

ii

ii

n

and

areoptimalityforconditionsnecessarythe

for

where),functionscalaraDefining

,....,

,....,(

n

i

iii

n

i

i

iwvw J

v J

1

2

1

2

0)(

)(

)allfor(

squareleastWeighted

squareleastOrdinary

measurescalarusedcommonlyMost

Quadratic objective function(a) Leads to analytical solution(b) Has nice geometric

interpretation(c) Facilitates interpretation

and analysis throughstatistics


Ordinary Least Squares


Y ˆ

Y V V

V V Y V

T T

OLS

T T

n

i

i

T

θ

θ θ θ

J

v J θ

AAA

AA

A

1

1

2

02

becomesoptimalityforconditionNecessary

notationmatrix-vectorUsing

symmetriciswhen

vectoraw.r.t.functionscalaraofationdifferentiofRules

BBxx

Bxx

xBy

ByxBy

x

Byx

2)(

)()(

T

T T T


9/67

31-03-2016

9


Geometric Interpretations


V ˆˆY V ˆ * θ θ θ AA areresidualsmodelEstimated

V ˆY ˆY

ˆY ˆ

:

V Y *

*

havewe

Defining

ParametersTrue

behaviorTrue:Assumption

θ

θ

θ

A

A

b

x

x

x

aθ

n

ˆ...

ˆ...

ˆY ˆ

2

1

1

1

1

A Amatrixofspacecolumn

theinliesVector Y ˆ

θ

b

a

x

x

x

x

y

y

y

y

v

v

v

v

n

i

n

i

n

i

ˆ

ˆ

ˆ

......

......

Y

...

...

V ˆ

ˆ...

ˆ

...

ˆ

ˆ

A

1

1

1

1

2

1

2

1

2

1




.ofspacecolumntheonvectorofprojectiona: AYŶ

.V ˆ

V ˆV ˆˆ

A

AAAYA

ofspacecolumnthetolarperpendicuisvectori.e.

impliesoptimalityforconditionNecessary

0 T T T T θ

HHH

AAAAH

AAAAI

2

1

1

i.e.matrixidempotentis:Note

matrix.)projection(orHatasknownisT T

T T Y Y ˆY V ˆ

AHI

H

ofspacecolumnthetoorthogonal

Aofspacecolumntheinlying :

componentsorthogonaltwointosplitisVector

:Y V ˆ

Y Y ˆ

Y


10/67

31-03-2016

10





Ethanol-Water Example


Experimental DataDensity and weight percent of ethanol in ethanol-water mixture

Ref.: Ogunnaike, B. A., Random Phenomenon, CRC Press, London, 2010


11/67

31-03-2016

11


Ethanol-Water Example




Quadratic Polynomial Model


v fPT eP dT cP bT aY 22 yieldreactionofdependencepressure

andetemperaturformodelConsider

nivθ Y

f ed cbaθ

T P P T P T

i

T i

i

T

T

iiiiii

i

,.....,2,1

1

)(

22)(

for

Defining

z

z

nivT fP eP dT cP bT aY

P T Y P T Y P T Y

iiiiiiii

nnn

,...,2,1

),,(),......,,,(),,,(

22

222111

for equationsmodelingCorrespond

:availableData

1166

1

1

1

1

2

1

222

111

2

1

n

v

v

v

θ

f

b

a

n

T P T

T P T

T P T

n

Y

Y

Y

nnnnn

V

.....

....

........

....................

........

........

Y

....

A


12/67

31-03-2016

12


Generalization of OLS


)()2()1(212211

,....,,,........,,

...

n

Z n y

p p

y y y

v z z z y

zzzSS

θ

and from

formtheofmodellinear-multigeneralaof

vectorparameterofestimationconsiderThus,

111

2

1

2

1

21

22

2

2

1

11

2

1

1

2

1

n

v

v

v

pθ

θ

θ

pn

z z z

z z z

z z z

n

y

y

y

n pn

p

nn

p

p

n

V

.....

............

....................

........

........

Y

....

A


Weighted Least Square


V Y

V V

....

θ

vw J θ

Min

iw

wwwdiag

T

n

i

ii

i

n

A

W

W

toSubject

asformulatedbecanproblemregressionrmultilineaThe

allfor

Let

matrixweightingDefining

1

2

21

0

optimalityforconditionnecessarytheUsing

Y ˆ

Y V V

WAWAA

AWAW

TT 1

02

θ

θ θ θ

J T T

OLStoWLSreducesSelecting nn IW


13/67

31-03-2016

13


Example: Multi-linear Regression


Laboratory experimentaldata on Yield obtained from acatalytic process at varioustemperatures and pressures

P .T..Y 21307570975 ˆ

modellinear-multiFitted

Ref.: Ogunnaike, B. A., RandomPhenomenon, CRC Press, London,2010


Reactor Yield Data



14/67

31-03-2016

14


Estimated Model


P .T..Y 21307570975ˆ





Boiling points of a series of hydrocarbons



15/67

31-03-2016

15


Candidate Models


0 1 2 3 4 5 6 7 8 9 10-250

-200

-150

-100

-50

0

50

100

150

200

250

B o i l i n g P o i n t ( 0 C )

n, No. of Carbon Atoms

Linear ModelT = 39*n - 170Quadratic Model

T = - 3*n2 + 67*n - 220

Data 1

Linear Model

Quadratic Model

2nnT

bnaT

:ModelQuadratic

:ModelLinear


Unaddressed Issues

Model parameter estimates change if

the data set size, n, matrix A and vector Y change

Matrix A is same but only Y changes (due to

measurement errors)

n is same but a different set of input conditions i.e.

different A matrix is chosen

How do we compare estimates generated through two

independent sets of experiments?

Can we come up with confidence intervals for ‘true’

parameters?



16/67

31-03-2016

16


Need for Statistical Approach

If we have multiple candidate models, how does one

systematically select a most suitable model?

If identified model is used for prediction, how to quantify

uncertainties in the model predictions?

Linear algebra/optimization based treatment of model

parameter estimation problem does not help in answering

these questions systematically.

Remedy: Formulate and solve the parameter estimation

using framework of probability and statistics



Notations


n y

ni

n

y y y

Y Y Y Y

Y Y Y

n

,........,,

,........,,

,........,,

21

21

21

S

ofnsrealizatioofsetai.e.,eachforone

s,experimenttindependennfromcollectedissetData

variablesrandomntindepedndeConsider

vectorparameterTrue

andRVsofnsrealizatio

relatingModel

ErrorRandomrelating

RVforModel

:*

*)(

*)(

θ

vθ y

V Y

V θ Y

V

Y

i

T i

i

ii

i

T i

i

i

i

z

z

i

T i

i

ii

i

T i

i

i

i

v y

V Y

V Y

V

Y

ˆˆ

ˆˆ

ˆˆ

ˆ

ˆ

) (

) (

z

θ

θz

θ

and,RVsofnsrealizatio

relatingModel

RV)(anestimatesparameterand

ResidualsModel

relatingRVforModel


17/67

31-03-2016

17


Context Sensitive Notations


111

2

1

2

1

21

22

2

2

1

11

2

1

1

2

1

n

v

v

v

pθ

θ

θ

pn

z z z

z z z

z z z

n

y

y

y

n pn

p

nn

p

p

n

V

........

........

.................... ........

........

Y

....

A

)1(

....

)1()(

........

....................

........

........

)1(

....

2

1

2

1

)()(

2

)(

1

)2()2(

2

)2(

1

)1()1(

2

)1(

1

2

1

n

V

V

V

pθ

θ

....

θ

θ

pn

z z z

z z z

z z z

n

Y

Y

Y

n pn

p

nn

p

p

n

VAY VariablesRandomof

Vectorsrepresent

boldandBold

:Note

VY

VariablesRandomof

ns"realizatio"of

vectorsrepresent and

:Note

V Y


Notations


)ofnrealizatio(aEstimatesParameter:(ordinary)

vector)variable(randomEstimatesParameter:(bold)

RV)aNOT (fixed,vectorparameterTrue

Note

θ

θ

ˆˆ

ˆ

:*

θ

θ

nii

n

Z

,...,,:

,....,,

)(

)()()(

21

21

z

zzzS

ofknowledgeor

tsmeasuremeninerrorsnoaretherei.e.

vectorsknownprefectlyofconsists

Set

:assumptiongsimplifyinmajorA


18/67

31-03-2016

18


Regression Problem Formulation


2

11

0

V Var V E

z z Y V pn

and

i.e.,variancewithRVmeanzeroais

errormodelingthethatassumeusLet

2

* * ...

)(v F V offormtheaboutmadebeen

hasassumptionNOstagethisAt:Note

.parametersmodeltruetheredpresentwhere

Thus,exactly.knownandvector ticdeterminisaisthatassumedisIt

* *

* *

,...,

...

p

p p z z Y E

1

11

z




d.distribute yidenticallandtindependenarefor

eachthatassumedfurtherisIt

ni

z z Y V

V

i

p p

i

ii

i

,....,

... * *

21

11

d.distribute yidenticallNOT arebuttindependenare

for RVs:Note niV z z Y i

i

p p

i

i ,....,... * * 21

11

ii p pii

i

p

i

i

V z z Y

n

,...,n ,:i ,...,z z y

* * ...

,

11

1 21

equationsmodelingcorrespondand

sexperimenttindependenfromgenerated

datasetaconsiderNow

S


19/67

31-03-2016

19




VAY

VAY

*

*

*

*

.......

...

....

.........

....

....

....

θ

n

V

V

V

pθ pn

z z

z z

z z

n

Y

Y

Y

nn

p

p

p

p

n

or,

haveweequations,modelallcollecting

notation,vectortheUsing

11

1

2

1

1

1

1

22

1

11

1

2

1




and

thatimpliesit

Since

**

,

θ θ E E

E

V E

n

i

AVAY

0V

1

0

nn

T

i

i

T

E

V

niV

E Cov

IVVR

VVVR

2][

var(

,...,2,1:

][)(

thatfollowsit,)and

IID,betoassumedareSince

Let

2

VAY *

* "

θ

θ

tsmeasuremenfrom

constantunknown"Estimate:Problem


20/67

31-03-2016

20



3/31/2016 State Estimation 39

θ

θ θ

θ

T T

torespectwith

functionobjectiveminimizingbyobtainedis

ofestimate(OLS)squareleastOrdinary

AA Y Y V V

errorsmodelingtheofvariancesampletheminimizes

thatestimatoranasviewedbecanOLSThus,

variance)sample(i.e. :Note

nnS vn

i

i

T 2

1

2V V

?ofestimateunbiasedanIs * ˆ

Y ˆ

θ θ

θ

OLS

T T

OLS AAA

1




.ˆ

ˆ

*

* *

θ

θ θ

E E

OLS

T T

T T

OLS

ofestimateunbiasedanisThus

sidesthebothonnsexpectatioTaking

θ

AAAA

YAAAθ

1

1

VAAAAYAAAθ

θ

Y

* ˆ

,ˆˆ

Y

θ

θ

T T T T

OLS

OLS OLS

11

i.e.,RVofnrealizatioaasviewedbecan

thatfollowsit

RVofnrealizatioaisSince


21/67

31-03-2016

21




122

1

AALLLRL

LVVLθθθ

LVLYθ

AAAL

θ

T T T

T T T

OLS OLS OLS

OLS

T T

E θ θ E Cov

θ

* * ˆ

*

ˆˆˆ

ˆ

matrixDefining

samplesfromEstimate:Remedy

practiceinknownnotis:Difficulty2

2

OLS

T θ pn

ˆY V ˆV ˆV ˆˆ A

where12

12 AAθ T OLS

Cov ̂) ˆ( ofEstimate


Minimum Variance Estimator


possible.assmallasis

thatsuch,unknown,ofestimate,

unbiasedanfindtowantweSuppose

T

p

θ θ E Cov

Rθ

* *

*

ˆˆˆ

,ˆ

θθθ

θ

R V0V

V

A

VAY

Y

Cov E

R

pn

θ

R

n

n

n

and

thatsuchvariablesrandomofvectoraisand

matrixknownaiswhere

modelaandtsmeasuremenGiven

1

*

Note: Here R is a symmetric and positive definite matrix


22/67

31-03-2016

22




T

T

θ θ E

θ θ E Cov

* *

* * ˆˆˆ

LYLY

θθθ

matrixaiswhere

formtheofestimatorlinearaproposeuslet

solution,OLSthefromcluesTaking

)(ˆ n p LLYθ

possible.assmallasisthatsuchMatrixFind:ProblemEstimationParameter

VarianceMinimum

θL ˆCov




ILA

L

0V

thatsuchchooseweifonlyandifholdwill

conditionssunbiasednetheSince , E

0AVILA

0VALLYθ

*

* * * * ˆ

θ E

θ θ E θ E θ E

thatimpliestrequiremenssunbiasedneThe

ILA

LYθLL

tosubject

ProblemEstimatonParametervarianceMinimum

Cov J Cov J J Min ˆ.


23/67

31-03-2016

23




ILA

LYLYLL

θ

L

tosubject

asformulatedbecanoffunctionscalara

minimizesthatfindingofproblemtheThus,

T θ θ E tr J

Min

Cov

* *

) ˆ(

2

1

pT

T

Var Var Var θ θ E tr

θ θ E tr Cov J

θθθθθ

θθθ

ˆ....ˆˆˆˆ

ˆˆˆ

* *

* *

21

2

1

functionscalaraConsiderfunction.objectivescalara

constructtoneedweproblem,onoptimizatianformulateTo




LVVALLYθ

ILA

θθθθ

* *

* * * *

ˆ

,

ˆˆˆˆ

θ θ

θ θ E θ θ E T T T

thatfollowsitSince

smultiplierLagrangeofmatrixtherepresents wherew.r.t.minimizedis

thatsuchmatrixfindingtoequivalentisThis

,

ˆˆ * *

L

ILAθθ

L

tr θ θ E tr J T

2

1


24/67

31-03-2016

24




L

ILALRL

torespectwith

functionobjectiveminimizingasedreformulat

becanproblemonoptimizatitheThus,

tr tr J T 2

1

T T T T

T T T T

T

θ θ E

θ θ

E θ θ E

E

LRLθθ

LRL

LVVLθθ

R VV

* *

* *

* *

ˆˆ

ˆˆ

,

or

thatfollowsitSince




][

][

0LAI

0ALR L

BBAABAA

CBBACA

J

J

tr tr

T T

T T T T

areoptimalityforconditionsnecessarythe

and

results Using

][1

1

0AR AILAI

R AL

T T

T T

and

haveweThus,


25/67

31-03-2016

25




1111

11

R AAR AR AL

AR A

T T T T

T T

thatfollowsitand

impliesThis

YR AAR ALYθ 111ˆ T T MV isestimatorvarianceminimumtheThus

11

111

AR Aθθθ

VR AAR Aθ

T T

MV MV

T T

MV

θ θ E Cov

θ

* * min

*

ˆˆˆ

ˆ

implieswhich

estimatorvarianceminimumtheFor


Gauss Markov Theorem


estimatorMVthe yieldsselectingthatindicates

solutionsquareleastweightedthewith

estimatorvarianceminimumtheComparing

1

1

111

ˆ

ˆ

R W

YWAWAAX

YR AAR ALYθ

T T

WLS

T T

MV

Gauss-Markov theoremThe minimum variance unbiased linear estimator isidentical to the weighted least square estimatorwhen the weighting matrix is selected as inverseof the “measurement error” covariance matrix.


26/67

31-03-2016

26


Regression: OLS as MV Estimator


)ˆ()ˆ(

ˆ)()(ˆ

,)(

12ˆ

112112

2

OLS

T

MV

OLS

T T T T

MV

CovCov

Cov

θAAθ

θYAAAYIAAIAθ

IVR

θ

i.e.estimator,varianceminimumtheis

estimatorOLSthethatfollowsit

problemregressionthetoReturning

MV

Covtr Covtr

θ

θθ

LLYLθ

ˆ~

~,~~

,

havewillwhere say

ofestimatorunbiasedlinearotherAny


Insights

OLS is an unbiased parameter estimator. The variance

errors in the parameter estimates can be reduced by

increasing the sample size.

OLS estimator can be viewed as an estimator

that minimizes sample variance of model residuals

that yields the parameter estimates with the minimum

possible variance (the most efficient linear estimator)

This is how far we can go without making any assumption

about the distribution of the model residuals.



27/67

31-03-2016

27


Need to Choose Distribution For selecting a suitable ‘black-box’ model that

explains the data best from candidate models, weneed to test hypothesis whether an estimatedmodel coefficient is ‘close to zero’ or ‘not close tozero’, i.e. whether the associated term in themodel can retained or neglected

We need to generate confidence intervals for thetrue model parameters

We need to use the estimated model for carryingout predictions

Thus, we cannot proceed further unless we selecta suitable distribution for the model residuals



1850 1900 1950 2000-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

G l o b a l T e m p e r a t u r e D e v i a t i o n

Year

Data

Linear Model

Quadratic Model

Example: Global Temperature Rise


.

V t .t .Y

V t Y

-

-

25

3

103053123.04114

10168.4187.8

Model

Developedusing OLS


28/67

31-03-2016

28


Statistics


63

31

22

3

10194100748

1007485615

1082081

1016841878

..

..

.ˆ

..ˆ

-

-

θ

T

T

OLS

AA

ModelLinear

974

521

214

1

22

5

10118310202110161

1020211063410464

101611046410294

105821

10305312304114

...

...

...

.ˆ

...ˆ

AAT

T

OLS θ

ModelQuadratic


Example: Global Temperature Rise


Histogramof Linear model

Residuals (normalized)

Histogramof Quadratic model

residuals (normalized)

-3 -2 -1 0 1 2 30

5

10

15

Normalized Residual

F r e q u e n c y-3 -2 -1 0 1 2 3

0

2

4

6

8

10

12

14

16

18

Normalized Residual

F r e q u e n c y

dev.std.sample

:residualNormalized

:ˆˆ

ˆ~

i

i

vv


29/67

31-03-2016

29


Choice of Distribution

Least Squares (LS) estimation Penalizes square of the deviations from zero error (i.e.

mean)

Thus, it ‘favors’ errors close to zero

Moreover, positive and negative errors of equalmagnitude are ’equally penalized’

Consequence: Histograms of the model residualsare approximately bell shaped in most LSestimation

Thus, it is reasonable to assume that the modelresiduals have Gaussian/normal distribution

This choice also follows from a generalized versionof the Central Limit Theorem



Regression Problem Reformulation


i

i

p p

i

ii

i

V

ni

z z Y V

V

forspecifiedwasondistributiNo

d.distribute yidenticalland

tindependenarefor

eachthatassumedisitnow,tillUp

,....,

... ) ( ) (

21

11

nn

i

i

N

ni N V

V

I0V 2

1

2 210

,

,....,,

~

wordsotherinor

for ~

i.e.Gaussian,iseachthatassumelyadditionalweNow


30/67

31-03-2016

30


Gaussian Assumption: Visualization


TrueRegression

Line


Modeling Error Densities


Consequences of Gaussianity


i p pi

ii

iiV

nV V V

nV V V

i

z z yv

v|θ v N

|θ v N |θ v N |θ v N

|θ vvv f θ L

niV

i

n

n

...

exp

....

,....,,

,...,,:

,....,,

11

2

2

21

21

22

1

21

21

21

where

followsasparametersunknownfor

functionlikelihoodtheconstructcanwei.i.d.,and

normalarethatassumptiontheUnder

θ

n

i

ivnn

θ L1

2

2

2

2

1

22

2 ln ln ln


31/67

31-03-2016

31


Maximum Likelihood Estimation


0AA

Y

ln θ

θ

θ L T

notationmatrixvectorinoptimalityforconditionsNecessary

θ θ

|θ f L

T

nn

T

nn

AA

IθV

Y Y exp

V V exp V

/

/

22

12

2

2

1

2

1

2

1

2

1

elyAlternativ

θ θ nnθ L T AA Y Y ln ln ln 2

2

2

1

22

2


OLS as ML Estimator


OLS ML

θ θ

θ

ˆY ˆ T1T

AAA

isofestimatepointlikelihoodmaximumThus,

12

12

AAθ

θAAθθ

θ

T

ML

OLS

T

ML ML

ML ML

θ N

CovCovθ E

θ

,ˆ

ˆˆˆ

ˆˆ

*

*

~

and thatfollowsitThen,

.RVofnrealizatioarepresentLet

Thus, if we assume that the modeling errors are i.i.d.samples from the Gaussian distribution, then

the OLS estimator turns out to be identical tothe Maximum Likelihood (ML) estimator.


32/67

31-03-2016

32




12

11

AAθθ

VAAAYAAAθθθθ

T

T T T T

ML MV OLS

Covθ E

θ

ˆˆ

ˆˆˆˆ

*

*

and

withvectorrandomGaussianais

thatfollowsitRVsGaussianofpropertiestheFrom

P

AAP T

matrixofelementdiagnonal

matrixdefineusLet

thi pii '

1

2 iiii

i

pθ N ,ˆ

ˆ

* ~

normalunivariateisofpdfmarginalthethatfollowsit

RV,GaussiantemultivariaofpropertiesFrom

θ

θ


Confidence Internals on Parameters


unknownis:Difficulty

asonintervalsconfidencethe

constructcanweprinciple,in~Since

2

222

2

1

/

*

/

*

*

ˆ

,,ˆ

a

ii

iia

i

iiii

z p

θ θ z P

θ

pθ N θ

CI.theconstructtothemuseand

residualsmodelusingestimate:Remedy

V ˆV ˆˆ T pn

12

2

? ˆ

ˆ *

2 ii

ii

p

θ θ ofondistributitheiswhat:Question


33/67

31-03-2016

33




ˆV ˆˆV ˆ

ˆˆˆˆV V

V V

* *

* * * *

AA

AAAA

T

T T T

n

i

niT

Y Y Y Y

~v

1

2

2

2

2

1 Consider

0 ˆV ˆ,V ˆ

* A

A

T thatfollowsit

ofspacecolumnthetoorthogonalisSince

ˆˆ

V ˆ

V ˆ

V V * *

AAT T

T T

222

111

2

2

2112

2

1

1

pn

T

p

T T

T T

~

~

V ˆV ˆ

ˆˆˆˆ * * * *

thatfollowsitRV,ofpropertiesthefromThus, 2

AAAA




pn~

pn~

v

pn

v pn pn

pn

pnn

i

i

n

i

i

T

2

2

12

2

2

2

1

22

1

11

ˆ

ˆˆ

ˆY V ˆˆV ˆV ˆˆ

thatfollowsitThus,

where Now θA

pn

pniiii

ii

ii T pn

Z pθ

p

θ

~/

)/ˆ(

/ˆ

ˆ

ˆ 2

and

forintervalconfidence(Thus,

1ˆˆˆˆ

%100)1

,2/,2/ ii pniiii pni

i

pt θ pt θ P


34/67

31-03-2016

34


Example: Global Warming


63

31

223

10194100748

1007485615

10820811016841878

..

..

.ˆ..ˆ

-

-

θ

T

T

OLS

AAP

;

ModelLinear

21424

*

22

26

*

22

22

*

22

10763.2

ˆ

108208.11019.4

ˆ

ˆ

ˆ

T

p~

θθθ

95.0107144106223977.1

95.010763.210168.410763.210168.4%95

3*

2

3

140,2/

4

140,2/

3

2

4

140,2/

3

2

.. P

t

t t P

forintervalconfidenceThus,


Hypothesis Testing


0

11

*

* * * ...'

i

p p z z Y E θ thi

hypothesistestcanwe

toofcomponent

ofoncontributitheofimportancethemeasureTo

0

0

1

0

*

*

:

:

i

i

H

H

hypothesis Alternate

hypothesis Null

While developing a black box model from data, weare often not clear about the terms to beincluded in the model. For example, for the globaltemperature data, should be develop a linearmodel or a quadratic model?


35/67

31-03-2016

35


Hypothesis Testing


k T P value p

p

θ k

value p

pn

ii

i

2

ˆ

ˆ

Then.statisticstesttheof

valueobservedthebeLet

Otherwisereject)tofailwe(i.e.Accept

if Reject

toisoftest,cesignificanoflevelatand,

then,trueisIf

0

20

0

0

0

H

t p

θ H

H

T p

H

pn

ii

i

pn

ii

i

,/ ˆ

ˆ

~ˆ

ˆ

θ


Example: Global Warming


974

521

214

1

22

5

10118310202110161

1020211063410464

101611046410294

105821

10305312304114

...

...

...

.ˆ

...ˆ

AAP T

T θ

ModelQuadratic

0

0

31

30

*

*

:

:

H

H


hypothesis Null

We are interested in finding whether inclusion ofthe quadratic term is contributing to the mean of Y


36/67

31-03-2016

36


Hypothesis Testing


0705.42,705.4ˆ

ˆ

139

33

3 T P value p

p

θ k Since

rejectwe

Since

becesignificanofleveltheLet

then,trueisIf

0

139025011

5

33

3

314211

3

33

30

9772170541011835821

103053

050

1011835821

0

H

t p

θ

~T p

H

....

.ˆ

ˆ

.

..

ˆ

ˆ

ˆ

,.

θθ

Thus, there is strong evidence that the quadraticterm contributes to the correlation


Mean Response


* θ Y E

Y

T

Y

00

00

0zz

zz

?RVofmeaniswhat,fixedforis,Question

θz

θ

ˆˆ

ˆ

,*

T

Y

Y

θ

0

0

0

followsasusingdconstructebecanofestimatean

knownotdoweSince

j

T

j

mi

i

i

T i

i

i

vθ y

y y yV

Y

V θ Y Y

,

*

,

,,,

*

,....,,

0

0

0

02010

0

0

z

zz

zz

samplesgetwillweRV,anisSince

ofsamplescollectandselectweSuppose

modelConsider


37/67

31-03-2016

37


Mean Response


00

00

0

00

Y

T T

Y

Y Y

Y

θ E E

* ˆˆ

? ˆ

ˆˆ

zθz

θ

ofestimateunbiasedanIs

variable.randomaisvector,randomaisSince

00201022

0000

0

0

00

0

00

zPzzAAz

zθzθzθz

zθz

T T T

Y Y

T T T T

Y

T

Y

T

Y

Cov

Covθ θ E Cov

θ E

ˆ

ˆˆˆˆ

ˆˆˆ

* *

*

or

and

haveweThus

00200

zPzz

θ

T T

Y θ N ,ˆ

ˆ

* ~

thatfollowsitRV,GaussianaisSince


Mean Response


pn

pn

T

Y Y

Y

Y Y T

pn Z

~/

) / ˆ(

/ ˆ

ˆ

ˆ 20000

0

00

zPz

responsemeantruetheonintervalconfidence(Thus,

1

1001

00

2

00

2 000zPzzPz

T

pnY Y

T

pnY t t P ˆˆˆˆ

% )

,/ ,/

θA

zPz

ˆY V

ˆV ˆ

V ˆ

ˆ

ˆˆ

ˆ

andwhere

followsascomputedbecanofestimatean

trueknowrarelywepractice,inSince, 2

T

T

Y

Y

pn

12

0022

2

0

0


38/67

31-03-2016

38


Future Response


0

0

0

0

0

0

z

zz

zz

Y

V θ Y Y T

i

predictingininterestedwe,situationssomein

fixedatmodeltheGiven

*

θz

θ

ˆˆ

ˆ

,*

T Y

Y

θ

0

0

0

followsasusingdconstructebecanofestimatean

knownotdoweSince

Apart from determining a single value to predict aresponse, we are interested in finding a prediction intervalthat with a given degree of confidence will contain theresponse.


Future Response


and

thatfollowsitSince

Consider

0020

0

2

00

2

00

0

zPzz

z

T T

T

θ ~N Y

θ ~N Y

V~N

Y Y

,ˆ

,

),,(

ˆ

*

*

00200

2

2

1

1

0

10 zPz

θzzz

T

n

n

~N Y Y

Y ,....,Y Y

Y

,ˆ

.ˆ,,,,

thatfollowsitThus,

obtaintoused

datapasttheoftindependenisresponse)future(i.e.


39/67

31-03-2016

39


Future Response


pn pn

T

T

T

T

~T

pn

Z

Y

Y

2

00

0

0

00

0

0

1

1

/ ˆ

ˆ

ˆ

ˆ

zPz

θz

zPz

θz

haveweanyforThus,

1

1

10

200

0

02 pnT

T

pn t

Y t P ,/ ,/

ˆ

ˆ

zPz

θz


Prediction Interval


0020

0

0

0

1

1100

zPzθz

z

z

T

pn

T t

Y Y

%

ˆˆ

,

,/

:istheni.e.

,atresponsefuturethefor

intervalpredictionA

002

0

0

0

0

1001

zPz

z

T

pnY

Y

t

Y E

ˆˆ

%

,/

is)(i.e.,atresponsemeanthe

onintervalconfidence:Recall


40/67

31-03-2016

40


CI and PI

Difference between confidence interval (CI) and

prediction interval (PI):

Confidence interval (CI) is on a fixed parameter

of interest (like E[Y0] )

Prediction interval (PI) is on a random variable

(like Y0 )

At any z0, the prediction interval on future

response is wider than the confidence interval on

the mean response.



Mileage Related to Engine Displacement


Consider the mileage (y, miles/gallon) and enginedisplacement (x, inch3) data for various cars. An expertcar engineer insists that the mileage is related todisplacement as: Y = mx + c

(Montgomery and Runger, 2003)

64

412

10347210696

106962219039419

04707333

..

..;ˆ

..ˆˆˆ

-

-.

mcθ

V mxcY

T

T T

AAP

parametersmodelEstimated

ModelProposed


41/67

31-03-2016

41


Mileage Related to Engine Displacement


100 150 200 250 300 350 400 450 5000

5

10

15

20

25

30

35

40

x (engine displacement)

y ( g a s o l i n e m i l e a g e )

Scatter, CI for Mean Response and Prediction Interval

raw data

regression model

mean response: lower

mean response: upper

ind. pred.:lower

ind. pred.:upper

IntervalPredictionCIResponseMean

x

x x

fromawaymove

weasincreases

andat narrowestisPI

:Note

0


Assessing Quality of Fit


Y?variableresponsetheexplainadequatelytoableis

modelfittedthewhetherassesswedoHow

Variability Analysis

RegressionResiduals

setdataGiven

R E

n

i

iiiiii

n

i

iii

n

i

iY

ii

SS SS

y y y y y y y y

y y y y y ySS

ni y

1

22

1

2

1

2

ˆˆ2ˆˆ

ˆˆ

,....,2,1:,z

n

i

ii

n

i

iii y y y y y y11

22 ˆV ˆˆ


42/67

31-03-2016

42




01

n

i

ii

T T

T T T

yvY ˆˆˆV ˆˆY

ˆY ˆY

θAθA

0AθA0θAA

thatimplies

optimalityforconditionnecessarytheOLS,In

ni z

V z z Y

i

i

i

p p

i

i

,...,,

...

2111

221

fori.e.

formtheoftypicallyare

regressionrmultilineainusedModels

Note




00

0

111

11

1

1

1

n

i

i

n

i

i

n

i

iT

n

T T

T

n

v y yv

v

ˆˆ

ˆV ˆ

V ˆˆY

...

1

0AθAA

1

A

constrainttheincludes

yoiptimalitforconditionnecesarytheThus,

ismatrixofcolumnfirstthethatimpliesThis

model}regressionbycapturedty{Variabili

d}unexplaineleftty{Variabili yVariabilitTotal

R E Y SS SS SS


43/67

31-03-2016

43




YY

E

YY

R

E s

S

SS

S

SS R

SS SS

12

Re

isfittheofqualitytheofmeasuregoodA

orasdenotedisandresidualstheof

yvariabilitthealsoisdunexplaineleft yVariabilit

R2 quantifies the proportion of the variability inthe response variable explained by the input variable.

R2 is called coefficient of determination.(a direct measure of the quality of fit)

A good fit should result in high R2




10 2

2

R

Y

Y

S

SS R

YY

R

:Note

invariationobservedTotal

regressionbyexplainedinVariation

The coefficient of determination close to 1 indicates

that the model adequately captures the relevantinformation contained in the data.

Conversely, the coefficient of determination close to 0indicates a model that is inadequate to captures therelevant information contained in the data.

In general, it is possible to improve R2 by introducingadditional parameters in a model. However, note thatthe improved R2 can be, at times, misleading.


44/67

31-03-2016

44




parametersmoreofinclusionthrough

improvethatmodelapenalizes

modelinvariableofno.ofregardless

constantremainstermThis

squaremeanResidual

fitmodelofmeasurealternateAn

2 R R

nSS

pnSS

nSS

pnSS R

adj

Y

E

Y

E adj

:

:)1/(

:)/(

)1/(

)/(1

2

2

.parametersofnumberexcessiveusingwithoutadequately

capturedbeenhasdatain yvariabilitthethatindicates

andofvalauescomparableandhighRelatively 22adj

R R


Variability Analysis: Examples


77.0

72.955,54.1237

2 R

SS SS RY

:examplemileageGasoline

66680,6715.0

1989.2,6934.6

616406192.0

5491.2,6934.6

22

22

. R R

SS SS

. R R

SS SS

adj

RY

adj

RY

ModelQuadratic

,

ModelLinear

:exampleWarmingGlobal

10 2 R :Note


45/67

31-03-2016

45




Boiling points of a series of hydrocarbons



Candidate Models


0 1 2 3 4 5 6 7 8 9 10-250

-200

-150

-100

-50

0

50

100

150

200

250

B o i l i n g P o i n t ( 0 C )


Linear ModelT = 39*n - 170Quadratic Model

T = - 3*n2 + 67*n - 220

Data 1

Linear Model

Quadratic Model

2nnT

bnaT

:ModelQuadratic

:ModelLinear


46/67

31-03-2016

46


Raw Model Residues


0 2 4 6 8-8

-6

-4

-2

0

2

4

6

8

10

M o d e l R e s i d u e v ( k ) ( 0 C )


vn.-n..-T )(02383)(6667661429218 2 :ModelQuadratic

33736ˆ .


Confidence Interval


:3Parameter

:2Parameter

:1Parameter and


7670128074

2547807955

415195871240

57062050

1001

50250

.-.-

..

. , -.-

.t

i

,

,

.

%

,.

*

0

3

*

3 θ

θ rd

hypothesistestcanwe

modelquadraticthetoofcomponent


006000536008930

053605060091070

089309107094641

)( 1

..-.

.-..-

..-.T AAP


47/67

31-03-2016

47


Hypothesis Testing


0

0

31

30

*

*

:

:

H

H


hypothesis Null

OtherwiserejecttoFail

if Reject


then,trueisIf

0

50050

3

0

0

5

33

0

0321448890

010

48890

H

.t .

θ H

H

T . p

H ii

,.

ˆ

.

~ˆ

ˆ

ˆ

θθ

rejectedishypothesisnulltheSince

:StatisticsTest

,0321.4

1845648890

02383

48890

ˆ3

k

..

.-

.

θ k


Hypothesis Testing


)01.0(

00160184562

18456

5

ce significanof level value p

..T P value p

.k

value p

:Note

.statisticstesttheofvalueobservedtheis

Thus, there is strong evidence that the quadraticterm contributes to the correlation between the

boiling point and the carbon number

9958.0,997.0

96980974.0

22

22

22

adj

adj

adj

R R

. R R

R R

:ModelQuadratic

: ModelLinear

andinDeterminatoftCoefficien


48/67

31-03-2016

48


Analysis of Residuals


Linear ModelNormalized

Residuals showa pattern

Quadratic ModelNormalized

Residuals are

Randomly spreadBetween +/- 2

0 2 4 6 8-5

-4

-3

-2

-1

0

1

2

3

N o r m a l i z e d M o d e l R e s i d u a l v ( k )


Linear Model

Quadratic Model




Laboratory experimentaldata on Yield obtained from acatalytic process at various

temperatures and pressures(n = 32)

21 21307570975ˆ x. x.. y


Ref.: Ogunnaike, B. A., RandomPhenomenon, CRC Press, London,2010


49/67

31-03-2016

49


Raw Model Residues


94150

21203

07570

866075

.

.

.

.

ˆ

ˆ

0 5 10 15 20 25 30 35-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

M o d e l R e s i d u e v ( k )

Sample No.


Confidence Interval


400000065000

000010009250

650000925064379

)( 1

...-

...-

.-.-.T AAP

4349941

13700150

8581896904522050

95

290250

..

. ,.

...t

i

,

,.

%

,.

*

:3Parameter

:2Parameter

:1Parameter and


0

2

2 * hypothesistestcanwe

modelproposedthetoofcomponent


θnd


50/67

31-03-2016

50


Hypothesis Testing


0

0

21

20

*

*

:

:

H

H


hypothesis Null


if Reject


then,trueisIf

0

290250

2

0

0

29

2

22

2

0

0452202980

050

02980

H

.t .

θ H

H

T . p

H

,.

ˆ

.

~ˆ

ˆ

ˆ

θθ

rejectedishypothesisnulltheSince

:StatisticsTest

,04522

5439202980

07570

02980

ˆ2

.k

..

.

.

θ k


Hypothesis Testing


)05.0(

01660543922

54392

29


..T P value p

.k

value p

:Note

.statisticstesttheofvalueobservedtheis


if Reject

toisoftest,cesignificanoflevelAt

0

290050

2

0

0

7564202980

010

H

.t .

θ H

H

,.

ˆ

.

)01.0(

,75642


.k

:Note

hypothesisnullrejecttofailweSince


51/67

31-03-2016

51


Nonlinear in Parameter Models


Re

/RePr

ScSh

Nu pa

transfermassandheatinmodelsbasedgroupessDimensionl

n A RT E

A C ek -R /

0:EquationsRateReaction

x

xY

)1(1:ModelVLESimplified

EquationAntoine

modelWaalDerVan

modelKwongRedlich

nscorrelatiomicThermodyna

C T

B A P

V

a

bV

RT P

bV T V a

bV RT P

vln

2


Nonlinear-in-Parameter Models


parametersTrue residual,Model

Defining

FormModelAbstract

::

),(

....

....

*

*

21

21

θ

θx

θ

x

g Y

x x x

T

m

T

n

)allfor(

squareleastWeighted

squareleastOrdinary

EstimationParameter

iww Min

Min

i

n

i

iiOLS

n

i

iOLS

0)(ˆ

)(ˆ

1

2

1

2

θθ

θθ

The parameter estimationproblem has to be solved

using numericaloptimization tools.


52/67

31-03-2016

52




d.distribute yidenticallandtindependenarefor

errormodelingrandomeachthatassumedisIt

ni

g Y iii

i

,....,

,

21

θx

for

thatassumedfurtherisIt

ni N i ,....2,1),0(~ 2

ni ,θ g Y

n

ni y

i

i

i

i

i y

,...,,

,...,,:,

21

21

for

equationsmodeland

sexperimenttindependenfromgenerated

datasetaconsiderNow

x

xS




θx ,

exp

....

,....,,

,...,,:

,....,,

i

ii

ii

n

n

i

g ye

e|θ e N

|θ e N |θ e N |θ e N

|θ eee f θ L

θ

ni

i

n

n

2

2

21

21

22

1

21

21

21

where

followsasparametersunknownfor

functionlikelihoodtheconstructcanwei.i.d.,and

normalarethatassumptiontheUnder

n

i

ienn

L1

2

2

2

2

1)log(

2)2log(

2)(log

θ


53/67

31-03-2016

53


Maximum Likelihood Estimation


n

i

i

i ,θ g y

θ

Minθ L

θ

Minθ

1

2xln ˆ

thatimpliesThis

n

i

i

i θ g y

nnθ L

1

2

2

2

2

1

22

2,ln ln ln x

OLS n

i

i

i ML θ ,θ g y

θ Minθ θ ˆˆˆ

1

2x

θ isofestimatepointlikelihoodmaximumtheThus,

Thus, under the Gaussian assumption, the OLS estimator turns outto be identical to the Maximum Likelihood (ML) estimator.


Gauss-Newton Method


ni

V θ

θ θ θ

θ g θ g Y

θ θ θ

θ

θ θ

θ g θ g θ g

θ

i

T

k

k ik i

i

k k

k

T

k

k ik ii

k

,....,,

,,

,

,,,

21

for

asedapproximatbecanequationsmodelthesmall,For

solution,guessaofodneighborhothein

ionapproximatbasedseriesTaylorConsider

xx

θ

θ

xxx

niV θ Y θ θ θ

θ g θ g Y Y

k

i

k T k ik

i

k

k ik ik i

i

k

i

,...,,

,,

) ,( ) (

,) (

21

for

andDefining

z

xzx


54/67

31-03-2016

54


Gauss-Newton Method


k

k

n

k

k

k

T k n

T k

k

k

n

k

v

v

θ

y

y

V

.............

Y

.....,

,

1

1

1

A

z

z

equationsmodelStacking

k T k k T k k k

k

k

k k k k

θ

θ

N

θ

Y ˆ

,,~V

V Y

AAA

A

1

20

isofestimate

likelihoodmaximumthe

thatassumptiontheUnder

n

i

k i

ik

k

k k

k k k

θ g y

θ θ θ

θ

1

21

1

0

,

ˆ

x

θ

where )(tolerance

satisfiediscriterionnterminatio

folloingthetilliterationsthecontinueand

asforguessnewa

generatecanwe,guess,initialanfromstartingThus,


Covariance of Parameter Estimate


N N N N

N T N

V

V

N T N

V

N

N

θ

pn

θ Cov

θ

N

N

N

ˆY V ˆ

V ˆV ˆˆ

ˆ

ˆ

) (

A

AA

followsascontructedbevanestimateAn

thatfollowsitOLS,ofpropertiestheFrom

.terminatesmethodNewtonGaussthewhen

obtainedsolutionoptimumtherepresentLet

12

2

12

1

2

N T N

V

N N

N N

N θ Cov

Cov

θ Covθ

θ θ θ

AA

θ

ˆ

),ˆ(

ˆˆ

ˆˆ

i.e.ofthattoidenticalis

thatarguecanweRVofntranslatio

onlyissolutionoptimumtheSince


55/67

31-03-2016

55




pn~

pn

pn

V

V pn

V

V

N

N

N

N

22

2

2

ˆˆ ˆ~

ˆ thatfollowsit,Since

pn

pn

V V

N

iiV ii

N iiV

ii ~T pn

Z p

p N N

N

N

2

/

/ ˆ

/ ˆ

ˆ

ˆ

ˆˆ

* θθ and


1

1001

22

N

iiV pnii

N

iiV pni

i

pt θ pt θ P N N ˆ,/ *

ˆ,/

*

ˆˆˆˆ

%

N iiV ii N

N T N N

pθ ~N

θ E

N

2

1

,ˆ,V ˆ

,

ˆ

*

*

θ

AAP

θ

thatfollowsitofondistributi

GaussianofassumptointhefromDefining

.andunbiasedisestimatorMLthethatassumeusLet


Linearizing Transformations


In some special cases, a linear-in-parameter formcan be derived using variable transformations

V Nu pa /logRelogPr loglog

V Nu RelogPr loglog

V C nT R

E k r A A

)log(

1)log(log 0

Defining:ModelVLESimplified

V x y

/

11

11

1

OLS/WLS methods developed for linear-in-parametermodels can be used for estimating parameters of the

transformed model.


56/67

31-03-2016

56


Nonlinear in Parameter Models


d.transformebecannot)(ResidualModelOriginalThe

Difficulty

n

i

i

T

n

i

iT

T

θ

Minθ

θ

θ θ

V θ

Minθ

1

2

1

2

ˆ

ˆ

ˆˆ

ˆ

forsolvingbyestimatingtoequivalentNOT is

dtransformethefromestimatesrecoveringand

forsolvingMoreover,

Parameters estimated using the transformed model serve asa good initial guess for solving the nonlinear optimization problem.


A Fix using WLS


n

i

iiWLS

iii

wθ

Minθ

V w

1

2

22

ˆ

forsolveand

eapproximattotryweapproach,thisBy

21

0

00

00

0

21

i

ii

i

i

i

i

ii

i

i

ii

ii

ii

g w

g V

g g V

V

ni g V

V

Choose

ofnbhdtheinexpansionseriesTaylorUsing

for asitdenoteusLetoffinctioncomplexais:Note

,...,,.


57/67

31-03-2016

57


WLS Example


V f f Y

Y Y

f f f Y

p p

p p

xx

xxx

1121

121132

ln ....ln ln ln

ˆ

...ˆ

modeldtransformeand

modelConsider

2

00

21

1

2

1

2

11ii

iiiV i

i

iiiiii

T

pT

n

i

iiT

n

i

iT

T

ywY Y

V

Y Y Y Y V

θ

y yθ

Minv

θ

Minθ

ii

Choose

problemestimationparameterdTransforme

ˆ

ˆln ˆln ˆln ln

....ln

ˆln ln ˆ

,


WLS Example


T nn

n

i

iiiT

n

i

iiT

T

y y y

, y y ydiag

y y yθ

Minv y

θ

Minθ

ln ....ln ln Y

....

ˆln ln ( ˆ

21

22

2

2

1

1

22

1

22

Multi Linear Regression Handout 2x1

Documents

Transcript of Multi Linear Regression Handout 2x1