Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf ·...

Basis Expansions andRegularization

Based on Chapter 5 ofHastie, Tibshirani and Friedman

Basis Expansions for Linear Models

!=

=

M

m

mm XhXf1

)()( "

Here the hm’s might be:

•hm(X)=Xm, m=1,…,p recovers the original model

•hm(X)=Xj2 or hm(X)= Xj

Xk

•hm(X)=I(Lm≤Xk ≤Um),

“knots”

Regression Splines

+

+

!=

!=

=

=

)()(

)()(

)(

1)(

24

13

2

1

"

"

XXh

XXh

XXh

Xh

Bottom left panel uses:

Number of parameters = (3 regions) X (2 params per region)- (2 knots X 1 constraint per knot)

= 4

cubicspline

Cubic Spline

3

26

3

15

3

4

2

3

2

1

)()(

)()(

)(

)(

)(

1)(

+

+

!=

!=

=

=

=

=

"

"

XXh

XXh

XXh

XXh

XXh

Xh

Number of parameters = (3 regions) X (4 params per region)- (2 knots X 3 constraints per knot)

= 6Knot discontinuity essentially invisible to the human eye

continuous first and second derivatives

Natural Cubic SplineAdds a further constraint that the fitted function is linear beyondthe boundary knots

A natural cubic spline model with K knots is represented by Kbasis functions:

kK

Kk

k

Kkk

XXXd

XdXdXH

XXH

XH

!!

!!

"

"""=

"=

=

=

++

"+

33

12

2

1

)()()(

where),()()(

)(

1)(

Each of these basis functions has zero 2nd and 3rd derivative outsidethe boundary knots

Natural Cubic Spline ModelsCan use these ideas in, for example, regression models.

For example, if you use 4 knots and hence 4 basis functions perpredictor variable, then simply fit logistic regression model withfour times the number of predictor variables…

Smoothing SplinesConsider this problem: among all functions f(x) with twocontinuous derivatives, find the one that minimizes the penalizedresidual sum of squares:

{ } { }dttfxfyfRSSN

i

ii

2''

1

2)()(),( !" +#=

=

$$

smoothingparameter

λ=0 : f can be any function that interpolates the dataλ=infinity : least squares line

Seems like there will be N features and presumably overfitting of thedata. But,… the smoothing term shrinks the model towards the linearfit

This is a generalized ridge regressionCan show that where K does not depend on λ

Smoothing SplinesTheorem: The unique minimizer of this penalized RSS is a naturalcubic spline with knots at the unique values of xi , i=1,…,N

{ } { }

ySyHHH

dttHtHxH

HyHyRSS

T

H

T

kjjkHijij

H

TT

!!"

"!"""!"

=#+=

=#=

#+$$=

$

%1

''''

)(ˆ

)()( and )(H

where)()(),(

!=

=

N

i

jj xHxf1

)()( "

1)( !+= KIS "

"

Nonparametric Logistic RegressionConsider logistic regression with a single x:

and a penalized log-likelihood criterion:

{ } { }

{ } { }dttfexfy

dttfxpyxpyfl

N

i

xf

ii

N

i

iiii

i2

''

1

)(

2''

1

)(2

11log()(

)(2

1)(1log()1()(log),(

!"

!"

#++=

###+=

=

=

$

$$

)()|0Pr(

)|1Pr(log:)(logit xf

xXY

xXYY =

==

===

Again can show that the optimal f is a natural spline with knots at the datapoint

Can use Newton-Raphson to do the fitting.

Thin-Plate SplinesThe discussion up to this point has been one-dimensional. The higher-dimensional analogue of smoothing splines are “thin-plate splines.” In2-D, instead of minimizing:

minimize:

{ } { }dttfxfyfRSSN

i

ii

2''

1

2)()(),( !" +#=

=

$$

{ }

21

2

2

2

22

21

22

2

1

2

1

2

)()()()(

where)()(),(

dxdxx

xf

xx

xf

x

xffJ

fJxfyfRSSN

i

ii

!!"

#$$%

&

''

+!!"

#$$%

&''

'+!!

"

#$$%

&

''

=

+(=

) )

*=

++

Thin-Plate SplinesThe solution has the form:

22

1

0

log)( and )()(

where)()(

zzzxxxh

xhxxf

jj

N

j

jj

T

=!=

++= "=

##

$%%

a type of “radial basis function”

Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf ·...

Documents

Transcript of Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf ·...