Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf ·...

17
Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Transcript of Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf ·...

Page 1: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Basis Expansions andRegularization

Based on Chapter 5 ofHastie, Tibshirani and Friedman

Page 2: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Basis Expansions for Linear Models

!=

=

M

m

mm XhXf1

)()( "

Here the hm’s might be:

•hm(X)=Xm, m=1,…,p recovers the original model

•hm(X)=Xj2 or hm(X)= Xj

Xk

•hm(X)=I(Lm≤Xk ≤Um),

Page 3: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

“knots”

Page 4: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Regression Splines

+

+

!=

!=

=

=

)()(

)()(

)(

1)(

24

13

2

1

"

"

XXh

XXh

XXh

Xh

Bottom left panel uses:

Number of parameters = (3 regions) X (2 params per region)- (2 knots X 1 constraint per knot)

= 4

Page 5: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

cubicspline

Page 6: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Cubic Spline

3

26

3

15

3

4

2

3

2

1

)()(

)()(

)(

)(

)(

1)(

+

+

!=

!=

=

=

=

=

"

"

XXh

XXh

XXh

XXh

XXh

Xh

Number of parameters = (3 regions) X (4 params per region)- (2 knots X 3 constraints per knot)

= 6Knot discontinuity essentially invisible to the human eye

continuous first and second derivatives

Page 7: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Natural Cubic SplineAdds a further constraint that the fitted function is linear beyondthe boundary knots

A natural cubic spline model with K knots is represented by Kbasis functions:

kK

Kk

k

Kkk

XXXd

XdXdXH

XXH

XH

!!

!!

"

"""=

"=

=

=

++

"+

33

12

2

1

)()()(

where),()()(

)(

1)(

Each of these basis functions has zero 2nd and 3rd derivative outsidethe boundary knots

Page 8: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Natural Cubic Spline ModelsCan use these ideas in, for example, regression models.

For example, if you use 4 knots and hence 4 basis functions perpredictor variable, then simply fit logistic regression model withfour times the number of predictor variables…

Page 9: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman
Page 10: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Smoothing SplinesConsider this problem: among all functions f(x) with twocontinuous derivatives, find the one that minimizes the penalizedresidual sum of squares:

{ } { }dttfxfyfRSSN

i

ii

2''

1

2)()(),( !" +#=

=

$$

smoothingparameter

λ=0 : f can be any function that interpolates the dataλ=infinity : least squares line

Page 11: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Seems like there will be N features and presumably overfitting of thedata. But,… the smoothing term shrinks the model towards the linearfit

This is a generalized ridge regressionCan show that where K does not depend on λ

Smoothing SplinesTheorem: The unique minimizer of this penalized RSS is a naturalcubic spline with knots at the unique values of xi , i=1,…,N

{ } { }

ySyHHH

dttHtHxH

HyHyRSS

T

H

T

kjjkHijij

H

TT

!!"

"!"""!"

=#+=

=#=

#+$$=

$

%1

''''

)(ˆ

)()( and )(H

where)()(),(

!=

=

N

i

jj xHxf1

)()( "

1)( !+= KIS "

"

Page 12: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman
Page 13: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman
Page 14: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Nonparametric Logistic RegressionConsider logistic regression with a single x:

and a penalized log-likelihood criterion:

{ } { }

{ } { }dttfexfy

dttfxpyxpyfl

N

i

xf

ii

N

i

iiii

i2

''

1

)(

2''

1

)(2

11log()(

)(2

1)(1log()1()(log),(

!"

!"

#++=

###+=

=

=

$

$$

)()|0Pr(

)|1Pr(log:)(logit xf

xXY

xXYY =

==

===

Again can show that the optimal f is a natural spline with knots at the datapoint

Can use Newton-Raphson to do the fitting.

Page 15: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Thin-Plate SplinesThe discussion up to this point has been one-dimensional. The higher-dimensional analogue of smoothing splines are “thin-plate splines.” In2-D, instead of minimizing:

minimize:

{ } { }dttfxfyfRSSN

i

ii

2''

1

2)()(),( !" +#=

=

$$

{ }

21

2

2

2

22

21

22

2

1

2

1

2

)()()()(

where)()(),(

dxdxx

xf

xx

xf

x

xffJ

fJxfyfRSSN

i

ii

!!"

#$$%

&

''

+!!"

#$$%

&''

'+!!

"

#$$%

&

''

=

+(=

) )

*=

++

Page 16: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman

Thin-Plate SplinesThe solution has the form:

22

1

0

log)( and )()(

where)()(

zzzxxxh

xhxxf

jj

N

j

jj

T

=!=

++= "=

##

$%%

a type of “radial basis function”

Page 17: Basis Expansions and Regularization - Columbia Universitymadigan/DM08/regularization.ppt.pdf · Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman