Post on 02-Jun-2018
8/11/2019 Polynomial Curve Fitting - NG
1/27
Navneet GoyalDepartment of Computer Science, BITS-Pilani, Pilani Campus, India
Polynomial Curve FittingBITS F464
Machine Learning
8/11/2019 Polynomial Curve Fitting - NG
2/27
Seems a very trivial concept!!
All of us know it well!!
Why are we discussing it in Machine Learning
course? A simple regression problem!! It motivates a number of key concepts of ML!!
Lets discover
Polynomial Curve Fitting
8/11/2019 Polynomial Curve Fitting - NG
3/27
Polynomial Curve Fitting
Observe Real-valuedinput variablex Usex to predict valueof target variable t
Synthetic datagenerated fromsin(2x) Random noise in
target valuesInput Variable
Target
Variable
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
4/27
Polynomial Curve Fitting
Input Variable
Targ
et
Variable
N observations ofxx = (x1,..,xN)Tt = (t1,..,tN)T Goal is to exploit trainingset to predict value offrom x
Inherently a difficultproblem
Data Generation:N = 10Spaced uniformly in range [0,1]Generated from sin(2x) by adding
small Gaussian noiseNoise typical due to unobservedvariables
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
5/27
Polynomial Curve Fitting
Input Variable
Target
Variable
Where M is the order of thepolynomial Is higher value of M better?
Well see shortly! Coefficients w0 ,wM aredenoted by vectorw Nonlinear function ofx, linear
function of coefficients w Called Linear Models
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
6/27
Sum-of-Squares Error Function
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
7/27
Polynomial curve fitting
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
8/27
Choice of M??
Called model selection or model comparison
Polynomial curve fitting
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
9/27
0thOrder Polynomial
Poor representations of sin(2x)
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
10/27
1stOrder Polynomial
Poor representations of sin(2x)
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
11/27
3rdOrder Polynomial
Best Fit to sin(2x)
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
12/27
9thOrder Polynomial
Over Fit: Poor representation of sin(2x)
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
13/27
Good generalization is the objective Dependence of generalization performance on M? Consider a data set of 100 points Calculate E(w*) for both training data & test data
Choose M which minimizes E(w*) Root Mean Square Error (RMS)
Sometimes convenient to use as division by N allows us to
compare different sizes of data sets on equal footing Square root ensures ERMSis measure on the same scale ( and in
same units) as the target variable t
Polynomial Curve Fitting
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
14/27
Flexibility & Model Complexity
M=0, very rigid!! Only 1 parameter to play with!
8/11/2019 Polynomial Curve Fitting - NG
15/27
Flexibility & Model Complexity
M=1, not so rigid!! 2 parameters to play with!
8/11/2019 Polynomial Curve Fitting - NG
16/27
8/11/2019 Polynomial Curve Fitting - NG
17/27
Over-fittingFor small M(0,1,2)Inflexible tohandle oscillationsof sin(2x)
M(3-8)flexible enough to
handleoscillations ofsin(2x)
For M=9Too flexible!!
TE = 0GE = high
Why is it happening?
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
18/27
Polynomial Coefficients
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
19/27
Data Set SizeM=9- Larger the data set, the more complexmodel we can afford to fit to the data- No. of data pts should be no less than 5-10 times the no. of adaptive parameters inthe model
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
20/27
Over-fitting Problem
Should we limit the no. of parameters accordingto the available training set?
Complexity of the model should depend only on the
complexity of the problem!
LSE represents a specific case of Maximum Likelihood
Over-fitting is a general property of maximumlikelihood
Over-fitting Problem can be avoided using the
Bayesian Approach!
8/11/2019 Polynomial Curve Fitting - NG
21/27
Over-fitting Problem
In Bayesian Approach, the effective number ofparameters adapts automatically to the size of the dataset
In Bayesian Approach, models can have more
parameters than the number of data points
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006
Springer
8/11/2019 Polynomial Curve Fitting - NG
22/27
Penalize large coefficient values
Regularization
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
23/27
Regularization:
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
24/27
Regularization:
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006Springer
8/11/2019 Polynomial Curve Fitting - NG
25/27
Regularization: vs.
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006
Springer
8/11/2019 Polynomial Curve Fitting - NG
26/27
Polynomial Coefficients
Reference: Christopher M Bishop: Pattern Recognition & Machine Leaning, 2006
Springer
8/11/2019 Polynomial Curve Fitting - NG
27/27
Concept of over-fitting Model Complexity & Flexibility
Take Aways from Polynomial Curve Fitting
Will keep revisiting it from time to time