Numerical Analysis - khu.ac.kr

21
Numerical Analysis 2015 Fall Linear Least Squares

Transcript of Numerical Analysis - khu.ac.kr

Page 1: Numerical Analysis - khu.ac.kr

Numerical Analysis

2015 Fall

Linear Least Squares

Page 2: Numerical Analysis - khu.ac.kr

Linear Least-Squares Regression

• Linear least-squares regression is a method to determine the “best” coefficients in a linear model for given data set.

• “Best” for least-squares regression means minimizing the sum of the squares of the estimate residuals. For a straight line model, this gives:

• This method will yield a unique line for a given set of data.

Sr ei

2

i1

n

yi a0 a1xi 2

i1

n

Page 3: Numerical Analysis - khu.ac.kr

Least-Squares Fit of a Straight Line

• Using the model:

y a0 a1x

Sr ei

2

i1

n

yi a0 a1xi 2

i1

n

Page 4: Numerical Analysis - khu.ac.kr

Example V

(m/s)

F

(N)

i xi yi (xi)2 xiyi

1 10 25 100 250

2 20 70 400 1400

3 30 380 900 11400

4 40 550 1600 22000

5 50 610 2500 30500

6 60 1220 3600 73200

7 70 830 4900 58100

8 80 1450 6400 116000

360 5135 20400 312850

a1 n xiyi xi yi

n xi

2 xi 2

8 312850 360 5135

8 20400 360 2

19.47024

a0 y a1x 641.87519.47024 45 234.2857

Fest 234.285719.47024v

Page 5: Numerical Analysis - khu.ac.kr

Example

All of them have same best-fit linear equation and coefficient of

determination value.

Page 6: Numerical Analysis - khu.ac.kr

Polynomial Regression

• The least-squares procedure from Chapter 13 can be readily extended to fit data to a higher-order polynomial. Again, the idea is to minimize the sum of the squares of the estimate residuals.

• The figure shows the same data fit with:

a) A first order polynomial

b) A second order polynomial

Page 7: Numerical Analysis - khu.ac.kr

Polynomial Regression • For a second order polynomial, the best fit would

mean minimizing:

Page 8: Numerical Analysis - khu.ac.kr

Process and Measures of Fit • For a second order polynomial, the best fit would

mean minimizing:

• In general, this would mean minimizing:

• The standard error for fitting an mth order polynomial to n data points is: because the mth order polynomial has (m+1) coefficients.

• The coefficient of determination r2 is still found using:

Sr ei

2

i1

n

yi a0 a1xi a2xi

2 2

i1

n

Sr ei

2

i1

n

yi a0 a1xi a2xi

2 amxi

m 2

i1

n

sy/ x Sr

n m1

r2 St Sr

St

Page 9: Numerical Analysis - khu.ac.kr

Example

Page 10: Numerical Analysis - khu.ac.kr

Multiple Linear Regression

• Another useful extension of linear regression is the case where y is a linear function of two or more independent variables:

Page 11: Numerical Analysis - khu.ac.kr

Multiple Linear Regression

• Another useful extension of linear regression is the case where y is a linear function of two or more independent variables:

• Again, the best fit is obtained by minimizing the sum of the squares of the estimate residuals:

Sr ei

2

i1

n

yi a0 a1x1,i a2x2,i amxm,i 2

i1

n

ya0 a1x1a2x2 amxm

Page 12: Numerical Analysis - khu.ac.kr

General Linear Least Squares • Linear, polynomial, and multiple linear

regression all belong to the general linear least-squares model: where z0, z1, …, zm are a set of m+1 basis functions and e is the error of the fit.

• The basis functions can be any function data but cannot contain any of the coefficients a0, a1, etc.

y a0z0 a1z1a2z2 amzm e

Page 13: Numerical Analysis - khu.ac.kr

Solving General Linear Least Squares Coefficients

• The equation: can be re-written for each data point as a matrix equation: where {y} contains the dependent data, {a} contains the coefficients of the equation, {e} contains the error at each point, and [Z] is:

• with zji representing the the value of the jth basis function calculated at the ith point.

y a0z0 a1z1a2z2 amzm e

y Z a e

Z

z01 z11 zm1

z02 z12 zm2

z0n z1n zmn

Page 14: Numerical Analysis - khu.ac.kr

Solving General Linear Least Squares Coefficients

• Generally, [Z] is not a square matrix, so simple inversion cannot be used to solve for {a}. Instead the sum of the squares of the estimate residuals is minimized:

• The outcome of this minimization yields:

Sr ei

2

i1

n

yi a jz ji

j0

m

2

i1

n

Z T

Z a Z T

y

Page 15: Numerical Analysis - khu.ac.kr

MATLAB Example • Given x and y data in columns, solve for the

coefficients of the best fit line for y=a0+a1x+a2x2

Z = [ones(size(x) x x.^2] a = (Z’*Z)\(Z’*y)

– Note also that MATLAB’s left-divide will automatically include the [Z]T terms if the matrix is not square, so a = Z\y would work as well

• To calculate measures of fit: St = sum((y-mean(y)).^2) Sr = sum((y-Z*a).^2) r2 = 1-Sr/St syx = sqrt(Sr/(length(x)-length(a)))

Homework

Page 16: Numerical Analysis - khu.ac.kr

Example

Z T

Z a Z T

y

Page 17: Numerical Analysis - khu.ac.kr

Nonlinear Regression • As seen in the previous chapter, not all fits are linear

equations of coefficients and basis functions.

• One method to handle this is to transform the variables and solve for the best fit of the transformed variables. There are two problems with this method:

– Not all equations can be transformed easily or at all

– The best fit line represents the best fit for the transformed variables, not the original variables.

• Another method is to perform nonlinear regression to directly determine the least-squares fit.

Page 18: Numerical Analysis - khu.ac.kr

Nonlinear Regression in MATLAB

• To perform nonlinear regression in MATLAB, write a function that returns the sum of the squares of the estimate residuals for a fit and then use MATLAB’s fminsearch function to find the values of the coefficients where a minimum occurs.

• The arguments to the function to compute Sr should be the coefficients, the independent variables, and the dependent variables.

Homework

Page 19: Numerical Analysis - khu.ac.kr

Nonlinear Regression in MATLAB Example

• Given dependent force data F for independent velocity data v, determine the coefficients for the fit:

• First - write a function called fSSR.m containing the following: function f = fSSR(a, xm, ym) yp = a(1)*xm.^a(2); f = sum((ym-yp).^2);

• Then, use fminsearch in the command window to obtain the values of a that minimize fSSR: a = fminsearch(@fSSR, [1, 1], [], v, F) where [1, 1] is an initial guess for the [a0, a1] vector, [] is a placeholder for the options

F a0va1

Homework

Page 20: Numerical Analysis - khu.ac.kr

Nonlinear Regression Results

Page 21: Numerical Analysis - khu.ac.kr

Nonlinear Regression Results • The resulting coefficients will produce the largest r2

for the data and may be different from the coefficients produced by a transformation: