Linear Regression { Linear Least Squares Linear Regression { Linear Least Squares ME 120 Notes...

Linear Regression – Linear Least SquaresME 120 Notes

Gerald Recktenwald

Portland State University

Department of Mechanical Engineering

gerry@pdx.edu

ME 120: Linear Regression Introduction

Introduction

Engineers and Scientists work with lots of data.

Scientists try to undersand the way things behave in the physical world.

Engineers try to use the discoveries of scientists to produce useful products or services.

Given data from a measurement, how can we obtain a simple mathematical model that

fits the data? By “fit the data”, we mean that the function follows the trend of the data.

ME 120: Linear Regression Introduction page 1

Polynomial Curve Fits

Linear Quadratic Cubic

2 3 4 5 6 70

0 1 2 3 4 5 6 75

0 2 4 6 8−10

Basic Idea

• Given data set (xi, yi), i = 1, . . . , n

• Find a function y = f(x) that is close to the data

The least squares process avoids guesswork.

Some sample data

(time) (velocity)

It is aways important to visualize your data.

You should be able to plot this data by hand.

Plot the Data

• Plot x on the horizontal axis, y on

the vertical axis.

. What is the range of the data?

. Use the range to select an

appropriate scale so that the data

uses all (or most) of the available

paper.

• In this case, x is the independent

variable.

• y is the dependent variable.

Suppose that x is a measured value of

time, and y is a measured velocity of a

• Label the axes

Analyze the plot

0 1 2 3 4 5 60

Time (s)

city (m

An equation that represents the data is

valuable

• A simple formula is more compact

and reusable than a set of points.

• This data looks linear so our “fit

function” will be

y = mx + b

• The value of slope or intercept may

have physical significance.

Trial and Error: Not a good plan

Pick any two points (x1, y1) and (x2, y2), and solve for m and b

y1 = mx1 + b

y2 = mx2 + b

Subtract the two equations

=⇒ y1 − y2 = mx1 −mx2 (b terms cancel)

Solve for m

m =y1 − y2

x1 − x2

=y2 − y1

x2 − x1

Trial and Error: Not a good plan

Now that we know m, we can use any one of the two original equations to solve for b.

y1 = mx1 + b =⇒ b = y1 −mx1

So, by picking two arbitrary data pairs, we can find the equation of a line that is at least

related to our data set.

However, that is not the best way to solve the problem.

Before showing a better way, let’s use this simplistic approach.

Simplistic line approximation: Use points 1 and 5

0 1 2 3 4 5 60

Time (s)

locity (m

Use point 1 and point 5:

m = 9.50

b = –0.5

Compute slope and intercept using

points 1 and 5

m =47− 9 m/s

5− 1 s= 9.5

b = 47m

s−(9.5

)(5 s)

= −0.5m/s

0 1 2 3 4 5 60

Time (s)

locity (m

m = 10

Compute slope and intercept using

points 2 and 4

m =41− 21 m/s

4− 2 s= 10

b = 41m

s−(10

)(5 s)

= 1m/s

0 1 2 3 4 5 60

Time (s)

locity (m

m = 8.667

b = 3.667Compute slope and intercept using

points 2 and 5

m =47− 21 m/s

5− 2 s= 8.667

b = 47m

s−(8.667

)(5 s)

= 6.667m/s

Simplistic line approximation: So far . . .

Computing the m and b from any two points gives values of m and b that depend on the

points you choose.

Data m b

(x1, y1), (x5, y5) 9.50 −0.5(x2, y2), (x4, y4) 10.0 1.0

(x2, y2), (x5, y5) 8.67 3.7

Don’t use the simplistic approach because m and b depend on the choice of data.

Instead use the least squares method that will give only one slope and one intercept for a

given set of data.

Least Squares Method

• Compute slope and intercept in a way that minimizes an error (to be defined).

• Use calculus or linear algebra to derive equations for m and b.

• There is only one slope and intercept for a given set of data that satisfies the least

squares criteria.

Do not guess m and b! Use least squares!

Least Squares: The Basic Idea

The best fit line goes near the data,

but not through them.

Curve fit at xiis y = mxi + b

(xi, yi)

The best fit line goes near the data,

but not through them.

The equation of the line is

y = mx + b

The data (xi, yi) are known.

m and b are unknown.

(xi, yi)

Curve fit value at xi is y = mxi + b

The discrepancy between the known

data and the unknown fit function is

taken as the vertical distance

yi − (mxi + b)

The error can be positive or

negative, so we use the square of

the error

[yi − (mxi + b)]2

Least Squares Computational Formula

Use calculus to minimize the sum of squares of the errors

Total error in the fit =n∑

[yi − (mxi + b)]2

Minimizing the total error with respect to the two parameters m and b gives

m =n∑

xiyi −∑

x2i − (

∑xi)

∑yi −m

Notice that b depends on m, so solve for m first.

Summary of Fitting a Line with Least Squares

• Plot the data first

• The slope and intercept of the least squares line fit are

m =n∑

xiyi −∑

x2i − (

∑xi)

∑yi −m

• The Trendline feature of Excel computes the least squares fit, and can fit polynomial

and other functions.

Subscript and Summation Notation

The following slides provide a review of subscript andsummation notation.

Suppose we have a set of numbers

x1, x2, x3, x4

We can use a variable for the subscript

xi, i = 1, . . . , 4

To add the numbers in a set we can write

s = x1 + x2 + x3 + x4

n∑i=1

We can put any formula with subscripts inside the summation notation.

Therefore4∑

xiyi = x1y1 + x2y2 + x3y3 + x4y4

4∑i=1

x2i = x

21 + x

22 + x

23 + x

The result of a sum is a number, which we can then use in other computations.

The order of operations matters. Thus,

4∑i=1

x2i 6=

x21 + x

22 + x

23 + x

24 6= (x1 + x2 + x3 + x4)

Subscript and Summation: Lazy Notation

Sometimes we do not bother to write the range of the set of data.

s =∑

implies

n∑i=1

where n is understood to mean, “however many data are in the set”

Subscript and Summation: Lazy Notation

So, in general

implies that there is a set of x values

x1, x2, . . . , xn

xi, i = 1, . . . , n

Practice with Least Squares

Practice!

Given some data (xi, yi), i = 1, . . . , n.

1. Always plot your data first!

2. Compute the slope and intercept of the least

squares line fit.

m =n∑

xiyi −∑

x2i − (

∑xi)

∑yi −m

Sample Data:

(time) (velocity)

Linear Regression { Linear Least Squares Linear Regression { Linear Least Squares ME 120 Notes...

Documents

Transcript of Linear Regression { Linear Least Squares Linear Regression { Linear Least Squares ME 120 Notes...

1 General Linear Squares and Nonlinear Regression.

Bivariate Least Squares Linear Regression: Towards a Uni ... · Bivariate least squares linear regression. II. Extreme structural models 2395 models where the instrumental dispersion

UVA CS 4501: Machine Learning Lecture 8: Review of Regression · Multivariate Linear Regression Regression Y = Weighted linear sum of X’s Least-squares / GD / SGD Linear algebra

UVA CS 4501: Machine Learning Lecture 22: Review...(3) Locally Weighted / Kernel Regression Regression Y = local weighted linear sum of X’s Least-squares Linear algebra Local Regression

Least-squares independence regression for non-linear ... · Mach Learn (2014) 96:249–267 DOI 10.1007/s10994-013-5423-y Least-squares independence regression for non-linear causal

Regularized Linear Regression, Nonlinear Regressionaarti/Class/10315_Fall19/lecs/Lecture17.pdf · Linear Regression Least Squares Estimator Normal Equations Gradient Descent Probabilistic

Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.

Linear Methods for Regressionvulstats.ucsd.edu/pdf/HTF.ch-3.regularization.pdf44 3. Linear Methods for Regression 3.2 Linear Regression Models and Least Squares AsintroducedinChapter2,wehaveaninputvectorXT

Multiple Linear Regression - Analysis Made Easy · Multiple Linear Regression The MULTIPLE LINEAR REGRESSION command performs simple multiple regression using least squares. Linear

Ch15. General Linear LeastLeast--Squares and Nonlinear ... · 수치해석 Numerical Analysis 161009 Ch15. General Linear LeastLeast--Squares and Nonlinear Squares and Nonlinear Regression

Understanding Linear Least Squares Regression

1 Simple Linear Regression I – Least Squares Estimation

4. Simple regression - OTexts Simple regression.pdf · Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

1 Chapter 12 Simple Linear Regression. 2 Chapter Outline Simple Linear Regression Model Least Squares Method Coefficient of Determination Model.

AP Statistics Chapter 8 Linear Regression. Objectives: Linear model Predicted value Residuals Least squares Regression to the mean Regression line Line.

Linear Regression Least Squares Method: an introduction.

1 Chapter 12 Simple Linear Regression Simple Linear Regression Model Least Squares Method Coefficient of Determination Model Assumptions Testing for Significance.

Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2.

Linear Regression and the Least-squares problemepsilon.uprrp.edu/aniel/Lectures/Lecture_LinRegLSP.pdfLinear Regression and the Least-squares problem Aniel Nieves-Gonz alez Aniel Nieves-Gonz

1 Least squares procedure Inference for least squares lines Simple Linear Regression.