Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only...
Transcript of Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only...
![Page 1: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/1.jpg)
1
Linear Regression
![Page 2: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/2.jpg)
2
Simple Linear Regresion
● First, we consider only one dimension X1.
● Regression: We predict numeric goal Y.● Linear: We assume linear relation Y=f(X),
● with intercept and slope● We have training data
● We minimize least square criterion.
![Page 3: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/3.jpg)
3
Residual Sum of Squares RSS● Residuum: the difference between the true and
the predicted value y, .,● i is the observation index, i=1:N.
● We minimize ,equivalently
equivalently MSE(train.data).
![Page 4: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/4.jpg)
4
Lin. Reg. Coeffitient Estimates● Simple linear regression
● Multivariate linear regression
● where X denotes Nx(p+1) train matrix <1,x>, ● y the N vector of training goal variable.
![Page 5: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/5.jpg)
5
Assessing the Accuracy of Coefficients Estimates
● Different training data lead to different estimates.(red-true, blue-estimated models)
● The dispersion is characterized by variance.true variance sample variance
![Page 6: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/6.jpg)
6
Standard Error, Variance● For data , ● (sample) variance (rozptyl) is:● (sample) standard error (směrodatná odchylka)
SE:● it is our estimate of true value .● variance of the mean estimate
is
● ubiased estimate:
![Page 7: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/7.jpg)
7
Standard Error of Parameters● Standard error of parameters are:
where ● We estimate by residual standard error
● Notice that is smaller for xi more
spread out (more leverage).
![Page 8: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/8.jpg)
8
Hypothesis Testing, Confidence Intervals
● There is approx. 95% chance that the interval
will contain the true value of .● Similarly, in .● Hypothesis test:
● Assume null hypothesis H0 versus alternative H
a.
● What is the probability of measured or higher?– p-value of the t-test
● (n-2) degree of freedom ● If suffitiently low (<5%), we reject null hypothesis.
![Page 9: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/9.jpg)
9
Importance of Features
● If the Pr(>|t|) is low, the parameter is significant.● Usually, significancy level 0.05 is taken,● to be 'really' sure (medicin) 0.001 ,● a parameter with higher value than 0.05 can be
non-zero due a chance.
![Page 10: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/10.jpg)
10
Assessing the Accuracy of the Model● Residual standard error:
●
● average amount that the response will deviate from the true regression line.
● RSE depends on the scale of Y.● mean(wage)=111.7036, RSE=41.64581● pred.y$fit[7]-pred.y$fit[1]=8.099244
![Page 11: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/11.jpg)
11
R2 Statistics● The proportion of variance explained
● scale independent, always in [0,1].
● where TSS (total SS) relates to trivial model – the mean.
● 'Our' wage R2 = 0.0043 is very low.
![Page 12: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/12.jpg)
12
Multiple Linear Regression● Model:
● p – number of variables (features)● Minimizing RSS we get coeffitients .
● one dimensional:● Is advertisement in newspaper important?
![Page 13: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/13.jpg)
13
Linear Regression – Matrix Form● We look for function f in the form:
● that minimizes RSS:
![Page 14: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/14.jpg)
14
Linear Regression - Derivation
● We take a derivative of RSS
● set it to =0
● and get the solution
● and the prediction .
![Page 15: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/15.jpg)
15
CollinearityExtreme Colin.: non invertible XTX
![Page 16: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/16.jpg)
16
Corellation of Variables
● Remarque 2: ● Too high number of predictors p – some are
correlated and with good F- stat. due a chance.● feature selection: Chapter 6, it is on shedule.
![Page 17: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/17.jpg)
17
Pattern on Residuals - Nonlinearity
![Page 18: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/18.jpg)
18
Kvalitative (discrete) Predictors● Encoding by 0/1, more valued we code each
value (except one) separetly.● Example: ethnicity
![Page 19: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/19.jpg)
19
The Estimated Slope is Fixed
![Page 20: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/20.jpg)
20
Non-linear Models● too many combination to check,
● if you know what, ADD IT – log, exp, product, ...
● simplified ideas of nonlinear models:● splines – piecewice polynomial functions● SVM – a trick to check higher degree polynoms● basis function, trees – piecewise 'kernel, constant'● stacking – LR on trained models● and others.
![Page 21: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/21.jpg)
21
Non-linear Model
![Page 22: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/22.jpg)
22
Corelated observations (rezidum)● usuall with time series● usually it leads to underestimate of the error.
![Page 23: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/23.jpg)
23
Non-constant Variance of Error Terms● log transformation, weighted least squares
![Page 24: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/24.jpg)
24
Outliers (odlehlá pozorování)
● Error in the dataset or missing predictor?
![Page 25: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/25.jpg)
25
High leverage – vzdálená X
● leverage statistics: diagonal of H=X(XTX)-1XT.● One dimensional:
![Page 26: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/26.jpg)
26
k – NN regression
![Page 27: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/27.jpg)
27
Comparison of Lin. Reg. and k-NN
● almost linear relation – linear model is better,● highly nonlinear relation – better is k- NN.
![Page 28: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/28.jpg)
28
![Page 29: Linear Regressionkti.mff.cuni.cz/~marta/su2a.pdf2 Simple Linear Regresion First, we consider only one dimension X 1. Regression: We predict numeric goal Y. Linear: We assume linear](https://reader036.fdocuments.net/reader036/viewer/2022071404/60f8ee96d79df74d834db993/html5/thumbnails/29.jpg)
29