Linear regression models in matrix terms. The regression function in matrix terms.
-
Upload
augustus-sims -
Category
Documents
-
view
242 -
download
0
Transcript of Linear regression models in matrix terms. The regression function in matrix terms.
![Page 1: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/1.jpg)
Linear regression modelsin matrix terms
![Page 2: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/2.jpg)
The regression function in matrix terms
![Page 3: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/3.jpg)
Simple linear regression function
iii xY 10 for i = 1,…, n
nnn xY
xY
xY
10
22102
11101
![Page 4: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/4.jpg)
Simple linear regression function in matrix notation
XY
nnn x
x
x
Y
Y
Y
2
1
1
02
1
2
1
1
1
1
![Page 5: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/5.jpg)
Definition of a matrixAn r×c matrix is a rectangular array of symbols or numbers arranged in r rows and c columns.
A matrix is almost always denoted by a single capital letter in boldface type.
36
21A
9.1401
8.2711
5.2651
1.3921
4.3801
B
6261
5251
4241
3231
2221
1211
1
1
1
1
1
1
xx
xx
xx
xx
xx
xx
X
![Page 6: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/6.jpg)
Definition of a vector and a scalarA column vector is an r×1 matrix, that is, a matrix with only one column.
8
5
2
q
A row vector is an 1×c matrix, that is, a matrix with only one row.
90324621h
A 1×1 “matrix” is called a scalar, but it’s just an ordinary number, such as 29 or σ2.
![Page 7: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/7.jpg)
Matrix multiplication
XY• The Xβ in the regression function is an example of
matrix multiplication.• Two matrices can be multiplied together:
– Only if the number of columns of the first matrix equals the number of rows of the second matrix.
– The number of rows of the resulting matrix equals the number of rows of the first matrix.
– The number of columns of the resulting matrix equals the number of columns of the second matrix.
![Page 8: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/8.jpg)
Matrix multiplication
• If A is a 2×3 matrix and B is a 3×5 matrix then matrix multiplication AB is possible. The resulting matrix C = AB has … rows and … columns.
• Is the matrix multiplication BA possible?
• If X is an n×p matrix and β is a p×1 column vector, then Xβ is …
![Page 9: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/9.jpg)
Matrix multiplication
59273841
8810610190
8696
3745
5123
218
791ABC
The entry in the ith row and jth column of C is the inner product (element-by-element products added together) of the ith row of A with the jth column of B.
101)9(7)4(9)2(1
90)6(7)5(9)3(1
12
11
c
c
24
23 27)6(2)7(1)1(8
c
c
![Page 10: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/10.jpg)
The Xβ multiplication in simple linear regression setting
1
02
1
1
1
1
nx
x
x
X
![Page 11: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/11.jpg)
Matrix addition
XY• The Xβ+ε in the regression function is an example
of matrix addition.• Simply add the corresponding elements of the two
matrices.– For example, add the entry in the first row, first column
of the first matrix with the entry in the first row, first column of the second matrix, and so on.
• Two matrices can be added together only if they have the same number of rows and columns.
![Page 12: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/12.jpg)
Matrix addition
1465
8510
199
812
139
257
653
781
142
BAC
23
12
11
954
972
c
c
c
For example:
![Page 13: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/13.jpg)
The Xβ+ε addition in the simple linear regression setting
nnn x
x
x
X
Y
Y
Y
Y
2
1
10
210
110
2
1
![Page 14: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/14.jpg)
Multiple linear regression functionin matrix notation
XY
nnnnn xxx
xxx
xxx
Y
Y
Y
2
1
321
232221
131211
2
1
1
1
1
![Page 15: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/15.jpg)
Least squares estimates of the parameters
![Page 16: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/16.jpg)
Least squares estimates
YXXX
b
b
b
b
p
1
1
1
0
The p×1 vector containing the estimates of the p parameters can be shown to equal:
where (X'X)-1 is the inverse of the X'X matrix and X' is the transpose of the X matrix.
![Page 17: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/17.jpg)
Definition of the transpose of a matrix
The transpose of a matrix A is a matrix, denoted A' or AT, whose rows are the columns of A and whose columns are the rows of A … all in the same original order.
97
84
51
A
TAA
![Page 18: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/18.jpg)
The X'X matrix in the simple linear regression setting
n
n
x
x
x
xxxXX
1
1
1
111 2
1
21
![Page 19: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/19.jpg)
Definition of the identity matrixThe (square) n×n identity matrix, denoted In, is a matrix with 1’s on the diagonal and 0’s elsewhere.
10
012I
The identity matrix plays the same role as the number 1 in ordinary arithmetic.
10
01
64
79
![Page 20: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/20.jpg)
Definition of the inverse of a matrix
The inverse A-1 of a square (!!) matrix A is the unique matrix such that …
11 AAIAA
![Page 21: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/21.jpg)
Least squares estimates in simple linear regression setting
?1
1
0
YXXXb
bb
n
n
x
x
x
xxxXX
1
1
1
111 2
1
21
soap suds so*su soap2
4.0 33 132.0 16.004.5 42 189.0 20.255.0 45 225.0 25.005.5 51 280.5 30.256.0 53 318.0 36.006.5 61 396.5 42.257.0 62 434.0 49.00--- --- ----- -----38.5 347 1975.0 218.75
ix iy ii yx 2ix
Find X'X.
![Page 22: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/22.jpg)
Least squares estimates in simple linear regression setting
It’s very messy to determine inverses by hand. We let computers find inverses for us.
14286.078571.0
78571.04643.4
75.2185.38
5.3871XXXX
14286.078571.0
78571.04643.4
75.2185.38
5.3871
1XX
Find inverse of X'X.
Therefore:
![Page 23: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/23.jpg)
Least squares estimates in simple linear regression setting
?1
1
0
YXXXb
bb
n
n
y
y
y
xxxYX
2
1
21
111soap suds so*su soap2
4.0 33 132.0 16.004.5 42 189.0 20.255.0 45 225.0 25.005.5 51 280.5 30.256.0 53 318.0 36.006.5 61 396.5 42.257.0 62 434.0 49.00--- --- ----- -----38.5 347 1975.0 218.75
ix iy ii yx 2ix
Find X'Y.
![Page 24: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/24.jpg)
Least squares estimates in simple linear regression setting
1975
347
14286.078571.0
78571.04643.41 YXXXb
51.9
67.2
)1975(14286.0)347(78571.0
)1975(78571.0)347(4643.4
1
0
b
bb
The regression equation issuds = - 2.68 + 9.50 soap
![Page 25: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/25.jpg)
Linear dependence
The columns of the matrix:
31263
6812
1421
A
are linearly dependent, since (at least) one of the columns can be written as a linear combination of another.
If none of the columns can be written as a linear combination of another, then we say the columns are linearly independent.
![Page 26: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/26.jpg)
Linear dependence is not always obvious
123
132
141
A
Formally, the columns a1, a2, …, an of an n×n matrix are linearly dependent if there are constants c1, c2, …, cn, not all 0, such that:
02211 nnacacac
![Page 27: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/27.jpg)
Implications of linear dependence on regression
• The inverse of a square matrix exists only if the columns are linearly independent.
• Since the regression estimate b depends on (X'X)-1, the parameter estimates b0, b1, …, cannot be (uniquely) determined if some of the columns of X are linearly dependent.
![Page 28: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/28.jpg)
The main point about linear dependence
• If the columns of the X matrix (that is, if two or more of your predictor variables) are linearly dependent (or nearly so), you will run into trouble when trying to estimate the regression function.
![Page 29: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/29.jpg)
Implications of linear dependenceon regressionsoap1 soap2 suds4.0 8 334.5 9 425.0 10 455.5 11 516.0 12 536.5 13 617.0 14 62
* soap2 is highly correlated with other X variables* soap2 has been removed from the equation
The regression equation issuds = - 2.68 + 9.50 soap1
![Page 30: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/30.jpg)
Fitted values and residuals
![Page 31: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/31.jpg)
Fitted values
nn xbb
xbb
xbb
y
y
y
y
10
210
110
2
1
ˆ
ˆ
ˆ
ˆ
![Page 32: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/32.jpg)
Fitted values
yXXXXXby 1ˆ
The vector of fitted values
is sometimes represented as a function of the hat matrix H
XXXXH 1
That is:
HyyXXXXy 1ˆ
![Page 33: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/33.jpg)
The residual vector
iii yye ˆ for i = 1,…, n
nnn yye
yye
yye
ˆ
ˆ
ˆ
222
111
nnn yy
yy
yy
e
e
e
e
ˆ
ˆ
ˆ
22
11
2
1
![Page 34: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/34.jpg)
The residual vector written as a function of the hat matrix
nnn yy
yy
yy
e
e
e
e
ˆ
ˆ
ˆ
22
11
2
1
![Page 35: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/35.jpg)
Sum of squares and the analysis of variance table
![Page 36: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/36.jpg)
Analysis of variance table in matrix terms
Source DF SS MS F
Regression p-1
Error n-p
Total n-1 JYYn
YYSSTO
1
JYYn
YXbSSR
1
YXbYYSSE
1pSSR
pn
SSE
MSE
MSR
![Page 37: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/37.jpg)
Sum of squares
In general, if you pre-multiply a vector by its transpose, you get a sum of squares.
n
iin
n
n yyyy
y
y
y
yyyyy1
2222
21
2
1
21
![Page 38: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/38.jpg)
Error sum of squares
2
1
ˆn
iii yySSE
![Page 39: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/39.jpg)
Error sum of squares
yyyySSE ˆˆ '
![Page 40: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/40.jpg)
Total sum of squares
n
yyyySSTO iii
2
22
Previously, we’d write:
JYYn
YYSSTO
1
But, it can be shown that equivalently:
where J is a (square) n×n matrix containing all 1’s.
![Page 41: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/41.jpg)
An example oftotal sum of squares
If n = 2:
2221
21
221
22
1
2 YYYYYYYi
i
But, note that we get the same answer by:
2221
21
2
121 2
11
11YYYY
Y
YYYJYY
![Page 42: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/42.jpg)
Analysis of variance table in matrix terms
Source DF SS MS F
Regression p-1
Error n-p
Total n-1 JYYn
YYSSTO
1
JYYn
YXbSSR
1
YXbYYSSE
1pSSR
pn
SSE
MSE
MSR
![Page 43: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/43.jpg)
Model assumptions
![Page 44: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/44.jpg)
Error term assumptions
• As always, the error terms εi are:
– independent– normally distributed (with mean 0)– with equal variances σ2
• Now, how can we say the same thing using matrices and vectors?
![Page 45: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/45.jpg)
Error terms as a random vector
The n×1 random error term vector, denoted as ε, is:
n
2
1
![Page 46: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/46.jpg)
The mean (expectation) of the random error term vector
The n×1 mean error term vector, denoted as E(ε), is:
0
0
0
0
2
1
nE
E
E
E
Definition Assumption Definition
![Page 47: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/47.jpg)
The variance of the random error term vector
The n×n variance matrix, denoted as σ2(ε), is defined as:
nnn
n
n
n
221
222
21
12112
2
1
22
),(),(
),(),(
),(),(
Diagonal elements are variances of the errors.Off-diagonal elements are covariances between errors.
![Page 48: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/48.jpg)
The ASSUMED variance of the random error term vector
BUT, we assume error terms are independent (covariances are 0), and have equal variances (σ2).
2
2
2
2
1
22
00
00
00
n
![Page 49: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/49.jpg)
Scalar by matrix multiplication
Just multiply each element of the matrix by the scalar.
462
101214
082
231
567
041
2
For example:
![Page 50: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/50.jpg)
The ASSUMED variance of the random error term vector
2
2
2
2
00
00
00
![Page 51: Linear regression models in matrix terms. The regression function in matrix terms.](https://reader033.fdocuments.net/reader033/viewer/2022050908/56649d995503460f94a8394c/html5/thumbnails/51.jpg)
The general linear regression model
Putting the regression function and assumptions all together, we get:
XYwhere:
• Y is a ( ) vector of response values
• β is a ( ) vector of unknown parameters
• X is an ( ) matrix of predictor values
• ε is an ( ) vector of independent, normal error terms with mean 0 and (equal) variance σ2I.