02-03-17
1
Linear models
F. Farnir, E. Moyse
Biostatistics & Bioinformatics
Faculty of Vet. Medicine
University of Liege
Outline of the course
� Intro: why using linear models ?
� R as a tool to learn and do statistics
� Matrix language
� Linear model
◦ Basic formulation
◦ Estimating parameters
◦ Testing hypotheses about the parameters
� Examples of linear model
Outline of the course
◦ Simple linear regression
◦ Simple ANOVA
◦ A more complicated example
◦ A more complex situation: repeated measures
� Summary & exercices
Matrix language
� Matrix definition
◦ A matrix M is a n*m table of scalar elements.
◦ The elements of the matrix are noted M(i,j),
where i corresponds to the row in the matrix
(i = 1, 2, …, n) and j to the column in the
matrix (j = 1, 2, …, m)
◦ Example:
if M = , M(1,3) = 5.
466
5810
02-03-17
2
Matrix language
� Special cases
◦ When m = n, we have a square matrix.
◦ When M is square and M(i,j) = M(j,i) for all j>i, we have a symetric matrix.
◦ When m (or n) = 1, we have a row (column) vector
◦ When m = n = 1, the matrix reduces to a scalar (i.e. the usual math objets we are usedto play with…)
◦ When M is square, and M(i,j) = 0 for all i ≠ j, the matrix is diagonal
Matrix language
� Vector definition using R
� Matrix definition using R
> v1<-c(2,4,6,8,10) # Simplest form> v2<-seq(2,10,2) # Args: min,max,step> v3<-1:10 # Successive numbers
> m1<-matrix(v3,byrow=T,nr=2)> m1
[,1] [,2] [,3] [,4] [,5][1,] 1 2 3 4 5[2,] 6 7 8 9 10>
Matrix language
� Matrix operations
◦ Equality: 2 matrices M and N are equal if:
� they have the same dimensions n and m.
� M(i,j) = N(i,j) for all possible values of (i,j).
◦ Example:
means that a = 10, b = 8, …, f = 4.
=
fed
cba
466
5810
Matrix language
� Matrix operations
◦ Transposition: N is a transpose of M if:
� M dimensions are m & n, and N dims. are n & m
� N(i,j) = M(j,i) for all possible values of (i,j).
� The usual notation is M’ = N
◦ Example:
=′⇒
=
45
68
610
466
5810MM
02-03-17
3
Matrix language
� Matrix operations
◦ Addition: C = M + N if:
� C, M & N have the same dimensions n and m.
� C(i,j) = M(i,j) + N(i,j) for all possible values of (i,j).
◦ Example:
◦ Substraction is defined similarly
−−+
=
109
553
5615
037
466
5810
Matrix language
� Matrix operations using R
◦ Addition
> m1<-matrix(1:12,byrow=T,nr=3)> m2<-matrix(1:12,byrow=F,nr=3)> m<-m1+m2> m
[,1] [,2] [,3] [,4][1,] 2 6 10 14[2,] 7 11 15 19[3,] 12 16 20 24>
Matrix language
� Matrix operations
◦ Scalar multiplication: N = a*M
� M & N have the same dimensions n and m.
� a is a scalar
� N(i,j) = a*M(i,j) for all possible values of (i,j).
◦ Example:
=
81212
101620
466
5810*2
Matrix language
� Matrix operations using R
◦ Scalar multiplication
> m1<-matrix(1:12,byrow=T,nr=3)> m2<-3*m1> m2
[,1] [,2] [,3] [,4][1,] 3 6 9 12[2,] 15 18 21 24[3,] 27 30 33 36>
02-03-17
4
Matrix language
� Matrix operations
◦ Matrix multiplication: C = M*N
� M & N have compatible dimensions n,m and m,q.
� C dimension is n,q
� C(i,j) = Σk M(i,k)*N(k,j) for all possible values of (i,j).
◦ Example:
=
−
14
21
1
2
1
*466
5810
Matrix language
� Matrix operations using R
◦ Matrix multiplication
> m1<-matrix(1:12,byrow=T,nr=3)> m2<-matrix(1:12,byrow=T,nr=4)> m<-m1%*%m2> m
[,1] [,2] [,3][1,] 70 80 90[2,] 158 184 210[3,] 246 288 330>
Matrix language
� Matrix operations
◦ Matrix multiplication (cont’d)
◦ Another example:
� The matrix I with I(i,i) = 1 for all i and I(i,j) = 0 for all
i,j,i ≠ j is called identity matrix (for obvious
reasons…)
=
466
5810
100
010
001
*466
5810
Matrix language
� Matrix operations using R
◦ Matrix multiplication with identity
> m1<-matrix(1:12,byrow=T,nr=3)> m2<-diag(4) # Diagonal matrix of dim 4,4> m1%*%m2
[,1] [,2] [,3] [,4][1,] 1 2 3 4[2,] 5 6 7 8[3,] 9 10 11 12>
02-03-17
5
Matrix language
� Matrix operations
◦ Matrix inversion: N = M-1
� M & N are square and of same dimensions n,n
� M*N = I
� N(i,j) are not easy to compute by hand in general.
� N is unique when it exists
◦ Example:
1
10
01
5.05.1
12*
43
21* −=⇒
=
−−
= MNNM
Matrix language
� Matrix operations using R
◦ Matrix inversion
> m<-matrix(1:4,byrow=T,nr=2)> n<-solve(m) # Computes the inverse of m> n
[,1] [,2][1,] -2.0 1.0[2,] 1.5 -0.5> m%*%n
[,1] [,2][1,] 1 0[2,] 0 1>
Matrix language
� Matrix operations
◦ Matrix inversion: questions ?
� Are all matrices invertible ? No
� When rows (or columns) are linear combinations of other
rows (or columns) of the matrix, the matrix is not invertible.
� The max # of independent rows and columns is called the
rank of the matrix, noted r(M)
� Example:
=> M[3,] = 2*M[1,] + M[2,]
=1098
456
321
M
Matrix language
� Matrix operations
◦ Matrix inversion: questions ?
� But, not obvious to see such dependencies…
� Fortunately, a measure, called determinant can be
systematically computed (although not easily…),
with the property that a null determinant indicates
a non invertible (« singular ») matrix.
02-03-17
6
Matrix language
� Matrix operations using R
◦ Matrix inversion: singular matrix
> m<-matrix(c(1,2,3,6,5,4,8,9,10),byrow=T,nr=3)> det(m) # Computes the determinant of m[1] 0> solve(m)Erreur dans solve.default(m) :
sous-programme Lapack dgesv : le système est exactement singulier>
Matrix language
� Matrix operations
◦ Matrix inversion: questions ?
� So, what to do with singular and rectangular
matrices ? Use generalized inverse
◦ G is a generalized inverse of M if M*G*M = M
◦ Example:
=⇒
=3/100
02/10
000
303
022
325
1GM
Matrix language
� Matrix operations using R
◦ Generalized inverse
> m<-matrix(c(5,2,3,2,2,0,3,0,3),byrow=T,nr=3)> det(m) # Computes the determinant of m[1] 0> g1<-matrix(c(0,0,0,0,1/2,0,0,0,1/3),byrow=T,nr=3)> m%*%g1%*%m
[,1] [,2] [,3][1,] 5 2 3[2,] 2 2 0[3,] 3 0 3>
Matrix language
� Matrix operations using R
◦ Generalized inverse
> m<-matrix(c(5,2,3,2,2,0,3,0,3),byrow=T,nr=3)> library(MASS)> g2<-ginv(m) # Penrose generalized inverse> g2
[,1] [,2] [,3][1,] 0.09259259 0.07407407 0.01851852[2,] 0.07407407 0.25925926 -0.18518519[3,] 0.01851852 -0.18518519 0.20370370> m%*%g2%*%m
[,1] [,2] [,3][1,] 5 2 3[2,] 2 2 0[3,] 3 0 3>
02-03-17
7
Matrix language
� Matrix operations
◦ Matrix inversion: summary
� For square matrices M with det(M) ≠ 0, a unique
inverse matrix, noted M-1, exists
� Note that: M*M-1*M = M*I = M
� For square matrices M with det(M) = 0 and for
rectangular matrices, an infinite number of
generalized matrices, noted G, exist.
solve(M)
ginv(M)
Matrix language: an application
� Solving a system of linear equations
◦ Consider the following system:
◦ Using our definitions, we can rewrite it as
A*x = b where A = , x = and b =
=+=−+=++
7*2
1*2
8*2
zx
zyx
zyx
−201
112
121
z
y
x
7
1
8
Matrix language: an application
� Solving a system of linear equations (cont’d)
◦ Now we can solve using matrix operations:
A-1*A*x = A-1*b
=> I*x = A-1*b
=> x = A-1*b
=> x = x[1] = (A-1*b)[1],
y = x[2] = (A-1*b)[2],
z = x[3] = (A-1*b)[3].
Matrix language
� Solving a system of equations using R
> A<-matrix(c(1,2,1,2,1,-1,1,0,2),byrow=T,nr=3)> b<-c(8,1,7)> x<-solve(A)%*%b> x
[,1][1,] 1[2,] 2[3,] 3>
02-03-17
8
Matrix language: an application
� Solving a system of linear equations
◦ Now, consider the following system:
◦ Again, we can rewrite it as:
A*x = b where A = , x = , b =
=++−=−+=++
7*2
1*2
8*2
zyx
zyx
zyx
−−211
112
121
z
y
x
7
1
8
Matrix language: an application
� Solving a system of linear equations
◦ But this time, the 3rd eqn is the 1st - the 2nd:
◦ So, we actually have 2 equations with 3
unknowns, which leads to an infinity of
solutions…
=++−=−+=++
7*2
1*2
8*2
zyx
zyx
zyx
Matrix language: an application
� Solving a system of linear equations
◦ In our matrix writing, this leads to
A[3,] = A[1,] – A[2,]
◦ Accordingly, the inverse of A does not exist
and generalized inverses need to be used !
◦ Since an infinity of generalized inverses can be
obtained, this leads to an infinity of solutions,
as expected !
Matrix language
� Solving a system of equations using R> A<-matrix(c(1,2,1,2,1,-1,-1,1,2),byrow=T,nr=3)> b<-c(8,1,7)> library(MASS)> x<-ginv(A)%*%b> x
[,1][1,] 0.3333333[2,] 2.6666667[3,] 2.3333333> y<-c(1,2,3) # Let’s show this is another solution> A%*%y
[,1][1,] 8[2,] 1[3,] 7>
02-03-17
9
Linear models
� A basic formulation
◦ A model is a linear model if the relationship
between the parameters and the modelled
variable is linear.
◦ Examples:
� Linear regression: y(ij) = β0 + β1*x(i) + e(ij)
� Quadratic regression: y(i) = β0 + β1*x(i) + β2*x(i)² + e(ij)
� One way ANOVA: y(i) = µ + a(i) + e(ij)
� Multiple ways ANOVA, other regressions, mixture
of ANOVA and regressions, …
Linear models
� Matrix formulation:y = X*β + e
◦ y’ = (y(1),…,y(n)) = vector of observations
� known (observed, or measured)
◦ β’ = (b(1),…,b(m)) = vector of parameters
� unknown, and to be estimated
◦ e’ = (e(1),…,e(n)) = vector of residuals
� unknown, but supposed N(0,σ²*I)
◦ X = design (or incidence) matrix, linking the parameters to the observations.
� known
Linear models
� Two main problems:
◦ How do we obtain estimators b for the
parameters β of the model ?
◦ How do we test hypotheses about the
parameters of the model ?
Linear models
� Estimation problem:
◦ The rationale we use to infer estimators is to
choose parameters values that make error « as
small as possible »
� To simultaneously reduce all components of e, we
choose to minimize e’*e = Σ e(i)²
� This leads to a « least-square (of the error) estimate »
in the following form:
b = (X’*X)-1*X’*y
02-03-17
10
Linear models
� Estimation problem: example 1
◦ Assume we have the following simple problem
10 = 3*β + e1
12 = 2*β + e2
◦ 2 equations – 1 unknown => no solution
◦ We choose an estimator of β that minimizes the
sum of the squared errors:
min (e1² + e2²) = min [(10 – 3*β)² + (12 – 2*β)²]
=> 2*(10-3*b)*(-3) + 2*(12-2*b)*(-2) = 0
=> b = 54/13
Linear models
� Estimation problem: example 1
◦ Using the matrix notation:
y’ = (10 12), X’ = (3 2),
b = (b), e = (e1 e2)
◦ X’X = 13 => (X’X)-1 = 1/13
X’y = 54
◦ => b = (X’X)-1 * X’y = 54/13
Linear models
� Estimation problem: example 2
◦ The simplest linear model:
yi = µ + ei
◦ In matrix form: y = X*β + e
� y’ = (y1 y2 … yn)
� X’ = (1 1 … 1)
� β = (µ)
� e’ = (e1 e2 … en)
◦ The estimator b of β is then: b = (X’*X)-1*X’*y
Linear models
� Estimation problem: example 2
◦ Let’s compute b:
� X’*X = n => (X’*X)-1 = 1/n
� X’*y = y1 + y2 +… + yn = Σ yi
� b = (X’*X)-1 * X’*y = 1/n * Σ yi = m
◦ Let’s compute (y-X*b)’*(y-X*b)/(n-r(X))
� (X*b)’ = (m m … m)
� (y-X*b)’ = (y1-m y2-m … yn-m)
� (y-X*b)’*(y-X*b) = (y1-m)² + (y2-m)² + … +(yn-m)²
= Σ (yi-m)²
� r(X) = # of indpdt lines of X = 1
02-03-17
11
Linear models
� Estimation problem: example 2
◦ Consequently:
� (y-X*b)’*(y-X*b)/(n-r(X)) = Σ (yi-m)²/(n-1) = s²
Linear models
� Testing problem:
◦ Most of the (null) hypotheses we might want to
test can be written as:
H0: L*β = c
where L and c are known matrix and vector,
respectively. This is known as « general linear
hypothesis ».
Linear models
� Testing problem (cont’d):
◦ A general test can be devised, based on a F statistics, for such hypotheses:
� G is a generalized inverse of X’*X. If the hypothesis is« testable », all G provide the same F value
� q = # of independent lines of L
� The hypothesis is of course embedded in the statistic
� The numerator is the estimator of σ².
� Normality assumptions are necessary to obtain the F distributions
Fq,n-r(X) = [(L*b-c)’*(L*G*L’)-1*(L*b-c)/q] / [(y-X*b)’*(y-X*b)/(n-r(X))]
Examples of linear models
� A simple linear regression:
◦ Let’s consider the following dataset:
◦ A question of interest: is there a significant
relation between weights and weeks ?
Week Weight
1 4.1
2 4.6
3 4.9
4 5.2
5 5.4
02-03-17
12
Examples of linear models
� A simple linear regression:
◦ A first answer is to look at a plot of weights
versus weeks. This can be achieved using R:
> weeks<-1:5> weights<-c(4.1,4.6,4.9,5.2,5.4)> plot(weeks,weights)
Examples of linear models
� A simple linear regression:
Examples of linear models
� A simple linear regression: there is a clear,
almost linear, increasing trend
◦ This could be modeled using a classical linear
regression:
Y(i) = β0 + β1*X(i) + e(i)
or, using the matrix notation:
Y = X*β + e where β =
1
0
ββ
Examples of linear models
� A simple linear regression:
◦ Computing the estimators b of β.
� X and y are easy to obtain:
( ) yXXX '**'* 1
1
0 −=
b
b
=
=
51
21
11
)5(1
)2(1
)1(1
⋮⋮⋮⋮
X
X
X
X
=
=
4.5
6.4
1.4
)5(
)2(
)1(
⋮⋮
y
y
y
y
02-03-17
13
Examples of linear models
� A simple linear regression:
◦ Computing the estimators b of β.
> weeks<-1:5> weights<-c(4.1,4.6,4.9,5.2,5.4)> Y<-weights> X<-matrix(c(rep(1,5),weeks),byrow=F,nr=5)> b<-solve(t(X)%*%X)%*%t(X)%*%Y> b
[,1][1,] 3.88[2,] 0.32> abline(b[1],b[2],col=« red »)
Examples of linear models
� A simple linear regression:
◦ Computing the estimators b of β.
Examples of linear models
� A simple linear regression:
◦ Testing an hypothesis: β1 = 0.
� The hypothesis can be put in the form L*β = c
as follows:
� L = (0,1), c = 0
� Next, we can use these elements in the formula
� Note that: q = 1 and r(X) = 2
Fq,n-r(X) = [(L*b-c)’*(L*G*L’)-1*(L*b-c)/q] / [(y-X*b)’*(y-X*b)/(n-r(X))]
Examples of linear models
� A simple linear regression:
◦ Testing an hypothesis: β1 = 0.> weeks<-1:5> weights<-c(4.1,4.6,4.9,5.2,5.4)> Y<-weights> n<-length(Y)> X<-matrix(c(rep(1,5),weeks),byrow=F,nr=5)> G<-solve(t(X)%*%X)> b<-G%*%t(X)%*%Y> L<-matrix(c(0,1),nr=1)> c<-c(0)> hypo<-L%*%b-c> numer<-t(hypo)%*%solve(L%*%G%*%t(L))%*%hypo> denom<-t(Y-X%*%b)%*%(Y-X%*%b)> F<-(numer/1)/(denom/(n-2))> pf(F,1,n-2,lower.tail=FALSE)
[,1][1,] 0.001857831
02-03-17
14
Examples of linear models
� A simpler solution to linear regression:
◦ Testing an hypothesis: β1 = 0.> weeks<-1:5> weights<-c(4.1,4.6,4.9,5.2,5.4)# ‘lm’ stands for ‘linear models’> lr<-lm(weights~weeks)> summary(lr)
Call:lm(formula = weights ~ weeks)
Residuals:1 2 3 4 5
-0.10 0.08 0.06 0.04 -0.08
Examples of linear models� A simpler solution to linear regression
(cont’d):◦ Testing an hypothesis: β1 = 0.
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.88000 0.10132 38.29 3.92e-05 ***weeks 0.32000 0.03055 10.47 0.00186 ** ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09661 on 3 degrees of freedomMultiple R-squared: 0.9734, Adjusted R-squared: 0.9645 F-statistic: 109.7 on 1 and 3 DF, p-value: 0.001858
Examples of linear models
� The classical solution for linear regression:
◦ Testing an hypothesis: β1 = 0.> weeks<-1:5> weights<-c(4.1,4.6,4.9,5.2,5.4)# b = Σ(X-Xm)*(Y-Ym)/ Σ(X-Xm)²> xm<-mean(weeks)> x<-weeks-xm> ym<-mean(weights)> y<-weights-ym> b<-sum(x*y)/sum(x**2)> b[1] 0.32> SCR<-b*sum(x*y)> SCT<-sum(y*y)> SCE<-SCT-SCR> dfR<-1> dfE<-length(weights)-2> F<-(SCR/dfR)/(SCE/dfE)> pf(F,dfR,dfE,lower.tail=FALSE)[1] 0.001857831
Examples of linear models
� A simple analysis of variance:
◦ As a second example, consider these data, with
horses heart rates:
Ardennes Warm Half
106.6 115.4 100.2
100.8 97.8 102.1
110.9 120.3 99.6
114.5 98.2 103.8
115.9 113.2 100.7
91.9 107.6
95.0
02-03-17
15
Examples of linear models
� A simple analysis of variance:
◦ The question of interest here is: is there a linkbetween heart rates and breed ?
◦ This question can be addressed using an ANOVA, i.e. the following model:
y(ij) = µ + αi + e(ij), i = 1, …, 3
or, using the matrix notation:
y = X*β + e where β =
H
W
A
µ
ααα
Examples of linear models
� A simple analysis of variance
◦ Elements of the model
=⇒
7007
0505
0066
75618
'*XX
=
1001
0011
0011
⋮⋮⋮⋮X
=⇒
0.709
9.544
6.640
5.1894
'*yX
=
0.95
8.100
6.106
⋮y
Examples of linear models
� A simple analysis of variance
◦ Elements of the model (using R)
# Data> X<-matrix(c(rep(c(1,1,0,0),6),rep(c(1,0,1,0),5),+ rep(c(1,0,0,1),7)),byrow=TRUE,nr=18)> Y<-c(106.6,100.8,110.9,114.5,115.9,91.9,115.4,+ 97.8,120.3,98.2,113.2,100.2,102.1,99.6,103.8,+ 100.7,107.6,95.0)# Parameters estimators> XX<-t(X)%*%X> XY<-t(X)%*%Y
Examples of linear models
� A simple analysis of variance
◦ Computing estimators of βIt is easy to see that X’X matrix is singular (the
first row is equal to the sum of the 3 following
ones => use a generalized inverse
> library(MASS)> G1<-ginv(XX)> b1<-G1%*%XY> b1
[,1][1,] 79.25810[2,] 27.50857[3,] 29.72190[4,] 22.02762
02-03-17
16
Examples of linear models
� A simple analysis of variance
◦ Computing estimators of βNote that another generalized inverse could be
obtained « by hand », by setting the estimator
of µ = 0 (and inverting the remaining diag.):
=
7007
0505
0066
75618
'*XX
=⇒
7/1000
05/100
006/10
0000
G
Examples of linear models
� A simple analysis of variance
◦ Computing estimators of β (using R)
# Another G> G2<-matrix(rep(0,16),nr=4)> G2[2,2]<-1/6> G2[3,3]<-1/5> G2[4,4]<-1/7# Check generalized inverse> XX%*%G2%*%XX
[,1] [,2] [,3] [,4][1,] 18 6 5 7[2,] 6 6 0 0[3,] 5 0 5 0[4,] 7 0 0 7
Examples of linear models
� A simple analysis of variance
◦ Computing estimators of β (using R) (cont’d)
# Solutions> b2<-G2%*%XY> b2
[,1][1,] 0.0000[2,] 106.7667[3,] 108.9800[4,] 101.2857# i.e. the 3 breeds averages# Observe that:> b1[1,1]+b1[2,1][1] 106.7667> b2[1,1]+b2[2,1][1] 106.7667
Examples of linear models
� A simple analysis of variance
◦ Testing a first hypothesis:
� H0: µA = µW = µH
which can be rewritten as:
H0: (µA = µW & µA = µH) or (µA - µW = 0 & µA - µH = 0)
� In terms of the general linear hypothesis, this can be
written as:
=
−−
0
0
1010
0110
H
W
A
µ
ααα
02-03-17
17
Examples of linear models
� A simple analysis of variance
◦ Testing a first hypothesis using R:# Hypothesis 1> L<-matrix(c(0,1,-1,0,0,1,0,-1),byrow=TRUE,nr=2)> q<-2> n<-18> rX<-3> num<-(t(L%*%b1)%*%solve(L%*%G1%*%t(L))%*%L%*%b1)/q> den<-(t(Y-X%*%b1)%*%(Y-X%*%b1))/(n-rX)# Test the hypothesis> F<-num/den> F
[,1][1,] 1.549397 > pf(F,q,n-rX,lower.tail=FALSE)
[,1][1,] 0.2445187
Examples of linear models
� A simple analysis of variance
◦ Easily testing the first hypothesis using R:
# Hypothesis 1> breed<-factor(c(rep(«A»,6),rep(«W»,5),rep(«H»,7)))> model<-lm(Y~breed)> summary(model)
Call:lm(formula = Y ~ breed)
Residuals:Min 1Q Median 3Q Max
-14.8667 -4.8964 0.3238 5.7907 11.3200
Examples of linear models
� A simple analysis of variance
◦ Easily testing the first hypothesis using R:
Coefficients:Estimate Std. Error t value Pr(>|t|)
(Intercept) 106.767 3.225 33.106 1.94e-15 ***breedW 2.213 4.783 0.463 0.650 breedH -5.481 4.395 -1.247 0.231 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.9 on 15 degrees of freedomMultiple R-squared: 0.1712, Adjusted R-squared: 0.06071 F-statistic: 1.549 on 2 and 15 DF, p-value: 0.2445
Examples of linear models
� A simple analysis of variance
◦ Testing a second hypothesis using R:# Hypothesis 2: H0: µ(H)=0.5*(µ(A)+µ(W))> L<-matrix(c(0,-0.5,-0.5,1),byrow=TRUE,nr=1)> q<-1> n<-18> rX<-3> num<-(t(L%*%b1)%*%solve(L%*%G1%*%t(L))%*%L%*%b1)/q> den<-(t(Y-X%*%b1)%*%(Y-X%*%b1))/(n-rX)# Test the hypothesis> F<-num/den> F
[,1][1,] 2.965257> pf(F,q,n-rX,lower.tail=FALSE)
[,1][1,] 0.1056209
02-03-17
18
Examples of linear models
� A more complex example:
◦ As a third example, consider this dataset
H St Ge Age HR H St Ge Age HR
1 T M 88 68 8 NT M 64 76
2 T M 96 64 9 NT M 77 75
3 T F 90 76 10 NT F 100 71
4 T F 73 71 11 NT F 75 85
5 T M 85 63 12 NT M 63 81
6 T F 99 63 13 NT M 73 80
7 T F 60 67 14 NT F 67 81
15 NT F 76 83
Examples of linear models
� A more complex example:
◦ Possible questions are:
� Is there any effect of training on the heart rate (HR) ?
� Is there an age effect and/or a gender effect on HR ?
� Is the (potential) training effect similar in males and in
females ?
◦ Answers:
� All these questions can be « easily » addressed using
a linear model…
Examples of linear models
� A more complex example :
◦ The model:
y(ijk) = µ + β*a(ijk) + τi + γj + (τγ)ij + e(ijk)
◦ Using the matrix notation:
y = X*β + e
where β’ = (µ,β,τT,τNT,γF,γM,τγTF,τγTM,τγNTF,τγNTM)
Examples of linear models
� A more complex example :
◦ Obtaining X and Y
=
83
81
80
81
85
71
75
76
67
63
63
71
76
64
68
y
=
01000110761
01000110671
10001010731
10001010631
01000110751
010001101001
10001010771
10001010641
00010101601
00010101991
00101001851
00010101731
00010101901
00101001961
00101001881
X
02-03-17
19
Examples of linear models
� A more complex example :
◦ Performing computations (using R)
# Building matrices> y<-c(68,64,76,71,63,63,67,76,75,71,85,81,80,81,83)> X<-matrix(rep(0,150),nr=15)> X[,1]<-rep(1,15)> X[,2]<-c(88,96,90,73,85,99,60,64,77,+ 100,75,63,73,67,76)> X[,3]<-c(rep(1,7),rep(0,8))> X[,4]<-1-X[,3]> X[,5]<-c(0,0,1,1,0,1,1,0,0,1,1,0,0,1,1)> X[,6]<-1-X[,5]> X[,7]<-c(X[,5][1:7],rep(0,8))> X[,8]<-c(X[,6][1:7],rep(0,8))> X[,9]<-c(rep(0,7),X[,5][8:15])> X[,10]<-c(rep(0,7),X[,6][8:15])
Examples of linear models
� A more complex example :
◦ Performing computations (using R) (cont’d)
# Computing estimators> XX<-t(X)%*%X> Xy<-t(X)%*%y> G<-ginv(XX)> b<-G%*%Xy
[,1][1,] 38.1162191[2,] -0.1592766[3,] 15.6683053[4,] 22.4479138[5,] 20.1285345[6,] 17.9876846[7,] 8.1587098[8,] 7.5095955[9,] 11.9698247
[10,] 10.4780891
Examples of linear models
� A more complex example :
◦ Remarks:
� Other solutions could be obtained, using other
generalized inverses.
� Since regression parameters are « estimable », the
other solutions would give the same solution for β(see notes for an example).
Examples of linear models
� A more complex example :
◦ Testing hypotheses (using R)
# test_H: a generic function to test hypothesestest_H<-function(X,y,L,c) {
library(MASS)XX<-t(X)%*%XG<-ginv(t(X)%*%X)b<-G%*%t(X)%*%ynum<-t(L%*%b-c)%*%solve(L%*%G%*%t(L))%*%(L%*%b-c)den<-t(y-X%*%b)%*%(y-X%*%b)q<-dim(L)[1]n<-length(y)rX<-qr(XX)$rankF<-(num/q)/(den/(n-rX))pF<-pf(F,q,n-rX,lower.tail=FALSE)c(F,pF)
}
02-03-17
20
Examples of linear models
� A more complex example :
◦ a) Testing the regression coefficient
◦
◦ b) Testing the training effect
# Test: beta=0> L<-matrix(c(0,1,rep(0,8)),nr=1)> c<-matrix(c(0))> test_H(X,y,L,c)[1] 2.1324558 0.1749022# No significant regression
# Test: tau(T)-tau(NT)=0> L<-matrix(c(0,0,1,-1,rep(0,6)),nr=1)> c<-matrix(c(0))> test_H(X,y,L,c)[1] 14.951074581 0.003126119# Significant training effect
Examples of linear models
� A word of caution:
◦ When testing the training effect, we actually
compare the means of the 2 groups (Trained –
Not Trained)
◦ The raw means embed information on other
effects of the model, which might not be
desirable…
� This can be shown by replacing the observation by the
assumed model and averaging over each group (see
next slide)
Examples of linear models
� A word of caution:
( ) ( )T
TMTFMFTTT eay +++++++=
7
*3*4
7
*3*4*
τγτγγγτβµ
( ) ( )NT
NTMNTFMFNTNTNT eay +++++++=
22*
τγτγγγτβµ
( ) ( )( ) ( ) ( ) ( ) ( )NTT
NTMNTFTMTF
MFNTTNTTNTT
ee
aayy
−+++++
−+−+−=−
14
*7*7*6*814
*
τγτγτγτγ
γγττβ
Examples of linear models
� A word of caution:
◦ This complicated expression shows that:
� Due to the non-balanced nature of the dataset,
comparing training statuses involves the gender
effect
� The presence of a covariate might induce differences
if both groups are not balanced wrt age
� The potential interactions between training status
and gender might render comparison of status
meaningless.
02-03-17
21
Examples of linear models
� A possible solution:
◦ Use « Least Square Means » (LSM)
� We first obtain averages (LSM) on subgroups
� We average these means to obtain marginal LSM
� Example: LSM(T,F) = ?, LSM(T) = ?
Heart rates Trained Not Trained
Females 76,71,63,67 71,85,81,83
Males 68,64,63 76,75,81,80
( ) TFTFFTTF eay +++++= τγγτβµ *
Conventionally averaged over all dataset
Examples of linear models
� A possible solution:
◦ Use « Least Square Means » (LSM)
� We first obtain averages on subgroups
# Compute L for the 4 subgroups> L_TF<-matrix(c(1,mean(X[,2]),1,0,1,0,1,0,0,0),nr=1)> L_TM<-matrix(c(1,mean(X[,2]),1,0,0,1,0,1,0,0),nr=1)> L_NTF<-matrix(c(1,mean(X[,2]),0,1,1,0,0,0,1,0),nr=1 )> L_NTM<-matrix(c(1,mean(X[,2]),0,1,0,1,0,0,0,1),nr=1 )# Compute LSM for the 4 subgroups> LSM_TF<-L_TF%*%b> LSM_TM<-L_TM%*%b> LSM_NTF<-L_NTF%*%b> LSM_NTM<-L_NTM%*%b
Examples of linear models
� A possible solution:
◦ Use « Least Square Means » (LSM)
� We then average to obtain marginal LSM
# Compute LSM for the main effects> LSM_T<-0.5*(LSM_TF+LSM_TM)> LSM_NT<-0.5*(LSM_NTF+LSM_NTM)> LSM_F<-0.5*(LSM_TF+LSM_NTF)> LSM_M<-0.5*(LSM_TM+LSM_NTM)
Examples of linear models
� A possible solution:
◦ Use « Least Square Means » (LSM)
� Finally, since they are linear combinations of the
parameters, LSM (or differences of LSM) can be
tested using the general linear hypothesis test given
above !
� Example: let’s compare the T & NT groups
LSM_T-LSM_NT
= 0.5*(LSM_TM + LSM_TF - LSM_NTF - LSM_NTM)
= 0.5*(L_TM + L_TF - L_NTF - L_NTM)*b
= (0,0,1,-1,0,0,0.5,0.5,-0.5,-0.5)*b
02-03-17
22
Examples of linear models
� A possible solution:
◦ Use « Least Square Means » (LSM)
� (0,0,1,-1,0,0,0.5,0.5,-0.5,-0.5)*b
# Compute difference of LSM between T and NT> L<-matrix(c(0,0,1,-1,0,0,0.5,0.5,-0.5,-0.5),nr=1)> c<-0> test_H(X,y,L,c)[1] 14.951074581 0.003126119# Same result as before, so showing that the obtained# solution is corrected for the other effects of the # model !
A (even) more complex situation
� Imagine the following situation
◦ 2 groups of 2 individuals are followed
longitudinally and 3 measures are taken on
each individual at 3 specific times (see figure)
A more complex situation
� Some questions are:
◦ Is there a significant difference in the
measures between the groups ?
◦ Is there a significant difference in the
measure between the times ?
� If yes, for which times ?
◦ [Are the 2 groups dynamic behaviour
different ?]
A more complex situation
� These questions can easily be adressed
using linear models, as done above.
◦ Omitting the interaction for simplicity:
=
2.97
5.88
0.85
6.110
8.89
5.91
0.118
7.113
7.103
3.116
4.106
4.89
y
=
100101
010101
001101
100101
010101
001101
100011
010011
001011
100011
010011
001011
X
=
3
2
1
2
1
τττγγµ
β
02-03-17
23
A more complex situation
� LM analysis, using R (1):
## Observations#> y<-c(89.4,106.4,116.3,103.7,113.7,118.0,91.5,89.8,+ 110.6,85.0,88.5,97.2)## Design matrix#> X<-matrix(rep(0,72),nr=12)> X[,1]<-1> X[1:6,2]<-1> X[7:12,3]<-1> X[c(1,4,7,10),4]<-1> X[c(2,5,8,11),5]<-1> X[c(3,6,9,12),6]<-1
A more complex situation
� LM analysis, using R (2):
## Compute solutions#> XX<-t(X)%*%X> Xy<-t(X)%*%y> library(MASS)> b<-ginv(XX)%*%Xy## Test of group effect#> L<-matrix(c(0,1,-1,0,0,0),nr=1)> c<-matrix(c(0),nr=1)> test_H(X,y,L,c)[1] 14.89196727 0.00481566## Significant group effect (p = 0.0048)
A more complex situation
� LM analysis, using R (3):
## Test of time effect#> L<-matrix(c(0,0,0,1,-1,0,0,0,0,0,1,-1),nr=2,byrow=T )> c<-matrix(c(0,0),nr=2)> test_H(X,y,L,c)[1] 8.25934879 0.01133366# Significant time effect (p = 0.0113)
# Or, equivalently:> L<-matrix(c(0,0,0,1,-1,0,0,0,0,1,0,-1),nr=2,byrow=T )> c<-matrix(c(0,0),nr=2)> test_H(X,y,L,c)[1] 8.25934879 0.01133366# Significant time effect (p = 0.0113)
A more complex situation
� LM analysis, using R (4):
# Obtain LSM for groups> L_G1T1<-matrix(c(1,1,0,1,0,0),nr=1)> L_G1T2<-matrix(c(1,1,0,0,1,0),nr=1)> L_G1T3<-matrix(c(1,1,0,0,0,1),nr=1)> L_G2T1<-matrix(c(1,0,1,1,0,0),nr=1)> L_G2T2<-matrix(c(1,0,1,0,1,0),nr=1)> L_G2T3<-matrix(c(1,0,1,0,0,1),nr=1)> LSM_G1T1<-L_G1T1%*%b> LSM_G1T2<-L_G1T2%*%b> LSM_G1T3<-L_G1T3%*%b> LSM_G2T1<-L_G2T1%*%b> LSM_G2T2<-L_G2T2%*%b> LSM_G2T3<-L_G2T3%*%b> LSM_G1<-(LSM_G1T1+LSM_G1T2+LSM_G1T3)/3> LSM_G2<-(LSM_G2T1+LSM_G2T2+LSM_G2T3)/3
02-03-17
24
A more complex situation
� LM analysis, using R (5):## Show LSM for groups, and difference#> c(LSM_G1,LSM_G2,LSM_G1-LSM_G2)[1] 107.91667 93.76667 14.15000## Test whether true difference is 0#> L_delta<-(L_G1T1+L_G1T2+L_G1T3-L_G2T1-L_G2T2-L_G2T3 )/3> c_delta<-matrix(c(0),nr=1)> LSM_delta<-L_delta%*%b## Test the difference#> test_H(X,y,L_delta,c_delta)[1] 14.89196727 0.00481566# Of course identical to previous groups test
A more complex situation
� LM analysis, summary:
◦ Everything seems fine, but...
� Independence assumptions have clearly been violated (measures taken on the same individualare likely to be correlated)
� Assuming erroneously independence might:
� Underestimate random residual variation (σ²e)
� Consequently, overestimate effects...
� An thus, increase false positive rates
◦ So, a question of interest is: how can we take these correlations intoaccount ?
A more complex situation
� Idea: use a more general family of linear
models, named « mixed models »,
allowing for correlations
◦ « Mixed » refers to the simultaneous use of
« fixed » an « random » effects
� Fixed: this effect would be the same if we repeat
the experiment
� Example: groups, times
� Random: this effect is randomly sampled in a
population of possible levels
� Example: animals
� Matrix formulation:
y = X*β + Z*u + e
◦ Z = design (or incidence) matrix, linking the
random parameters to the observations.
� known
◦ u = vector of random effects
� unknown, values to be predicted
� assumed to be random samples from N(0,I*σ²u)
� so, var(ui) = σ²u for all i
� and cov(ui,uj) = 0 for all combinations of i and j ≠ i (i.e.
individuals are assumed to be un(co)related
� σ²u is an unknown parameter, to be estimated
A more complex situation
02-03-17
25
� Matrix formulation (cont’d):
y = X*β + Z*u + e
◦ e = vector of random residuals
� unknown
� assumed to be random samples from N(0,I*σ²e)
� so, var(ei) = σ²u for all i
� and cov(ei,ej) = 0 for all combinations of i and j (i.e. individuals
are assumed to be un(co)related
� σ²e is an unknown parameter, to be estimated
◦ Furthermore, we will assume:
� Cov(ui,ej) = 0 for all i,j
A more complex situation
� Variances and covariances
◦ u ~ N(0;G) with G = I*σ²u
◦ e ~ N(0;R) with R = I*σ²e
◦ V = V(y)
= V(X*β + Z*u + e)
= V(X*β) + V(Z*u) + V(e)
+ 2*Cov(X*β,Z*u) + 2*Cov(X*β,e)
+ 2*Cov(Z*u,e)
= 0 + Z*V(u)*Z’ + R + 0 + 0 + Z*Cov(u,e)
= Z*G*Z’ + R + 0
= Z*G*Z’ + R
A more complex situation
� Variances and covariances: example
◦ Back to our problem...
A more complex situation
## Random effect design matrix#> Z<-matrix(rep(0,48),nr=12)> Z[c(1:3),1]<-1> Z[c(4:6),2]<-1> Z[c(7:9),3]<-1> Z[c(10:12),4]<-1## Known correlation matrices# Arbitrary values are used to start with#> sigma_2_a<-10.0> sigma_2_e<-20.0> G<-diag(4)*sigma_2_a # No correlation between animal s> R<-diag(12)*sigma_2_e # No correlation between residu als
� Variances and covariances: example
◦ Back to our problem...
A more complex situation
## Observations variance-covariance matrix#> V<-Z%*%G%*%t(Z)+R> V
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12][1,] 30 10 10 0 0 0 0 0 0 0 0 0[2,] 10 30 10 0 0 0 0 0 0 0 0 0[3,] 10 10 30 0 0 0 0 0 0 0 0 0[4,] 0 0 0 30 10 10 0 0 0 0 0 0[5,] 0 0 0 10 30 10 0 0 0 0 0 0[6,] 0 0 0 10 10 30 0 0 0 0 0 0[7,] 0 0 0 0 0 0 30 10 10 0 0 0[8,] 0 0 0 0 0 0 10 30 10 0 0 0[9,] 0 0 0 0 0 0 10 10 30 0 0 0
[10,] 0 0 0 0 0 0 0 0 0 30 10 10[11,] 0 0 0 0 0 0 0 0 0 10 30 10[12,] 0 0 0 0 0 0 0 0 0 10 10 30
02-03-17
26
� Variances and covariances: example
◦ More generally, in our problem...
A more complex situation
++
++
++
++
++
++
=
2222
2222
2222
2222
2222
2222
2222
2222
2222
2222
2222
2222
000000000
000000000
000000000
000000000
000000000
000000000
000000000
000000000
000000000
000000000
000000000
000000000
euuu
ueuu
uueu
euuu
ueuu
uueu
euuu
ueuu
uueu
euuu
ueuu
uueu
σσσσσσσσσσσσ
σσσσσσσσσσσσ
σσσσσσσσσσσσ
σσσσσσσσσσσσ
V
� We can see that:
◦ Introducing a random individual effect in the
model has led to introduce a correlation
between observations on the same
individual
◦ The price to be paid:
� More parameters (u, σ²u)
� Much more complicated resolution => use of
specialized softwares (SAS, AIREML, ...)
� Some details given in the appendix for R solution
A more complex situation
� An alternative solution:
◦ Instead of introducing an individual effect in
order to correlate the observations,
correlations can be introduced directly in the
R matrix
� No random effect anymore (Z = 0)
� V = Z*G*Z’ + R = R
� Two parameters (σ1² & σ2²) need to be estimated, with
R(i,i) = σ1² + σ2² for all i
R(i,j) = σ1² for all i ≠ j (same animal)
A more complex situation
� An alternative solution (cont’d):� Other equivalent coding: R = K*σ²
σ² = σu² + σe², ρ = σu²/(σu² + σe²)
K(i,i) = 1 for all i
K(i,j) = ρ for all i,j (same animal)
� This correlation structure is referred to as « compound
symmetry » (CS). It involves only 2 parameters (σ² and ρ).
A more complex situation
02-03-17
27
� An alternative solution (cont’d):
◦ The latter approach is more flexible, because
other R structures than CS can be introduced
� For example, it could be expected that measures
taken on times 1 and 3 should be less correlated
than measures taken on times 1 and 2
� A possible structure to model this is:
� σ² = σu² + σe², ρ = σu²/(σu² + σe²)
K(i,i) = 1 for all i
K(i,j) = ρ|i-j| for all i,j (same animal)
� This type of structure is named « type 1 auto-regression »
(AR(1)), and also involves only 2 parameters (σ² and ρ)
A more complex situation
� Model selection refers to procedures used
to select the « best » model to be used on
a given dataset
� Various approaches have been proposed,
and we’ll show one as an example
� Although almost automated procedures
exist and are implemented, nothing will
replace the experimenter knowledge of
the problem and sound reasoning...
A word on model selection
� We can re-use a previous example to
show the rationale:
A word on model selection
H St Ge Age HR H St Ge Age HR
1 T M 88 68 8 NT M 64 76
2 T M 96 64 9 NT M 77 75
3 T F 90 76 10 NT F 100 71
4 T F 73 71 11 NT F 75 85
5 T M 85 63 12 NT M 63 81
6 T F 99 63 13 NT M 73 80
7 T F 60 67 14 NT F 67 81
15 NT F 76 83
� Several models could be used to select
the « best » one:
(1) yi = m + ei
(2) yi = m + b*Ai + ei
(3) yi = m + b*Ai + Gi + ei
(4) yi = m + b*Ai + Gi + Si + ei
(5) yi = m + b*Ai + Gi + Si + Gi*Si + ei
(6) yi = m + Gi + Si + Gi*Si + ei
(7) yi = m + Gi + Si + ei
...
Possible models
02-03-17
28
� A model is nested within another if it is
built with a subset of the factors of this
model:
(1) => (2) => (3) => (4) => (5)
(7) => (6) => (5)
but, for example:
(6) ≠> (4)
� Comparing nested (linear) models can be
done using a F test (see next slide)
Nested models
� Idea: if a new factor does not contribute
significantly to the model, the extra-fit
provided by this factor is an estimator of
the error variance.
� Remark: when models are not nested,
other criteria (such as ‘Akaike Information
Criterion = AIC) must be used.
F test for nested models selection
Fq,n-r(X) = [(Xc*bc-Xr*br)’*y/q] / [(y-Xc*bc)’*(y-Xc*bc)/(n-r(Xc))]
� Example: comparing (3) to (2)
F test for nested models selection
# Reduced model> Xr<-matrix(rep(0,30),nr=15)> Xr[,1]<-rep(1,15)> Xr[,2]<-age# Complete model> Xc<-matrix(rep(0,45),nr=15)> Xc[,1]<-rep(1,15)> Xc[,2]<-age> Xc[,3]<-gender# Solutions> library(MASS)> br<-ginv(t(Xr)%*%Xr)%*%t(Xr)%*%hr> bc<-ginv(t(Xc)%*%Xc)%*%t(Xc)%*%hr# Test > num<-t(Xc%*%bc-Xr%*%br)%*%hr/1> den<-t(hr-Xc%*%bc)%*%(hr-Xc%*%bc)/(15-2)> pf(num/den,q,n-rxc,lower.tail=FALSE)
[,1][1,] 0.4151881
Appendix
02-03-17
29
� Procedure:
1. Obtain variance components (σ²,ρ,...)
estimation using specific methods
� The today preferred method is called REML
(REstricted Maximum Likelihood)
2. Obtain solutions using these estimates
� A practical method is to used so-called
Henderson’s mixed model equations (MME)
3. Perform (approximate) testing
� Use a modified version of the general linear
hypothesis described above
Appendix: solving mixed models
� Procedure – 1) REML: AI algorithm
◦ θ(k+1) = θ(k) + AI-1*SC
� k = iteration #
� θ’ = (σ²u,σ²e)’ and θ(0) ~ arbitrary
� SC = « score vector »
� SC(1) = 0.5*y’*P*Z*G*Z’*P’*y – trace(Z*G*Z’*P)
� P = « hat matrix » = V-1 – V-1*X’*(X’*V-1*X)-*X*V-1
� SC(2) = 0.5*y’*P*P’*y – trace(P)
� AI = « average information matrix »
Appendix: solving mixed models
=
P'*y*P*y'*PZ'*P'*y*G*Z*P*y'*P
P'*y*Z'*P*G*Z*y'*PZ'*P'*y*G*Z*Z'*P*G*Z*y'*PAI
� Procedure – 1) REML: R implementation
Appendix: solving mixed models
## REML estimators computation#library(MASS)# Init computationsdiff<-1000.0AI<-matrix(rep(0,4),nr=2)SC<-matrix(rep(0,2),nr=2)sigma_2_u<-10.0sigma_2_e<-20.0sigma_2<-c(sigma_2_u,sigma_2_e)# Loop while estimates differwhile (diff>0.01) {
# Loop body => see next slide}
� Procedure – 1) REML: R implementation
Appendix: solving mixed models
# Loop body (1) # Variance of the observationsZGZ<-Z%*%G%*%t(Z)V<-(ZGZ)*sigma_2_u+R*sigma_2_e# P matrixVi<-solve(V)XVi<-t(X)%*%ViP<-Vi-t(XVi)%*%(ginv(XVi%*%X)%*%XVi)# Partial computationsPy<-P%*%yZGZP<-ZGZ%*%PZGZPy<-ZGZ%*%Py# Continued on next slide...
02-03-17
30
� Procedure – 1) REML: R implementation
Appendix: solving mixed models
# Loop body (2) # TracestrP<-0trPZGZ<-0for (i in 1:dim(P)[1]) {
trP<-trP+P[i,i]trPZGZ<-trPZGZ+ZGZP[i,i]
}# AI matrixAI[1,1]<-0.5*t(ZGZPy)%*%(P%*%ZGZPy)AI[1,2]<-0.5*t(Py)%*%(P%*%ZGZPy)AI[2,2]<-0.5*t(Py)%*%(P%*%Py)AI[2,1]<-AI[1,2]# Score vectorSC[1]<-0.5*(t(Py)%*%ZGZPy-trPZGZ)SC[2]<-0.5*(t(Py)%*%Py-trP)
� Procedure – 1) REML: R implementation
Appendix: solving mixed models
# Loop body (3) # New estimatorsnew_sigma_2<-sigma_2+solve(AI)%*%SCnew_sigma_2_u<-new_sigma_2[1]new_sigma_2_e<-new_sigma_2[2]# Differencediff<-(sigma_2[1]-new_sigma_2[1])**2diff<-diff+(sigma_2[2]-new_sigma_2[2])**2sigma_2<-new_sigma_2
}sigma_2
[,1][1,] 18.82629[2,] 26.21528
� Procedure – 2) MME: method
◦ BLUE ( ) and BLUP ( ) can be obtained using:
◦ In our case, this can be written:
Appendix: solving mixed models
=
+ −
−
−−−
−−
y*Z'*R
y*X'*R
u
β
GZ*Z'*RX*Z'*R
Z*X'*RX*X'*R1
1
111
11
ˆ
ˆ
β u
=
+ Z'*y
X'*y
u
β
IZ'*ZZ'*X
X'*ZX'*X
ˆ
ˆ
)/(* 22ue σσ
� Procedure – 2) MME: R implementation
Appendix: solving mixed models
MMEl<-matrix(rep(0,100),nr=10)MMEr<-matrix(rep(0,10),nr=10)MMEl[1:6,1:6]<-t(X)%*%XMMEl[1:6,7:10]<-t(X)%*%ZMMEl[7:10,1:6]<-t(Z)%*%XMMEl[7:10,7:10]<-t(Z)%*%Z+solve(G)*(sigma_2[2]/sigm a_2[1])MMEr[1:6]<-t(X)%*%yMMEr[7:10]<-t(Z)%*%ysol<-ginv(MMEl)%*%MMEr
02-03-17
31
� Procedure – 3) Approximate testing
◦ The approximation comes from the fact that
only estimators of β and u are available
◦ The denominator degres of freedom are
estimated using various methods (see
details in the litterature) well beyond the
scope of this text...!
◦ The estimator is a simple extension of the
method for fixed effects only models
Appendix: solving mixed models
Fq,v = [(L*b-c)’*(L*C*L’)-1*(L*b-c)/q] ^
� Procedure – 3) Approximate testing
Appendix: solving mixed models
# Testing group effects> L<-matrix(c(0,1,-1,rep(0,7)),nr=1)> Lsol<-L%*%sol> C_hat<-ginv(MMEl/sigma_2[2])> LCL<-L%*%(C_hat%*%t(L))> LCLi<-ginv(LCL)> Fg<-t(Lsol)%*%(LCLi%*%Lsol)/1> dfg1<-1> dfg2<-2 # Cfr Kenward-Rogers...> 1-pf(Fg,dfg1,dfg2)
[,1][1,] 0.1145035>
� Procedure – 3) Approximate testing
Appendix: solving mixed models
# Testing time effects> L<-matrix(c(0, 0,0, 1,-1,0, 0,0,0,0, 0, 0,0, 1,0,-1 ,+ 0,0,0,0),nr=2,byrow=T) > Lsol<-L%*%sol> C_hat<-ginv(MMEl/sigma_2[2])> LCL<-L%*%(C_hat%*%t(L))> LCLi<-ginv(LCL)> Ft<-t(Lsol)%*%(LCLi%*%Lsol)/1> dft1<-1> dft2<-6 # Cfr Kenward-Rogers...> 1-pf(Ft,dft1,dft2)
[,1][1,] 0.006966431 >
� The same analyses, with SAS: program
Appendix: solving mixed models
options ls=80;data phd;
input groupe temps animal pheno @@;cards;
1 1 1 89.4 1 1 2 103.7 1 2 1 106.4 1 2 2 113.71 3 1 116.3 1 3 2 118.0 2 1 3 91.5 2 1 4 85.02 2 3 89.8 2 2 4 88.5 2 3 3 110.6 2 3 4 97.2;proc glm;
class groupe temps;model pheno=groupe temps;lsmeans groupe /pdiff stderr;
proc mixed;class groupe temps animal;model pheno=groupe temps / solution;repeated /sub=animal type=cs;
02-03-17
32
� The same analyses, with SAS: result (1)
Appendix: solving mixed models
SASlisting
Top Related