Post on 06-Feb-2016
description
Chapter 5 Multilevel Models• 5.1 Cross-sectional multilevel models
– 5.1.1 Two-level models– 5.1.2 Multiple level models– 5.1.3 Multiple level modeling in other fields
• 5.2 Longitudinal multilevel models– 5.2.1 Two-level models– 5.2.2 Multiple level models
• 5.3 Prediction• 5.4 Testing variance components• Appendix 5A – High order multilevel models
Multilevel Models• Multilevel models - a conditional modeling framework that
takes into account hierarchical and clustered data structures. • Used extensively in educational science and other
disciplines in the social and behavioral sciences. • A multilevel model can be viewed as a linear mixed effects
model and hence, the statistical inference techniques introduced in Chapter 3 are readily applicable.
• By considering multilevel data and models as a separate unit, we expand the breadth of applications that linear mixed effects models enjoy.
• Also known as hierarchical models
5.1 Cross-sectional multilevel models
• Two-level model example• Level 2 (Schools), Level 1(Students within a school)• Level 1 Model (student replications – j)
yij = β0i + β1i zij + ij
– yij - student’s performance on an achievement test– zij - total family income
• Level 2 Model– Thinking of the schools as a random sample, we model
{β0i, β1i} as random quantities.
β0i = β0 + α0i and β1i = β1 + α1i ,
– where α0i, α1i are mean zero random variables.
Two-level model example• The combined level 1 and level 2 models form:yij = (β0 + α0i ) + (β1 + α1i) zij + ij
= α0i + α1i zij + β0 + β1 zij + ij .• The two-level model may be written as a single linear
mixed effects model. • Specifically, we define αi = (α0i , α1i)´, zij = (1, zij)´, β = ( β0,
β1)´ and xij = zij, to write
yij = zij´ αi + xij´ β + ij .• We model and interpret behavior through the succession of
level models.• We estimate the combined levels through a single (linear
mixed effects) model.
Two-level model example - variation • Modify the level-2 model so that
β0i = β0 + β01 xi + α0i and β1i = β1 + β11 xi + α1i , – where xi indicates whether the school was a Catholic based
or a public school. • The combined level 1 and (new) level 2 models form:
yij = α0i + α1i zij + β0 + β01 xi + β1 zij + β11 xi zij + ij . • That we can again write as
yij = zij´ αi + xij´ β + ij .– by defining αi = (α0i , α1i)´, zij = (1, zij)´,
β = ( β0, β01, β1, β11)´ and xij = (1, xi, zij, xi zij)´. • The term β11 xi zij , interacting between the level-1 variable zij
and the level-2 variable xi, is known as a cross-level interaction. • Many researchers argue that understanding cross-level
interactions is a major motivation for analyzing multilevel data.
Three Level Models• Level 1 model (students)
yi,j,k = z1,i,j,k´ βi,j + x1,i,j,k´ β1 + ε1,i,j,k , – The predictors z1,i,j,k and x1,i,j,k may depend on the student
(gender, family income and so on), classroom (teacher characteristics, classroom facilities and so on) or school (organization, structure, location and so on).
• Level 2 model (classroom)βi,j = Z2,i,j γi + X2,i,j β2 + ε2,i,j.
– The predictors Z2,i,j and X2,i,j may depend on the classroom or school, but not students.
• Level 3 model (School)γi = Z3,i β3 + ε3,i .
– The predictors Z3,i may depend on the school, but not students or classroom.
Combined Model• The combined level 1, 2 and 3 models form:yi,j,k = z1,i,j,k´ ( Z2,i,j (Z3,i β3 + ε3,i) + X2,i,j β2 + ε2,i,j)
+ x1,i,j,k´ β1 + ε1,i,j,k
= xi,j,k´ β + zi,j,k´ αi,j+ ε1,i,j,k , • where
3
2
1
βββ
β
kjijii
kjiji
kji
kji
,,,1,,2,3
,,,1,,2
,,,1
,,
zZZzX
xx
kjiji
kjikji
,,,1,,2
,,,1,, zZ
zz
i
jiji
,3
,,2, ε
εα
Motivation for multilevel models• Multilevel modeling provides a structure for hypothesizing
relationships in a complex system.• The ability to estimate cross-level effects is one advantage
of multilevel modeling when compared to an alternate research strategy calling for the analysis of each level in isolation of the others.
• Second and higher levels of multilevel models also provide use with an opportunity to estimate the variance structure using a parsimonious, parametric structure. – One typically assumes that disturbance terms from
different levels are uncorrelated.
5.2 Longitudinal multilevel models• Similar to cross-sectional multilevel models, except:
• Use a t subscript to denote the Level 1 replication, for time• Allow for correlation among Level 1 observations to represent
serial patterns.• Possibly include functions of time as Level 1 predictors.
• Typical example – use students as Level 2 unit of analysis and time as Level 1 unit of analysis.
• Growth curve models are a classic example:• we seek to monitor the natural development or aging of an
individual.• This development is typically monitored without intervention and
the goal is to assess differences among groups. • In growth curve modeling, one uses a polynomial function of age
or time to track growth.
Example – Dental Data• Originally due to Potthoff and Roy (1964); see also Rao
(1987).• y is the distance, measured in millimeters, from the center of
the pituitary to the pterygomaxillary fissure. • Measurements were taken on 11 girls and 16 boys at ages 8,
10, 12, and 14. • The interest is in
• how the distance grows with age and • whether there is a difference between males and females.
• Figure 5.1 Multiple Time Series Plot of Dental Measurements. Open circles are girls, solid circles are boys
Measure
16
18
20
22
24
26
28
30
32
Age
8 10 12 14
Dental Model• Level 1 model
yit = β0i + β1i z1,it + it ,
• z1,it is age.• Level 2 model
β0i = β00 + β01 GENDERi + α0i and
β1i = β10 + β11 GENDERi + α1i.• GENDER - 1 for females and 0 for males.
Three Level Example• Children Mental Health Assessment by Guo and Hussey
(1999)• The Level 1 model (time replications) is
yi,j,t = z1,i,j,t´ βi,j + x1,i,j,t´ β1 + ε1,i,j,t , – Assessment y is the “Deveroux Scale of Mental
Disorders,” a score made up of 111 items. – x1,i,j,t = PROGRAMi,j,t -1 if the child was in program
residence at the time of the assessment and 0 if the child was in day treatment or day treatment combined with treatment foster care.
– z1,i,j,t = (1 TIMEi,j,t)´. TIMEi,j,t is measured in days since the inception of the study.
– Thus, the level-1 model can be written asyi,j,t = β0,i,j + β1,i,j TIMEi,j,t + β1 PROGRAMi,j,t + ε1,i,j,t .
Children Mental Health Assessment• The level 2 model (child replications) is
βi,j = Z2,i,j γi + X2,i,j β2 + ε2,i,j,• where there are i =1 ,…, n children and j = 1, …, Ji raters. • The level 2 model of Guo and Hussey can be written as
β0,i,j = β0,i,0 + β0,0,1 RATERi,j + ε2,i,j
• andβ1,i,j = β2,0 + β2,1 RATERi,j .
• The variable RATERi,j = 1 if rater was a teacher and = 0 if the rater was a caretaker.
• The level 3 model (rater replications) isγi = Z3,i β3 + ε3,i
β0,i,0 = β0,0,0 + β0,1,0 GENDERi + ε3,i .
5.3 Prediction• Recall that we estimate model parameters and predict
random variables. • Consider a two-level longitudinal model
– Level 1 model (replication on time)yi,t = z1,i,t´ βi + x1,i,t´ β1 + ε1,i,t ,
– Level 2 model - βi = Z2,i β2 + i. – Linear mixed model is
yi,t = z1,i,t´ (Z2,i β2 + i) + x1,i,t´ β1 + ε1,i,t ,
• The best linear unbiased predictor (BLUP) of βi is
bi,BLUP = ai,BLUP + Z2,i b2,GLS ,
– where ai,BLUP = D Zi Vi-1 (yi - Xi bGLS ).
Three-Level Model Prediction• Estimate model parameters• Next, compute BLUP residuals
– Use the formula, ai,BLUP = D Zi Vi-1 (yi - Xi bGLS ).
– This yields the BLUPs for αi,j = (ε2,i,j´ ε3,i´)´, say,
ai,j,BLUP = (e2,i,j,BLUP ´ e3,i,BLUP ´)´.
• Then, compute BLUP predictors of γi and βi,j
– gi,BLUP = Z3,i b3,GLS + e3,i,BLUP
– bi,j,BLUP = Z2,i,j gi,BLUP + X2,i,j b2, GLS + e2,i,j,BLUP .
• Forecasts are also straightforward: for AR(1) level-1 disturbances, this simplifies to
BLUPTjiL
GLSLTjiBLUPjiLTjiLTji ijijijijey ,,,,1,,,1,,,,,1,,ˆ bxbz
5.4 Testing variance components• For the error components model, do we wish to pool? We can
express this as an hypothesis of the form H0: σα2 = 0.
• For a two-level model, do the data provide evidence that our 2nd level model is viable? We may wish to test H0: Var α = 0.
• Unfortunately, the usual likelihood ratio testing procedure is not valid for testing many variance components of interest. – In particular, the concern is for testing parameters where
the null hypothesis is on the boundary of possible values.• That is, • σα
2 = 0 is on the boundary.– Usual approximations are not valid when we use the
boundary restriction in our definition of the estimator.– As a general rule, the standard hypothesis testing
procedures favors the simpler null hypothesis more often than it should.
20
Alternative Testing Procedures• Suppose that we wish to test the null hypothesis H0: σ 2 = σ0
2, where σ0
2 is a known positive constant. – Standard likelihood ratio test is okay.– This procedure is not available when σ0
2 = 0 because the log-likelihood under H0 is not well defined.
• However, H0: σ 2 = 0 is still a testable hypothesis
– A simple test is to reject H0 if the maximum likelihood estimator,
– exceeds zero. – This test procedure has power 1 versus all alternatives and a
significance level of zero, a good test!!!
n
i iyn1
21
Error Components Model• Consider the likelihood ratio test statistic for assessing H0:
σα2 = 0.
• The asymptotic distribution turns out to be• Typically, the asymptotic distribution of the likelihood ratio
test statistic for one parameter is• This means that using nominal values, we will accept the
null hypothesis more often than we should; thus, we will sometimes use a simpler model than suggested by the data.
• Suppose that one allows for negative estimates. Then, the asymptotic distributions turns out to be the usual
2)1(2
1
2)1(
2)1(
Recommendations• No general theory is available.• Some additional theoretical results are available.• This can be important for some applications.• Simulation methods are always possible.• You have to know what your software package is doing
– It may be giving you the appropriate test statistics or it may ignore the boundary issue.
– If it incorrectly ignores the boundary issue, the test procedures are biased towards simpler models.