Easy Power and Sample Size for Most of the Mixed Models ... · - Common Tests in the Linear Mixed...
Transcript of Easy Power and Sample Size for Most of the Mixed Models ... · - Common Tests in the Linear Mixed...
1
1
Easy Power and Sample Size forMost of the Mixed Models
You Will Ever SeeKeith E. Muller
Chief, Division of MethodologyDepartment of Health Outcomes and Policy
University of Florida, [email protected]
Anna E. Barón (Colorado SPH)Sarah M. Kreidler (Colorado SPH)
Yueh-Yun Chi (Florida Biostatistics)Deborah H. Glueck (Colorado SPH)
http://samplesizeshop.org/
2
2
SponsorshipSupported by the National Institute of Dental and CraniofacialResearch under award NIDCR 1 R01 DE020832-01A1,Multilevel and Longitudinal Study Sample Size Tools forBehavioral Scientists (12/09/10-11/30/14), led by Keith Mullerand Deborah Glueck. The content is solely the responsibility ofthe authors, and does not necessarily represent the officialviews of the National Institute of Dental and CraniofacialResearch nor the National Institutes of Health.
Additional support from NIAAA R01-AA013458-01, NHLBIR01-HL091005, and NIDCR U54-DE019261.
3
3
Saga of Sample Size Selectionì We have long needed to select sample size for designs withclusters, repeated measures and multiple outcomes, and now wesee combinations.ì Existing approaches: 1) simulations,2) exemplary data,3) large sample approximations, and4) special cases.ì We think the ideas and software we present today make thejob easier than ever before.ì The first version of our free power software was written 30years ago.ì Previous versions matrix based, user hostile.Now point and click (GUI).
4
4
Software Development Team UCD
5
5
Outline1. Characterizing the Linear Models You See- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model
2. Six-Step Checklist for Power and Sample Size Analysis- Two Real Design Examples- Using the Checklist for the Examples
3. Simple Adjustments for Power with Missing Data4. Free, Web-based Software, GLIMMPSE,and Related Web Resources
6
6
1. Characterizing The Linear Models You SeeGeneral Linear Mixed Model Commonly Used for Clustered
and Repeated Measures Dataì Laird and Ware (1982) Demidenko (2004) Muller and Stewart (2007)ì Studies with Clustering - Designed: Cluster randomized studies - Observational: Clustered observationsì Studies with Repeated Measures - Designed: Randomized clinical trials - Observational: Cohort studies, natural historyì Combinations - Cluster randomized longitudinal studies
7
7
Data StructuresClusteringClustering is a restricted form of multilevel data,a special simple case of importance that we can cover.
Repeated MeasuresRepeated measures is a restricted form of multilevel data,a special simple case of importance that we can cover.
8
8
Distinguishing the Independent Sampling Unitfrom the Observational Unit
Observational Unit: alcohol use bychildin a schoolin a particular year
Independent Sampling Unit (ISU): the school.ISU, the "Subject," is not always the participant.
R ISU: How many clusters, here schools.8 total observations sums over ISU, clusters, times.
9
9
Clustering in One or More Levels
CommunityLevel 1
SchoolLevel 2
ClassroomLevel 3
ClassroomLevel 3
SchoolLevel 2
ClassroomLevel 3
10
10
Repeated Measures: Memory of Pain Trial
11
11
Three Classes of Tests in theGeneral Linear Mixed Model
A) Power for testing fixed effects (means) B) Power for testing random effects (covariance) C) Power for testing fixed and random effects
General and accurate power and sample sizetools are not available.
There are good methods for most common tests in A.
12
12
Power and Sample Size for Fixed Effectsin the General Linear Mixed Model
ì Many General Linear Mixed Model tests can be recast as testsin the General Linear Model, (Muller andMultivariate GLMMStewart, 2006; Muller, et al., 2007)ì Why do we care?- Muller, et al. (1992) show how to do power for time bytreatment using framework.multivariateì We know how to computer power and sample size for awide class of linear mixed models!- Typical clinical trial or longitudinal study in which maininference is about time by treatment interaction, and others.
13
13
Four Requirements for a Reversible LMM ScenarioRequirement Description
1 "Nice" Design 2 Specific Covariance of Responses3 Wald Test Statistic of Fixed Effects4 Specific E
Within ISU
rror df for Reference DistributionJ
14
14
Four Requirements to Recast a LMM as a GLMM: 1To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Have a "Nice" Design - No missing or mistimed data, Balanced Within ISU - Treatment assignment does not change over time; no repeated covariates
- Saturated in time and time by treatment effects - Unequal ISU group sizes OK
15
15
Four Requirements to Recast a LMM as a GLMM: 2To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Fit an Unstructured Covariance Model
16
16
Four Requirements to Recast a LMM as a GLMM: 3To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Use Wald test for inference about Fixed Effects- Most common test used for analysis
17
17
Four Requirements to Recast a LMM as a GLMM: 4To be to a General Linear Multivariate Model, a reversible
Linear Mixed Model scenario must:ì Use Kenward-Roger methodJ
- df approximation method with modified covariance matrix- With reversibility, covariance matrix is unstructured and testis equivalent to Hotelling-Lawley Trace test
- Muller et al. (2007), among many others, showed it gives thebest LMM test
18
18
Power and Sample Size for Multivariate Modelì Muller, LaVange, Ramey and Ramey (1992) described powerfor univariate and multivariate approaches to repeatedmeasures, including Hotelling-Lawley Trace (HLT)ì If data analysis fits unstructured covariance, thenKenward-Roger Wald test equivalent to HLTwhen reversible (Edwards, et al., 2008)
There is a second path to a reversible scenario.ì If data analysis fits "random intercept" only, then Wald Test with residual method for error df equivalent tounivariate approach to repeated measures with uncorrected test(Gurka, Edwards, Muller, 2011)
19
19
Two Paths to Reversing a LMM ScenarioRepeated Measures (Test)
Multivariate (Hotelling)12 Unstructu
Require Univariate (Uncorrected)Balanced within ISU Balanced within ISU
red Covariance3 Wald Test Fixed Effects Wald Test Fixed Effects4 Kenward-Roger df
Compound Symmetric
Residual df
Either column gives an easy path to power.
20
20
Outline1. the Linear Models You See Characterizing
- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model
2. Six-Step Checklist for Power and Sample Size Analysis- Two Real Design Examples- Using the Checklist for the Examples
3. Simple Adjustments for Power with Missing Data4. Free, Web-based Software, GLIMMPSE,and Related Web Resources
21
21
First of Two Examplesì Memory of Pain: Proposed study comparing effect of sensoryfocus intervention to placebo on memory of dental pain(Law et al., 1994; Logan et al., 1995)
22
22
Second of Two Examplesì Project Northland Chicago (PNC) Trial: Proposedlongitudinal cohort study using data from previous community-randomized controlled trial to test intervention for adolescents(ages 11-14) designed to prevent alcohol use (Komro et al.,2007)
ClusteringRandomizationRepeated Measures
23
23
The PNC Trial: Cluster Randomized DesignThe PNC Trial: Clustering
Community1 ...
School1
Schooln1
Community22
School1
Schooln22
Recruit 22 Communities
... ...
24
24
The PNC Trial: Clustering + Randomization
RandomizeCommunity
Treat
Analysis
6th Grade Data 7th Grade Data 8th Grade Data
6th Grade Data 7th Grade Data 8th Grade DataStandard Delayed Treatment
25
25
Checklist for Power and Sample Size Analysis
Step 1• Power or
sample size? Type I error?
Step 2• What is
the sampling scheme?
Step 3• What
responses are measured?
26
26
Checklist for Power and Sample Size Analysis
Step 4• What is
the primary hypothesis of interest?
Step 5• What are
the means?
Step 6• What is
the covariance structure?
27
27
Step 1. What is the Study Design Goal?
28
28
Goal for the Memory of Pain Trialì Determine Sample Sizeì Power of 0.9 and (Type I Error Rate) 0.01œ
ì Primary Hypothesis: Time trend by Treatment Interactionì Expect the Treated group mean to be 1.2 points lower inMemory of Pain (5-point scale) compared to the Placebo at thelast time measurement (12 months)
29
29
Step 2a. Specify Study Design GroupsOne-sample Two-sample Multi-sample
30
30
Two Samples for the Memory of Pain Trial
RandomizeParticipant
SIT Treatment
Stop
Month 0 Data
Month 6 Data
Month 12 Data
Month 0 Data
Month 6 Data
Month 12 DataPlacebo
31
31
Step 2b. Specify Study Design Covariates
32
32
Step 2c. Specify Cluster Sampling Scheme
33
33
Step 2d. Specify Relative Group Sizes
34
34
Step 3a. Specify Response Variables
35
35
Step 3b. Specify Repeated Measures
36
36
Repeated Measures for the Memory of Pain Trial
37
37
Step 4. Specify Primary Hypothesis of Interest
38
38
Step 5. Specify Mean Differences Between Groups
39
39
Mean Differences for the Memory of Pain TrialTreatment group mean is 1.2 points lower on Memory of Paincompared to the Placebo group mean at the last timemeasurement (12 months).
Consider effect sizes of .5 times up to 2 times the stated effect toallow for uncertainty of the input information.
40
40
Step 6. Variance Structure:Multi-level Model Sources of Correlation
41
41
Clustering Covariance PatternFor clustering, exchangable sampling inducesCompound Symmetry
53 3
3 33 3
#! !
! !
! !
" " !Þ$ !Þ$" !Þ$ " !Þ$
" !Þ$ !Þ$ "œ Þ#&
42
42
Covariance Patterns Often Used for Repeated Measuresì Unstructuredì AR(1)ì Linear Exponent AR(1) (LEAR allows slower decay)ì Í"Random Intercept" CS covariance of responses;Gurka, Edwards, Muller (2011) showed danger
43
43
Covariance Patterns for Repeated Measures - UnstructuredUnstructured
53 3
3 33 3
#" #
" $
# $
" " !Þ$ !Þ#" !Þ$ " !Þ&
" !Þ# !Þ& "œ Þ#&
44
44
Covariance Patterns for Repeated Measures - AR(1)First order autoregressive
53 3
3 3
3 3
#" "
#
" "
"#
"
"" !Þ$! " !Þ$!
"
œ Þ#&" !Þ$! !Þ!*
!Þ!* !Þ$! "
45
45
Covariance Patterns for Repeated Measures - LEARLinear Exponent AR(1) ( = 0.5)$
53 3
3 3
3 3
#" "
"
" "
""
"
"" !Þ$! " !Þ$!
"
œ Þ#&" !Þ$! !Þ"'
!Þ"' !Þ$! "
$
$
46
46
Commonly Used Covariance Patterns for Multiple ResponseVariables
ì Unstructured observedì Structure from Structural Equations Modelì Theoretical framework
47
47
Building Overall Covariance Structure
Clustering
Repeated Measures
Multiple Response Variables
48
48
Building Overall Covariance StructureVariance Clusters Time Responses
Clusters of 3 Repeated 2 ResponseSize 3 Measures V
D3#
! ! " #
! ! " $
! ! # $
%
%œ Œ Œ
" "" "
" "
""
53 3 3 3
3 3 3 33 3 3 3
33
ariables
49
49
Overall Covariance Model for Memory of Pain Trial- Variance of Memory of Pain 0.96œ
- Correlation of responses 6 months apart 0.5œ
- Correlation decays slowly over time, between 0 and 12months correlation 0.4œ
50
50
Overall Covariance Model for Memory of Pain Trial
53 3
3 33 3
#" #
" $
# $
" " !Þ$ !Þ" !Þ " !Þ&
" !Þ !Þ& "œ Þ#&
454
51
51
Example 2: PNC, Alcohol Use Prevention Study Power
Community1 ...
School1
Schooln1
Community22
School1
Schooln22
Recruit 22 Communities
... ...
52
52
PNC, Alcohol Use Prevention Study Example for Power
RandomizeCommunity
Treat
Analysis
6th Grade Data 7th Grade Data 8th Grade Data
6th Grade Data 7th Grade Data 8th Grade DataStandard Delayed Treatment
53
53
PNC Trial: Study Design Checklist1. What is the study design goal?a. Solving for power or sample sizePowerb. Type I error rate 0.05
54
54
PNC Trial: Study Design Checklistì What is the sampling scheme?ì How many groups?2 treatment groupsì What are the covariates?Noneì Is clustering present?Yes; one levelì Are group sizes equal or unequal?Yes, with 10 communities per group
55
55
PNC Trial: Study Design Checklistì What responses are measured?ì What are the response variables?Alcohol use behavior scaleì Are repeated measures present?Yes, at 6th, 7th and 8th gradesì What is the primary hypothesis of interest?Time Trend by Treatment Interaction
56
56
PNC Trial: Study Design Checklistì What are the means?Mean difference is 0.25 reduction in self reported alcohol use intreatment group vs. control
57
57
PNC Trial: Study Design Checklistì What is the variance structure?ì What are the sources of correlation in the study design?- Clustering (one level), withclusters of size 10 (# children/cluster)
- Repeated Measures, 3 occasions, 1 year apart
58
58
PNC Trial: Study Design Checklistì What is the variance structure?ì What is the pattern of variability for each source ofcorrelation?
- Variance: 0.095# œ
- Intraclass correlation for community: 0.013! œ
- Correlation for responses 1 year apart: 0.33" œ
- Correlation decays slowly with decay rate: 0.3$œ
59
59
Overall Covariance Structure for PNC Trial
53
3
3 33 3
3 3
#!
!
" ""
" "
""
"
" âã ä ã "
â "Œ œ
"
"
!Þ!* Œ" â !Þ!" " !Þ$! !Þ#"ã ä ã !Þ$! " !Þ$!
!Þ!" â " !Þ#" !Þ$! "
$
$
60
60
Outline1. Characterizing the Linear Models You See- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model
2. Six-Step Checklist for Power and Sample Size Analysis- Two Real Design Examples- Using the Checklist for the Examples
3. Simple Adjustments for Power with Missing Data4. Free, Web-based Software, GLIMMPSE,and Related Web Resources
61
61
3. Simple Adjustments for Power with Missing DataMissing Data Adjustments
ì Some useful approximations from null case results inCatellier and Muller (2000) and better non-null case results inRingham et al. (in review):
- Complete data power is an upper bound- Power for N (100% - % missing) x # ISUsœappears somewhat liberal
- Power for N (100% - % missing) x # ISUs appearsœ :
somewhat conservative to OK- Results assuming Missing at Random
ì More work is in progress to identify better approximations
62
62
Reviewì For widely applicable restrictions a General Linear MixedModel can be expressed as a General Linear MultivariateModel with accurate power and sample size available.ì Answers to a series of simple questions can completely specifythe inputs to a power analysis.ì Convenient adjustments may suffice for simple missing datapatterns.
63
63
Outline1. Characterizing the Linear Models You See- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model
2. Six-Step Checklist for Power and Sample Size Analysis- Two Real Design Examples- Using the Checklist for the Examples
3. Simple Adjustments for Power with Missing Data4. Free, Web-based Software, GLIMMPSE,and Related Web Resources
64
64
Any Brief Questions Before We Look at Our Software?
BIBLIOGRAPHY
BooksKleinbaum DG, Kupper LL, Nizam A, and Muller KE (2007) Applied Regression Analysis and Other Multivariable Methods.
4th edition. Boston: Duxbury Press.Muller KE and Stewart PW (2006) . New York: Wiley.Linear Model Theory; Univariate, Multivariate, and Mixed ModelsMuller KE and Fetterman BA (2002) . Cary, NC:Regression and ANOVA: An Integrated Approach Using SAS Software®
SAS Institute.
The following articles can be accessed from or www.samplesizeshop.org www.health-outcomes-policy.ufl.edu/mullerCatellier DJ and Muller KE (2000) Tests for gaussian repeated measures with missing data in small samples, Statistics in
Medicine, , 1101-1114.19Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, and Muller KE (2010) Real longitudinal data analysis for real
people: building a good enough mixed model, , , 504-520.87Statistics in Medicine 29Edwards LJ, Muller KE, Wolfinger RD, Qaqish BF, and Schabenberger O (2008) An R-square statistic for fixed effects in the
linear mixed model, , , 6137-6157.Statistics in Medicine 27Glueck D H and Muller KE (2003) Adjusting power for a baseline covariate in linear models, , ,Statistics in Medicine 22
2535-2551.Guo Y, Logan HL, Glueck DH, and Muller KE (2013) Selecting a sample size for studies with repeated measures, BMC
Medical Research Methodology, in press.Gurka MJ, Edwards LJ, Muller KE, and Kupper LL (2006) Extending the Box-Cox transformation to the linear mixed model,
Journal of the Royal Statistical Society Series A, Statistics in Society, , 255-272.169Gurka, MJ, Muller KE, and Edwards LJ (2011) Avoiding bias in mixed model inference for fixed effects, Statistics in
Medicine, , 2696-2707. doi:10.1002/sim.429330Komro KA, Maldonado-Molina MM, Tobler AL, Bonds JR, and Muller KE (2007) Effects of home access and availability of
alcohol on young adolescents’ alcohol use, (Abingdon, England), , 1597-1608.Addiction 102Muller KE (2009) Analysis of Variance (ANOVA concepts and computations), Wiley Interdisciplinary Reviews:
Computational Statistics.Muller K E and Barton CN (1989) Approximate Power for Repeated-Measures ANOVA Lacking Sphericity, Journal of the
American Statistical Association, , 549-555.84Muller KE, Edwards LJ, Simpson SL, and Taylor DJ (2007) Statistical tests with accurate size and power for balanced linear
mixed models, , , 3639–3660. DOI 10.1002/sim.2827Statistics in Medicine 26Muller KE, Lavange LM, Ramey SL, and Ramey CT (1992) Power calculations for general linear multivariate models
including repeated measures applications, , , 1209-1226.Journal of the American Statistical Association 87Muller KE, and Peterson BL (1984) Practical methods for computing power in testing the multivariate general linear
hypothesis, , , 143-158.Computational Statistics and Data Analysis 2Simpson SL, Edwards LJ, Muller KE, Sen PK, and Styner MA (2010) A linear exponent AR(1) family of correlation
structures, , , 1825–1838. Statistics in Medicine 29Taylor DJ, and Muller KE (1995) Computing confidence bounds for power and sample size of the general linear univariate
model, , , 43-47. doi:10.2307/2684810The American Statistician 49
Other SourcesAdams G, Gulliford MC, Ukoumunne OC, Eldridge S, Chinn S, and Campbell MJ (2004) Patterns of intra-cluster correlation
from primary care research to inform study design and analysis, , , 785-794.Journal of Clinical Epidemiology 57Demidenko E (2004) (1st ed.). Wiley-Interscience.Mixed Models: Theory and ApplicationsKerry SM, and Bland JM (1998) The intracluster correlation coefficient in cluster randomisation, ,BMJ (Clinical research ed.)
316(7142), 1455.Laird NM, and Ware JH (1982) Random-effects models for longitudinal data, , , 963-974.Biometrics 38Law A, Logan H, and Baron RS (1994) Desire for control, felt control, and stress inoculation training during dental treatment,
Journal of Personality and Social Psychology, , 926-936.67Logan HL, Baron RS, Keeley K, Law A, and Stein S (1991) Desired control and felt control as mediators of stress in a dental
setting, , , 352–359.Health Psychology 10