Easy Power and Sample Size for Most of the Mixed Models ... · - Common Tests in the Linear Mixed...

1

1

Easy Power and Sample Size forMost of the Mixed Models

You Will Ever SeeKeith E. Muller

Chief, Division of MethodologyDepartment of Health Outcomes and Policy

University of Florida, [email protected]

Anna E. Barón (Colorado SPH)Sarah M. Kreidler (Colorado SPH)

Yueh-Yun Chi (Florida Biostatistics)Deborah H. Glueck (Colorado SPH)

http://samplesizeshop.org/

2

2

SponsorshipSupported by the National Institute of Dental and CraniofacialResearch under award NIDCR 1 R01 DE020832-01A1,Multilevel and Longitudinal Study Sample Size Tools forBehavioral Scientists (12/09/10-11/30/14), led by Keith Mullerand Deborah Glueck. The content is solely the responsibility ofthe authors, and does not necessarily represent the officialviews of the National Institute of Dental and CraniofacialResearch nor the National Institutes of Health.

Additional support from NIAAA R01-AA013458-01, NHLBIR01-HL091005, and NIDCR U54-DE019261.

3

3

Saga of Sample Size Selectionì We have long needed to select sample size for designs withclusters, repeated measures and multiple outcomes, and now wesee combinations.ì Existing approaches: 1) simulations,2) exemplary data,3) large sample approximations, and4) special cases.ì We think the ideas and software we present today make thejob easier than ever before.ì The first version of our free power software was written 30years ago.ì Previous versions matrix based, user hostile.Now point and click (GUI).

4

4

Software Development Team UCD

5

5

Outline1. Characterizing the Linear Models You See- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model

2. Six-Step Checklist for Power and Sample Size Analysis- Two Real Design Examples- Using the Checklist for the Examples

3. Simple Adjustments for Power with Missing Data4. Free, Web-based Software, GLIMMPSE,and Related Web Resources

6

6

1. Characterizing The Linear Models You SeeGeneral Linear Mixed Model Commonly Used for Clustered

and Repeated Measures Dataì Laird and Ware (1982) Demidenko (2004) Muller and Stewart (2007)ì Studies with Clustering - Designed: Cluster randomized studies - Observational: Clustered observationsì Studies with Repeated Measures - Designed: Randomized clinical trials - Observational: Cohort studies, natural historyì Combinations - Cluster randomized longitudinal studies

7

7

Data StructuresClusteringClustering is a restricted form of multilevel data,a special simple case of importance that we can cover.

Repeated MeasuresRepeated measures is a restricted form of multilevel data,a special simple case of importance that we can cover.

8

8

Distinguishing the Independent Sampling Unitfrom the Observational Unit

Observational Unit: alcohol use bychildin a schoolin a particular year

Independent Sampling Unit (ISU): the school.ISU, the "Subject," is not always the participant.

R ISU: How many clusters, here schools.8 total observations sums over ISU, clusters, times.

9

9

Clustering in One or More Levels

CommunityLevel 1

SchoolLevel 2

ClassroomLevel 3

ClassroomLevel 3

SchoolLevel 2

ClassroomLevel 3

10

10

Repeated Measures: Memory of Pain Trial

11

11

Three Classes of Tests in theGeneral Linear Mixed Model

A) Power for testing fixed effects (means) B) Power for testing random effects (covariance) C) Power for testing fixed and random effects

General and accurate power and sample sizetools are not available.

There are good methods for most common tests in A.

12

12

Power and Sample Size for Fixed Effectsin the General Linear Mixed Model

ì Many General Linear Mixed Model tests can be recast as testsin the General Linear Model, (Muller andMultivariate GLMMStewart, 2006; Muller, et al., 2007)ì Why do we care?- Muller, et al. (1992) show how to do power for time bytreatment using framework.multivariateì We know how to computer power and sample size for awide class of linear mixed models!- Typical clinical trial or longitudinal study in which maininference is about time by treatment interaction, and others.

13

13

Four Requirements for a Reversible LMM ScenarioRequirement Description

1 "Nice" Design 2 Specific Covariance of Responses3 Wald Test Statistic of Fixed Effects4 Specific E

Within ISU

rror df for Reference DistributionJ

14

14

Four Requirements to Recast a LMM as a GLMM: 1To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Have a "Nice" Design - No missing or mistimed data, Balanced Within ISU - Treatment assignment does not change over time; no repeated covariates

- Saturated in time and time by treatment effects - Unequal ISU group sizes OK

15

15

Four Requirements to Recast a LMM as a GLMM: 2To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Fit an Unstructured Covariance Model

16

16

Four Requirements to Recast a LMM as a GLMM: 3To be reversible to a General Linear Multivariate Model, aLinear Mixed Model scenario must:ì Use Wald test for inference about Fixed Effects- Most common test used for analysis

17

17

Four Requirements to Recast a LMM as a GLMM: 4To be to a General Linear Multivariate Model, a reversible

Linear Mixed Model scenario must:ì Use Kenward-Roger methodJ

- df approximation method with modified covariance matrix- With reversibility, covariance matrix is unstructured and testis equivalent to Hotelling-Lawley Trace test

- Muller et al. (2007), among many others, showed it gives thebest LMM test

18

18

Power and Sample Size for Multivariate Modelì Muller, LaVange, Ramey and Ramey (1992) described powerfor univariate and multivariate approaches to repeatedmeasures, including Hotelling-Lawley Trace (HLT)ì If data analysis fits unstructured covariance, thenKenward-Roger Wald test equivalent to HLTwhen reversible (Edwards, et al., 2008)

There is a second path to a reversible scenario.ì If data analysis fits "random intercept" only, then Wald Test with residual method for error df equivalent tounivariate approach to repeated measures with uncorrected test(Gurka, Edwards, Muller, 2011)

19

19

Two Paths to Reversing a LMM ScenarioRepeated Measures (Test)

Multivariate (Hotelling)12 Unstructu

Require Univariate (Uncorrected)Balanced within ISU Balanced within ISU

red Covariance3 Wald Test Fixed Effects Wald Test Fixed Effects4 Kenward-Roger df

Compound Symmetric

Residual df

Either column gives an easy path to power.

20

20

Outline1. the Linear Models You See Characterizing

- Common Tests in the Linear Mixed Model (LMM)- The LMM as a General Linear Multivariate Model



21

21

First of Two Examplesì Memory of Pain: Proposed study comparing effect of sensoryfocus intervention to placebo on memory of dental pain(Law et al., 1994; Logan et al., 1995)

22

22

Second of Two Examplesì Project Northland Chicago (PNC) Trial: Proposedlongitudinal cohort study using data from previous community-randomized controlled trial to test intervention for adolescents(ages 11-14) designed to prevent alcohol use (Komro et al.,2007)

ClusteringRandomizationRepeated Measures

23

23

The PNC Trial: Cluster Randomized DesignThe PNC Trial: Clustering

Community1 ...

School1

Schooln1

Community22

School1

Schooln22

Recruit 22 Communities

... ...

24

24

The PNC Trial: Clustering + Randomization

RandomizeCommunity

Treat

Analysis

6th Grade Data 7th Grade Data 8th Grade Data

6th Grade Data 7th Grade Data 8th Grade DataStandard Delayed Treatment

25

25

Checklist for Power and Sample Size Analysis

Step 1• Power or

sample size? Type I error?

Step 2• What is

the sampling scheme?

Step 3• What

responses are measured?

26

26

Checklist for Power and Sample Size Analysis

Step 4• What is

the primary hypothesis of interest?

Step 5• What are

the means?

Step 6• What is

the covariance structure?

27

27

Step 1. What is the Study Design Goal?

28

28

Goal for the Memory of Pain Trialì Determine Sample Sizeì Power of 0.9 and (Type I Error Rate) 0.01œ

ì Primary Hypothesis: Time trend by Treatment Interactionì Expect the Treated group mean to be 1.2 points lower inMemory of Pain (5-point scale) compared to the Placebo at thelast time measurement (12 months)

29

29

Step 2a. Specify Study Design GroupsOne-sample Two-sample Multi-sample

30

30

Two Samples for the Memory of Pain Trial

RandomizeParticipant

SIT Treatment

Stop

Month 0 Data

Month 6 Data

Month 12 Data

Month 0 Data

Month 6 Data

Month 12 DataPlacebo

31

31

Step 2b. Specify Study Design Covariates

32

32

Step 2c. Specify Cluster Sampling Scheme

33

33

Step 2d. Specify Relative Group Sizes

34

34

Step 3a. Specify Response Variables

35

35

Step 3b. Specify Repeated Measures

36

36

Repeated Measures for the Memory of Pain Trial

37

37

Step 4. Specify Primary Hypothesis of Interest

38

38

Step 5. Specify Mean Differences Between Groups

39

39

Mean Differences for the Memory of Pain TrialTreatment group mean is 1.2 points lower on Memory of Paincompared to the Placebo group mean at the last timemeasurement (12 months).

Consider effect sizes of .5 times up to 2 times the stated effect toallow for uncertainty of the input information.

40

40

Step 6. Variance Structure:Multi-level Model Sources of Correlation

41

41

Clustering Covariance PatternFor clustering, exchangable sampling inducesCompound Symmetry

53 3

3 33 3

#! !

! !

! !

" " !Þ$ !Þ$" !Þ$ " !Þ$

" !Þ$ !Þ$ "œ Þ#&

42

42

Covariance Patterns Often Used for Repeated Measuresì Unstructuredì AR(1)ì Linear Exponent AR(1) (LEAR allows slower decay)ì Í"Random Intercept" CS covariance of responses;Gurka, Edwards, Muller (2011) showed danger

43

43

Covariance Patterns for Repeated Measures - UnstructuredUnstructured

53 3

3 33 3

#" #

" $

# $

" " !Þ$ !Þ#" !Þ$ " !Þ&

" !Þ# !Þ& "œ Þ#&

44

44

Covariance Patterns for Repeated Measures - AR(1)First order autoregressive

53 3

3 3

3 3

#" "

#

" "

"#

"

"" !Þ$! " !Þ$!

"

œ Þ#&" !Þ$! !Þ!*

!Þ!* !Þ$! "

45

45

Covariance Patterns for Repeated Measures - LEARLinear Exponent AR(1) ( = 0.5)$

53 3

3 3

3 3

#" "

"

" "

""

"

"" !Þ$! " !Þ$!

"

œ Þ#&" !Þ$! !Þ"'

!Þ"' !Þ$! "

$

$

46

46

Commonly Used Covariance Patterns for Multiple ResponseVariables

ì Unstructured observedì Structure from Structural Equations Modelì Theoretical framework

47

47

Building Overall Covariance Structure

Clustering

Repeated Measures

Multiple Response Variables

48

48

Building Overall Covariance StructureVariance Clusters Time Responses

Clusters of 3 Repeated 2 ResponseSize 3 Measures V

D3#

! ! " #

! ! " $

! ! # $

%

%œ Œ Œ

" "" "

" "

""

53 3 3 3

3 3 3 33 3 3 3

33

ariables

49

49

Overall Covariance Model for Memory of Pain Trial- Variance of Memory of Pain 0.96œ

- Correlation of responses 6 months apart 0.5œ

- Correlation decays slowly over time, between 0 and 12months correlation 0.4œ

50

50

Overall Covariance Model for Memory of Pain Trial

53 3

3 33 3

#" #

" $

# $

" " !Þ$ !Þ" !Þ " !Þ&

" !Þ !Þ& "œ Þ#&

454

51

51

Example 2: PNC, Alcohol Use Prevention Study Power

Community1 ...

School1

Schooln1

Community22

School1

Schooln22

Recruit 22 Communities

... ...

52

52

PNC, Alcohol Use Prevention Study Example for Power

RandomizeCommunity

Treat

Analysis

6th Grade Data 7th Grade Data 8th Grade Data

6th Grade Data 7th Grade Data 8th Grade DataStandard Delayed Treatment

53

53

PNC Trial: Study Design Checklist1. What is the study design goal?a. Solving for power or sample sizePowerb. Type I error rate 0.05

54

54

PNC Trial: Study Design Checklistì What is the sampling scheme?ì How many groups?2 treatment groupsì What are the covariates?Noneì Is clustering present?Yes; one levelì Are group sizes equal or unequal?Yes, with 10 communities per group

55

55

PNC Trial: Study Design Checklistì What responses are measured?ì What are the response variables?Alcohol use behavior scaleì Are repeated measures present?Yes, at 6th, 7th and 8th gradesì What is the primary hypothesis of interest?Time Trend by Treatment Interaction

56

56

PNC Trial: Study Design Checklistì What are the means?Mean difference is 0.25 reduction in self reported alcohol use intreatment group vs. control

57

57

PNC Trial: Study Design Checklistì What is the variance structure?ì What are the sources of correlation in the study design?- Clustering (one level), withclusters of size 10 (# children/cluster)

- Repeated Measures, 3 occasions, 1 year apart

58

58

PNC Trial: Study Design Checklistì What is the variance structure?ì What is the pattern of variability for each source ofcorrelation?

- Variance: 0.095# œ

- Intraclass correlation for community: 0.013! œ

- Correlation for responses 1 year apart: 0.33" œ

- Correlation decays slowly with decay rate: 0.3$œ

59

59

Overall Covariance Structure for PNC Trial

53

3

3 33 3

3 3

#!

!

" ""

" "

""

"

" âã ä ã "

â "Œ œ

"

"

!Þ!* Œ" â !Þ!" " !Þ$! !Þ#"ã ä ã !Þ$! " !Þ$!

!Þ!" â " !Þ#" !Þ$! "

$

$

60

60




61

61

3. Simple Adjustments for Power with Missing DataMissing Data Adjustments

ì Some useful approximations from null case results inCatellier and Muller (2000) and better non-null case results inRingham et al. (in review):

- Complete data power is an upper bound- Power for N (100% - % missing) x # ISUsœappears somewhat liberal

- Power for N (100% - % missing) x # ISUs appearsœ :

somewhat conservative to OK- Results assuming Missing at Random

ì More work is in progress to identify better approximations

62

62

Reviewì For widely applicable restrictions a General Linear MixedModel can be expressed as a General Linear MultivariateModel with accurate power and sample size available.ì Answers to a series of simple questions can completely specifythe inputs to a power analysis.ì Convenient adjustments may suffice for simple missing datapatterns.

63

63




64

64

Any Brief Questions Before We Look at Our Software?

BIBLIOGRAPHY

BooksKleinbaum DG, Kupper LL, Nizam A, and Muller KE (2007) Applied Regression Analysis and Other Multivariable Methods.

4th edition. Boston: Duxbury Press.Muller KE and Stewart PW (2006) . New York: Wiley.Linear Model Theory; Univariate, Multivariate, and Mixed ModelsMuller KE and Fetterman BA (2002) . Cary, NC:Regression and ANOVA: An Integrated Approach Using SAS Software®

SAS Institute.

The following articles can be accessed from or www.samplesizeshop.org www.health-outcomes-policy.ufl.edu/mullerCatellier DJ and Muller KE (2000) Tests for gaussian repeated measures with missing data in small samples, Statistics in

Medicine, , 1101-1114.19Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, and Muller KE (2010) Real longitudinal data analysis for real

people: building a good enough mixed model, , , 504-520.87Statistics in Medicine 29Edwards LJ, Muller KE, Wolfinger RD, Qaqish BF, and Schabenberger O (2008) An R-square statistic for fixed effects in the

linear mixed model, , , 6137-6157.Statistics in Medicine 27Glueck D H and Muller KE (2003) Adjusting power for a baseline covariate in linear models, , ,Statistics in Medicine 22

2535-2551.Guo Y, Logan HL, Glueck DH, and Muller KE (2013) Selecting a sample size for studies with repeated measures, BMC

Medical Research Methodology, in press.Gurka MJ, Edwards LJ, Muller KE, and Kupper LL (2006) Extending the Box-Cox transformation to the linear mixed model,

Journal of the Royal Statistical Society Series A, Statistics in Society, , 255-272.169Gurka, MJ, Muller KE, and Edwards LJ (2011) Avoiding bias in mixed model inference for fixed effects, Statistics in

Medicine, , 2696-2707. doi:10.1002/sim.429330Komro KA, Maldonado-Molina MM, Tobler AL, Bonds JR, and Muller KE (2007) Effects of home access and availability of

alcohol on young adolescents’ alcohol use, (Abingdon, England), , 1597-1608.Addiction 102Muller KE (2009) Analysis of Variance (ANOVA concepts and computations), Wiley Interdisciplinary Reviews:

Computational Statistics.Muller K E and Barton CN (1989) Approximate Power for Repeated-Measures ANOVA Lacking Sphericity, Journal of the

American Statistical Association, , 549-555.84Muller KE, Edwards LJ, Simpson SL, and Taylor DJ (2007) Statistical tests with accurate size and power for balanced linear

mixed models, , , 3639–3660. DOI 10.1002/sim.2827Statistics in Medicine 26Muller KE, Lavange LM, Ramey SL, and Ramey CT (1992) Power calculations for general linear multivariate models

including repeated measures applications, , , 1209-1226.Journal of the American Statistical Association 87Muller KE, and Peterson BL (1984) Practical methods for computing power in testing the multivariate general linear

hypothesis, , , 143-158.Computational Statistics and Data Analysis 2Simpson SL, Edwards LJ, Muller KE, Sen PK, and Styner MA (2010) A linear exponent AR(1) family of correlation

structures, , , 1825–1838. Statistics in Medicine 29Taylor DJ, and Muller KE (1995) Computing confidence bounds for power and sample size of the general linear univariate

model, , , 43-47. doi:10.2307/2684810The American Statistician 49

Other SourcesAdams G, Gulliford MC, Ukoumunne OC, Eldridge S, Chinn S, and Campbell MJ (2004) Patterns of intra-cluster correlation

from primary care research to inform study design and analysis, , , 785-794.Journal of Clinical Epidemiology 57Demidenko E (2004) (1st ed.). Wiley-Interscience.Mixed Models: Theory and ApplicationsKerry SM, and Bland JM (1998) The intracluster correlation coefficient in cluster randomisation, ,BMJ (Clinical research ed.)

316(7142), 1455.Laird NM, and Ware JH (1982) Random-effects models for longitudinal data, , , 963-974.Biometrics 38Law A, Logan H, and Baron RS (1994) Desire for control, felt control, and stress inoculation training during dental treatment,

Journal of Personality and Social Psychology, , 926-936.67Logan HL, Baron RS, Keeley K, Law A, and Stein S (1991) Desired control and felt control as mediators of stress in a dental

setting, , , 352–359.Health Psychology 10

Easy Power and Sample Size for Most of the Mixed Models ... · - Common Tests in the Linear Mixed...

Documents

Transcript of Easy Power and Sample Size for Most of the Mixed Models ... · - Common Tests in the Linear Mixed...