Paired Comparisons and Repeated Measuresmaitra/stat501/lectures/InferenceForMeans... · Paired...

Paired Comparisons and Repeated Measures

• Hotelling’s T-squared tests for a single mean vector have

useful applications for studies with paired comparisons or

repeated measurements.

1. Paired comparison designs in which two treatments are

applied to each sample unit and p variables are measured

on each unit under each treatment.

2. Repeated measures designs in which the same variable is

measured at the same set of p time points on each sample

Paired Comparison Designs

• What is a paired comparisons design? A design where twodifferent treatments (or a treatment and an ’absence of treat-ment’ or control) are assigned to each sample unit.

• Examples:

– We measure volumes of sales of a certain product in acertain market before and after an advertising campaign

– We measure blood pressure on a sample of individualsbefore and after receiving some drug

– We count traffic accidents at a collection of intersectionswhen stop sign controls or light controls were used forsignalization.

Paired Comparison Designs

• One advantage of this type of design is that the differences

in the outcome measures under one treatment or the other

reflect only the effect of treatment since everything else in

the units receiving the treatments is absolutely identical.

• It is not quite so simple: in drug trials there may be carry-

over effects to worry about and in ’before/after’ experiments

other conditions may also change.

Paired Comparisons when p = 1• In the univariate case, let X1j and X2j denote the responses

of unit j under treatments 1 and 2, respectively.

• Differences Dj = X1j −X2j, j = 1, ..., n reflect the effect oftreatment. If D ∼ N(δ, σ2

t =D̄ − δsd/√n∼ tn−1,

D̄ =1

Dj, s2d =

n− 1

(Dj − D̄)2.

• A 100(1 − α)% confidence interval for δ = E(X1j − X2j) isgiven by

d̄− tn−1(α/2)sd√n≤ δ ≤ d̄+ tn−1(α/2)

sd√n.

Paired Comparisons when p > 1

• When p variables are measured on each sample unit, we

compute vectors of differences ([post treatment]-[pre treat-

ment]):

Dj1Dj2

...Djp

X1j1X1j2

...X1jp

−X2j1X2j2

...X2jp

for j = 1, ..., n sample units.

• For Dj ∼ NIDp(δ,Σd) we have

T2 = n(D̄ − δ)′S−1D (D̄ − δ) ∼

p(n− 1)

n− pFp,n−p,

Paired Comparisons when p > 1

• where

D̄ =1

Dj, SD =1

n− 1

(Dj − D̄)(Dj − D̄)′.

• In particular, a test of H0 : δ = 0 versus H1 : δ 6= 0 rejects

H0 at level α if

T2 = nD̄′S−1D D̄ >

p(n− 1)

n− pFp,n−p(α).

If we fail to reject H0, we conclude that there are no sub-

stantial treatment effects on any of the p variables.

Confidence region for δ

• As before, a 100(1−α)% CR for δ consists of all δ such that

(D̄ − δ)′S−1D (D̄ − δ) ≤

p(n− 1)

n(n− p)Fp,n−p(α).

• Simultaneous T2 intervals for the p individual mean differ-ences are given by

D̄i ±√p(n− 1)

(n− p)Fp,n−p(α)

√√√√s2D,i

• When n− p is large,

p(n− 1)

(n− p)Fp,n−p(α) ≈ χ2

p(α).

Example 6.1: Effluent Study

• Wastewater treatment plants are required to monitor thequality of the treated water that goes into rivers and streams.

• To monitor the reliability of a self-monitoring program, 11samples of water were divided in half. One half was sent toa state lab and another one to a private lab for analysis.

• Two variables were measured: biochemical oxygen demand(BOD) and suspended solids (SS).

• Question of interest: Is there a difference between labs inmean values for either BOD or SS ?

Example 6.1: Effluent Study (cont’d)

Commercial lab State labSample x1j1 (BOD) x1j2 (SS) x2j1 (BOD) x2j2 (SS) dj1 dj2

1 6 27 25 15 -19 122 6 23 28 13 -22 103 18 64 36 22 -18 424 8 44 35 29 -27 155 11 30 15 31 -4 -16 34 75 44 64 -10 117 28 26 42 30 -14 -48 71 124 54 64 17 609 43 54 34 56 9 -2

10 33 30 29 20 4 1011 20 14 39 21 -19 -7

• Sample statistics are

[−9.3613.27

], Sd =

[199.26 88.3888.38 418.61

• To test H0 : δ = 0 versus H1 : δ 6= 0, we compute T2:

T2 = 11[−9.36 13.27]

[0.0055 −0.0012−0.0012 0.0026

] [9.36

]= 13.6.

• For α = 0.05, [p(n − 1)/(n − p)]F2,9(0.05) = 9.47. Since

T2 > 9.47, we reject H0.

• The two labs appear to produce significantly different mean

measurements for either BOD or SS or both. Which variable

has different means?

• The private lab seems to report lower BOD and higher SS

than the state lab. We compute the two simultaneous T2

intervals to obtain:

δ1 : (−22.46, 3.74), δ2 : (−5.71, 32.25).

• Both intervals include zero, yet we rejected the null

hypothesis of equal mean vectors.

Example 6.1: Effluent Study (cont’d)• The T2 intervals are conservative (wider than needed for

simultaneous 95% coverage) when only two comparisons arebeing made.

• The Bonferroni intervals are less conservative, but also coverzero.

• One complication is that normality of the Dj’s is suspect;water sample 8 appears to be an outlier. With such a smallsample, its effect is substantial.

• Consider the correlation between the measurements of BODand SS

/* This program is posted as effluent.sas

on the course web page. It computes a

one-sample Hotelling T-squared test. */

DATA set1;

INPUT sample BOD1 SS1 BOD2 SS2;

dBOD=BOD1-BOD2;

dSS = SS1-SS2;

/* LABEL BOD1 = BOD (private lab)

SS1 = SS (private lab)

BOD2 = BOD (state lab)

SS2 = SS (state lab); */

datalines;

PROC GLM DATA=set1;

MODEL dBOD dSS = Z / NOINT SOLUTION;

MANOVA H=Z / PRINTE;

Wilks’ Lambda 0.450625 5.49 2 9 0.0277

• Using R (see the code in effluent.R)

Cstar <-matrix( c(1, 0, -1, 0, 0, 1, 0, -1), 2, 4, byrow=T)

# create a design matrix

library(ICSNP)

HotellingsT2( as.matrix(edat[, 2:5]) %*% t(Cstar), mu = c(0,0))

# T.2 = 6.1377, df1 = 2, df2 = 9, p-value = 0.02083

# alternative hypothesis: true location is not equal to c(0,0)

Repeated Measures Designs

• Suppose now that q treatments are applied to each of the n

sample units. Then

X ′j = [Xj1, Xj2, ..., Xjp]′, j = 1, ..., n.

• Of interest are contrasts of the elements of µ = E(Xj) suchas

µ1 − µ2µ1 − µ3

...µ1 − µq

1 −1 0 · · · 01 0 −1 · · · 0... ... ... ... ...1 0 0 · · · −1

µ = Cµ.

• The contrast matrix C has p− 1 independent rows.

• If there is no effect of treatment, then Cµ = 0.

Testing Hypotheses in Repeated MeasuresDesigns

• To test H0 : Cµ = 0 we again use T2:

T2 = n(Cx̄)′(CSC′)−1(Cx̄)

• The null hypothesis is rejected if

T2 ≥(n− 1)(p− 1)

(n− p+ 1)Fp−1,n−p+1(α).

• A confidence region for contrasts Cµ is given by all Cµ suchthat

n(Cx̄)′(CSC′)−1(Cx̄) ≤(n− 1)(p− 1)

(n− p+ 1)Fp−1,n−p+1(α).

Example 6.2: Dog Anesthetics

• A sample of 19 dogs were administered four treatments each:

(1) high CO2 pressure, (2) low CO2 pressure,

(3) high pressure + halothane, (4) low pressure + halothane.

• Outcome variable was milliseconds between heartbeats.

• Three contrasts are of interest:

(µ3 + µ4)− (µ1 + µ2) : Effect of halothane

(µ1 + µ3)− (µ2 + µ4) : Effect of CO2 pressure

(µ1 + µ4)− (µ2 + µ3) : Interaction between halothane and CO2.

Example 6.2: Dog Anesthetics (cont’d)

• The 3× 4 contrast matrix C is: −1 −1 1 11 −1 1 −11 −1 −1 1

• The test of H0 : Cµ = 0 rejects the null if

T2 = n(Cx̄)′(CSC′)−1(Cx̄) = 116 ≥(18)(3)

(16)F3,16(α) = 10.94.

• We clearly reject the null, so the question now is whetherthere is a difference between CO2 pressure, between halothanelevels or perhaps there is no main effect of treatment butthere is still an interaction.

• Three simultaneous confidence intervals (one for each rowof C):

c′1µ : (x̄3 + x̄4)− (x̄1 + x̄2)±

√18(3)

16F3,16(0.05)

√c′1Sc1

19= 209.31± 73.70.

• For the effect of CO2 pressure and the interaction, we findrespectively: −60.05± 54.70, −12.79± 65.97

• Use of halothane produces longer times between heartbeats.This effect is similar at both high and low CO2 pressurelevels because the interaction is not significant (thirdcontrast). Higher CO2 pressure produces shorter timesbetween heartbeats.

• The same null hypothesis can be represented with differentcontrast matrices

• For example, the null hypothesis that all four treatmentshave the same mean is also tested with H0 : Cµ = 0, where

−1 1 0 00 −1 1 00 0 −1 1

or C =

1 0 0 −10 1 0 −10 0 1 −1

• The value of T2 is unchanged

T2 = n(Cx̄)′(CSC′)−1(Cx̄) = 116

• The value of T2 will be the same for any set of contraststhat implies the same joint null hypothesis.

/* This program performs a one-sample

Hotelling T-squared test for equality

of means for repeated measures. The

program is posted as dogs.sas

Data are posted as dogs.dat */

data set1;

infile ’c:\stat501\data\dogs.dat’;

input dog x1-x4;

Z = 1;

/* LABEL X1 = High CO2 without Halothane

X2 = Low CO2 without Halothane

X3 = High CO2 with Halothane

X4 = Low CO2 with Halothane; */

/* Test the null hypothesis that

all treatments have the same mean

time between heartbeats */

proc glm data=set1;

model x1-x4 = z /noint nouni;

manova H=z M=x1-x2+x3-x4, -x1-x2+x3+x4,

-x1+x2+x3-x4 prefix=c /printe ;

M Matrix Describing Transformed Variables

x1 x2 x3 x4

c1 1 -1 1 -1

c2 -1 -1 1 1

c3 -1 1 1 -1

MANOVA Test Criteria and Exact F Statistics

for the Hypothesis of No Overall Z Effect

on the Variables Defined by the M Matrix Transformation

Statistic Value F-Value Num DF Den DF Pr > F

Wilks’ Lambda 0.13431200 34.38 3 16 <.0001

Pillai’s Trace 0.86568800 34.38 3 16 <.0001

Hotelling-Lawley Trace 6.44535118 34.38 3 16 <.0001

Roy’s Greatest Root 6.44535118 34.38 3 16 <.0001

R Code for the same

• posted in dogs.R

library(ICSNP)

C<-matrix( c(1, -1, 1, -1, -1, -1, 1, 1, -1, 1, 1, -1), 3, 4, byrow=T)

X <- as.matrix(dogdat[ ,2:5]) %*% t(C) # remember the need for using t(C).

HotellingsT2(X)

# Hotelling’s one sample T2-test

# data: as.matrix(dogdat[, 2:5]) %*% t(C)

# T.2 = 34.3752, df1 = 3, df2 = 16, p-value = 3.318e-07

# alternative hypothesis: true location is not equal to c(0,0,0)

Paired Comparisons and Repeated Measuresmaitra/stat501/lectures/InferenceForMeans... · Paired...

Documents

Transcript of Paired Comparisons and Repeated Measuresmaitra/stat501/lectures/InferenceForMeans... · Paired...

Comparisons, Superlatives, & Equal Comparisons

Terms Between subjects = independent Each subject gets only one level of the variable. Repeated measures = within subjects = dependent = paired.

Development of Tourism Industry in Iran (Hybrid Approach Delphi - Paired Comparisons - Borda)

Paired Programming!

University of Dundee Breast cancer risk reduction Anderson, Annie … · Statistical analysis used Chi squared tests for comparisons in proportions and paired t tests for comparisons

1 Paired Differences Paired Difference Experiments 1.Rationale for using a paired groups design 2.The paired groups design 3.A problem 4.Two distinct ways.

The Relation Between MOS and Pairwise Comparisons and …rkm38/pdfs/zerman2018cross_content.pdf · The results of paired comparisons are typically scaled in ... quality expressed

Paired Assoicates

Assessing 3D Scan Quality Through Paired-comparisons Psychophysics · 2018-01-04 · psychophysics of perception to do that evaluation. As psychophysics is not subject to opinion,

PAIRED COMPAiSONSCU) FLORIDA STATE UNIV TALLAHASSEE- … · to the analysis of basic paired comparisons data like those of Table 1. Then extensions of the method are developed for

ANOVA approaches to Repeated Measures • univariate repeated ...

SL-Model for Paired Comparisons

Bradley-Terry models for paired comparisons incorporating ...

20060403506 - DTICthe method of paired comparisons and the method of magnitude estimation, in terms of subjective attitude. These ... Jobson, Woodell, & Hines' presented an approach

-referenced variable X€¦ · Geary„s c is based on paired comparisons of values of a geo-referenced variable X In neighbouring regions to measure spatial autocorrelation. Geary„s

Oral Reading Strategies - udel.edu · Paired Reading: Best Practices Text selection at student’s instructional level 90 -95 % word recognition (repeated reading: 85%) Allow student

· Web viewp.31) Use of the word ‘violence’ has a significant impact, especially when paired with the repeated referral to Julie as an ‘innocente victime’ (Couvent, p.38).,

Værdisætning af offentlige investeringer og serviceydelser · (Mackenzie 1993, Holmes & Adamowicz 2003) og Paired Comparisons (Sinden 1974). 3.1 Nytte og præferencer Både CVM

Schmerzerkennung,-.-diagnose- und quan5ﬁzierung-€¦ · Grimace-scale-posthoc testing for repeated measures with Bonferroni correction for multiple comparisons. In the CFA assay

The Repeated-Measures ANOVA - Rutgers Universitymmm431/quant_methods_S13/QM_Lecture14.pdf · Repeated-Measures ANOVA Logic of the Repeated-Measures ANOVA • The repeated-measures