J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2005 Confidence Intervals.
-
Upload
lambert-rice -
Category
Documents
-
view
217 -
download
0
Transcript of J.D. Bramble, Ph.D. Creighton University Medical Center Med 483 -- Fall 2005 Confidence Intervals.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Confidence Intervals
Data can be described by point estimates mean, standard deviation, etc.
Point estimates from a sample are not always equal to population parameters
Data can be described by interval estimates shows the variability of the estimate.
Using the standard error we can see the amount that the estimate will vary from the true value.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Confidence Intervals
Interval estimates are called confidence intervals (CI).
CI define the an upper limit and lower limit associated with a known probability.
These limits are known as confidence limits.
The associated probability of the CI is most commonly 95%, but may be 99% or 90%
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Confidence limits set the boundaries that are likely to include the population mean.
Thus, we can conclude that in general, we are 95% confident that the true mean of the population is found within these limits.
Confidence Intervals
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Standard Error
The standard error is defined as We expect that the mean is within one
standard error of quite often. SE is a measure of the precision of x as an
estimate of . The smaller SE the more precise the estimate
SE includes two factors that affect the precision of the measurement n and sd
n
s
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Standard Deviation vs Standard Error
Standard deviation describes the dispersion of the data. The variability from one data point to the next
Standard error (SE) describes the uncertainty in the mean of the data that is a result of sampling error. The variability associated with the sample mean
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Calculating Confidence Intervals
Recall that 95% of the area under a standard curve is between z = ±1.96.
-1.96 1.96
95.45%
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Calculating Confidence Intervals
The general formula is:
P = 0.95
Lower limit = x - 1.96( n ) Upper limit = x + 1.96 ( n )
n
zx
n
zx
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Calculating CI of two samples We use the t-distribution. The t distribution describes the distribution
of the sample mean when the variance is also estimated from sample data.
Thus, the formula for the CI in these cases is:
n
stx *
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example: Problem To assess the effectiveness of hormone replacement
therapy on bone mineral density, 94 women between the age of 45 and 64 were given estrogen medication. After taking the medication for 36 months the bone mineral density was measured for each of the women in the study. The average density was 0.878 g/cm2 with a standard deviation of 0.126 g/cm2. Calculate a 95% CI for the mineral bone density of this population.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example: SE and t Recall that SE is:
t(, df) = t(0.025, 93) = 1.990
013.094
126.0
n
s
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example: Calculations
0.904 to0.852
0.2586 878.0
94
126.099.1878.0
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example: Conclusion
The 95% confidence limits are: lower: 0.852 g/cm2; upper: 0.904 g/cm2
We are 95% confident that the average bone density of all women age 45 to 64 who take this hormone replacement medication is between 0.852 g/cm2 and 0.904 g/cm2.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
For a 95% confidence intervals we believe that 95% of the samples drawn form the population would have a mean that fall within the confidence limits
Example: Conclusions (cont’d)
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Other Confidence Limits
For a 99% or 90% CI the calculations and interpretations are similar.
What CI is going to give the widest or narrowest interval?
CI can be established for any parameter mean, proportion, relative risk, odds ratio, etc.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Using CI to Test Hypotheses
Diastolic blood pressure of 12 people before and after administration of a new drug.
Paired t-test Hypotheses: H0: d > 0; Ha: d < 0 xd = -3.1
sd = 4.1
J.D. Bramble, Ph.D.MED 483 – Fall 2005
98.022.5
12.21.312
1.4)795.1(1.3
)1,2
(
d
d
nd
n
stx
Using CI to Test Hypotheses
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Conclusion – since zero does not fall within the interval we can conclude with 95% certainty that there is a significant decrease in blood pressure after taking the new drug.
If we did a paired t-test the conclusions would be the same.
Using CI to Test Hypotheses
J.D. Bramble, Ph.D.MED 483 – Fall 2005
]
]
]
]
]
]
]
]
]
]
]
[
[
[
[
[
[
[
[
[
[
[
True Population Mean ()
1
2
3
4
5
67
8
9
10
11
CI for different samples
Visual Representation of CI
J.D. Bramble, Ph.D.Creighton University Medical
CenterMed 483 -- Fall 2005
ANOVA: Analysis of Variance
Single Factor
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Objectives
Know the assumptions for an ANOVA When is ANOVA used rather than a t-test Set up ANOVA tables and understand the relationships
between the values within the table Compute the F-ratio and appropriate degrees of
freedom Know how and when to use a two factor ANOVA Apply Tukey’s multiple comparison procedure
J.D. Bramble, Ph.D.MED 483 – Fall 2005
A statistical method of comparing means of different groups.
A single factor ANOVA for two groups produces the same p-value as an independent t-test
The t-test is inappropriate for more than two groups – increases probability of a Type I error
Using a t-test to test the means of each pair leads to problems regarding the proper level of significance.
ANOVA vs. t-test
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA is not limited to two groups Can appropriately handle comparisons of
several means from several groups Thus, ANOVA overcomes the difficulty of
doing multiple t-tests The sampling distribution used is the F
distribution.
ANOVA vs. t-test
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: assumptions
The observations are independent one observation is not correlated with another observation.
Variance of the various groups are homogeneous
ANOVA is a robust test that is not as sensitive to departures from normality and homogeneity, especially when sample sizes are large and nearly equal for each group.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: Characteristics
ANOVA analyzes the variance of the groups to evaluate differences in the mean.
Within group Measures the variance of observations within each
group variance due to “chance”
Between groups measure the variance between the groups variance due to treatment or chance
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: Characteristics
It can be shown that when means of each group are equal, the within and the between group variance is equal.
The F-statistic is the ratio of the estimated variance
chance
chancetreatmentF
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA : the F distribution
The ratio follows an F distribution The F statistic has two sets of degrees of
freedom. For between groups -- (I - 1); where I is the
number of groups For within groups -- I(J - 1); where J is the
number of observations in each group
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: single factor Let I = the number of population samples Let J = the number of observations in each sample Thus the data consist of IJ observations The overall or grand mean is:
IJ
xX
jk
J
J
I
I 11
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Now it is necessary to compute the sums of squares for the treatment -- SSTr (between group); error--SSE (within group), and the total-- SST.
Sum of the squared deviations between groups The total sums of squares measures the amount of
variation about the grand mean With algebraic manipulation we find that:
SST = SSTr + SSE
ANOVA: single factor
J.D. Bramble, Ph.D.MED 483 – Fall 2005
22
11
2 11)( x
IJx
JxxSSTr IJ
J
J
I
IIJ
22
11
2 1)( x
IJxxxSST IJ
J
J
I
IIJ
When completing the ANOVA table usually only SSTr and SST are calculated. SSE is found by SSE = SST - SSTr
ANOVA: sums of squared
J.D. Bramble, Ph.D.MED 483 – Fall 2005
After calculating the sums of squares, F is simple the ratio of the mean squares of both the treatment and error.
The mean squares is the sums of squares divided by the appropriate degrees of freedom.
1
I
SSTrMSTr
)1(
JI
SSEMSE
MSE
MSTrF
ANOVA: mean sums of square
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: single factor table
Sources of variation
Degrees of freedom SS MS F
Treatment
Error
Total
I - 1
I(J - 1)
IJ - 1
SSTr
SSE
SST
MSTr
MSE
MSTr
MSE
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANOVA: Example
An experiment was conducted to examine various modes of medication delivery. A total of 15 subjects diagnosed with the flu were enrolled and the length of time until alleviation of major symptoms was measured for three groups: Group A received an inhaled version, Group B received an injection, and Group C received an oral dose.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Single factor example
x =
Groups A B C
56 62 72
102 58 100
Time (min) 90 78 117
87 68 109
94 87 103
Average 85.8 70.6 100.2
J.D. Bramble, Ph.D.MED 483 – Fall 2005
H0: all three means are equal or 1 = 2 = 3
Ha: at least one mean is different
= 0.05
Critical value: F(, df) given I-1 = 2 and I(J-1) = 12, F(0.05, 2,12) = 3.89
Single factor example: set up
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Single factor example: calculating sums of squares
7.153,53.739,109893,11415/)1283(893,114 2 SST
8.962,2128315
1])501()353()429[(
5
1 2222 SSTr
9.190,28.962,23.153,5 SSE
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Single factor example: completing the table
Sources of variation
Degrees of freedom SS MS F
Treatments
Error
Total
2
12
14
2,190.3
2,962.8
5,153.7
1,095.5
246.9
4.44
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Single factor example: decision and conclusions
Compare Fstat to Fcrit: 4.45 > 3.89, therefore fail to reject H0.
There evidence suggest that the time it takes to alleviate major flu symptoms differed significantly due to the mode of medication delivery.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Where is the Difference?
Recall the hypotheses of the ANOVA Ho is that all the means are equal Ha is that at least on is not.
If we fail to reject Ho the analysis is complete. What does it mean when Ho is rejected
at least one mean is different Which 's are different from one another.
if only two treatment levels. three or more treatment levels
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Finding the difference We must do a post hoc analysis.
a test that is done after the ANOVA The purpose is to determine the location of
the difference. Different of post hoc test are available and
are discussed in the text. These test include Bonferroni, Sceffe,
Student Newman-Keuls, and Tukey' HSD.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Tukey’s HSD
Where, = significance levelI = number of groups
J = number of observation per treatment
MSE = mean square error (or within group MS)
J
MSEQw JII *))1(,,
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Using the Tukey’s HSD
All the information, except Q, needed to find w is located in the ANOVA table
Q is determined by using the studentized range distribution, and dfwithin.
Once w is determined order all treatment level means in ascending order
Underline those values that differ by less than w. Treatment means not underlined correspond to
treatments that are significantly different.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example Using the previous example, we no want to find which
form(s) of medication really is different form the others. To start we will order the means
Groups B A C
Average 70.6 85.8 100.2
J.D. Bramble, Ph.D.MED 483 – Fall 2005
The ANOVA Does this data indicate that the amount of time it takes a
student to nod off is dependent on the statistical topic being studied?
Source df SS MS F
Treatment (i.e., between) 3 5882.4 1960.8 21.09
Error (i.e., within) 16 1487.4 93Total 19 7369.8
Since the computed F-statistic of 21.09 is greater than the critical value of F(0.05, 3, 16) = 3.24 we reject Ho.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Computing the Tukey’s There are I = 3 treatments and the degrees of
freedom for the error is 12; thus, from the table Q(0.05, 3, 12) = 3.77.
Computing the Tukey value we get:
5.265
9.246*77.3*)1(,,( J
MSEQ JII
J.D. Bramble, Ph.D.MED 483 – Fall 2005
And the Difference is…
Ordering the treatment level means and underscoring those that differ by less that 26.5
We conclude that only significant difference is between group B (injection) and group C (oral).
Groups B A C
Average 70.6 85.8 100.2
J.D. Bramble, Ph.D.Creighton University Medical
CenterMed 483 -- Fall 2005
ANOVA: Analysis of Variance
Two Factor
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Two-Factor ANOVA
Single factor ANOVAs subjects or treatments are categorized in only
one way (i.e., type of treatment) Two factor ANOVAs
subjects or treatments are categorized in two ways (i.e., type of treatment and gender)
Two factor ANOVAs test the influence of both factors.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Examples of Two Factor Designs
An experiment is designed to test if there is a difference in how fast 3 different antacid brands (Acid Eater, Relieve the Burn, and Blah Stomach) dissolve in male and female stomachs.
What type of study techniques (1 hour every day, 3 hours once a week, an "all-nighter" prior to the exam, a late night party before the exam) results in better test scores while controlling for the person's age (<17, 18-20, 21-23, 24-26, 27 >).
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Advantages of Two-way ANOVAs
Economy In a two-factor analysis we can test
interactions. Testing for an interaction allows us to
determine whether the variation of the treatment varies by the conditions in which the treatment is applied
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Example
Instruction
Computer Classroom Means
Ability Whiz 90 82 86
Novice 80 88 84
Means 85 85
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Three Research Hypotheses
Is there a significant difference between those taught by computer and those taught in the classroom?
Is there a significant difference between computer whizzes and computer novices
Is there a significant interaction between type of instruction and computer ability of the subject
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Two Factor ANOVA Example
Researchers are interested on the effect of caffeine and performance. Controlling for the students academic program, subjects were given 3 different levels of caffeine for two weeks prior to taking a standard aptitude test. They record the test scores below.
None Low Med High
Under grad 76 82 68 63Med 67 69 59 56
Pharm 81 96 67 64Nur 56 59 54 58Law 51 70 42 37
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Writing the hypothesis
Hypotheses are written the same as a single factor ANOVA with the exception of adding a second set of hypotheses for the second factor.
For Factor A (Caffeine level) (I = # treatment levels) Ho: none = low = med = high
Ha: at least on is different For Factor B (Program) (J = # treatment levels)
Ho: under grad = med = pharm = nur = law
Ha: at least one is different
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Critical Values
Critical values for a two factor ANOVA are found by looking on an F-table at the appropriate degrees of freedom.
Degrees of freedom for a two factor ANOVA are found for all sources of variation and the total For factor A: I-1 For factor B: J-1 For error: (I-1)(J-1) For total: I(J-1)
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Calculating Degrees of Freedom For our example I = 4 and J = 5; thus, the df are:
dftask = 2-1 = 1
dfdose = 3 –1 = 2
dferror = 3 * 4 = 12
dftotal = (4 *5)-1 = 19
Notice the relationship between the degrees of freedom is the same as a single factor ANOVA dfFactor 1 + dfFactor 2 + dferror = dftotal;
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Calculating Critical Values
With the df known we can now find the critical values one for Factor 1and one for Factor 2.
The critical values are found by looking on an F-table at the appropriate alpha and degrees of freedom for each factor. For factor A:F(, dfFactor A, dferror) For Factor B: F(, dfFactor B, dferror).
For our example Factor 1 is F(0.05, 3, 12) = 3.49 Factor 2 is F(0.05, 4, 12) = 3.26
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Computing the Test Statistic
First compute the sums of squares for the different sources of variation. SST -- sums of square for the total SSA -- sums of square for factor A SSB -- sums of square for factor B SSE -- sums of square for the error
The relationship still holds that if you add all the sums of squares you get the sums of squares of the total. Thus, SST = SSA + SSB + SSE
J.D. Bramble, Ph.D.MED 483 – Fall 2005
The mean sums of squares for Factor 1, Factor 2, and the Error can be computed by dividing the sums of squares by the appropriate degrees of freedom.
The F-statistic is calculated by dividing the mean sums of square for each factor by the mean sums of square of the error.
Computing the Test Statistic
J.D. Bramble, Ph.D.MED 483 – Fall 2005
The ANOVA TableSources of Variation df SS MS F
Factor 1 I-1 SSA MSA MSA/MSEFactor 2 J-1 SSB MSB MSB/MSEError (I-1)(J-1) SSE MSETotal IJ-1 SST
Sources of Variation df SS MS F
Month 3 1182.95 394.32 10.72Lot 4 1947.5 486.88 13.24Error 12 441.3 36.78Total 19 3571.75
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Decision and Conclusion
The relationships for making the decision is the same.
For factor A, F(0.05, 3, 12) = 3.49 and Fstat = 10.72. Since Fstat > Fcrit we reject Ho
For factor B F(0.05, 4, 12) = 3.26 and Fstat = 13.24. Since Fstat > Fcrit we again reject Ho
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Where is the difference?
Looking on the table to get Q For factor 1: Q(, I,( I-1)(J-1)) . (Notice that I is the # of
levels for Factor A and (I-1)(J-1) is the df of the error) Thus, Q for Factor B: Q(, J, (I-1)(J-1))
.
The formula for w is For factor A:
For factor B
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Computing the Tukey’s For factor A:
Ordering the means and underscoring all the pairs that differ by less than w = 11.39
There is a significant difference in test scores between the high and medium caffeine groups and the none.
High Med None Low55.6 58 66.2 75.2
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Two Factor ANOVA:Repeated Measures
Measurements are made repeatedly on each subject (before, during, and after the intervention) Subjects are recruited as matched sets on variables such
as age or diagnosis A laboratory is experiment is run several times, each
time with several parallel treatments. When appropriate, the use of the repeated
measures ANOVA test is usually more powerful than ordinary ANOVA.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Two Factor ANOVA Example How does various types of music affect
agitation in Alzheimer’s patients?Group Piano Mozart Easy Listening
Early 2124221820
9121059
2926302426
Middle 2220251820
1418119
13
1518201319
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Writing the hypothesis
For Factor A (Music) Ho: piano = mozart = easy listening
Ha: at least on is different For Factor B (Stage)
Ho: early = middle
Ha: at least one is different For the Interaction
Ho: no interaction between music and stage on agitation levelHa: there is an interaction
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Compute SS for the total (SST), error (SSE), factor A(SSA), factor B (SSB), and the interaction of AB (SSAB). SST = SSA + SSB + SSAB + SSE
Each SS has associated degrees of freedom SST = IJK SSE = IJ(K - 1) SSA = I-1 SSB = J - 1 SSAB = (I - 1)(J - 1)
Two Factor Repeated Measures ANOVA
J.D. Bramble, Ph.D.MED 483 – Fall 2005
MS are computed by the appropriate SS/df The test statistic is arrived at by the
appropriate MS divided be MSE.
H01 vs. Ha1 H02 vs. Ha2
H012 vs. Ha12
F, I-1, IJ(K - 1) F, J-1, IJ(K - 1)
F, (I-1)(J-1), IJ(K - 1)
MSA / MSE MSB / MSEMSAB / MSE
Hypotheses Test Statistic Critical Value
Two Factor Repeated Measures ANOVA
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Repeated Measures ANOVA Table
Source df SS MS F
Factor 1 I - 1 SS1 SS1 / df1 MS1/MSE
Factor 2 J-1 SS2 SS2 / df2 MS2/MSE
Interaction (I-1)(J-1) SS1x2 SS1x2 / df1x2 MS1x2/MSE
Within (Error) IJ(K-1) SSE SSE / dfE
Total IJK-1 SST
J.D. Bramble, Ph.D.MED 483 – Fall 2005
The ANOVA Table
Source df SS MS F
Music 2 740 370 48.89
Stage 1 30 30 4.05
Music x Stage 2 260 130 17.53
Error 24 178 7.42
Total 29 1208
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Repeated Measures: Tukey’s
When no significant interaction is found For comparing levels of factor A, obtain Q, I,
IJ(K - 1)
For comparing levels of factor B, obtain Q,
J, IJ(K - 1)
w = Q * MSE/JK for factor 1 comparisons
w = Q * MSE/IK for factor 2 comparisons Arrange sample means in increasing order and
underscore pairs of differing by less than w
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Multivariate Analysis of Variance
Referred to as MANOVA Used when there is multiple dependent
variables Dependent variables are usually releted to
one another MANOVA helps to determine the effect of
the treatment (IV) on any one outcome (DV)
J.D. Bramble, Ph.D.MED 483 – Fall 2005
MANOVA Example Does sex, race, and educational level affect how well
people deal with the pressure of a terminal disease?
IV = sex (2), race (4), education (4)
DV = Coping strategies (5)
MANOVA can estimate the effects of the IV (sex, race, and education) for each of the five scales of coping strategies, independent of one another.
J.D. Bramble, Ph.D.MED 483 – Fall 2005
Analysis of Covariance Referred to as ANCOVA Allows researchers to adjust or equalize
baseline differences between groups In addition to the DV and IV a covariate is
enter into the model. The covariate is a variable that is known to
have an effect on the DV
J.D. Bramble, Ph.D.MED 483 – Fall 2005
ANCOVA example Wood et al. (2002) tested an educational intervention
to promote breast self examination (BSE).
Quasi experimental design
Difference in knowledge and skill related to BSE.
Enter these covariates into the model is essential to determine if the difference is the intervention or the initial difference