CHEE824 - Winter 2006J. McLellan1 Background Slides for CHEE824 Hypothesis tests –For comparison...
-
Upload
trevor-hawkins -
Category
Documents
-
view
214 -
download
0
Transcript of CHEE824 - Winter 2006J. McLellan1 Background Slides for CHEE824 Hypothesis tests –For comparison...
CHEE824 - Winter 2006
J. McLellan 1
Background Slides for CHEE824
• Hypothesis tests– For comparison of means
– Comparison of variances
– Discussion of power of a hypothesis test - type I and type II errors
• Joint confidence regions (for the linear case)
CHEE824 - Winter 2006
J. McLellan 2
Hypothesis Tests
… are an alternative approach to confidence limits for factoring in uncertainty in decision-making
Approach– make a hypothesis statement
– use appropriate test statistic for statement
– consider range of values for test statistic that would be likely to occur if hypothesis were true
– compare value of test statistic estimated from data to range - if significant, hypothesis is rejected, otherwise hypothesis is accepted
CHEE824 - Winter 2006
J. McLellan 3
Example
Naphtha reformer in a refinery» under old catalyst, octane number was 90» under new catalyst, average octane number of 92 has been
estimated using a sample of 4 data points» standard deviation of octane number in unit is known to be 1.5» has the octane number improved significantly?
» We could use confidence limits to answer this question• for the mean, with known variance• form interval, and see if old value (90) is contained in interval for
new mean
» consider direct test … hypothesis test
CHEE824 - Winter 2006
J. McLellan 4
Example
Hypothesis test -
Null hypothesis »
Alternate hypothesis»
– approach» mean is estimated using sample average
» if observed average is within reasonable variation limits of old mean, conclude that no significant change has occurred
» reference distribution - Standard Normal
90:0 =μH
90: >μaH
“status quo”
CHEE824 - Winter 2006
J. McLellan 5
Example
» to compare with Standard Normal, we must standardize» if mean under new catalyst was actually the old mean,
then
would be distributed as a Standard Normal distribution• observed values would vary accordingly
» now choose a fence - limit that contains 95% of values of Standard Normal
» if observed value exceeds fence, then it is unlikely that the mean under the new catalyst is equal to the old mean
• small chance of obtaining an observed average outside this range
» if value exceeds fence, reject null hypothesis
4/
90
σ−X
CHEE824 - Winter 2006
J. McLellan 6
Example
» Compute test statistic value using observed average of 92:
» now determine fence - test at 95% significance level - upper tail area is 0.05
• z = 1.65
» compare: 2.67 > 1.65 -conclude that mean must be significantly higher, since likelihood of obtaining an average of 92 when true mean is 90 is very small
67.24/5.1
9092
4/
90=
−=
−σx
fence - upper tailarea is 0.05
We only use the upper tail here, because we are interested in testingto see whether the new mean is greater than the old mean.
CHEE824 - Winter 2006
J. McLellan 7
Example
– there is a small chance (0.05) that we could obtain an observed average that would lie outside the fence even though the mean had not changed
» in this case, we would erroneously reject the null hypothesis, and conclude that the catalyst had caused a significant increase
» referred to as a “Type I error” - false rejection• this would happen 5% of the time
• to reduce, move fence further to the extreme of the distribution - reduce upper tail area
= 0.05 is the “significance level” • (1- ) is sometimes referred to as the “confidence level”
is a tuning parameter for the hypothesis test
CHEE824 - Winter 2006
J. McLellan 8
Hypothesis Tests
Review sequence1) formulate hypothesis
2) form test statistic
3) compare to “fence” value z = 1.65
4) in this case, reject null hypothesis
90:0 =μH
90: >μaH
4/
90
σ−X
CHEE824 - Winter 2006
J. McLellan 9
Types of Hypothesis Tests
One-sided tests– null hypothesis - parameter equal to old value
– alternate hypothesis - parameter >, < old value» e.g.,
Two-sided tests– null hypothesis - parameter equal to old value
– alternate hypothesis - parameter not equal to old value (could be greater than, less than)
» e.g.,
90:0 =μH
90: >μaH
90:0 =μH
90: ≠μaH
In two-sided tests, two fences are used (upper, lower), and significance area is split evenly between lower and upper tails.
CHEE824 - Winter 2006
J. McLellan 10
Hypothesis Tests for Means
… with known variance
Two-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
≠=
aHH
n
X
/0
σμ−
2/
2/
z
z−
2/0
/ σμ
zn
X>
−
rejection region
CHEE824 - Winter 2006
J. McLellan 11
Hypothesis Tests for Means
… with known variance
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
>=
aHH
n
X
/0
σμ−
z
σμ
zn
X>
−
/0
rejection region
CHEE824 - Winter 2006
J. McLellan 12
Hypothesis Tests for Means
… with known variance
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
<=
aHH
n
X
/0
σμ−
−=− 1zz
σμ
−<−
10
/z
n
X
rejection region
CHEE824 - Winter 2006
J. McLellan 13
Hypothesis Tests for Means
When the variance is unknown, we estimate using the sample variance.
Test statistic– use “standardization” using sample standard deviation
Reference distribution - – becomes the Student’s t distribution– degrees of freedom are those of the sample variance
» n-1
ns
X
/0μ−
CHEE824 - Winter 2006
J. McLellan 14
Hypothesis Tests for Means
… with unknown variance
Two-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
≠=
aHH
ns
X
/0μ−
2/1,12/,1
2/,1
−−−
−
=− nn
n
tt
t
2/,10
/ μ
−>−
ntnsX
rejection region
CHEE824 - Winter 2006
J. McLellan 15
Hypothesis Tests for Means
… with unknown variance
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
>=
aHH
ns
X
/0μ−
,1−nt
μ
,10
/ −>−
ntns
X
rejection region
CHEE824 - Winter 2006
J. McLellan 16
Hypothesis Tests for Means
… with unknown variance
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
0
00
:
:
μμμμ
<=
aHH
ns
X
/0μ−
−−− =− 1,1,1 nn tt
μ
−−<−
1,10
/ ntns
X
rejection region
CHEE824 - Winter 2006
J. McLellan 17
Hypothesis Tests for Variances
• Hypotheses» e.g.,
• Test Statistic» since
then
20
2
20
20
:
:
σσ
σσ
≠
=
aH
H
21
22
1~ −− nn
s χσ
212
0
2~
)1(−
−n
sn χσ
Test Statistic
CHEE824 - Winter 2006
J. McLellan 18
Hypothesis Tests for Variances
Two-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
20
2
20
20
:
:
σσ
σσ
≠
=
aH
H
20
2)1(
σsn−
22/,12
0
22
2/1,120
2 )1(,
)1( χ
σχ
σ −−− >−
<−
nnsn
orsn
22/,1
22/1,1 , χχ −−− nn
Rejection region
CHEE824 - Winter 2006
J. McLellan 19
Hypothesis Tests for Variances
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
20
2
20
20
:
:
σσ
σσ
>
=
aH
H
20
2)1(
σsn−
2,12
0
2)1(χ
σ−>
−n
sn
2,1 χ −n
Rejection region
CHEE824 - Winter 2006
J. McLellan 20
Hypothesis Tests for Variances
One-Sided Test - at the significance level
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
20
2
20
20
:
:
σσ
σσ
<
=
aH
H
20
2)1(
σsn−
21,12
0
2)1(χ
σ−−<
−n
sn Rejection region
21,1 χ −−n
CHEE824 - Winter 2006
J. McLellan 21
Outline
• random samples• notion of a statistic• estimating the mean - sample average• assessing the impact of variation on estimates -
sampling distribution• estimating variance - sample variance and standard
deviation• making decisions - comparisons of means, variances
using confidence intervals, hypothesis tests• comparisons between samples
CHEE824 - Winter 2006
J. McLellan 22
Comparisons Between Two Samples
So far, we have tested means and variances against known values
» can we compare estimates of means (or variances) between two samples?
» Issue - uncertainty present in both quantities, and must be considered
Common Question» do both samples come from the same underlying parent
population?» e.g., compare populations before and after a specific
treatment
CHEE824 - Winter 2006
J. McLellan 23
Preparing to Compare Samples
Experimental issues» ensure that data is collected in a randomized order for
each sample• ensure that there are no systematic effects - e.g., catalyst
deactivation, changes in ambient conditions, cooling water heating up gradually
» blocking - subject experimentation to same conditions - ensure quantities other than those of interest aren’t changing
CHEE824 - Winter 2006
J. McLellan 24
Comparison of Variances
… is typically conducted prior to comparing means» recall that standardization required for hypothesis test (or
confidence interval) for the mean requires use of the standard deviation we should compare variances first before choosing appropriate mean comparison
Approach » focus on ratio of variances
• is this ratio = 1?
• will be assessed using sample variances
» what should we use for a reference distribution?
22
21 /σσ
CHEE824 - Winter 2006
J. McLellan 25
Comparison of Variances
Test Statistic– for use in both hypothesis tests and confidence
intervals
The quantity
» n1 and n2 are the number of points in the samples used to compute and respectively
1,122
22
21
21
21~
/
/−− nnF
s
s
σσ
F-distribution
21s
22s
CHEE824 - Winter 2006
J. McLellan 26
The F Distribution
… arises from the ratio of two Chi-squared random variables, each divided by their degrees of freedom
» sample variance is sum of squared Normal random variables
» dividing by population variance standardizes them, and the expression becomes sum of standard Normal r.v.’s, i.e., Chi-squared
1,122
22
21
21
21~
/
/−− nnF
s
s
σσ
212
2
1
1~ −− nn
s χσ
CHEE824 - Winter 2006
J. McLellan 27
Confidence Interval Approach
Form probability statement for this test statistic:
and rearrange:
σσ
−=<< −−−−− 1)/
/( 2/,1,12
222
21
21
2/1,1,1 2121 nnnn Fs
sFP
σσ
−=<<
−−−−−1)(
2/1,1,122
21
22
21
2/,1,122
21
2121 nnnn Fs
s
Fs
sP
CHEE824 - Winter 2006
J. McLellan 28
Confidence Interval Approach
100(1-)% Confidence Interval
Approach:» compute confidence interval» determine whether “1” lies in the interval
• if so - identical variances is a reasonable conjecture
• if not - different variances
2/1,1,122
21
22
21
2/,1,122
21
2121 σσ
−−−−−<<
nnnn Fs
s
Fs
s
CHEE824 - Winter 2006
J. McLellan 29
Hypothesis Test Approach
Typical approach – use a 1-sided test, with the test direction dictated by
which variance is larger
Test Statistic
22
21
22
22
21
21
/
/
s
s
s
s=
σσ
Under the null hypothesis,we are assuming that
122
21 =
σσ
CHEE824 - Winter 2006
J. McLellan 30
Hypothesis Tests for Variances
One-Sided Test - at the significance level
For
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
22
21
22
210
:
:
σσ
σσ
>
=
aH
H
,1,1 21 −− nnF
22
21
s
s
22
21 ss >
,1,122
21
21 −−> nnFs
s
CHEE824 - Winter 2006
J. McLellan 31
Hypothesis Tests for Variances
One-Sided Test - at the significance level
For
Hypotheses:
Test Statistic:
Fences:
Reject H0 if
21
22
22
210
:
:
σσ
σσ
>
=
aH
H
,1,1 12 −− nnF
21
22
s
s
21
22 ss >
,1,121
22
12 −−> nnFs
s
Why the reversal?
CHEE824 - Winter 2006
J. McLellan 32
Why the reversal?
• Property of F-distribution
• typically, we would compare against
• Problem - » tables for upper tail areas of 1- are not always available
• Solution - use the following fact for F-distributions
• to use this, reverse the test ratio - previous slide
22
21
s
s−−− 1,1,1 21 nnF
νννν
,,1,,
1221
1
FF =−
CHEE824 - Winter 2006
J. McLellan 33
Example
Global warming problem from tutorial:» s1 - standard devn for March ‘99 is 3.2 C
» s2 - standard devn for March ‘98 is 2.3 C
» has the variance of temperature readings increased in 1999?
» first, work with variances: • 1999 -- 10.2 C2
• 1998 -- 5.3 C2
» since a) we are interested in whether variance increased, and b) 1999 variance (10.2) is greater than 1998 variance (5.3), use the ratio
Each is estimatedusing 31 data points
22
21
s
s
CHEE824 - Winter 2006
J. McLellan 34
Example
Hypotheses:
» observed value of ratio = 1.94» “fence value” - test at the 5% significance level:
• F31-1, 31-1, 0.05 = 1.84
» since observed value of test statistic exceeds fence value, reject the null hypothesis
• variance has increased
Note » if we had conducted the test at the 1% significance level
(F=2.39), we would not have rejected the null hypothesis
22
21
22
210
:
:
σσ
σσ
>
=
aH
H
CHEE824 - Winter 2006
J. McLellan 35
Example
Now use confidence intervals to compare variances:
» use a 95% confidence interval - outer tail area is 2.5% on each side
» this is a 2-tailed interval, so we need
2/1,1,122
21
22
21
2/,1,122
21
2121 σ
σ
−−−−−<<
nnnn Fs
s
Fs
s
48.0/1
/1
07.2
025.0,131,131
025.0,1,1
975.0,1,12/1,1,12/1,1,1
025.0,131,1312/,1,1
12
212121
21
==
=
==
==
−−
−−
−−−−−−−−
−−−−
F
F
FFF
FF
nn
nnnnnn
nn
CHEE824 - Winter 2006
J. McLellan 36
Example
Confidence interval:
Conclusion » since 1 is contained in this interval, we conclude that the
variances are the same» why does the conclusion differ from the hypothesis test?
• 2-sided confidence interval vs. 1-sided hypothesis test• in confidence interval, 1 is close to the lower boundary
0.493.0
)48.0(3.5
2.10
)07.2(3.5
2.10
22
21
22
21
<<⇒
<<
σσ
σσ
CHEE824 - Winter 2006
J. McLellan 37
Comparing Means
The appropriate approach depends on:» whether variances are known» whether a test of sample variances indicates that variances
can be considered to be equal • measurements coming from same population
Assumption: data are Normally distributed
The approach is similar, however the form depends on the conditions above
» form test statistic» use reference distribution» re-arrange (confidence intervals) or compare to fence
(hypothesis tests)
CHEE824 - Winter 2006
J. McLellan 38
Comparing Means
Known Variances» if variances are known ( ), then
» now we can standardize to obtain our test statistic
22
21 , σσ
),(~)(2
22
1
21
2121 nnNXX
σσμμ +−−
Z
nn
XX~
)()(
2
22
1
21
2121
σσ
μμ
+
−−−
Note - we are assuming that the samples used for the averages are independent.
CHEE824 - Winter 2006
J. McLellan 39
Comparing Means
Known Variances
Confidence Interval» form probability statement for test statistic as a Standard
Normal random variable» re-arrange interval» procedure analogous to that for mean with known
variance
2
22
1
21
2/21212
22
1
21
2/21 )()()(nn
zXXnn
zXXσσμμσσ
++−<−<+−−
CHEE824 - Winter 2006
J. McLellan 40
Comparing Means
Known Variances
Hypothesis Test
Test Statistic
Fences
Reject H0 if
21
210
:
:
μμμμ
≠=
aHH
2
22
1
21
21 )(
nn
XX
σσ+
−
2/
2/
z
z−
2/
2
22
1
21
21 )(
σσz
nn
XX>
+
−
Two-Sided Test
CHEE824 - Winter 2006
J. McLellan 41
Comparing Means
Unknown Variance– appropriate choice depends on whether variances can
be considered equal or are different» test using comparison of variances» if variances can be considered to be equal, assume that
we are sampling with same population variance » pool variance estimate to obtain estimate with more
degrees of freedom
CHEE824 - Winter 2006
J. McLellan 42
Pooling Variance
– If variances can reasonably be considered to be the same, then we can assume that we are sampling from population with same variance
» convert sample variances back to sums of squares, add them together, and divide by the combined number of degrees of freedom
» can follow similar procedure for
∑ −=−⇒∑ −−
===
11
1
21,1
211
1
21,1
1
21 )()1()(
1
1 n
ii
n
ii XXsnXX
ns
22s
CHEE824 - Winter 2006
J. McLellan 43
Pooling Variance
– We have obtained the original sum of squares from each sample variance
– combine to form overall sum of squares
– degrees of freedom
– pooled variance estimate
222
211 )1()1( snsnSSoverall −+−=
2)(11 2121 −+=−+−= nnnnoverallν
2
)1()1(
21
222
2112
−+−+−
=nn
snsnsp
CHEE824 - Winter 2006
J. McLellan 44
Comparing Means
Unknown Variance - “Equal Variances”
Confidence Intervals
» recall that» since variance is estimated, we use the t-distribution as a
reference distribution
» degrees of freedom = (n1-1) + (n2-1)
» if 0 lies in this interval, means are not different
212/,2121
212/,21
11)()(
11)(
nnstXX
nnstXX pp ++−<−<+−− νν μμ
2/,2/1, νν tt −=−
CHEE824 - Winter 2006
J. McLellan 45
Comparing Means
Unknown Variance - “Equal Variances”
Hypothesis Test
Test Statistic
Fences
Reject H0 if
21
210
:
:
μμμμ
≠=
aHH
21
2111
)(
nns
XX
p +
−
2/1,2/,
2/,
νν
ν
−=− tt
t
2/,
21
2111
)(νt
nns
XX
p
>+
−
CHEE824 - Winter 2006
J. McLellan 46
Comparing Means
Unknown Variance - “Unequal Variances”– test becomes an approximation
• approach» test statistic
» reference distribution - Student’s t distribution» estimate an “equivalent” number of degrees of freedom
2
22
1
21
21 )(
n
s
n
s
XX
+
−
CHEE824 - Winter 2006
J. McLellan 47
Comparing Means
Unknown Variance - “Unequal Variances”– equivalent number of degrees of freedom
– degrees of freedom ν is largest integer less than or equal to
11 2
2
2
22
1
2
1
21
2
2
22
1
21
−
⎟⎟⎠
⎞⎜⎜⎝
⎛
+−
⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛+
n
ns
n
ns
ns
ns
CHEE824 - Winter 2006
J. McLellan 48
Comparing Means
Unknown Variance - “Unequal Variances”
Confidence Intervals» similar to case of known variances, but using sample
variances and t-distribution
» degrees of freedom ν is the effective number of degrees of freedom (from previous slide)
» recall that
» if 0 isn’t contained in interval, conclude that means differ
2
22
1
21
2/,21212
22
1
21
2/,21 )()()(ns
ns
tXXns
ns
tXX ++−<−<+−− νν μμ
2/,2/1, νν tt −=−
CHEE824 - Winter 2006
J. McLellan 49
Comparing Means
Unknown Variance - “Unequal Variances”
Hypothesis Test
Test Statistic
Fences
Reject H0 if
21
210
:
:
μμμμ
≠=
aHH
2
22
1
21
21 )(
n
s
n
s
XX
+
−
2/,2/1, , νν tt −
2/,
2
22
1
21
21 )(νt
ns
ns
XX>
+
−
CHEE824 - Winter 2006
J. McLellan 50
Paired Comparisons for Means
Previous approach» 2 data sets obtained from 2 processes» compute average, sample variance for EACH data set» compare differences between sample averages
Issue - » extraneous variation present because we have conducted one
experimental program for process 1, and one distinct experimental program for process 2
» additional variation reduces sensitivity of tests• location of fences depends in part on extent of variation
» can we conduct experiments in a paired manner so that they have as much variation in common as possible, and extraneous variation is eliminated?
CHEE824 - Winter 2006
J. McLellan 51
Paired Comparisons of Means
Approach - » set up pairs of experimental runs with as much in common
as possible» collect pairs of observations for each experimental run --
process 1, process 2» compute differences» conduct a confidence interval or hypothesis test on the
mean of the differences, using the average of the differences in the test statistic
• variance estimated using the sample variance of the differences
» test to see if the mean of the differences is plausibly zero (no difference in population means)
CHEE824 - Winter 2006
J. McLellan 52
Paired Comparison of Means
Example - oxide thickness on silicon wafers
» runs at two positions in a furnace
» run pairs of tests with a wafer in each location
Furnace PositionA B difference
920 923 -3914 924 -10927 913 14891 881 10943 923 20902 884 18910 887 23856 858 -2937 916 21857 857 0
average 9.1variance 141.6556
std 11.90191
CHEE824 - Winter 2006
J. McLellan 53
Paired Comparison of Means
Confidence Interval
» are average and standard deviation of differences
» conclude that means are identical if zero is contained in interval
» n is number of data points in paired samples (e.g., 10 pairs)
nstDnstD dndn // 2/,1212/,1 μμ −− +<−<−
dsD ,
CHEE824 - Winter 2006
J. McLellan 54
Paired Comparison of Means
Hypothesis Test
Test Statistic
Fences
Reject H0 if
0:
0:
21
210
≠−=−
μμμμ
aHH
ns
D
d /
2/,12/1,1 , −−− nn tt
2/,1/ −> nd
tns
D
CHEE824 - Winter 2006
J. McLellan 55
“Tuning” Hypothesis Tests
What significance level should we use for a hypothesis test?
rejection region
Rejection region has area . If thenull hypothesis were actually true,there is probability that we the observed value would fall outsidethe fences, and we would erroneouslyreject the null hypothesis FALSE REJECTION- referred to as a Type I error
CHEE824 - Winter 2006
J. McLellan 56
Adjusting the False Rejection Rate
… is achieved by moving the fences further out» use a higher threshold as a basis to reject null
hypothesis» i.e., make the outer tail area SMALLER» e.g., instead of testing at 5% significance (95%
confidence level), test at 1% significance level (99% confidence level)
CHEE824 - Winter 2006
J. McLellan 57
Failure to Detect
Suppose the mean has actually increased.
False rejection regionwith area =
Failure to detect region - observed values of the teststatistic falling in this regionshould in fact be rejected, however they aren’t becausethey fall within the acceptanceregion - FAILURE TO REJECT referred to as a Type II errorwhich has a probability ofoccurring
CHEE824 - Winter 2006
J. McLellan 58
Failure to Detect
The probability of a type II error depends on:– size of the shift to be
detected– location of the fence --
significance level (Type I error probability)
– influences degree of overlap of two distributions, and thus the overlap area
Area =
CHEE824 - Winter 2006
J. McLellan 59
Failure to Detect
Schematic: Distribution for X-bar is standardized as:
however if the true mean has shifted, this not a standard Normalrandom variable. if new mean has shifted by μthen we must use
as the standardized form
n
X
/0
σμ−
n
X
nn
X
///00
σμμ
σμ
σμ −−
=−−
CHEE824 - Winter 2006
J. McLellan 60
Failure to Detect
Computing - for 1-sided hypothesis test» outer tail area on high side is » fence value is z
» type II error probability is where has mean μ0+μ
» in order to compute probability of type II error, convert to standard normal:
)( zXP < X
X
CHEE824 - Winter 2006
J. McLellan 61
Failure to Detect
Introduce
» size of shift as multiple of standard deviation of X (population)
» no analytical expression for » summarize in graphs referred to as Operating
Characteristic Curves » 1- is called the POWER of the hypothesis test
σμφ =
CHEE824 - Winter 2006
J. McLellan 62
Operating Characteristic Curve
• Example shape of the curve
Increasing sample size n
n=1n=5
n=50
0 1 2 3 4
Size of shift
Probabilityof failingto reject
1
0.8
0.6
0.4
0.2
0
For fixed value of
CHEE824 - Winter 2006
J. McLellan 63
Operating Characteristic Curve
• Illustrates trade-off between false detection/failure to detect for fixed sample size
• Use - examples» given desired false detection, failure to detect rates,
determine sample size required to detect given shift» given sample size and false detection rate, determine
failure to detect rate given size of shift
Sample size nFailure to detectrate
False detection rate
CHEE824 - Winter 2006
J. McLellan 64
Operating Characteristic Curves
… are available for:» 2-sided hypothesis test for mean
• variance known
• variance unknown
» 1-sided hypothesis test for mean• variance known
• variance unknown
» tests for variance
CHEE824 - Winter 2006
J. McLellan 65
Joint Confidence Region (JCR)
… answers the question
Where do the true values of the parameters lie?
Recall that for individual parameters, we gain an understanding of where the true value lies by:
» examining the variability pattern (distribution) for the parameter estimate
» identify a range in which most of the values of the parameter estimate are likely to lie
» manipulate this range to determine an interval which is likely to contain the true value of the parameter
CHEE824 - Winter 2006
J. McLellan 66
Joint Confidence Region
Confidence interval for individual parameter:
Step 1) The ratio of the estimate to its standard deviation is distributed as a Student’s t-distribution with degrees of freedom equal to that of the standard devn of the variance estimate
Step 2) Find interval which contains
of values -i.e., probability of a t-value falling in this interval is
Step 3) Rearrange this interval to obtain interval
which contains true value of parameter of the time
$~
$
νi i
st
i
−
[ , ], / , /−t tν ν 2 2 100 1( )%−( )1−
$, / $ ν i t s
i± 2
100 1( )%−
CHEE824 - Winter 2006
J. McLellan 67
Joint Confidence Region
Comments on Individual Confidence Intervals: » sometimes referred to as marginal confidence intervals -
cf. marginal distributions vs. joint distributions from earlier
» marginal confidence intervals do NOT account for correlations between the parameter estimates
» examining only marginal confidence intervals can sometimes be misleading if there is strong correlation between several parameter estimates
• value of one parameter estimate depends in part on anther• deletion of the other changes the value of the parameter
estimate• decision to retain might be altered
CHEE824 - Winter 2006
J. McLellan 68
Joint Confidence Region
Sequence:
Step 1) Identify a statistic which is a function of the parameter estimate statistics
Step 2) Identify a region in which values of this statistic lie a certain fraction of the time (a region)
Step 3) Use this information to determine a region which contains the true value of the parameters of the time
100 1( )%−
100 1( )%−
CHEE824 - Winter 2006
J. McLellan 69
Joint Confidence Region
The quantity
is the ratio of two sums of squares, and is distributed as an F-distribution with p degrees of freedom in the numerator, and n-p degrees of freedom in the denominator
( $ ) ( $ )
~ ,
ε
− −
−
T T
pn pp
sF
X X
2estimate ofinherentnoise variance(if MSE is used, degrees of freedom is n-p)
CHEE824 - Winter 2006
J. McLellan 70
Joint Confidence Region
We can define a region by thinking of those values of the ratio which have a value less than
i.e.,
Rearranging yields:
Fp n p, ,− −1
( $ ) ( $ )
, ,
ε
− −
≤ − −
T T
pn pp
sF
X X
2 1
( $ ) ( $ ) , , ε − − ≤ −T T
pn pps FX X 2
CHEE824 - Winter 2006
J. McLellan 71
Joint Confidence Region - Definition
The joint confidence region for the parameters is defined as those parameter values satisfying:
Interpretation:
» the region defined by this inequality contains the true values of the parameters of the time
» if values of zero for one or more parameters lie in this region, those parameters are plausibly zero, and consideration should be given to dropping the corresponding terms from the model
100 1( )%−
( $ ) ( $ ) , , ε − − ≤ − −T T
pn pps FX X 21
100 1( )%−
CHEE824 - Winter 2006
J. McLellan 72
Joint Confidence Region - Example with 2 Parameters
Let’s reconsider the solder thickness example:
95% Joint Confidence Region (JCR) for slope&intercept:
( ) ;X XT =⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
10 2367
2367 563335
$.
.
; =−
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
45810
113
[ ]
( $ ) ( $ )
$ $$
$, , , .
β β β β
β β β ββ β
β βε ε
− −
= − −−
−
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
≤ =− −
T T
Tpn pps F s F
X X
X X0 0 1 1
0 0
1 1
2 2210 20952
sε2 13538= .
CHEE824 - Winter 2006
J. McLellan 73
Joint Confidence Region - Example with 2 Parameters
95% Joint Confidence Region (JCR) for slope&intercept:
The boundary is an ellipse...
[ ]45810 113
45810
113
2 135 38
2 135 38 4 46 1207 59
0 1
0
1
2 8 0 95. .
.
.
( . )
( . )( . ) .
, , .− − −
−
− −
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
≤
= =
β β
β
β
X XT F
CHEE824 - Winter 2006
J. McLellan 74
Joint Confidence Region - Example with 2 Parameters
Region
320 600
-0.6
-1.6
Intercept
Slope
rotated - implies correlationbetween estimates of slopeand intercept
centred at least squares parameter estimates
greater “shadow” along horizontal axis --> variance ofintercept estimate is greater than that of slope
CHEE824 - Winter 2006
J. McLellan 75
Interpreting Joint Confidence Regions
1) Are axes aligned with coordinate axes?
» is ellipse horizontal or vertical?
» indicates no correlation between parameter estimates
2) Which axis has the greatest shadow?
» projection of ellipse along axis
» indicates which parameter estimate has the greatest variance
3) The elliptical region is, by definition, centred at the least squares parameter estimates
4) Long, narrow, rotated ellipses indicate significant correlation between parameter estimates
5) If a value of zero for one or more parameters lies in the region, these parameters are plausibly zero - consider deleting from model
CHEE824 - Winter 2006
J. McLellan 76
Joint Confidence Regions
What is the motivation for the ratio
used to define the joint confidence region?
Consider the joint distribution for the parameter estimates:
( $ ) ( $ )
ε
− −T T
p
s
X X
2
1
2
122
1
( ) det( )exp{ ( $ ) ( $ )}
/$
$π
p
T
ΣΣ− − −−
Substitute in estimate for parameter covariance matrix:
( $ ) (( ) ) ( $ )
( $ ) ( $ )
ε
ε
− −
= − −
− −T T
T T
s
s
X X
X X
1 2 1
2
CHEE824 - Winter 2006
J. McLellan 77
Confidence Intervals from Densities
Individual Interval Joint Regionf b$( ) f b b$ $ ( , ) 0 1 0 1
bb0
b1
lower upper
area = 1-alpha
volume = 1-alpha
Joint ConfidenceRegion
CHEE824 - Winter 2006
J. McLellan 78
Relationship to Marginal Confidence Limits
Region
320 600
-0.6
-1.6
Intercept
Slope
centred at least squares parameter estimates
marginal confidence interval for intercept
marginal confidence interval
for slope
CHEE824 - Winter 2006
J. McLellan 79
Relationship to Marginal Confidence Limits
Region
320 600
-0.6
-1.6
Intercept
Slope 95% confidenceregion for parametersconsidered jointly
marginal confidence interval for intercept
marginal confidence interval
for slope
95% confidenceregion implied byconsidering parametersindividually
CHEE824 - Winter 2006
J. McLellan 80
Relationship to Marginal Confidence Intervals
Marginal confidence intervals are contained in joint confidence region
» potential to miss portions of plausible parameter values at tails of ellipsoid
» using individual confidence intervals implies a rectangular region, which includes sets of parameter values that lie outside the joint confidence region
» both situations can lead to • erroneous acceptance of terms in model
• erroneous rejection of terms in model