The Right Questions about Statistics: How confidence intervals work Maths Learning Centre
description
Transcript of The Right Questions about Statistics: How confidence intervals work Maths Learning Centre
The Right Questions about Statistics:How confidence intervals work
Maths Learning CentreThe University of Adelaide
A confidence interval is designed to give a RANGE of possible answers
for a “WHAT’S THE NUMBER?” question, using DATA from a sample.
You calculate which numbers could have been likely to produce your data.
What is the median number of chapters
in a novel?
Randomly choose 12 books
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
Which possible medians might be likely to
produce this data?
For each particular median, we could do a
hypothesis test to answer the question (in this case it’s called
the sign test) …
12 14 16 201810 11 13 15 17 19 21 22 23
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.0005
Under 0.05, so NO, the median is not 10.
Is the median for all books 10?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.0005
Under 0.05, so NO, the median is not 11.
Is the median for all books 11?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.0005
Under 0.05, so NO, the median is not 12.
Is the median for all books 12?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.001
Under 0.05, so NO, the median is not 13.
Is the median for all books 13?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.02
Under 0.05, so NO, the median is not 14.
Is the median for all books 14?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.23
Over 0.05, so YES, the median could be 15.
Is the median for all books 15?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.75
Over 0.05, so YES, the median could be 16.
Is the median for all books 16?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 1.0
Over 0.05, so YES, the median could be 17.
Is the median for all books 17?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.56
Over 0.05, so YES, the median could be 18.
Is the median for all books 18?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.11
Over 0.05, so YES, the median could be 19.
Is the median for all books 19?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.01
Under 0.05, so NO, the median is not 20.
Is the median for all books 20?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.001
Under 0.05, so NO, the median is not 21.
Is the median for all books 21?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.0005
Under 0.05, so NO, the median is not 22.
Is the median for all books 22?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
?
SIGN TEST
P-value = 0.0005
Under 0.05, so NO, the median is not 23.
Is the median for all books 23?
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
12 14 16 201810 11 13 15 17 19 21 22 23
That is, all the medians that would give a
p-value of over 0.05.
“95% confidence interval”
It shows all the medians that could be true based on hypothesis tests at the 5% level of significance.
Book # Chapters
1 212 163 144 185 146 167 198 159 17
10 1311 1912 20
This is the (95%)confidence interval
You could say it’s the medians I am “happy to
believe” based on my data.
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
What is the mean weightin a novel?
Randomly choose 12 books
Another example...
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
Which possible means might be likely to
produce this data?
For each particular mean, we could do a
hypothesis test to answer the question (in this case it’s called
the t-test) …
Another example...
250 300 350200 400
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
?
T-TEST
P-value = 0.00007
Under 0.05, so NO, the mean is not 200.
Is the mean for all books 200?
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
?
T-TEST
P-value = 0.00008
Under 0.05, so NO, the mean is not 201.
Is the mean for all books 201?
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
?
T-TEST
P-value = 0.00009
Under 0.05, so NO, the mean is not 202.
Is the mean for all books 202?
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
And so on...
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
“95% confidence interval”
That is, all the means that would give a
p-value of over 0.05.
It shows all the means that could be true based on
hypothesis tests at the 5% level of significance.
This is the (95%)confidence interval
You could say it’s the means I am “happy to
believe” based on my data.
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
“95% confidence interval”
These ends would have given exactly 0.05 in the
hypothesis test.
BUT you don’t have to go through all this to find a
confidence interval!
You just have to figure out where the ends are.
So you find them by doing the hypothesis test
calculation backwards...
CI ends = mean ± t* × std. Error= 269 & 347
t = mean – 300 std. Error = 0.46
t* = ± 2.20
More than 0.05, so the mean could be 300.
p = 0.65
HYPOTHESIS TEST
Assume mean is 300
Use assumed mean and data to calculate a test
statistic
Use test statistic to get p-value.
CONFIDENCE INTERVAL
Assume P-value is 0.05
Use p-value to get “critical” test statistics.
Use critical values to get two means.
Compare p-value to 0.05 to decide.
95% confidence interval is between these.
-2.0 -1.0 0.0 1.0 2.0
95% CI is from 269 to 347
250 300 350200 400
Book Weight (g)
1 2422 3663 4244 3125 3076 2387 3178 2659 317
10 31411 21712 379
Let’s go over that again…What is the mean weight in a novel?
Randomly choose 12 books
Choose a hypothesis test, and a
significance level.
T-Test, 0.05
Use critical values and your data to calculate
the two ends.
CI ends = mean ± t* × std. Error= 269 & 347
State the answer. The 95% CI is from
269g to 347g.
Find the critical values of the test statistic.
t* = ± 2.20
-2.0 -1.0 0.0 1.0 2.0
So this is how to find a confidence interval:
•Have a “what’s the number?” question.•Collect data.•Choose a matching hypothesis test.•Work backwards to calculate two ends. •The confidence interval is between these two values.
And this is what aconfidence interval means:
•The values in the CI would be retained with a matching hypothesis test.
•The values in the CI are those you are “happy to believe” based on your data.
•The values in the CI have a high chance of producing data like yours.