Lecture 11 Notes - Asal Aslemand -...
Transcript of Lecture 11 Notes - Asal Aslemand -...
1
Lecture 11 Notes➢ Chapter 17. More About Tests
➢ Chapter 18. Inferences About Means
Important Ideas from Lecture 10
2
• All the hypothesis tests boils down to the same question: “Is an observed difference or pattern too large to be
attributed to chance?”
• We measure “how large” by putting our sample results in the context of a sampling distribution model (e.g.,
Normal model, 𝑡 distribution – which we will learn later in this lecture).
Steps in conducting Hypothesis Testing:
1. State the null and the alternative hypothesis.
2. Check the necessary assumptions.
3. Identify the test-statistic. Find the value of the test-statistic.
4. Find the p-value of the test-statistic.
5. State (if any) a conclusion.
Important Ideas from Lecture 10
3
P-value:
• It is a conditional probability.
• It is not the probability that Ho (null hypothesis: current belief) is true.
• It is: P(observed statistic value [or even more extreme] | Ho]. Given Ho (the null hypothesis), because Ho gives
the parameter values that we need to find required probability.
• P-value serves as a measure of the strength of the evidence against the null hypothesis (but it should not serve as
a hard and fast rule for decision).
• If p-value = 0.03 (for example) all we can say is that there is 3% chance of observing the statistic value we
actually observed (or one even more inconsistent with the null value).
More About P-values
4
Recall the example in lecture 10 in which we investigated whether there is evidence to suggest that the true
proportion 𝑝 of all Canadians who worked at a job or business at anytime (between July 2007 and June 2008),
regardless of the number of hours per week, was less than 50%.
• The p-value of the observed test-statistics was less than 0.0001. With this p-value all we can say that if the true
percentage of all Canadians who worked at a job or business at anytime (between July 2007 and June 2008),
regardless of the number of hours per week was 50%, the probability of observing percentage of Canadians -
who worked at a job or business at anytime (between July 2007 and June 2008), regardless of the number of
hours per week – no higher than 50% in a sample like this is less than 1 chance in 10000.
More About P-values
5
Example:
A New England Journal of Medicine paper reported that the seven-year risk of heart attack in diabetes patients
taking the drug Avandia was increased from the baseline of 20.2% to an estimated risk of 28.9%.
• This study estimated the p-value: P( Ƹ𝑝 ≥ 28.9% | 𝑝 = 20.20%) = 0.03
This p-value means that a heart attack rate of at least as high as the one they observe could be expected in 3% of
similar experiments, even if, there were no increased risk from taking Avandia.
• An earlier study had estimated the seven-year risk to be 26.9% and thus reported the p-value of:
P( Ƹ𝑝 ≥ 26.9% | 𝑝 = 20.20%) = 0.27
This p-value means that a heart attack rate of at least as high as the one they observe could be expected in 27% of
similar experiments, even if, there were no increased risk from taking Avandia. This is not remarkable enough to
reject Ho: 𝑝 = 20.20%. In other words this study was not convincing.
More About P-values
6
• Big p-values just mean that what we have observed is not surprising. It means that, the results are in line with our
assumption that the null hypothesis models the world, so we have no reason to reject it.
• A big p-value does not prove that the null hypothesis is true.
• When we see a big p-value, all we can say is: we cannot reject Ho (we fail to reject Ho) – we cannot conclude Ha
(We have no evidence to support Ha).
Example of Hypothesis Testing for a Proportion
7
In 1980s, it was generally believed that congenital abnormalities affect 5% of the nation’s children. Some people
believe that the increase in the number of chemicals in the environment in recent years has led in the incidence of
abnormalities. A recent study examined 384 children and found that 46 of them showed signs of abnormality. Is
this strong evidence that the risk has increased?
We will aim at stating and/or answering the following:
a. Write appropriate hypotheses.
b. Check the necessary assumptions.
c. Identify the test-statistic. Find the value of the test-statistic.
d. Find the p-value of the test-statistic and explain its meaning in the context of this problem.
e. Give (if any) a conclusion.
f. Do environmental chemicals cause congenital abnormalities?
Example of Hypothesis Testing for a Proportion
8
a. Write appropriate Hypotheses (State the Null and the Alternative Hypotheses)
Ho: 𝑝 = 0.05 verses Ha: 𝑝 > 0.05
b. Check the Necessary Assumptions:
• Independence Assumption:
There is no reason to think that one child having genetic abnormalities would affect the probability that other children have them.
• Randomization Condition:
This sample may not be random, but genetic abnormalities are plausibly independent. The sample is probably representative of all children, with regards to genetic abnormalities.
• 10% Condition:
The sample of 384 children is less than 10% of all children.
• Success/Failure Condition:
np = (384)(0.05) = 19.2 and n(1 − 𝑝) = (384)(0.95) = 364.8 are both greater than 10, so the sample is large enough.
Example of Hypothesis Testing
9
c. Identify the test-statistics. Find the value of the test-statistic.
We showed in part b that the conditions have been satisfied, so a Normal model can be used to model the sampling distribution of the sample proportion. That is,:
Ƹ𝑝 is approx. normal with mean 𝑝 = 0.05 and standard error of 𝜎 ො𝑝 =𝑝(1−𝑝)
𝑛=
0.05(1−0.05)
384= 0.0111
Under Ho: 𝑝 = 0.05, the test statistics has a Z standard normal distribution.
Thus, we perform a one-proportion z-test:
Ƹ𝑝 = 46
384= 0.1198 is the estimated proportion of children with genetic abnormalities
Z = ො𝑝 −𝑝
𝜎ෝ𝑝= 0.1198−0.05
0.05(1−0.05)
384
= 0.1198 −0.05
0.0111≅ 6.28
The value of Z is approximately 6.28, meaning that the observed proportion of children with genetic abnormalities is over 6 standard deviations above the hypothesized proportion (𝑝 = 0.05).
Example of Hypothesis Testing
10
d. Find the p-value of the test-statistic and explain its meaning in the context of this problem.
P-value = P(Z > 6.28) ≅ 0.000
Note: We find the area above the
Z of 6.28 since Ha: 𝑝 > 0.05
If 5% of children have genetic abnormalities, the
chance of observing 46 children with genetic
abnormalities in a random sample of 384 children is
almost 0.
Example of Hypothesis Testing
11
e. Give (if any) a conclusion.
With a P-value of this low, we reject the null hypothesis. There is a very strong evidence that more than 5% of children have genetic abnormalities.
f. Do environmental chemicals cause congenital abnormalities?
We do not know that environmental chemicals cause genetic abnormalities. We merely have evidence that suggests that a greater percentage of children are diagnosed with genetic abnormalities now, compared to the 1980s.
Using 95% CI for a Proportion
12
• The p-value in the previous example was extremely small (less than 0.0001). That is a strong evidence to suggest
that more than 5% of children have genetic abnormalities. However, it does not say that the percentage of
sampled children with genetic abnormalities was “a lot more than 5%”. That is, the p-value by itself says nothing
about how much greater the percentage might be. The confidence interval provides that information.
• To assess the difference in practical terms, we should also construct a confidence interval:
95% CI for P: ෝ𝑝 ± 𝑀𝐸 = ෝ𝑝 ± (𝑍∗𝑥 𝑆𝐸( ො𝑝))
= 0.1198 ± (1.96 x 0.0166)
= 0.1198 ± 0.0324 = (0.0874, 0.1522)
Recall that Ƹ𝑝 = 46
384= 0.1198
Z* for 95% interval: Z* = 1.96
𝑆𝐸 𝑃 =ො𝑝(1− ො𝑝)
𝑛=
0.1198 (1−0.1198)
384= 0.0166
Interpretation:
We are 95% Confident that the true percentage of
children with genetic abnormalities is between
8.74% and 15.22%.
13
Z = 39.377 ≅ 6.28 (take the positive sign because the difference between ො𝑝 = 0.1198 and 𝑝 = 0.05 is positive: +0.0698
P-value = 3.494 x 10−10 < 0.0001. p-value is less than 𝛼 = 0.05
We reject Ho and conclude Ha; We have a very strong evidence to conclude that more than 5% of all children have genetic
abnormalities.
95% CI for P: (9.1%, 15.6%) – We are 95% confident that the true percentage of all children that have genetic
abnormalities is between approximately 9.1% and 15.6%. Since both values of this CI are more than the hypothesized
value of P = 0.05 (5%), we can further infer that this true percentage is more than 5% (with 0.95 probability).
Same Example: CI and Hypothesis Testing for a Proportion (Two-sided Test) – in R
Same Example:
Ho: 𝑝 = 0.05
Ha: 𝑝 ≠ 0.05
Decisions Errors in Tests
14
• When Ho is true, a Type I error occurs if Ho is rejected.
The probability of making a type I error is denoted by 𝛼.
• When Ho is false, a Type II error occurs if Ho is not rejected.
The probability of making a type II error is denoted by 𝛽.
Example of Decisions Errors in Tests
15
In medical disease testing, the null hypothesis is usually the assumption that a person is healthy. The alternative is
that the person has the disease we are testing for.
Ho: Healthy verses Ha: Infected
• Type I error: Reject Ho when it is true.
A Type I error is a false positive: A healthy person is diagnosed with the disease.
That is, a person must go under further test.
• Type II error: Fail to reject Ho (“Accept Ho”) when it is false.
A Type II error is a false negative in which an infected person is diagnosed as disease-free.
That is, a sick person gets untreated.
For example: If a new treatment is being tested for a disease (e.g., epilepsy), a Type I error will lead to future
patients getting a useless treatment; a Type II error means a useful treatment will remain undiscovered.
Example of Decisions Errors in Tests
16
Jury Trial:
Ho: Innocent verses Ha: Guilty
• Type I error: Reject Ho when it is true.
A Type I error occurs if the jury convicts an innocent .
• Type II error: Fail to reject Ho (“Accept Ho”) when it is false.
A Type II error occurs if the jury fails to convict a guilty person.
What type of error could we making in our example?
17
In 1980s, it was generally believed that congenital abnormalities affect 5% of the nation’s children.
Some people believe that the increase in the number of chemicals in the environment in recent years
has led in the incidence of abnormalities. A recent study examined 384 children and found that 46 of
them showed signs of abnormality. Is this strong evidence that the risk has increased?
Ho: 𝑝 = 0.05 verses Ha: 𝑝 > 0.05
𝑍 ≅ 6.28, 𝑝-value < 0.0001 (which is less than 𝛼 = 0.05). We reject Ho and conclude Ha.
This means we could be making a Type I error. We decided that the true percentage is more than 5%
based on our data (as evidence against Ho), however, it could be that the hypothesized value of 5% is
true (e.g., 𝐻0: 𝑝 = 0.05 could be true).
What type of error could we making in our example?
18
Suppose it was claimed that the percentage of adult Canadians who worked at a job or business at
anytime (between July 2007 and June 2008), regardless of the number of hours per week, was 50%. Of
the 4,756 respondents, 1,581 indicated that they worked at a job or business at anytime (between July
2007 and June 2008), regardless of the number of hours per week . Is there evidence to suggest that the
true proportion 𝑝 is greater than 0.50?
Ho: 𝑝 = 0.50 Ha: 𝑝 > 0.50
Z = - 23.02
P-value = P(Z > -23.02) ≅ 1, P-value > 𝛼 = 0.05; We Fail to Reject Ho; We cannot conclude Ha.
This means we could be making a Type II error. We indicated that there is no evidence to conclude that
the true percentage of adult Canadians who worked at a job or business at anytime (between July 2007
and June 2008), regardless of the number of hours per week was more than 50% - this conclusion
implies that Ho: 𝑝 = 0.50 is plausible, but it could not be the case.
What about making a correct decision? Power of Test
19
Power of test refers to probability of correctly reject Ho when it is false: P(reject Ho | Ho is false)
Power = 1 – Beta = 1 – P(fail to reject Ho | Ho is false)
Note: The complement of reject Ho is, fail to reject Ho. For example: P(A|B) = 1 - P(AC|B)
• When we think about power, we imagine the null hypothesis is false.
• The value of power depends on how far the truth lies from the value we hypothesize.
• We call this distance between the null hypothesis value (for example) 𝑝0 and the truth 𝑝, the effect size.
• The effect size is unknown, of course, since it involves the true p.
• But, we can estimate the effect size as the difference between the null value and the observed estimate.
• The effect size is central to how we think about the power of hypothesis test.
• A larger effect is easier to see and results in larger power.
• Small effects are difficult to detect. They will result in more Type II errors and therefore lower power.
• The power of the test both depends on the size of the effect and the amount of variability in the sampling model. For
proportions, we use a Normal sampling model (for Ƹ𝑝) with standard deviation inversely proportional to the square root of the
sample size, n.
Example of Power and Beta
20
A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if
less than 90% of adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in
a certain region.
a. Calculate power of the test if the percentage of adults who drink milk is really 85%.
b. Calculate beta if the percentage of adults who drink milk is really 85%.
Example of Power and Beta
21
A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of
adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in a certain region.
a. Calculate power of the test if the percentage of adults who drink milk is really 85%.
Alpha = 0.05 = P(reject Ho | Ho is true)
P(Z < -1.645) = 0.05 (this is the rejection region)
Z critical value is -1.645
Power = P(reject Ho | Ho is false)
= P(ො𝑝−0.90
0.90(1−0.90)
200
< -1.645 | P = 0.85)
= P( Ƹ𝑝 < 0.8651 | P = 0.85)
= 𝑃(𝑍 <0.8651−0.85
0.85 1−0.85
200
≅ 0.60) ≅ 0.73
Example of Power and Beta
22
A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of
adults drink milk (at alpha = 0.05). They collect a random sample of 200 adults in a certain region.
b. Calculate beta if the percentage of adults who drink milk is really 85%.
Beta = 1 – power = 1 – 0.73 = 0.27
Example of Power and Sample Size
23
A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90% of
adults drink milk (at alpha = 0.05). They collect a random sample of 100 adults in a certain region.
a. Calculate power of the test if the percentage of adults who drink milk is really 85%.
Alpha = 0.05 = P(reject Ho | Ho is true)
P(Z < -1.645) = 0.05 (this is the rejection region)
Z critical value is -1.645
Power = P(reject Ho | Ho is false)
= P(ො𝑝−0.90
0.90(1−0.90)
100
< -1.645 | P = 0.85)
= P( Ƹ𝑝 < 0.85065 | P = 0.85)
= 𝑃(𝑍 <0.85065−0.85
0.85 1−0.85
100
≅ 0.02) ≅ 0.51
Power increases as Sample Size, n increases
24
A newsletter reports that 90% of adults drink milk. The researchers are interested in investigating if less than 90%
of adults drink milk (at alpha = 0.05).
They collect a random sample of 50 adults in a
certain region. Calculate power of the test if the
percentage of adults who drink milk is really 85%.
They collect a random sample of 200 adults in a
certain region. Calculate power of the test if the
percentage of adults who drink milk is really 85%.
If we keep 𝛼 at the same size, larger sample sizes increase the power of test because sampling variability (sampling distributing)
are much narrower. The critical value, 𝑝∗ gets closer to 𝑝0 and farther from p.
The Sampling Model for Sample Mean
• When a random sample is drawn from any population with mean 𝜇 and standard deviation 𝜎, its sample mean ഥ𝑦
has a sampling distribution with the same mean as the population mean 𝜇, and standard error of 𝜎ത𝑦 =𝜎
𝑛.
• The Central Limit Theorem tells us that no matter what population the random sample comes from, the shape of
the sampling distribution is approximately normal as long as the sample size n is large. That is, the larger the
sample used, the more closely the Normal model approximates the sampling distribution of the sample mean.
• For large n (n > 60; note some use n > 30), we express the sampling distribution of ഥ𝑦 : ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎
𝑛)
However, in practice the population parameters are unknown. For example, 𝜎 is unknown. In that case, we estimate
𝜎 by the sample standard deviation, 𝑆. Thus, we replace 𝜎 with S: ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎
𝑛≅
𝑠
𝑛)
25
Testing for the Mean of a Quantitative Population for the Big Sample
Suppose we wish to test a hypothesis about a mean of a quantitative population denoted by 𝜇:
𝐻0: 𝜇 = 𝜇0 𝐻𝑎: 𝜇 ≠ 𝜇0
For large n (n > 60), we know that by Central Limit Theorem sampling distribution of ഥ𝑦 : ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎
𝑛)
And if 𝜎 is unknown we estimate 𝜎 by the sample standard deviation, 𝑆.
Thus, we replace 𝜎 with S: ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎
𝑛≅
𝑠
𝑛)
Our test statistic is: Z = ഥ𝑦 −𝜇𝜎
𝑛
≅ഥ𝑦 −𝜇𝑠
𝑛
26
Example: Testing for the Mean of a Quantitative Population with large Sample
27
Researchers claimed that the true mean number of hours of work for all Canadians with poor health is
different from 40 hours (usual hours of work per week). In order to test their hypothesis, these researchers
relied on the obtained statistics from the Canadian Community Health Survey (CCHS, 2011) for a random
sample of 65 Canadians with poor health. For this random sample, the mean hours of work was 33.91
hours and with standard deviation of 13.85. The histogram of hours of work for this sample was
approximately normal.
• How far away from 40 hours the sample mean need to be in order for researchers be able to support
their claim?
• In other words, how many estimated standard error do the sample mean need to be away from the value
of 40 hours so that researchers could support their claim?
• We need to find the value of the test-statistics, which summarizes how far (e.g., how many est. standard
error) the point estimate (e.g., ഥ𝑦 ) is way from the hypothesized 𝐻0value.
• In this example, we are interested to see how many est. standard error, ഥ𝑦 = 33.91 is away from 40.
• Recall CLT: For a large random sample, ഥ𝑦 ~𝑁(𝜇, 𝜎ത𝑦 =𝜎
𝑛). However, 𝜎 is unknown in this example.
We estimate 𝜎 by the sample standard deviation, S. Thus, the estimated standard error is 𝑠
𝑛.
Example: Testing for the Mean of a Quantitative Population with large Sample
28
Researchers claimed that the true mean number of hours of work for all Canadians with poor health is
different from 40 hours (usual hours of work per week). In order to test their hypothesis, these researchers
relied on the obtained statistics from the Canadian Community Health Survey (CCHS, 2011) for a random
sample of 65 Canadians with poor health. For this random sample, the mean hours of work was 33.91
hours and with standard deviation of 13.85 hours. The histogram of hours of work for this sample was
approximately normal.
𝐻0: 𝜇 = 40 𝐻𝑎: 𝜇 ≠ 40
Z = ഥ𝑦 −𝜇𝜎
𝑛
≅ഥ𝑦 −𝜇𝑠
𝑛
= 33.91−40
13.85
65
= -3.55
𝑝-value = 2x Area (below Z of -3.55) = 2(0.0002) = 0.004
𝑝-value of 0.0004 < (𝛼 = 0.05). We reject Ho and conclude Ha.
We have strong evidence to conclude that the mean hours of work for Canadians with poor health is
different from 40.
Confidence Interval for the Mean of a Quantitative Population when the Sample is Big
Recall the generic form of stating (finding) a CI: Point Estimate ± Margin of Error
Therefore, the confidence for the mean 𝜇 has the form: ො𝜇 ± Margin of Error
= ഥ𝑦 ± ME
= ഥ𝑦 ± (𝑍 ∗ 𝑆𝐸(ഥ𝑦 ))
= ഥ𝑦 ± (Z( 𝜎
𝑛))
≅ ഥ𝑦 ± (Z( 𝑠
𝑛))
29
Example: CI for the Mean of a Quantitative Population with large Sample
30
For a random sample of 65 Canadians with poor health, the hours of work had mean 33.91 hours and
standard deviation of 13.85 hours. Find 95% CI for 𝜇: ഥ𝑦 ± ME
= ഥ𝑦 ± (𝑍 ∗ 𝑆𝐸(ഥ𝑦 ))
= ഥ𝑦 ± (Z( 𝜎
𝑛))
≅ ഥ𝑦 ± (Z( 𝑠
𝑛))
= 33.91 ± (1.96( 13.85
65))
= 33.91 ± (1.96 x 1.72)
= 33.91 ± 3.37 = (30.54, 37.28)
Interpretation: We are 95% confident that the true mean hours of work for Canadians with poor health is
between 30.54 and 37.28 hours .
31
But what if sample size
was not large enough?
What test-statistic do
we use instead?
32
The t distribution
The density, t distribution, was calculated by William Gosset.Recall: When population standard deviation 𝝈 is unknown, its value is estimated by the sample standard deviation S; The value for S is different for different random samples with different sizes (n).
• The t distribution is bell-shaped and is symmetric (like the Normal model) about the mean 0.• The standard deviation is a bit larger than 1 and its value depends on degrees of freedom, df = n-1
(one less than the sample size). • The t distribution has a slightly different spread for different values of df.• The t distribution has a wider shape than Z standard normal distribution when sample size is small. • When df is about 60 or more, the two distributions (Z and t) are nearly identical. • We can think of t distribution with df = ∞ (infinity) as standard normal distribution, Z, because
as n (sample size) increases, we have 𝑠
𝑛≅
𝜎
𝑛
William Gosset
Assumptions and Conditions for the t distribution
• Independence Assumption:
The data values should be independent (form each other).
• Randomization Condition:
This condition is satisfied if the data arise from a random sample or suitably randomized experiment. Randomly sampled data, especially data from simple random sample are ideal – almost surely they are independent, with well defined target population.
• Normal Population Assumption:
Student’s t distribution will not work for data that are badly skewed. So, Examine graphical displays (e.g., Histogram or Boxplot); Check mean vs median. But note that t distribution performs adequately well even when the assumption of normality is violated (e.g., slight skewedness in the data). Thus, we say that t distribution is robust when the assumption of normality is violated.
Even for small n (sample size) check for Nearly Normal Condition:
• The data come from a distribution that is unimodal and reasonably symmetric. Check this assumption by making a histogram, boxplot, or normal probability plot.
• For very small sample, n <15, the data should follow a Normal model fairly closely. If you find clear outlier or skewness do not use t method.
• For sample size n between 15 and 40, the t method will work reasonably well for mildly to moderately skewed unimodal data, but would perform badly in the presence of strong skewness or outliers. Make a histogram, boxplot, or normal probability plot of data.
33
Testing for the Mean of a Quantitative Population for Small Sample with Unknown 𝜎
Suppose we wish to test a hypothesis about a mean of a quantitative population denoted by 𝜇:
𝐻0: 𝜇 = 𝜇0 𝐻𝑎: 𝜇 ≠ 𝜇0
When certain assumptions and conditions are met, the standardized sample mean has:
𝑡 =ഥ𝑦 − 𝜇𝑠𝑛
Follows a student’s t-model with n-1 degrees of freedom. Note, we estimate the standard deviation of ഥ𝑦 with:
SE ഥ𝑦 =𝑠
𝑛
34
35
Example of Hypothesis Testing for a Population Mean 𝝁
Studies on attitudes toward statistics showed that, on a 7-point Likert scale, where “1” is an indicative of
a strong negative attitudes, to “7” which is an indicative of strong positive attitudes about statistics, male
students’ feelings concerning statistics (known as the “Affect” component) is 4, on average. A researcher
takes a random sample of 18 male students who were enrolled in an introductory statistics course and
estimates their mean affect toward statistics. Her sample had mean 4.08 and standard deviation of 0.90.
Is there evidence to suggest that the true mean Affect for the male group is more than 4?
36
Checking the Assumptions and Conditions in our Example
The histogram looks bell-shaped and symmetric.
The boxplot shows no outlier and is approx. symmetric.
The normal probability plot is close to the straight line.
All these three plot suggest that there is no violation of
assumption of normality.
37
Checking the Assumptions and Conditions in our Example
Independence Assumption: Males students are independently (of each other) selected.
Randomization Condition: Males students are randomly selected for this study.
Normal Population Assumption: Histogram, boxplot, and normal probability plots shows that this
distribution came from a distribution that is unimodal and reasonably symmetric.
38
Example of Hypothesis Testing for a Population Mean 𝝁
Studies on attitudes toward statistics showed that, on a 7-point Likert scale, where “1” is an indicative of
a strong negative attitudes, to “7” which is an indicative of strong positive attitudes about statistics, male
students’ feelings concerning statistics (known as the “Affect” component) is 4, on average. A researcher
takes a random sample of 18 male students who were enrolled in an introductory statistics course and
estimates their mean affect toward statistics. Her sample had mean 4.08 and standard deviation of 0.90.
Is there evidence to suggest that the true mean Affect for the male group is more than 4?
39
Example of Hypothesis Testing for a Population Mean 𝝁
It is claimed that the males’ mean Affect, that is feelings concerning statistics, is 4 (on a 7-point Likert scale).
A researcher takes a random sample of 18 male students who were enrolled in an introductory statistics course and
estimates their affect toward statistics. He sample had mean 4.08 and standard deviation of 0.90. Is there evidence
to suggest that the true mean Affect for the male group is more than 4?
𝐻0: 𝜇 = 4 𝐻𝑎: 𝜇 > 4
Under Ho, our test statistic is: 𝑡 =ത𝑦−𝜇𝑠
𝑛
(t distribution with df = n -1)
𝑡 =4.08−40.90
18
= 0.08
0.211= 0.39 with df = 18 – 1 = 17
𝑡 = 0.39 has p-value greater than 0.10
(see explanation: next slide)
So, p-value is greater than 𝛼 = 0.05.
We Fail to Reject Ho. We cannot conclude Ha.
We have no evidence to conclude that the mean Affect for the male group is
more than 4 (this implies that 𝐻0: 𝜇 = 4 is plausible – we could be making a Type II error).
Finding P-Value from t-table
Is there evidence to suggest that the true mean Affect for the male group is more than 4?
𝐻0: 𝜇 = 4 𝐻𝑎: 𝜇 > 4
𝑡 = 0.39 with df = 18 – 1 = 17
Find P-value (Picture version of this explanation is on the next slide):
Go along the line of df = 17 and find a t-score close to 0.39.
We see that t-values are increasing in the row of df = 17.
So it in this case, we take, the first t-value of 1.330 as our reference.
Our test-statistics of t = 0.39 is smaller than t = 1.330
the area above t = 1.330 is 0.100; so area above t = 0.39 would be much
greater than 0.100. Ultimately, our p-value is a big value!
𝑡 = 0.39 has p-value > 𝛼 = 0.05. We Fail to Reject Ho.
We cannot conclude Ha. We have no evidence to conclude that the
mean Affect for the male students taking an introductory statistics
course is more than 4
(this implies that 𝐻0: 𝜇 = 4 is plausible but we could be making a
Type II error).
Finding P-Value using t-distribution
Online Applet: https://istats.shinyapps.io/tdist/
Is there evidence to suggest that the true mean Affect for the male group is more than 4?𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 > 𝟒
One-sample t-Interval for the Mean: Example: 95% CI for a Mean (when n is small)
When the assumptions and conditions are met, the Confidence Interval for a mean is:
= ഥ𝑦 ± (𝑡𝑛−1∗ ∗ 𝑆𝐸(ഥ𝑦 ))
where the standard error of the mean 𝑆𝐸 ഥ𝑦 =𝑠
𝑛
The critical 𝑡𝑛−1∗ depends on the confidence level that you specify on the number of degrees of freedom, n-1, which
we get from the sample size.
42
43
95% Confidence Interval for Males’ Mean Affect:
95% CI for 𝜇: ഥ𝑦 ± ME
ഥ𝑦 ± (𝑡𝑛−1∗ ∗ 𝑆𝐸(ഥ𝑦 ))
Confidence Level is 0.95
𝛼 = 0.05 (error probability)
𝛼/2 = 0.025
n = 18; df = n - 1 = 18 – 1 = 17
t-score: 𝑡0.025 with df =17 is 2.110
ത𝑦 ± 𝑡0.025(𝑠
𝑛)
= 4.08 ± [(2.110) 0.897
18]
= 4.08 ± 0.45
= (3.63, 4.53)
Interpretation: We are 95% confident that the true mean affect for male is between 3.63 and 4.53.
44
CI and Hypothesis Testing for a Population Mean 𝝁 – in R
One-sided test:
𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 > 𝟒
Two-sided test:
𝑯𝟎: 𝝁 = 𝟒 𝑯𝒂: 𝝁 ≠ 𝟒
45
Example of Hypothesis Testing for a Population Mean 𝝁
Laughter is often called “the best medicine”; studies have shown that laughter can reduce muscle tension and
increase oxygenation of the blood. Researchers investigated the physiological changes that accompany laughter. 25
subjects (18-34 years old) watched film clips designed to evoke laughter. During the laughing period, researchers
measured the heart rate (beats per minutes) of each subject and obtained: ത𝑦 = 73.5, and 𝑆 = 6. It is well known
that mean restoring heart rate is 71 beats per minute. Is there evidence that the true mean heart rate during laughter
exceeds 71 beats per minute? Use alpha = 0.05
Assumptions
Random Sample of 18-34 years old is taken from the population.
Heart rate during laughter has a normal distribution.
46
Example of Hypothesis Testing for a Population Mean 𝝁
Ho: 𝜇 = 71 Ha: 𝜇 > 71
𝑡𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 =ത𝑦−𝜇𝑆
𝑛
=73.5−71
6
25
= 2.083 ~𝑡(𝑑𝑓 = 24)
P-value: 𝑃 𝑡24 > 2.083 = ?
Look under the t-table (we will do this in class together).
Along the df=24, search for a value close to 2.083.
We find: 2.064 < 2.083 < 2.492
Look above the table, the second row, since we have a one-sided test, note the associated area below each of the
values: 0.010 < p-value < 0.025 ;
P-value < 0.05.P-value is small. We reject Ho.
We have evidence to indicate that the true mean heart rate during laughter exceeds 71 beats per minute.
47
Example of Hypothesis Testing for a Population Mean 𝜇 – Using CI
Let’s find the 95% CI for the true mean heart rate during laughter.
Recall n = 25 had ത𝑦 = 73.5 and 𝑆 = 6Thus, df = 25-1=24
Alpha = 0.05 (since confidence level is 0.95)
Confidence interval is two-sided so we divide alpha be 2:
Alpha/2 = 0.05/2 = 0.025
Look up value, in t table for 𝑡(0.025;𝑑𝑓=24) = 2.064
95% CI for the true mean heart rate during laughter is:
ത𝑦 ± 𝑡0.025;24𝑆
𝑛
= 73.5 ± 2.0646
25
= 73.5 ± 2.4768= (71.02, 75.98)
Conduct hypothesis testing: Ho: 𝜇 = 71 Ha: 𝜇 > 71
We reject Ho since 71 is not in this interval.
We indicate that the true mean heart rate during laughter exceeds 71 beats per minute.
Interpretation: We are 95% confident that
the true mean heart beat during laughter is
between 71.02 and 75.98 beats per minute.
48
Determining Sample Size
How large of a sample size is needed if researchers need to estimate the true mean heart beats during laughter to be
within 4% with 95% confidence level?
• Suppose we know the actual value of 𝜎 = 5.
To solve for n is the Margin of Error part of Confidence Interval:
n = (𝑍∗2)
𝑆2
𝑀𝐸2= (1.96)2∗ (5)2
0.042= 60025
• Suppose we do not know the actual value of 𝜎; we estimate 𝜎 with previous information:
Recall n = 25 had ത𝑦 = 73.5 and 𝑆 = 6Thus, df = 25-1=24
We use t0.025;24=2.064
n = (𝑡0.025;24)2 𝑆2
𝑀𝐸2= (2.064)2∗ (6)2
0.042= 95852.16
Round up for precision: n = 95853