Post on 04-Jun-2018
8/13/2019 Ch11 [Non-Parametric Tests]
1/27
Chapter
11
Elementary Statistics
Larson Farber
Nonparametric Tests
8/13/2019 Ch11 [Non-Parametric Tests]
2/27
The Sign TestThe Sign TestThe Sign TestThe Sign Test
Section 11.1
8/13/2019 Ch11 [Non-Parametric Tests]
3/27
Left-tailed test: H 0: median k and H a :
median < k
Right-tailed test: H 0: median ≤ k and H a : median > k
Two-tailed test: H 0: median = k and H a : median k
Nonparametric TestsA nonparametric test is a hypothesis test that does not requireany specific conditions about the shape of the populations or the
value of any population parameters.Tests are often called “distribution free” tests.
The Sign Test is a nonparametric test that can be used to
test a population median against a hypothesized value, k.
Hypotheses
or
or
8/13/2019 Ch11 [Non-Parametric Tests]
4/27
Sign TestTo use the sign test, first compare each entry in thesample to the hypothesized median, k .
• If the entry is below the median, assign it a – sign.
• If the entry is above the median, assign it a + sign.
• If the entry is equal to the median, assign it a 0.
Compare the number of + and – signs. (Ignore 0’s.) If the
number of + signs and the number of – signs are
approximately equal, the null hypothesis is not likely tobe rejected. If they are not approximately equal,
however, it is likely that the null hypothesis will be
rejected.
8/13/2019 Ch11 [Non-Parametric Tests]
5/27
Sign TestTest Statistic: When n ≤ 25, the test statistic is thesmaller number of + or – signs.
When n > 25, the test statistic is:
For n > 25, you are testing the binomial probability that = 0.50.
8/13/2019 Ch11 [Non-Parametric Tests]
6/27
ApplicationA meteorologist claims that the daily median temperature forthe month of January in San Diego is 57º Fahrenheit. Thetemperatures (in degrees Fahrenheit) for 18 randomly selectedJanuary days are listed below. At = 0.01, can you support themeteorologist’s claim?
58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63
551. Write the null and alternative hypothesis.
H 0: median = 57º and H a : median ≠ 57º
2. State the level of significance.= 0.01
3. Determine the sampling distribution.
Binomial with p = 0.5
8/13/2019 Ch11 [Non-Parametric Tests]
7/27
Since Ha contains the ≠ symbol, this is a two-tail test.
There are 8 + signs and 9 – signs. So, n = 8 + 9 = 17.
5855+
–
6260+
+
5556 –
–
5557 –
0
5361 –
+
5258 –
+
5263 –
+
5963+
+
5555 –
–
8/13/2019 Ch11 [Non-Parametric Tests]
8/27
6. Find the test statistic.
5. Find the rejection region.
4. Find the critical value. With n = 17, use Table 8
Critical value is 2.
Reject H 0 if the test
statistic is less than orequal to 2.
The test statistic is the smaller number of + or – signs,
so the test statistic is 8.
8/13/2019 Ch11 [Non-Parametric Tests]
9/27
7. Make your decision.
8. Interpret your decision.
The test statistic, 8, does not fall in the critical region. Failto reject the null hypothesis.
There is not enough evidence to reject themeteorologist’s claim that the median dailytemperature for January in San Diego is 57.
The sign test can also be used withpaired data (such as before and after).Find the difference betweencorresponding values and record the
sign. Use the same procedure.
8/13/2019 Ch11 [Non-Parametric Tests]
10/27
The Wilcoxon TestThe Wilcoxon TestThe Wilcoxon TestThe Wilcoxon Test
Section 11.2
8/13/2019 Ch11 [Non-Parametric Tests]
11/27
Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is a nonparametric test thatcan be used to determine whether two dependent samples
were selected from populations with the same distribution.
•Find the difference for each pair:Sample 1 value – Sample 2 value
•Find the absolute value of the difference.
•Rank order these differences.
•Affix a + or – sign to each of the rankings.
•Find the sum of the positive ranks.
•Find the sum of the negative ranks.
•Select the smaller of the absolute values of the sums.
To find the test statistic, w s
8/13/2019 Ch11 [Non-Parametric Tests]
12/27
Application
The table shows the daily headache hours suffered by 12patients before and after receiving a new drug for seven weeks.
At = 0.01, is there enough evidence to conclude that thenew drug helped to reduce daily headache hours?
1. Write the null and alternative hypothesis.
2. State the level of significance.
= 0.01
H 0: The headache hours after using the new drug areat least as long as before using the drug.
H a: The new drug reduces headache hours. (Claim)
8/13/2019 Ch11 [Non-Parametric Tests]
13/27
12
345
678
2.13.9
3.82.52.4
3.63.42.4
Before
2.22.8
2.52.61.9
1.82.01.6
After
–0.11.1
1.3 –0.10.5
1.81.40.8
Diff.
0.11.1
1.30.10.5
1.81.40.8
Abs
1.55.0
6.01.53.0
8.07.04.0
Rank
–1.55.0
6.0 –1.53.0
8.07.04.0
Sign Rank
8/13/2019 Ch11 [Non-Parametric Tests]
14/27
The sum of the positive ranks is 5 + 6 + 3 + 8 + 7 + 4 = 33.
The sum of the negative ranks is –1.5 + (–1.5) = –3.
The test statistic is the smaller of the absolute value ofthese sums, w s = 3.
There are 8 + and – signs, so n = 8. The criticalvalue is 2. Because w s = 3 is greater than the
critical value, fail to reject the null hypothesis.There is not enough evidence to conclude thenew drug reduces headache hours.
8/13/2019 Ch11 [Non-Parametric Tests]
15/27
Wilcoxon Rank-Sum TestThe Wilcoxon rank-sum test is a nonparametric test that
can be used to determine whether two independentsamples were selected from populations having the samedistribution.
Both samples must be at least 10. Then n 1represents the size of the smaller sample and n 2the size of the larger sample.
When the samples are the same size, it does not matter which is n 1.
8/13/2019 Ch11 [Non-Parametric Tests]
16/27
Wilcoxon Rank-Sum TestTest statistic:
Combine the data from both samples and rank it.R = the sum of the ranks for the smaller sample.Find the z -score for the value of R .
where
8/13/2019 Ch11 [Non-Parametric Tests]
17/27
The KruskalThe KruskalThe KruskalThe Kruskal----WallisWallisWallisWallisTestTestTestTest
Section 11.3
8/13/2019 Ch11 [Non-Parametric Tests]
18/27
The Kruskal-Wallis TestThe Kruskal-Wallis test is a nonparametric test that can beused to determine whether three or more independent
samples were selected from populations having the samedistribution.
H 0: There is no difference in the population distributions.H a: There is a difference in the population distributions.
Combine the data and rank the values. Then
separate the data according to sample and find
the sum of the ranks for each sample.
Ri = the sum of the ranks for sample i .
8/13/2019 Ch11 [Non-Parametric Tests]
19/27
The sampling distribution is a chi-square distribution with k – 1degrees of freedom (where k = the number of samples).
Given three or more independent samples, the teststatistic H for the Kruskal-Wallis test is:
where k represents the number of samples, n i is the
size of the i th sample, N is the sum of the samplesizes, and R i is the sum of the ranks of the i
th
sample.
Reject the null hypothesis when H is greater than the critical
number. (Always use a right-tail test.)
The Kruskal-Wallis Test
8/13/2019 Ch11 [Non-Parametric Tests]
20/27
ApplicationYou want to compare the hourly pay rates of accountantswho work in Michigan, New York and Virginia. To do so, you
randomly select 10 accountants in each state and recordtheir hourly pay rate as shown below. At the .01 level, canyou conclude that the distributions of accountants’ hourly payrates in these three states are different?
MI(1) NY(2) VA(3)14.24 21.18 17.02014.06 20.94 20.63014.85 16.26 17.470
17.47 21.03 15.54014.83 19.95 15.38019.01 17.54 14.90013.08 14.89 20.48015.94 18.88 18.50013.48 20.06 12.800
16.94 21.81 15.570
8/13/2019 Ch11 [Non-Parametric Tests]
21/27
= 0.01
H 0 : There is no difference in the hourly pay rate in the 3 states.
H a : There is a difference in the hourly pay in the 3 states.
1. Write the null and alternative hypothesis.
2. State the level of significance.
The sampling distribution is chi-square with d.f. = 3 – 1 = 2.
From Table 6, the critical value is 9.210.
5. Find the rejection region.
4. Find the critical value.
3. Determine the sampling distribution.
X2
8/13/2019 Ch11 [Non-Parametric Tests]
22/27
Test StatisticData State Rank
12.800 VA 113.080 MI 2
13.480 MI 314.060 MI 414.240 MI 5
14.830 MI 614.850 MI 714.890 NY 814.900 VA 915.380 VA 10
15.540 VA 1115.570 VA 1215.940 MI 1316.260 NY 14
16.940 MI 1517.020 VA 1617.470 MI 17.517.470 VA 17.5
17.540 NY 1918.500 VA 2018.880 NY 2119.010 MI 22
19.950 NY 2320.060 NY 2420.480 VA 2520.630 VA 26
20.940 NY 2721.030 NY 28
21.180 NY 2921.810 NY 30
Michigan salaries are in ranks:
2, 3, 4, 5, 6, 7, 13, 15, 17.5, 22The sum is 94.5.
New York salaries are in ranks:8, 14, 19, 21, 23, 24, 27, 28, 29, 30The sum is 223.
Virginia salaries are in ranks:1, 9, 10, 11, 12, 16, 17.5, 20, 25, 26The sum is 147.5.
8/13/2019 Ch11 [Non-Parametric Tests]
23/27
R1 = 94.5, R2 = 223, R3 = 147.5
n 1 = 10, n 2 = 10 and n 3 = 10, so N = 30
The test statistic 10.76 falls in the rejection region, soreject the null hypothesis.
There is a difference in the salaries of the 3 states.
Find the test statistic.
Make Your Decision
Interpret your Decision
9.210 10.76
8/13/2019 Ch11 [Non-Parametric Tests]
24/27
Rank CorrelationRank CorrelationRank CorrelationRank Correlation
Section 11.4
8/13/2019 Ch11 [Non-Parametric Tests]
25/27
(There is a significant correlation between thevariables.)
Rank Correlation
The Spearman rank correlation coefficient, r s , is a measure ofthe strength of the relationship between two variables. TheSpearman rank correlation coefficient is calculated using theranks of paired sample data entries. The formula for theSpearman rank correlation coefficient is
where n is the number of paired data entries and d is thedifference between the ranks of a paired data entry.
The hypotheses:
(There is no correlation between the variables.)
8/13/2019 Ch11 [Non-Parametric Tests]
26/27
Rank CorrelationSeven candidates applied for anursing position. The seven
candidates were placed in rankorder first by x and then by y .The results of the rankings arelisted below. Using a .05 level
of significance, test the claimthat there is a significantcorrelation between thevariables.
(There is no correlation between the variables.)(There is a significant correlation between thevariables.)
x y
1 2 12 4 43 1 34 5 2
5 7 66 3 17 6 7
8/13/2019 Ch11 [Non-Parametric Tests]
27/27
Application
Critical Value = 0 .715
Since the statistic 0.643 does not fall in the rejection region, fail to reject H 0. There
is not enough evidence to support the claim that there is a significant correlation.
x y d = x – y d 2
1 2 1 1 12 4 4 0 03 1 3 –2 44 5 2 3 9
5 7 6 1 16 3 1 2 47 6 7 –1 1
20