Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in...

16
1 Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1. Zinc Data Model Validation The Normal plot suggests that the Normality assumption is acceptable: the plotted points follow an approximately straight line and the Anderson-Darling test gives a p- value of 0.86, which means that the data are consistent with the null hypothesis that they were generated from a Normal distribution. Based on the signalling rules used in the laboratory, the individuals chart suggests a stable system, with points varying at random around a mean of 195.62. The assumptions underlying the test and confidence interval would appear to hold in this case. Significance Test The hypothesis to be tested is that the long-run mean concentration equals the spiked value of 200 μg L -1 : H o : μ = 200 H 1 : μ 200 We choose a significance level of α=0.05 and carry out a two-tailed test. The reference t-distribution will have 29 degrees of freedom; this means that our critical values will be ±2.05 as shown in Figure 2.S.1.3. Figure 2.S.1.3: A t-distribution with 29 degrees of freedom

Transcript of Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in...

Page 1: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

1 Course Manual: PG Diploma in Statistics, TCD, Base Module

Chapter 2: Solutions to Supplementary Exercises 2.S.1. Zinc Data Model Validation The Normal plot suggests that the Normality assumption is acceptable: the plotted points follow an approximately straight line and the Anderson-Darling test gives a p-value of 0.86, which means that the data are consistent with the null hypothesis that they were generated from a Normal distribution. Based on the signalling rules used in the laboratory, the individuals chart suggests a stable system, with points varying at random around a mean of 195.62. The assumptions underlying the test and confidence interval would appear to hold in this case. Significance Test The hypothesis to be tested is that the long-run mean concentration equals the spiked value of 200 µg L-1:

Ho: µ = 200 H1: µ ≠ 200 We choose a significance level of α=0.05 and carry out a two-tailed test. The reference t-distribution will have 29 degrees of freedom; this means that our critical values will be ±2.05 as shown in Figure 2.S.1.3.

Figure 2.S.1.3: A t-distribution with 29 degrees of freedom

Page 2: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

2 Course Manual: PG Diploma in Statistics, TCD, Base Module

The test statistic is:

Since this value is a long way outside the critical values we reject the null hypothesis; the observed sample mean is statistically significantly different from the spiked value of 200. We conclude that the analytical system is not producing results for the control samples that vary at random around 200 µg L-1: it is biased downwards. Confidence Interval A 95% confidence interval for the long-run mean zinc concentration around which the measurements on the control samples vary is given by

± tc

195.62 ± 2.05

195.62 ± 2.10. We estimate that the long-run mean is somewhere between 193.52 and 197.72 µg L-1. Note that this interval does not include the spiked value of 200. The analytical system appears to be biased downwards by between 2.28 and 6.48 µg L-1. Relationship between Methods Where a confidence interval does not contain the null hypothesis value for the corresponding t-test, as in the current example, the test result will be statistically significant, i.e., the null hypothesis will be rejected. Conversely, if the test fails to reject the null hypothesis, then the hypothesised mean will be covered by the confidence interval. Thus, the confidence interval gives the set of possible long-run values which would not be rejected by a t-test.

Page 3: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

3 Course Manual: PG Diploma in Statistics, TCD, Base Module

2.S.2. Brain Density Data With small sample sizes it is always difficult to assess the validity of the assumptions underlying the statistical methods used in analysing the data. The model here is that the data are independently randomly scattered around some mean (which may or may not be zero) and that that random scatter corresponds to what would be expected from a single Normal distribution. Figure 2.S.2.1 is a scatterplot of the difference between the density measurement for the alcoholic and his/her control against the average of the two measurements. A picture like this can indicate that the differences tend to vary in a systematic way with the size of the measurements (something that is found often) – if this occurs it would contradict the assumption of a single Normal distribution of differences. The largest difference (4.7) lies in the top right-hand corner of the scatterplot and could suggest that the chance variation increases with the average size of the densities being measured. Without this point, the remaining 10 points seem fairly randomly scattered around the average of 1.5. The large value does not appear exceptionally large on the Normal plot of Figure 2.S.2.2 and the Anderson-Darling test does not reject the assumption of a Normal distribution. For the moment we will accept the data as conforming to the model required for the paired t-test and corresponding confidence interval. Further data might cast doubt on this assumption, leading to a review of the analysis given below.

Figure 2.S.2.1: Scatterplot of brain density differences versus means for pairs of subjects

Page 4: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

4 Course Manual: PG Diploma in Statistics, TCD, Base Module

Figure 2.S.2.2: Normal plot of density differences

A paired t-test tests the hypothesis that the mean of a population of such differences is equal to some hypothesised value; here that value is zero. Since there are 11 differences the test statistic will have 10 degrees of freedom: for a two-tailed test with a significance level of α = 0.05 the critical value is ±2.23. The Minitab paired t-test output shown in Table 2.S.2.3 gives a t-value of 3.19 which exceeds the critical value and is therefore evidence of a statistically significant difference on average. The p-value is 0.01 which is an alternative way of reporting the statistical significance.

Paired T for Controls - Alcoholics N Mean StDev SE Mean Controls 11 40.6636 2.5625 0.7726 Alcoholics 11 39.1455 1.7224 0.5193 Difference 11 1.51818 1.57976 0.47632 95% CI for mean difference: (0.45689, 2.57948) T-Test of mean difference = 0 (vs not = 0): T-Value = 3.19 P-Value = 0.010

Table 2.S.2.3: Minitab paired t-test analysis of the brain density data The 95% confidence interval of Table 2.S.2.3 suggests that in the population of such differences the brain densities for alcoholics would be reduced by between 0.46 and 2.58 units, on average. Note that if the largest difference of 4.7 is excluded, the implied population difference is somewhat smaller than before but the test and confidence interval still indicate a reduction in brain density.

Page 5: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

5 Course Manual: PG Diploma in Statistics, TCD, Base Module

2.S.3. Generic-Reference Drug Data We begin with some graphical analyses, which will, in fact, answer most, if not all, our questions. Figure 2.S.3.1 shows that the absorbance differences (generic–reference) for the 20 subjects appear to vary randomly around zero, thus indicating no systematic difference between the two drugs.

Figure 2.S.3.1: Scatterplot of differences versus subject number

Figure 2.S.3.1 does not show any systematic differences between the first ten subjects (who received the generic drug first) and the second set of ten (who received the reference drug first). Had there been a difference then some kind of carry-over effect of the drugs would be indicated. Figure 2.S.3.2, the Normal probability plot, does not call into question the Normality assumption that underlies the paired t-test, which is the natural tool for the data analysis, given the self-paired study design. Figure 2.S.3.3 does not show any systematic change in the differences as a function of the mean response from the different subjects: any such trend would call into question the assumption underlying the t-test - that the data (i.e., differences) come from a single Normal distribution.

Page 6: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

6 Course Manual: PG Diploma in Statistics, TCD, Base Module

Figure 2.S.3.2: Normal probability plot of differences

Figure 2.S.3.3: Scatterplot of differences versus mean response from each subject

The Minitab analysis, shown in Table 2.S.3.3 suggests that there is no differential effect – there is no systematic tendency for either drug to be absorbed more than the other. This is shown by the low t-value (0.15) and the correspondingly high p-value (0.88). The confidence interval for the long-run mean difference ranges from –464 to 538; the fact that it covers zero is consistent with the test result not rejecting a long-run value of

Page 7: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

7 Course Manual: PG Diploma in Statistics, TCD, Base Module

zero. Perhaps the most striking aspect of the data is the variability of responses and, in particular, the variability of the differences, as shown in Figure 2.S.3.4. The individual responses vary from 938 to 4108 (a range of 3170 units), but the differences within subject vary from –2353 to 2030 (a range of 4383 units); thus the variation within subject is even greater than the variation between subjects.

Paired T for Generic - Reference N Mean StDev SE Mean Generic 20 2085.50 642.56 143.68 Reference 20 2048.50 859.54 192.20 Difference 20 37.0000 1070.6215 239.3982 95% CI for mean difference: (-464.0663, 538.0663) T-Test of mean difference = 0 (vs not = 0): T-Value = 0.15 P-Value = 0.879

Table 2.S.3.3: A Minitab paired t-test analysis of the drug absorption data

Figure 2.S.3.4: Dotplots of differences and original observations

The paired design suggests that the investigators expected that some subjects would show high absorbance, on both drugs, while others would show relatively low absorbance. Taking differences would then eliminate the subject-to-subject systematic variation, and, if there were a systematic difference between the amounts of the two drugs absorbed by the subjects, the confidence interval would estimate the size of this shift. Figure 2.S.3.5 shows that such an expectation (if I am correct in my assumption) was not borne out in practice. Plotting the responses on the two drugs against each

Page 8: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

8 Course Manual: PG Diploma in Statistics, TCD, Base Module

other would be expected to show a general tendency for the pairs of responses to vary from the bottom left of the scatterplot (those whose absorbance was low on both drugs) to the top right-hand side of the scatterplot (subjects showing high absorbance for both drugs). Figure 2.S.3.5 shows no such tendency – a subject’s absorbance of the reference drug is not a useful predictor of his response to the generic drug.

Figure 2.S.3.5: Scatterplot of differences versus mean response from each subject

Page 9: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

9 Course Manual: PG Diploma in Statistics, TCD, Base Module

2.S.4. Boys-Girls IQ Scores Figure 2.S.4.1 shows dotplots for the three datasets. There are two relatively low values for both girls and boys (all), however, removal of these four values does not affect in any material way the analyses carried out below.

Figure 2.S.4.1: Dotplots of the three sets of data

The histogram and the Normal plot of residuals (data minus the groups means; Figures 2.S.4.2 and 2.S.4.3) suggest that the assumption of Normality is reasonable. Figures 2.S.4.4 and 2..S.4.5, which are Normal plots for girls and boys separately, are similarly supportive of the Normality assumption. We may proceed, therefore, to carrying out a two-sample t-test. Figure 2.S.4.1 suggests that the variability of IQ scores is about the same for boys and girls (an F-test (not shown) supports this). Accordingly, we make the assumption of a common population standard deviation in carrying out our analyses. First we will consider the reduced dataset for boys; later we will analyse all the IQ scores for the boys.

Page 10: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

10 Course Manual: PG Diploma in Statistics, TCD, Base Module

Figure 2.S.4.2: Histogram of residuals for IQ data

Figure 2.S.4.3: Normal plot of residuals for IQ data

Page 11: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

11 Course Manual: PG Diploma in Statistics, TCD, Base Module

Figure 2.S.4.4: Normal plot of IQ scores for girls

Figure 2.S.4.5: Normal plot of IQ scores for all the boys

Page 12: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

12 Course Manual: PG Diploma in Statistics, TCD, Base Module

Two-Sample T-Test and CI: Boys, Girls

N Mean StDev SE Mean Boys 31 111.1 11.9 2.1 Girls 31 105.8 14.3 2.6

Difference = mu (Boys) - mu (Girls) Estimate for difference: 5.22581 95% CI for difference: (-1.44660, 11.89822) T-Test of difference = 0 (vs not =): T-Value = 1.57 P-Value = 0.122 DF = 60 Both use Pooled StDev = 13.1327

Table 2.S.4.3: Two sample t-test and confidence interval, comparing boys and girls

The p-value (corresponding to a t-value of 1.57) is 0.122; this suggests that the population mean IQ scores for boys and girls do not differ. The confidence interval for the population mean difference is from –1.45 to 11.90; since this covers zero we cannot reject the null hypothesis of zero long-run mean difference. The results of the t-test and the confidence interval lead to identical conclusions, which must be the case, since the two methods are based on the same data and there is a one-to-one correspondence in the way they use the data.

Table 2.S.4.4 repeats the analysis for all the boys’ scores. The only difference involved in the calculations is that a weighted average1 of the two sample variances is calculated in Table 2.S.4.4, whereas only a simple average is needed for Table 2.S.4.3. The results are virtually identical to those from the reduced dataset. Note that the standard error that appears in the denominator of the test is calculated as:

to allow for the different sample sizes; the square of the common estimate of the long-run standard deviation, , is calculated using the weighted average shown in the footnote.

1 for Table 2.S.4.4; for Table 2.S.4.3.

Page 13: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

13 Course Manual: PG Diploma in Statistics, TCD, Base Module

Two-Sample T-Test and CI: Boys-all, Girls

N Mean StDev SE Mean Boys-all 47 111.0 12.1 1.8 Girls 31 105.8 14.3 2.6 Difference = mu (Boys-all) - mu (Girls) Estimate for difference: 5.11874 95% CI for difference: (-0.87760, 11.11507) T-Test of difference = 0 (vs not =): T-Value = 1.70 P-Value = 0.093 DF = 76 Both use Pooled StDev = 13.0122

Table 2.S.4.4: Two sample t-test and confidence interval, comparing boys (all) and girls

Page 14: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

14 Course Manual: PG Diploma in Statistics, TCD, Base Module

2.S.5. Collagen type VII protein data Model Validation Figure 2.S.5.1 shows the residuals (data minus group means) plotted against the mean responses for the two groups. While the control sample (higher mean) responses are somewhat less scattered than the T molecule responses, given the relatively small sample sizes the figure does not provide strong evidence to call into question the assumption of constant standard deviation. An F-test gives an F-statistic of 2.0 with a corresponding p-value of 0.15, which does not reject the hypothesis of equal long-run standard deviations. In any case, see below, dropping the assumption has no effect on the conclusions drawn. Figure 2.S.5.2 is a Normal plot of the residuals – it shows that the data points are close to a straight line. This, together with the p-value of 0.89 for the Anderson-Darling test, gives good support to the assumption of data Normality. Table 2.S.5.2 is a Minitab two-sample analysis of the data.

Two-sample T for C vs T N Mean StDev SE Mean C 18 1.031 0.114 0.027 T 18 0.469 0.163 0.038 Difference = mu (C) - mu (T) Estimate for difference: 0.5616 95% CI for difference: (0.4664, 0.6569) T-Test of difference = 0 (vs not =): T-Value = 11.98 P-Value = 0.000 DF = 34 Both use Pooled StDev = 0.1406

Table 2.S.5.2: A Minitab two-sample analysis of the protein data

The t-test is highly statistically significant (note the p-value < 0.0005) – the fact that the T molecule results in less than half the amount of collagen type VII protein is unlikely to be due to chance: the difference in the sample means is much larger than would be expected from chance variation, given the observed cell-to-cell variation as measured by the combined (or pooled) standard deviation of 0.14. The confidence interval provides a measure of the size of the reduction in protein production (in the dimensionless instrumental response units). Note that an approximate t-test that does not assume constant standard deviation gives virtually identical results.

Page 15: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

15 Course Manual: PG Diploma in Statistics, TCD, Base Module

Figure 1: Residuals versus group means

Figure 2: Normal plot of residuals versus

Page 16: Chapter 2: Solutions to Supplementary Exercises 2.S.1 ... No… · Course Manual: PG Diploma in Statistics, TCD, Base Module Chapter 2: Solutions to Supplementary Exercises 2.S.1.

16 Course Manual: PG Diploma in Statistics, TCD, Base Module

References

[1] Mullins, E., Statistics for the Quality Control Chemistry Laboratory, Royal Society

of Chemistry, Cambridge, 2003. [2] Samuels, M.L., Statistics for the Life Sciences, Dellen Publishing Company, 1989. [3] Golden, C.J. et al., Differences in brain densities between chronic alcoholic and

normal control patients, Science, 211, 508-510, 1981. [4] Moore, D.S., The Basic Practice of Statistics, W.H. Freeman, New York, 2007.

Text © Eamonn Mullins, 2012; data, see references