CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or...

23
CHAPTER 6 CONFIDENCE INTERVALS Populations are often too large. Their sizes require that we select samples which are then used to draw conclusions about populations. The most common estimators used to make inferences about a population are point estimates and interval estimates. A point estimate such as the mean, median, or mode uses a statistic to estimate the parameter at a single value or point. An interval estimate (confidence interval) specifies a range within the unknown parameter may fall. Such an interval is often accompanied by a statement as to the level of confidence placed on its accuracy. Interval estimates or confidence intervals state the range within which a population parameter probably lies. After completing this chapter, you will be able to use Microsoft Excel to: Calculate confidence intervals for the mean. Calculate confidence interval for proportions. Determine sample sizes for means and proportions. Confidence Interval for the Mean (Large Samples or δ Known) In this section we will explain how to use Microsoft Excel to calculate the confidence interval for the population mean if δ is known or the sample is large sample (i.e., n 30). Microsoft Excel has a statistical function called Confidence, which returns the value of the margin of error, E. After Microsoft Excel computes this value you need only to subtract/add the value from/to the sample mean (xbar) to find confidence interval. Excel’s function for computing confidence intervals for a population mean assumes a normal distribution, regardless of a sample size. Therefore, if the population distribution is not a normal distribution, then a large sample (n 30) should be used. The Excel’s command is CONFIDECNE (Alpha, Standard-dev, Size), 120

Transcript of CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or...

Page 1: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

CHAPTER 6

CONFIDENCE INTERVALS

Populations are often too large. Their sizes require that we select samples which are then used to draw conclusions about populations. The most common estimators used to make inferences about a population are point estimates and interval estimates. A point estimate such as the mean, median, or mode uses a statistic to estimate the parameter at a single value or point. An interval estimate (confidence interval) specifies a range within the unknown parameter may fall. Such an interval is often accompanied by a statement as to the level of confidence placed on its accuracy. Interval estimates or confidence intervals state the range within which a population parameter probably lies. After completing this chapter, you will be able to use Microsoft Excel to:

Calculate confidence intervals for the mean. Calculate confidence interval for proportions. Determine sample sizes for means and proportions.

Confidence Interval for the Mean (Large Samples or δ Known) In this section we will explain how to use Microsoft Excel to calculate the confidence interval for the population mean if δ is known or the sample is large sample (i.e., n ≥ 30). Microsoft Excel has a statistical function called Confidence, which returns the value of the margin of error, E. After Microsoft Excel computes this value you need only to subtract/add the value from/to the sample mean (xbar) to find confidence interval. Excel’s function for computing confidence intervals for a population mean assumes a normal distribution, regardless of a sample size. Therefore, if the population distribution is not a normal distribution, then a large sample (n ≥ 30) should be used. The Excel’s command is CONFIDECNE (Alpha, Standard-dev, Size),

120

Page 2: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

where Alpha (α) = 1 – confidence level. In other words, an alpha of 0.01 indicates a 99% confidence level. The standard deviation is the population standard deviation, δ. If the population standard deviation δ is not known and the sample size is large (n ≥ 30), we use sample standard deviation s instead of δ. Recall from our textbook that the confidence interval for population mean is

xbar – E ≤ μ ≤ xbar + E Note that Microsoft Excel uses the formula E=zcδ/√n to calculate the margin of error, where zc is the critical z-value for the chosen confidence level c. For example, if c=95%, then zc = 1.96. Example 6.1: [δ is known] John decided to try to estimate the average number of miles he drives each day. For a four month period, he selected a random sample of 40 days and kept a record of the distance driven on each of those sample days. The sample mean (xbar) was 29.58 miles, and also assumed that the population standard deviation was δ =5.5 miles. Find a 99% confidence interval for the population mean of miles John drove per day in the four-month period. Solution: We will create a worksheet template to solve this problem and similar problems.

- On a new worksheet enter the data as shown in Figure (6.1). - In cell A8, type = B3-B4 and click the Enter key. - In cell B8, type = B3 + B4 and click the Enter key. - Save this template under the name Confidence Interval Template For

Mean_Large Samples.xls. This template can be used to find the confidence interval for a population mean for large samples or known population standard deviation.

- In cell B3, enter the value of the sample mean, xbar. - Activate cell B4, and from the formula bar click on the Insert

Function, fx. - A dialog box will be displayed as in Figure (6.2). Select Statistical

category and scroll through the menu and highlight Confidence function and click OK.

- Complete the Confidence dialog box shown in Figure (6.3) by putting Alpha = 1-0.99 = 0.01.

Standard-dev =5.5 Size = 40, then click OK. This function returns a value of 2.2400 as a

121

Page 3: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

margin of error, E. You may find the margin of error E, by activating cell B4, and typing the command = CONFIDENCE (0.01, 29.5, 40), then clicking OK. Thus, the confidence interval of the population mean is (27.2600, 31.7400) [see Figure (6.4)]. You can use this template to find the confidence interval for the mean (large samples or known δ) of similar problems.

Figure (6.1)

122

Page 4: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.2)

Figure (6.3)

123

Page 5: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.4)

Example 6.2: A researcher wishes to estimate the average amount of money a person spends on a lottery ticket each month. A sample of 50 people who play the lottery found the mean to be $18.7 and a standard deviation to be $6.4. Construct the 90% confidence interval of the population mean. Solution: By using the Confidence Interval Template for the mean (large samples or known δ) we can find the confidence interval (17.2112, 20.1888) [see Figure (6.5)].

124

Page 6: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.5)

Example 6.3: The College president asks the math teacher to estimate the average age of the students of their college. How large a sample is necessary? The math teacher would like to be a 95% confidence that the estimate should be accurate within one year. From a previous study, the standard deviation of ages is known to be 3 years. Solution: Microsoft Excel doesn’t have any function for this, so you need to create a template to solve the formula. Note that the formula which is used to calculate sample size is n = (zcδ/E) ^2.

- Open a new worksheet, and in cell A1, type Sample Size for Mean/Confidence Interval.

- In the cells A3, A4, A5, type z-value (zc), Std_Dev(s), and Margin of Error (E), respectively.

- In cells A6 and A7, type the Sample Size, and Rounded size, respectively.

- In cell B6 type = (B3 * B4)/B5)^2, and click the Enter key.

125

Page 7: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

- In cell B7, type = Int (B6 + .99). This Excel’s command will round up the answer to the next whole number. Note that the contents of B6 and B7 will display # Div/0!; but they will change when you fill in the values [see Figure (6.6)].

- Save your file under the name Sample Size Mean Template.xls or under a different name. This template can be used to find the sample size whenever a population mean is needed to be estimated.

- In cell B3, type 1.96, since the confidence required is 95% - In cell B4, type 3, the estimated standard deviation. - In cell B5, type 1, since the margin of error is to be less than one. The

sample size of 34.5744 is displayed in cell B6, and the rounded value of the sample size is displayed in cell B7 [see Figure (6.7)].

Figure (6.6)

126

Page 8: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.7)

Confidence Interval for the Mean (Small Samples and δ Unknown) When the population standard deviation δ is unknown and the sample size is small (n < 30), the sample standard deviation s is used instead of the population standard deviation δ. Hence, the confidence intervals are computed using student’s t-distribution. The confidence interval for population mean is

xbar – E ≤ μ ≤ xbar + E, where xbar is the sample mean, and the margin of error is E=tcs/√n. Microsoft Excel contains two commands for student’s t-distribution under Statistical options of Insert Function. TDIST (x, degree of freedom, tails): This returns the area in the tail of student’s t-distribution beyond the specified value of x for the specified number of degree of freedom and number of tails (1 or 2).

127

Page 9: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

TINV (probability, degree of freedom): This returns the critical t-value, tc such that the area in the two tails beyond the tc value equals the specified probability for the specified degree of freedom. TINV command finds the value of tc which will be used in computing the confidence interval for the mean for small samples and unknown population standard deviation (δ). For instance, if we have a sample of size 15, then the tc value we use in the computation of 98% confidence interval of the mean is 2.624. Notice that for 98% confidence interval, 2% of the area is in the two tails, and the number of degrees of freedom is d.f. =15-1= 14. Example (6.4): Find the critical value tc for a 95% confidence interval when the sample size is 16. Solution:

- Since the confidence level is 0.95, then α=1- 0.95 = .05 which is the area in the two tails. - Click Insert Function. - Select Statistical or All category.

- Scroll through the list of function names and highlight TINV and click OK.

- In the TINV dialog box [see Figure (6.8)], enter 0.05 in the probability textbox, and 15 in the degree of freedom textbox. Then click OK. A value of 2.1314 will be returned [see Figure (6.9)].

Also, note that the value of the margin of error, E, can be found from the descriptive statistics dialog box as we will see later. If you have data entered into a worksheet, you can use these menu selections to automatically calculate the sample mean (xbar) and the sample standard deviation (s) for the data, as well as the margin of error value (E) for the confidence interval on the basis of student’s t-distribution no matter what the sample size (n) is.

128

Page 10: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.8)

Figure (6.9)

129

Page 11: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Example 6.5: You randomly select 16 restaurants and measure the temperature of

the coffee sold at each. The sample mean temperature is 162.0 degrees Fahrenheit with a simple standard deviation of 10.0 degrees Fahrenheit. Find the 95% confidence interval for the mean temperature. Assume the temperature is approximately normally distributed.

Solution:

Note that the area in the two tails is 0.05 and the degree of freedom is d.f.=16-1 =15. We will create a template to solve such problems.

- On a new worksheet, enter the data shown in Figure (6.10) - In cell B8 type = (B4 * B6)/SQRT (B5)) and click the Enter key. - In cell A12, type =B3-B8, and click the Enter key. - In cell B12, type =B3 + B8, and click the Enter key. - Save this file under the name Confidence Interval Template For

Mean_Small Samples.xls [see Figure (6.11)]. This template can be used to find the confidence interval for the population mean of small samples and unknown population standard deviation (δ).

- Cells B8, A12, and B12 display # Div /0!; but they will change when you fill in the values.

- In cell B3, enter the value of xbar, which is 162.0. - In cell B4, enter the sample standard deviation s, which is 10. - In cell B5, enter the sample size, n, which is 16. - In cell B6, enter the critical t-value, tc, which is 2.1314 computed in

example 6.4. You may compute tc, by typing the formula: = TINV(0.05, 15).

Thus, the lower and upper limits of the confidence interval are computed in cells A12, and B12. Hence the confidence interval is (156.6714, 167.3286) [see Figure (6.12)].

In the next example, we will demonstrate how to use summary statistics to find the sample mean (xbar), the sample standard deviation(s), and the margin of error (E). Then we construct the confidence interval for the population mean for small sample and unknown population standard deviation (δ).

130

Page 12: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.10)

Figure (6.11)

131

Page 13: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.12)

Example 6.6: The following data represent the weights in pounds of a 22 randomly selected football players. 225 230 235 239 231 227 244 223 250 226 242 254 252 226 231 247 223 224 233 224 240 243 Use the descriptive statistics dialog box to get the summary statistics and create a 90% confidence interval of the population mean weights. Solution: It is clear that we have here a small sample (n < 30), and the population standard deviation δ is unknown.

- Type weights in cell A1, and type the given weights values in cells A2 through A23.

- Click Tools Data Analysis. Select Descriptive Statistics, and click OK.

- Complete the descriptive statistics dialog box [see Figure(6.13)]

132

Page 14: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

- For the Input Range field, enter A2:A23 or select cells. - Select Output Range to start at cell C1. - Select Summary Statistics and Confidence Level for Mean, and click

OK. - Widen cells in column C and D to accommodate the output. Note that

the sample mean (xbar) is in D3, and the margin or error (E) for a 95% confidence interval based on student’s t-distribution, not the standard normal distribution, is in cell D16.

- In cell C18, type Confidence Interval for the Mean (Small Samples and Sigma Unknown) and merge cells.

- In cells C19, and D19, type lower and upper, respectively. - In cell C20, enter the formula = D3-D16 and click the Enter key. This

returns the lower limit of the confidence interval. - In cell D20, enter the formula = D3 + D16, and click the Enter key.

This returns the upper limit of the confidence interval. Thus, the confidence interval is (230.4465, 239.4625) [see Figure (6.14)].

Figure (6.13)

133

Page 15: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.14)

Confidence Interval for Population Proportions The confidence interval for the population proportion is calculated as follows.

phat – E ≤ μ ≤ phat + E,

with the margin of error E= zc*SQRT (phat*(1-phat)/n), where phat is the sample proportion, n is the sample size, and qhat = 1- phat. Note that zc depends on the level of confidence. 90% level of confidence => zc = 1.65 95% level of confidence => zc = 1.96 99% level of confidence => zc = 2.58 There is no direct formula in Microsoft Excel that calculates a confidence interval for a population proportion or to calculate the sample size. So, we will generate two templates, one is to solve problems that involve finding confidence intervals of a population proportion, and the other is used to solve problems that involve finding sample sizes.

134

Page 16: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Example 6.7: In a survey of 1011 US adults, 293 said that their favorite sport to watch is football. Find a point estimate for the population proportion of US adults who say their favorite sport to watch is football, and construct the 99% confidence interval for the population proportion of adults in the United States who say that their favorite sport to watch is football. Solution: Since n= 1011, and x = 293, then the point estimate phat is equal to x/n. Thus, phat = 293/1011 = 0.29, and qhat = 1-phat = 0.71. I need to generate a template be used whenever you want to find the confidence interval of a population proportion.

- On a new worksheet enter the data as shown in Figure (6.15). - In cell B6, type= B4* SQRT (B3*(1-B3)/B5), and click the Enter key.

This returns the value of the margin of error (E). - In cell A9, type = B3 – B6, and click the Enter key. - In cell B9, type = B3 + B6, and click the Enter key. Note that the cell

contents for B6, A9, and B9 display # DIV/0!. This is because the variables values in cells B3, B4, and B5 have not yet been entered.

- Save this file under the name Confidence Interval Template _Population Proportion.xls [see Figure (6.16)]. This template can be used to find the confidence interval for a population proportion.

- In cell B3, enter, 0.29, the sample proportion. - In cell B4, enter 2.58, the critical z-value (zc) for a 99% confidence

level. - In cell B5, enter the number 1011 which is the sample size. - By activating the cells A9, and B9, the lower limits will be displayed

these cells. Hence, the confidence interval is (0.25, 0.33) = (25%, 33%) [see Figure (6.17)].

135

Page 17: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.15)

Figure (6.16)

136

Page 18: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.17)

In the next example, we will create a template that can be used to find sample size when estimating a population proportion. Note that the formula for determining the sample size is given by n = (phat*qhat)(zc/E)^2 This formula can be used whenever phat and qhat are known. But if phat and qhat are unknown, we assume that phat= qhat= 0.5.

137

Page 19: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Example 6.8: A researcher wishes to estimate with a 90% confidence, the population proportion of people who own a computer. A previous study shows that 37% of those interviewed had a computer at home. The researcher wishes to be accurate within 3% of the true proportion. Find the minimum sample size necessary. Solution: We have to create a template to solve this problem and similar problems. Note that E= 3% =0.03, phat =0 .37, and zc = 1.65, since the confidence level is 90%.

- On a new worksheet and in cell A1, type Sample Size for Proportion/Confidence Interval.

- In cells A3, A4, A5, A6, and A7, type z-value (zc), phat, Error (E), Size (n), and Rounded Size, respectively.

- In cell B6, type = B4* (1-B4) * (B3/B5) ^2, and click the Enter key. This should return the sample size.

- In cell B7, type Int(B6 +0.99). This round up the answer to the next whole number.

- Save this file under the name Sample Size_Proportion.xls, or under a different name [see Figure (6.18)]. This template is used to find the sample size needed to estimate the population proportion.

- In cells B3, B4, and B5, enter 1.65, 0.37, and 0.03, respectively. - The value of the sample size is computed in cell B7. Thus the

minimum sample size is 706 [see Figure (6.19)].

138

Page 20: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Figure (6.18)

Figure (6.19)

139

Page 21: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Microsoft Excel Lab Experiments: Lab Experiment 6.1: Open or retrieve the Excel worksheet Heights.xls from the CD-ROM included with this manual. This file contains the height (in feet) of 28 randomly selected basketball players. Use the descriptive statistics dialog box to get summary statistic for the data and to create a 95% confidence interval of the mean heights of basketball players. Lab Experiment 6.2: A survey of 52 first-time ice-skaters showed that 17 did not want to repeat this experience. Find the 90% confidence interval of the true proportion of ice-skaters who did not want to repeat this experience. Lab Experiment 6.3: A survey showed that 22% of 100 women over age 50 in the study was widows.

(a) How large a sample must one take to be 98 % confident that the estimate is within 0.05 of the true proportion of women over ages 50 who were widows?

(b) How large a sample should be if no prior information about the sample proportion was available?

Lab Experiment 6.4:6 When Mendel constructed his famous genetics experiment with peas, one sample of offspring consisted of 426 green peas and 154 yellow peas.

(a) Use Microsoft Excel to find the following confidence interval estimates of the percentages of yellow peas. 99% Confidence Interval: __________________________ 98% Confidence Interval: __________________________ 95% Confidence Interval: __________________________ 90% Confidence Interval: __________________________

6 Mario F. Triola, Minitab Manual, Pearson Wesley, 2004.

140

Page 22: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

(b) After examining the pattern of the above confidence intervals, complete the following statement. “As the level of confidence decreases, the confidence interval limits ____________”.

(c) In your own words, explain why the preceding completed statement makes sense. That is, why should the confidence intervals behave as you have described?

Lab Experiment 6.5: Open or retrieve the Excel worksheet Weights.xls from the CD-ROM included with this manual. This file contains the weights (in lb) of 45 bears. Assuming (δ) is known to be 124.2 lb, find the 99% confidence interval of the population mean of all such bear weights. Lab Experiment 6.6: (Simulated Data)7

In this experiment we will generate 500 IQ scores then we will construct a confidence interval based on the sample result. IQ scores have a normal distribution with mean of 100 and a standard deviation of 15. First generate the 500 sample as follows.

1. Choose Tools Data Analysis Random Number Generation. 2. The Random Number Generation Screen Shows up. Choose 1 for

number of variables (that is, 1 column) , 500 for the number of all random numbers (that is, 500 rows), “normal” for the distribution, 100 for the mean and 15 for the standard deviation, and store the output starting at cell A1 [see Figure (6.20)] Click OK.

3. Use the descriptive statistics dialog box to get summary statistics n = _________________________ xbar = _______________________ s = __________________________

Using the generated values, construct a 95% confidence interval estimate of the population mean of all IQ scores. Enter the 95% confidence interval here: ________________________________________________________

7 Mario F. Triola, Minitab Manual, Pearson Wesley, 2004.

141

Page 23: CHAPTER 6faculty.ccc.edu/.../Chapter6.pdf · CHAPTER 6 CONFIDENCE INTERVALS ... estimates or confidence intervals state the range within which a population ... Determine sample sizes

Because of the way that the sample data was generated, we know that the population mean is 100. Do the confidence interval limits contain the true mean IQ score of 100? _____________________________________________________________

Figure (6.20)

142