ACTIVITY 10 - Mrs. Hamilton AP...

.-- -- ---

10 Introduction to Inference

ACTIVITY 10 A Little Tacky!

Materials: Small box of thumbtacks When you flip a fair coin, it is equally likely to land "heads" or "tails." Do thumbtacks behave in the same way? In this activity, you will toss a thumbtack several times and observe whether it comes to rest with the point up (U) or point down (D). The question you are trying to answer is: what proportion of the time does a tossed thumbtack settle with its point up (U)?

1. Before you begin the activity, make a guess about what will happen. If you could toss your thumbtack over and over and over, what proportion of all tosses do you think would settle with the point up (U)?

2. Toss your thumbtack 50 times. Record the result of each toss (U or D) in a table like the one shown. In the third column, calculate the proportion of point up (U) tosses you have obtained so far.

Toss . Outcome Cumulative proportion of U's

3. Make a scatterplot with the number of tosses on the horizontal axis and the cumulative proportion of U's on the vertical axis. Connect consecutive points with a line segment. Does the overall proportion of U's seem to be approaching a single value?

4. Your set of 50 tosses can be thought of as a simple random sample from the population of all possible tosses of your thumbtack. The parameter p is the (unknown) population proportion of tosses that would land point up (U). What is your best estimate for p? It's 6, the proportion of U's in your 50 thumbtack tosses. Record your value ofb. How does it compare with the conjecture you made in step l?

5. If you tossed your thumbtack 50 more times (don't do it!), would you expect to get the same value of j? In chapter 9, we learned that the values of6 in repeated samples could be described by a sampling distribution. The mean of the sampling distribution 4 is equal to the population proportion p. How far will your sample proportion j be from the true value p? If the sampling distribution is approximately normal, then the 68-95-99.7.rule tells us that about 95% of all j-values will be within two standard deviations of p.

ACTIVITY 10 A Little Tacky! (continued)

6. The sampling distribution of$ will be approximately normal if n$ 2 10 and n( l -6) 2 10. Verify that these conditions are satisfied for your sample.

7. Estimate the standard deviation of the sampling distribution by comput-

ing /? using your value of$. This is the formula we developed in

Chapter 9 with p replaced by$.

I. Construct the interval b i 2 / 9 based on your sample of 50 tosses.

This is called a confidence interval for 6. 9. Your teacher will draw a number line with a scale marked off from 0 to 1 that has tick marks every 0.05 units. Draw your confidence interval above the number line. Your classmates will do the same. Do most of the intervals overlap? If so, what values are contained in all of the overlapping intervals?

10. About 95% of the time, the sample proportion $ of point up (U) tosses will be within two standard deviations of the actual population praportion of point up(U) tosses of a thumbtack. But if 6 is within two standard devia- - . . tions of p, then p is within two standard deviations of$. So about 95% of

the time, the interval bf 2 will contain the true proportion p. n

11. There is no way to know whether the confidence interval you constructed in step 8 actually "catches" the true proportion p of times that your thumbtack will land point up. What we can say is that the method you used in step 8 will succeed in capturing the unknown population parameter about 95% of the time. Likewise, we would expect about 95% of all the confidence intervals drawn by the members of your class in step 9 to capture p.

This activity shows you how sample statistics can be used to estimate unknown population parameters. This is one of the two types of statistical inference that you will meet in this chapter.

INTRODUCTION

When we select a sample, we know the responses of the individuals in the sample. Often we are not content with information about the sample. We want to infer from the sample data some conclusion about a wider population that the sample represents.

/

_Z 4 - Chapter 10 Introduction to Inference

I I Statistical inference provides methods for drawing conclusions about a population from sample data. I I

I

We have, of course, been drawing conclusions from data all along. What is new in formal inference is that we use probability to express the strength of our conclusions. Probability allows us to take chance variation into account and so to correct our judgment by calculation.- ere are two examples of how probability can correct our judgment.

STATISTICAL INFERENCE I

EXAMPLE 10.1 DRAFT LOTTERIES AND DRUG STUDIES

In the Vietnam War years, a lottery determined the order in which men were drafted for army service. The lottery assigned draft numbers by choosing birth dates in random order. We expect a correlation near zero between birth dates and draft numbers if the draft numbers come from random choice. The actual correlation between birth date and draft number in the first draft lottery was r = -0.226. That is, men born later in the year tended to get lower draft numbers. Is this small correlation evidence that the lottery was biased? Our unaided judgment can't tell because any two variables will have some association in practice, just by chance. So we calculate that a correlation this far from zero has probability less than 0.001 in a truly random lottery. Because a correlation as strong as that observed would almost never occur in a random lottery, there is strong evidence that the lottery was unfair.

Probability calculations can also protect us from jumping to a conclusion when only chance variation is at work. Give a new drug and a placebo to 20 patients each; 12 of those taking the drug show improvement, but only 8 of the placebo patients improve. Is the drug more effective than the placebo? Perhaps, but a difference this large or larger between the results in the two groups would occur about one time in five simply because of chance variation. An effect that could so easily be just chance is not convincing.

In this chapter, we will meet the two most common types of formal statistical inference. Section 10.1 concerns confidence intervals for estimating the value of a population parameter. Section 10.2 presents tests of significance, which assess the evidence for a claim about a population. Both types of inference are based on the sampling distributions of statistics. That is, both report probabilities that state what would happen if we used the inference method many times. This kind of probability statement is characteristic of statistical inference. Users of statistics must understand the meaning of the probability statements that appear, for example, on computer output for statistical procedures.

The methods of formal inference require the long-run regular behavior that probability describes. Inference is most reliable when the data are produced by a properly randomized design. When you use statistical inference you are acting as if the data are a random sample or come porn a randomized experiment. If this is not true, your conclusions may be open to challenge. Do not be overly impressed by the complex details of formal inference. This elaborate machinery

--- __.I

10.1 Estimating with C o n f i d e n ~ e , ~ ~ -- - -_ __ < - 4

cannot remedy basic flaws in producing the data, such as voluntary response samples and uncontrolled experiments. Use the common sense developed in your study of the first nine chapters of this book, and proceed to formal inference only when you are satisfied that the data deserve such analysis.

The purpose of this chapter is to describe the reasoning used in statistical inference. We will illustrate the reasoning by a few specific inference techniques, but these are oversimplified so that they are not very useful in practice. Later chapters will first show how to m o q these techniques to make them practically useful and will then introduce inference methods for use in most of the settings we met in learning to explore data. There are libraries-both of books and of computer soft- wardull of more elaborate statistical techniques. Informed use of any of these methods requires an understanding of the underlying reasoning. A computer will do the arithmetic, but you must still exercise judgment based on understanding.

10.1 ESTIMATING WITH CONFIDENCE

What decides whether you will gain admission to the college or university of your choice? By taking challenging courses (AP courses, for example) and earning hlgh grades in those courses, you certainly improve your chances. Many schools also look closely at your performance on standardued tests, such as the SAT I: Reasoning Test (more commonly referred to as the SAT). This test has two one for verbal reasoning ability and one for mathematical reasoning ability. In 2000, 1,260,278 college- bound seniors took the SAT. Their mean SAT Math score was 5 14 with a standard deviation of 1 13. For the SAT Verbal, the mean was 505 with a standard deviation of l l l.

In early 2000, University of California President Richard Atkinson stirred considerable controversy when he suggested that the University of California sys- tem drop the SAT I: Reasoning Test as a factor in college admissions decisions. He suggested replacing this test with tests that reflect course content better.

EXAMPLE 10.2 SAT MATH SCORES I N CALIFORNIA

Suppose you want to estimate the mean SAT Math score for the more than 350,000 high school seniors in California. Only about 49% of California students take the SAT. These self-selected seniors are planning to attend college and so are not representative of all California seniors. You know better than to make inferences about the population based on any sample data. At considerable effortmd expense, you give the test to a simple random sample (SRS) of 500 California high school seniors. The mean for your sample is T= 461. What can vou say about the mean score in the ~o~ulation of all 350,000 seniors?

The law of large numbers tells us that the sample mean X from a large SRS will be close to the unknown population mean p. Because 5s = 461, we guess that p is "somewhere around 461." To make "somewhere around 461" more precise, we ask: How would the sample mean vary if we took many samples of 500 seniors fiom this same population?

Recall the essential facts about the sampling distribution of E

Chapter 10 ~ntr~duction to Inference k .- The central limit theorem tells us that the mean 55 of 500 scores has a distri-

bution that is close to normal. . 'I,

The mean of this normal sampling distribution is the same as the unknown mean p of the entire population.

The standard deviation of P for an SRS of 500 students is U I ~ , where u is the standard deviation of individual SAT Math scores among all California high school seniors.

Let us suppose that we know that the standard deviation of SAT Math scores in the population of all California seniors is a = 100. The standard deviation of X is then

(It is usually not realistic to assume we know a. We will see in the next chapter how to proceed when u is not known. For now, we are more interested in statistical reasoning than in details of realistic methods.)

If we choose many samples of size 500 and find the mean SAT Math score for each sample, we might get mean T = 461 from the first sample, X = 455 from the second, jS = 463 from the third sample, and so on. If we collect all these sample means and display their distribution, we get the normal distribution with mean equal to the unknown p and standard deviation 4.5. Inference about the unknown p, starts from this sampling distribution. Figure 10.1 displays the distribution. The different values of T appear along the axis in the figure, and the normal curve shows how probable these values are.

" . FIGURE 10.1 The sarnpllngdSstribution of the mean score X of an SRS of 500 California seniors on the SAT Math quantitative test.

10.1 Estimating with ~onfiden--!

Statistical confidence Figure 10.2 is another picture of the same sampling distribution. It illustrates the following line of thought:

The 68-95-99.7 rule says that in 95% of all samples, the mean score 55 for the sample will be within two standard deviations of the population mean score p. SO the mean of 500 SAT Math scores will be within 9 points of p in 95% of all samples.

Whenever F is within 9 points of the unknown p, p is within 9 points of the observed 55. This happens in 95% of all samples.

So in 95% of all samples, the unknown p lies between 55 - 9 and 55 + 9. Figure 10.3 displays this fact in picture form.

FIGURE 10.2 In 95% of all samples, X l ies within 29 of the unknown population mean p. S o p also lies within +9 of ii i n those samples.

FIGURE 10.3 To say that ji + 9 is a 95% confidence interval for the population mean p is to say that i n repeated samples, 95% of these intervals capture p.

/'

===--=ZZOi-.Chapter 10 Introduction to Inference

margin of error

This conclusion just restates a fact about the sampling distribution of 15. The language of statistical inference uses this fact about what would happen in the long run to express our confidence in the results of any one sample.

EXAMPLE 10.3 95% CONFIDENCE

Our sample of 500 California seniors gave jT = 461. We say that we are 95% confident that the unknown mean SAT Math score for all California high school seniors lies between

X-9=461-9=452

and

T+9=461 +9=470

Be sure you understand the grounds for our confidence. There are only two possibilities:

1. The interval between 452 and 470 contains the true p.

2. Our SRS was one of the few samples for which 55 is not within 9 points of the true p. Only 5% of all samples give such inaccurate results.

We cannot know whether our sample is one of the 95% for which the interval X f 9 catches p , or if it is one of the unlucky 5%. The statement that we are 95% confident that the unknown p lies between 452 and 470 is shorthand for saying, 'We got these numbers by a method that gives correct results 95% of the time."

The interval of numbers between the values Zf 9 is called a 95% confidence interval for p. Like most confidence intervals we will meet, this one has the form

estimate * margin of error

The estimate (Tin this case) is our guess for the value of the unknown parameter. The margin of error +9 shows how accurate we believe our guess is, based on the variability of the estimate. This is a 95% confidence interval because it catches the unknown p in 95% of all possible samples.

CONFIDENCE INTERVAL

A level C confidence interval for a parameter has two parts:

An interval calculated from the data, usually of the form

estimate + margin of error

A confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples.

10.1 Estimating with ~ o n a ~ ~ e

Users can choose the confidence level, most often 90% or higher because we most often want to be quite sure of our conclusions. We will use C to stand for the confidence level in decimal form. For example, a 95% confidence level corresponds to C = 0.95.

Figure 10.3 is one way to picture the idea of a 95% confidence interval. Figure 10.4 illustrates the idea in a different form. Study these figures carefully. If you understand what they say, you have mastered one of the big ideas of statistics. Figure 10.4 shows the result of drawing many SRSs from the same population and calculating a 95% confidence interval from each sample. The center of each interval is at Kand therefore varies from sample to sample. The sampling distribution of yappears at the top of the figure to show the long-term pattern of this variation. The 95% confidence intervals from 25 SRSs appear below. The center jl of each interval is marked by a dot. The arrows on either side of the dot span the confidence interval. All except one of these 25 intervals cover the true value of p. In a very large number of samples, 95% of the confidence intervals would contain p.

FIGURE 10.4 Twenty-five samples from the same population gave these 95% confidence intervals. In the long run, 95% of all samples give an interval that contains the population mean p.

/

Chapter 10 Introduction to Inference

EXERCISES 10.1 POLLING WOMEN A New York Times poll on women's issues interviewed 1025 women randomly selected from the United States, excluding Alaska and Hawaii. The poll found that 47% of the women said they do not get enough time for themselves.

(a) The poll announced a margin of error of +3 percentage points for 95% confidence in its conclusions. What is the 95% confidence interval for the percent of all adult women who think they do not get enough time for .themselves?

(b) Explain to someone who knows no statistics why we can't just say that 47% of all adult women do not get enough time for themselves.

(c) Then explain clearly what "95% confidence" means.

10.2 NAEP SCORES Young people have a better chance of full-time employment and good wages if they are good with numbers. How strong are the quantitative skills of young Americans of working age? One source of data is the National Assessment of Educational Progress (NAEP) Young Adult Literacy Assessment Survey, which is based on a nationwide probability sample of households. The NAEP survey includes a short test of quantitative skills, covering mainly basic arithmetic and the ability to apply it to realistic problems. Scores on the test range from 0 to 500. For example, a person who scores 233 can add the amounts of two checks appearing on a bank deposit slip; someone scoring 325 can determine the price of a meal from a menu; a person scoring 375 can transform a price in cents per ounce into dollars per pound.'

Suppose that you give the NAEP test to an SRS of 840 people from a large population in which the scores have mean 280 and standard deviation a = 60. The mean f of the 840 scores will vary if you take repeated samples.

(a) Describe the shape, center, and spread of the sampling distribution of l

(b) Sketch the normal curve that describes how T varies in many samples from this population. Mark its mean and the values one, two, and three standard deviations on either side of the mean.

(c) According to the 68-95-99.7 rule, about 95% of all the values of ST fall within of the mean of this curve. What is the missing number? Call it m for

"margin of error." Shade the region from the mean minus m to the mean plus m on the axis of your sketch, as in Figure 10.2.

(d) Whenever ST falls in the region you shaded, the true value of the population mean, p = 280, lies in the confidence interval between ST- m and E+ m. Draw the confidence interval below your sketch for one value o f f inside the shaded region and one value of T outside the shaded region. (Use Figure 10.4 as a model for the drawing.)

(e) In what percent of all samples will the true mean p = 280 be covered by the confidence interval f i m?

10.3 EXPLAINING CONFIDENCE A student reads that a 95% confidence interval for the mean NAEP quantitative score for men of ages 21 to 25 is 267.8 to 276.2. Asked to explain the meaning of this interval, the student says, "95% of all young men have scores between 267.8 and 276.2." Is the student right? Justify your answer.

10.4 AUTO EMISSIONS Oxides of nitrogen (called NOX for short) emitted by cars and trucks are important contributors to air pollution. The amount of NOX emitted by a

- - - -

10.1 Estimating with Confidence 5.43- --

particular model varies from vehicle to vehicle. For one light truck model, NOX emissions vary with mean p that is unknown and standard deviation o = 0.4 grams per mile. You test an SRS of 50 of these trucks. The sample mean NOX level f estimates the unknown p. You will get different values of f if you repeat your sampling.

(a) Describe the shape, center, and spread of the sampling distribution of ST.

(b) Sketch the normal curve for the sampling distribution of f. Mark its mean and the values one, two, and three standard deviations on either side of the mean.

(c) According to the 68-95-99.7 rule, about 95% of all values of f lie within a distance m of the mean of the sampling distribution. What is m? Shade the region on the axis of your sketch that is within m of the mean, as in Figure 10.2.

(d) Whenever f falls in the region you shaded, the unknown population mean p lies in the confidence interval Yf m. For what percent of all possible samples does this happen?

(e) Following the style of Figure 10.4, draw the confidence intervals below your sketch for two values of ST, one that falls within the shaded region and one that falls outside it.

Confidence interval for a population mean We can now give the recipe for a level C confidence interval for the mean p of a population when the data come from an SRS of size n. The construction of the interval depends on the fact that the sampling distribution of the sample mean 55 is at least approximately normal. This distribution is exactly normal if the population distribution is normal. When the population distribution is not normal, the central limit theorem tells us that the sampling distribution of will be approximately normal if n is sufficiently large. Be sure to check that these conditions are satisfied before you construct a confidence interval.

I CONDITIONS FOR CONSTRUCTING A CONFIDENCE INTERVAL F O R p b The construction of a confidence interval for a population mean p is appropriate when

the data come from an SRS from the population of interest, and .

the sampling distribution of 5~ is approximately normal.

Our construction of a 95% confidence interval for the mean SAT Math score began by noting that any normal distribution has probability about 0.95 within 2 standard deviations of its mean. To construct a level C confidence interval, we want to catch the central probability C under a normal curve. To do that, we must go out z" standard deviations on either side of the mean. Since any normal distribution can be standardized, we can get the value z" from the standard normal table. Here is an example of how to find 2'.

P

Chapter 10 Introduction to Inference

EXAMPLE 10.4 FINDING z*

To find an 80% confidence interval, we must catch the central 80% of the normal sampling distribution of 7. In catching the central 80% we leave out 20%, or 10% in each tail. So Z* is the point with area 0.1 to its right (and 0.9 to its left) under the standard normal curve. Search the body of Table A to find the point with area 0.9 to its left. The closest entry is z" = 1.28. There is area 0.8 under the standard normal curve between -1.28 and 1.28. Figure 10.5 shows how z" is related to areas under the curve.

FIGURE 10.5 The central probability 0.8 under a standard normal curve lies between -1.28 and 1.28. That is. there is area 0.1 to the riaht of 1.28 under the curve.

Figure 10.6 shows the general situation for any confidence level C. If we catch the central area C, the leftover tail area is 1 - C, or (1 - C)/2 on each side. You can find z' for any C by searching Table A. Here are the results for the most common confidence levels:

Confidence level Tail Area

Notice that for 95% confidence we use z* = 1.960. This is more exact than the approximate value zQ = 2 given by the 68-95-99.7 rule. The bottom row in Table C gives the values of zQ for many confidence levels C. This row is labeled z*. (You can find Table C inside the rear cover. We will use the other rows of the table in the next chapter.) Values z" that mark off a specified area under the standard normal curve are often called critical values of the distribution.

10.1 Estimating with ~ o n % % i ~ n

FIGURE 10.6 In general, the central probability C under a standard normal curve lies between -z* and z*. Because z* has area (1-C)/2 to i ts right under the curve, we call it the upper (1-C)/2 critical value.

I CRITICAL VALUES I I . b

Here's the thinking that leads to the level C confidence interval:

The number z* with probability f l lying to its right under the standard normal curve is called the upper P critical value of the standard normal distribution.

Any normal curve has probability C between the point z" standard deviations below its mean and the point z" standard deviations above its mean.

The standard deviation of the sampling distribution of r is U I &, and its mean is the population mean p. SO there is probability C that the observed sample mean T takes a value between

I

(T 0- p -zQ- and p+z" -

& & Whenever this happens, the population mean ,u is contained between

* cr c Z - z - and Z+z*-

& & That is our confidence interval. The estimate of the unknown p is y, and the margin of error is z" (T / & .

- -10 Introduction to Inference

CONFIDENCE INTERVAL FOR A POPULATION MEAN

Choose an SRS of size n from a population having unknown mean p and known standard deviation cr. A level C confidence interval for p is

Here 2 9 s the value with area C between -2" and z' under the standard normal curve. This interval is exact when the population distribution is normal and is approximately correct for large n in other cases.

EXAMPLE 10.5 VIDEO SCREEN TENSION -

A manufacturer of high-resolution video terminals must control the tension on the mesh of fine wires that lies behind the surface of the viewing screen. Too much tension will tear the mesh and too little will allow wrinkles. The tension is measured by an electrical device with output readings in millivolts (mV). Some variation is inherent in the production process. Careful study has shown that when the process is operating properly, the standard deviation of the tension readings is cr = 43 mV. Here are the tension readings from an SRS of 20 screens from a single day's production.

269.5 297.0 269.6 283.3 304.8 280.4 233.5 257.4 317.5 327.4 264.7 307.7 310.0 343.3 328.1 342.6 338.8 340.1 374.6 336.1

Construct a 90% confidence interval for the mean tension p of all the screens produced on this day.

St+ 1: Identify the population of interest and the parameter you want to draw conclusions about. The population of interest is all of the video terminals produced on the day in question. We want to estimate p, the mean tension for all of these screens.

Step 2: Choose the appropriate inference procedure. Vm'fy the conditions for using the selected procedure. Since we know a, we should use the confidence interval for a p o p ulation mean that was just introduced to estimate p . Now we must check that the two required conditions-(1) SRS from the population of interest and (2) sampling distribution of f approximately normal-are met.

The data come from an SRS of 20 screens from the population of all screens produced that day.

Is the sampling distribution o f f approximately normal? Past experience suggests that the tension readings of screens produced on a single day follow a normal distribution quite closely. If the population distribution is normal, the sampling distribution of f will be normal. Let's examine the sample data.

- Y

------ 10.1 Estimating with Confideng- -'

1 ,y: 312 means 320.0 to 329.9 I I I

FIGURE 10.7 A rternplot (a) and a normal probability plot (b) of the video screen tension r e g ings for Example 10.5.

A stemplot of the tension readings (Figure 10.7(a)) shows no outliers or strong skewness. The norman probability plot in Figure 10.7(b) tells us that the sample data are approximately normally distributed. These data give us no reason to doubt the nor- mality of the population from which they came.

Step 3: lfthe conditions are met, cany out fhe inference procedure. You can check that the mean tension reading for the 20 screens in our sample is T = 306.3 mV. The confidence interval formula is jZ f z *a 16. For a 90% confidence level, the critical value is Z* = 1.645. So the 90% confidence interval for p is

Stefi 4: Interpret your results in the context ofthe problem. We are90% confident that the true mean tension in the entire batch of video terminals produced that day is between 290.5 and 322.1 mV.

Suppose that a single computer screen had a tension reading of 306.3 mV, the same value as the sample mean in Example 10.5. Repeating the calculation with n = 1 shows that the 90% confidence interval based on a single measurement is

The mean of twenty measurements gives a smaller margin of error and therefore a shorter interval than a single measurement. ~ i ~ i r e 10.8 illustrates the gain from using 20 observations.

FIGURE 10.8 Confidence intervals for n = 20 ar~d ~n = 1 for Example 10.5. Larger samples give shorter intervals.

0 Introduction to Inference

We will use the four-step process of Example 10.5 throughout our study of statistical inference. You can think of this general structure as your Inference Toolbox. Specific inference procedures (tools) will be added to your toolbox for use in a variety of settings. Examples that use the Inference Toolbox will be

INFERENCE TOOLBOX Confidence intervals

To construct a oonfi$pncs interval: - !-.. I 2 '

SF0 I: Identie @ ~ $ p ~ ~ ~ ~ ~ ~ ~ f i n t e x s t Band the parameteeii'pu .+g&!,o & i a & ~ ~ 4 R d ~ _ s ~ n s about. -

sikf0: choose the ~ ~ ~ ~ n 3 @ & ~ h @ e ' . P ~ P ~ ~ ~ d ~ e e 'venfythe ccjn&@i&~&&i&&$e.&]:mi& pro~duh . s t p ~ 3: If he rond~hF&&$?&$.h&$6utIke infigenice procgdui&'!'! . --7

1; (.

. . ' ' .GI V* &6ijiratk &,fmgagiil of error

;,,, c,:.;. 1

U St@ 4: Interpret your r d ~ i t s in the kqg~eq&$5:$he_@#~blgm~., . I

The form of confidence intervals for the population mean p rests on the fact that the statistic 55 used to estimate p has a normal distribution. Because many sample statistics have normal distributions (at least approximately), it is useful to notice that the confidence interval has the form

estimate f z"u,,imate

The estimate based on the sample is the center of the confidence interval. The margin of error is zQaeStimate. The desired confidence level determines z" from Table A. The standard deviation of the estimate, aeStimate, depends on the particular estimate we use. When the estimate is Tfrom an SRS, the standard deviation of the estimate is u l&.

EXERCISES 10.5 ANALYZING PHARMACEUTICALS A manufacturer of pharmaceutical products analyzes a specimen from each batch of a product to verify the concentration of the active ingredient. The chemical analysis is not perfectly precise. Repeated measurements on the same specimen give slightly different results. The results of repeated measurements follow a normal distribution quite closely. The analysis procedure has no bias, so the mean p of the population of all measurements is the true concentration in the specimen. The standard deviation of this distribution is known to be cr = 0.0068 grams per liter. The laboratory analyzes each specimen three times and reports the mean result.

10.1 Estimating with ~once- _-- - - -__

Three analyses of one specimen give concentrations

Construct a 99% confidence interval for the true concentration p. Use the Inference Toolbox as a guide.

10.6 SURVEYING HOTEL MANAGERS A study of the career paths of hotel general managers sent questionnaires to an SRS of 160 hotels belonging to major U.S. hotel chains. There were 114 responses. The average time these 114 general managers had spent with their current company was 11.78 years. Give a 99% confidence interval for the mean number of years general managers of major-chain hotels have spent with their current company. (Take it as known that the standard deviation of time with the company for all general managers is 3.2 years.) Use the Inference Toolbox as a guide.

10.7 ENGINE CRANKSHAFTS Here are measurements (in millimeters) of a critical dimension on a sample of auto engine crankshafts:

The data come from a production process that is known to have standard deviation rr = 0.060 mm. The process mean is supposed to be p = 224 mm but can drift away from this target during production.

(a) We expect the distribution of the dimension to be close to normal. Make a plot of these data and describe the shape of the distribution.

(b) Give a 95% confidence interval for the process mean at the time these crankshafts were produced.

How confidence intervals behave T h e confidence interval X * z " a I & for the mean of a normal population illustrates several important properties that are shared by all confidence intervals in common use. T h e user chooses the confidence level, and the margin of error follows from this choice. We would like high confidence and also a small margin of error. High confidence says that our method almost always gives correct answers. A small margin of error says that we have pinned down the parameter quite precisely. T h e margin of error is

0 margin of error = z " -

&

This expression has z" and 0 in the numerator and & in the denominator. So the margin of error gets smaller when

z" gets smaller. Smaller z" is the same as smaller confidence level C (look at Figure 10.6 again). There is a trade-off between the confidence level and the

_------ -. Chapter 10 Introduction to Inference

margin of error. To obtain a smaller margin of error from the same data, you must be willing to accept lower confidence.

u gets smaller. The standard deviation u measures the variation in the population. You can think of the variation among individuals in the population as noise that obscures the average value p. It is easier to pin down p when a is small.

n gets larger. Increasing the sample size n reduces the margin of error for any fixed confidence level. Because n appears under a square root sign, we must take four times as many observations in order to cut the margin of error in half.

EXAMPLE 10.6 CHANGING THE CONFIDENCE LEVEL

Suppose that the manufacturer in Example 10.5 wants 99% confidence rather than 90%. Table C gives the critical value for 99% confidence as z* = 2.575. The 99% confidence interval for p based on an SRS of 20 video monitors with mean F = 306.3 is

0- Zf z*-= 43 306.3k2.575- =306.3k 24.8 =(281.5, 331.1

& f i Demanding 99% confidence instead of 90% confidence has increased the margin of error from 15.8 to 24.8. Figure 10.9 compares these two measurements.

240 260 280 300 320 340 360 380

longer interval. FIGURE 10.9 90% and 99% confidence intervals for Example 10.6. Higher confidence requires a

EXERCISES 10.8 CORN YIELD Crop researchers plant 15 plots with a new variety of corn. The yields in bushels per acre are

Assume that (+ = 10 bushels per acre.

(a) Find the 90% confidence interval for the mean yield p for this variety of corn. Use your Inference Toolbox.

(b) Find the 95% confidence interval.

(c) Find the 99% confidence interval.

(d) How do the margins of error in (a), (b), and (c) change as the confidence level increases?

10.1 Estimating with ton-

10.9 MORE CORN Suppose that the crop researchers in Exercise 10.8 obtained the same value of T from a sample of 60 plots rather than 15.

(a) Compute the 95% confidence interval for the mean yield p.

(b) Is the margin of error larger or smaller than the margin of error found for the sample of 15 plots in Exercise 10.8? Explain in plain language why the change occurs.

(c) Will the 90% and 99% intervals for a sample of size 60 be wider or narrower than those for n = 15? (You need not actually calculate these intervals.)

10.10 CONFIDENCE LEVEL AND INTERVAL LENGTH Examples 10.5 and 10.6 give confidence intervals for the screen tension p based on 20 measurements with F = 306.3 and a = 43. The 99% confidence interval is 281.6 to 331.1 and the 90% confidence interval is 290.5 to 322.1.

(a) Find the 80% confidence interval for p.

(b) Find the 99.9% confidence interval for p.

(c) Make a sketch like Figure 10.9 to compare all four intervals. How does increasing the confidence level affect the length of the confidence interval?

10.11 SAMPLE SIZE AND MARGIN OF ERROR Find the margin of error for 90% confidence in Example 10.5 if the manufacturer measures the tension of 80 video monitors. Check that your result is half as large as the margin of error based on 20 measurements in Example 10.5.

Choosing the sample size A wise user of statistics never plans data collection without planning the inference at the same time. You can arrange to have both high confidence and a small margin of error by taking enough observations. T h e margin of error of the confidence interval for the mean of a normally distributed population is rn = z" u/&. To obtain a desired margin of error m, substitute the value of zd for your desired confidence level, set the expression for m less than or equal to the specified margin of error, and solve the inequality for n. The procedure is best illustrated with an example.

EXAMPLE 10.7 DETERMINING SAMPLE SIZE

Company management wants a report of the mean screen tension for the day's production accurate to within f 5 mV with 95% confidence. How large a sample of video monitors must be measured to comply with this request?

For 95% confidence, Table C gives z" = 1.96. We know that a = 43. Set the margin of error to be at most 5:

Introduction to Inference

43 1.96-< 5

&- J;; , 0.96)(43)

5 & 2 16.856 n 2 284.125 so take n = 285

Because n is a whole number, the company must measure the tension of 285 video screens to meet management's demand. On learning the cost of this many measurements, management may reconsider this request!

Here is the principle:

SAMPLE SIZE FOR DESIRED MARGIN OF ERROR

To determine the sample size n that will yield a confidence interval for a population mean with a specified margin of error m, set the expression for the margin of error to be less than or equal to m and solve for n:

0 z*-ern

&-

In practice, taking observations costs time and money. The required sample size may be impossibly expensive. Do notice once again that it is the size of the sample that determines the margin of error. The size of the fiofiulation (as long as the population is much larger than the sample) does not influence the sample size we need.

EXERCISES 10.12 A BALANCED SCALE? To assess the accuracy of a laboratory scale, a standard weight known to weigh 10 grams is weighed repeatedly. The scale readings are normally distributed with unknown mean (this mean is 10 grams if the scale has no bias). The standard deviation of the scale readings is known to be 0.0002 gram.

(a) The weight is weighed five times. The mean result is 10.0023 grams. Give a 98% confidence interval for the mean of repeated measurements of the weight. (b) How many measurements must be averaged to get a margin of error of *0.0001 with 98% confidence?

10.13 SURVEYING HOTEL MANAGERS, I I How large a sample of the hotel managers in Exercise 10.6 (page 549) would be needed to estimate the mean p within *l year with 99% confidence?

10.14 ENGINE CRANKSHAFTS, II How large a sample of the crankshafts in Exercise 10.7 (page 549) would be needed to estimate the mean p within k0.020 mm with 95% confidence?

--- 10.1 Estimating with ~onfi>ence-.359----- .

Some cautions Any formula for inference is correct only in specific circumstances. If statistical procedures carried warning labels like those on drugs, most inference methods would have long labels indeed. Our handy formula i f z * u / & for estimating a normal mean comes with the following list of warnings for the user:

The data must be an SRS from the population. We are completely safe if we actually carried out the random selection of an SRS. We are not in great danger if the data can plausibly be thought of as observations taken at random from a p o p ulation. That is the case in Exercise 10.5 (page 548), where we have in mind the population resulting from a very large number of repeated analyses of the same specimen.

The formula is not correct for probability sampling designs more complex than an SRS. Correct methods for other designs are available. We will not discuss confidence intervals based on multistage or stratified samples. If you plan such samples, be sure that you (or your statistical consultant) know how to carry out the inference you desire.

There is no correct method for inference from data haphazardly collected with bias of unknown size. Fancy formulas cannot rescue badly produced data.

Because X is strongly influenced by a few extreme observations, outliers can have a large effect on the confidence interval. You should search for outliers and try to correct them or justify their removal before computing the interval. If the outliers cannot be removed, ask your statistical consultant about procedures that are not sensitive to outliers.

If the sample size is small and the population is not normal, the true confidence level will be different from the value C used in computing the interval. Examine your data carefully for skewness and other signs of nonnormality. The interval relies only on the distribution of T, which even for quite small sample sizes is much closer to normal than the individual observations. When n r 15, the confidence level is not greatly disturbed by nonnormal populations unless extreme outliers or quite strong skewness are present. We will discuss this issue in more detail in the next chapter.

You must know the standard deviation a of the population. This unrealistic re- quirement renders the interval Z f z * a / & of little use in statistical practice. We will learn in the next chapter what to do when u is unknown. However, if the sample is large, the sample standard deviation s will be close to the unknown a. Then i k z * s 1 & is an approximate confidence interval for p.

The most important caution concerning confidence intervals is a conse- quence of the first of these warnings. The margin of error in a confidence interval covers only random sampling errors. The margin of error is obtained from the sampling distribution and indicates how much error can be expected because of chance variation in randomized data production. Practical difficulties, such as undercoverage and nonresponse in a sample survey, can cause additional errors

/

=-==SK Chapter 10 Introduction to Inference

that may be larger than the random sampling error. Remember this unpleasant fact when reading the results of an opinion poll or other sample survey. The practical conduct of the survey influences the trustworthiness of its results in ways that are not included in the announced margin of error.

Every inference procedure that we will meet has its own list of warnings. Because many of the warnings are similar to those above, we will not print the full warning label each time. It is easy to state (from the mathematics of probability) conditions under which a method of inference is exactly correct. These conditions are n m r fully met in practice. For example, no population is exactly normal. Deciding when a statistical procedure should be used in practice often requires judgment assisted by exploratory analysis of the data.

Finally, you should understand what statistical confidence does not say. We are 95% confident that the mean SAT Math score for all California high school seniors lies between 452 and 470. That is, these numbers were calculated by a method that gives correct results in 95% of all possible samples. We cannot say that the probability is 95% that the true mean falls between 452 and 470. No randomness remains after we draw one particular sample and get from it one particular interval. The true mean either is or is not between 452 and 470. The probability calculations of standard statistical inference describe how often the method gives correct answers.

EXERCISES 1 '

1 . . 10.15 ATALK SHOW OPINION POLL A radio talk show invites listeners to enter a dispute about a proposed pay increase for city council members. "What yearly pay do you think council members should get? Call us with your number." In all, 958 people call. The mean pay they suggest is T = $8740 per year, and the standard deviation of the responses is s = $1 125. For a large sample such as this, s. is very close to the unknown population a. The station calculates the 95% confidence interval for the mean pay p that all citizens would propose for council members to be $8669 to $881 1.

(a) Is the station's calculation correct?

(b) Does their conclusion describe the population of all the city's citizens? Explain your answer.

10.16 THE 2000 PRESIDENTIAL ELECTION A closely contested presidential election pitted George W. Bush against A1 Gore in 2000. A poll taken immediately before the 2000 election showed that 5 1% of the sample intended to vote for Gore. The polling organization announced that they were 95% confident that the sample result was within f 2 points of the true percent of all voters who favored Gore.

(a) Explain in plain language to someone who knows no statistics what "95% confident" means in this announcement.

(b) The poll showed Gore leading. Yet the polling organization said the election was too close to call. Explain why.

(c) On hearing of the poll, a nervous politician asked, "What is the probability that over half the voters prefer Gore?" A statistician replied that this question can't be

10.1 Estimating with confidence 555

answered from the poll results, and that it doesn't even make sense to talk about such a probability. Explain why.

10.17 PRAYER IN THE SCHOOLS The New York TimesICBS News Poll asked the question, "Do you favor an amendment to the Constitution that would permit organized prayer in public schools?" Sixty-six percent of the sample answered 'Yes." The article describing the poll says that it "is based on telephone interviews conducted from Sept. 13 to Sept. 18 with 1,664 adults around the United States, excluding Alaska and Hawaii. . . . the telephone numbers were formed by random digits, thus permitting access to both listed and unlisted residential numbers."

(a) The article gives the margin of error as 3 percentage points. Make a confidence statement about the percent of.all adults who favor a school prayer amendment.

(b) The news article goes on to say: "The theoretical errors do not take into account a margin of additional error resulting from the various practical difficulties in taking any survey of public opinion." List some of the "practical difficulties" that may cause errors in addition to the *3% margin of error. Pay particular attention to the news article's description of the sampling method.

10.18 95% CONFIDENCE A student reads that a 95% confidence interval for the mean SAT Math score of California high school seniors is 452 to 470. Asked to explain the .

meaning of this interval, the student says, "95% of California high school seniors have SAT Math scores between 452 and 470." Is the student right? Justify your answer.

SUMMARY - . . . . 8 % . ' . . - I I I

A confidence interval uses sample data to estimate an unknown population parameter with an indication of how accurate the estimate is and of how confident we are that the result is correct.

Any confidence interval has two parts: an interval computed from the data and a confidence level. T h e interval often has the form

estimate f margin of error

T h e confidence level states the probability that the method will give a correct answer. That is, if you use 95% confidence intervals often, in the long run 95% of your intervals will contain the true parameter value. You do not know whether a 95% confidence interval calculated from a particular set of data contains the true parameter value.

A level C confidence interval for t he mean p of a normal population with known standard deviation 0, based on an SRS of size n, is given by

T h e critical value z" is chosen so that the standard normal curve has area C between -zQ and zQ. Because of the central limit theorem, this interval is approximately correct for large samples when the population is not normal.

Introduction to Inference

Other things being equal, the margin of error of a confidence interval gets smaller as

the confidence level C decreases

the population standard deviation a decreases

the sample size n increases

The sample size required to obtain a confidence interval with specified margin of error m for a normal mean is found by setting

u z"-ern

.IF;-

and solving for n, where z' is the critical value for the desired level of confidence. Always round n up when you use this formula.

A specific confidence interval recipe is correct only under specific conditions. The most important conditions concern the method used to produce the data. Other factors such as the form of the population distribution may also be important.

Use the Inference Toolbox (page 548) as a guide when you construct a confidence interval.

SECTION 10.1 EXERCISES

10.19 WHO SHOULD GET WELFARE? A news article on a Gallup Poll noted that "28 percent of the 1548 adults questioned felt that those who were able to work should be taken off welfare." The article also said, "The margin of error for a sample size of 1548 is plus or minus three percentage points." Opinion polls usually announce margins of error for 95% confidence. Using this fact, explain to someone who knows no statistics what "margin of error plus or minus three percentage points" means.

10.20 HOTEL COMPUTER SYSTEMS How satisfied are hotel managers with the computer systems their hotels use? A survey was sent to 560 managers in hotels of size 200 to 500 rooms in Chicago and De t r~ i t .~ In all, 135 managers returned the survey. Two questions concerned their degree of satisfaction with the ease of use of their computer systems and with the level of computer training they had received. The managers responded using a seven-point scale, with 1 meaning "not satisfied," 4 meaning "mod- erately satisfied," and 7 meaning "very satisfied."

(a) What do you think is the population for this study? There are some major shortcomings in the data production. What are they? These shortcomings reduce the value of the formal inference you are about to do.

(b) The measurements of satisfaction are certainly not normally distributed, because they take only whole-number values from 1 to 7. Nonetheless, the use of confidence intervals based on the normal distribution is justified for this study. Why?

- 10.1 Estimating with ~onfiden~-1=~-5+----s:

(c) The mean response for satisfaction with ease of use was jT = 5.396. Give a 95% confidence interval for the mean in the entire population. (Assume that the population standard deviation is u = 1.75.) Use your Inference Toolbox.

(d) For satisfaction with training, the mean response was ?i = 4.398. Taking u = 1.75, give a 99% confidence interval for the population mean.

10.21 A NEWSPAPER POLL A New York Times poll on women's issues interviewed 1025 women and 472 men randomly selected from the United States, excluding Alaska and Hawaii. The poll announced a margin of error o f f 3 percentage points for 95% confidence in conclusions about women. The margin of error for results concerning men was 24 percentage points. Why is this larger than the margin of error for women?

10.22 HEALING OF SKIN WOUNDS Biologists studying the healing of skin wounds measured the rate at which new cells closed a razor cut made in the skin of an anesthetized newt. Here are data from 18 newts, measured in micrometers (millionths of a meter) per hour:?

(a) Make a stemplot of the healing rates (split the stems). It is difficult to assess nor- mality from 18 observations, but look for outliers or extreme skewness. Now make a normal probability plot. What do you find?

(b) Scientists usually assume that animal subjects are SRSs from their species or genetic type. Treat these newts as an SRS and suppose you know that the standard deviation of healing rates for this species of newt is 8 micrometers per hour. Give a 90% confidence interval for the mean healing rate for the species.

(c) A friend who knows almost no statistics follows the formula X f r * u 1 & in a biol- ogy lab manual to get a 95% confidence interval for the mean. Is her interval wider or narrower than yours? Explain to her why it makes sense that higher confidence changes the length of the interval.

10.23 MORE NEWTS! How large a sample would enable you to estimate the mean healing rate of skin wounds in newts (see Exercise 10.22) within a margin of error of 1 micrometer per hour with 90% co&dence?

10.24 DETERMINING SAMPLE SIZE Researchers planning a study of the reading ability of third-grade children want to obtain a 95% confidence interval for the population mean score on a reading test, with margin of error no greater than 3 points. They carry out a small pilot study to estimate the variability of test scores. The sample standard deviation is s = 12 points in the pilot study, so in preliminary calculations the researchers take the population standard deviation to be u = 12.

(a) The study budget will allow as many as 100 students. Calculate the margin of error of the 95% confidence interval for the population mean based on n = 100.

(b) There are many other demands on the research budget. If all of these demands were met, there would be funds to measure only 10 children. What is the margin of error of the confidence interval based on n = 10 measurements?

-

358 chapter 10 Introduction to Inference

! (c) Find the smallest value of n that would satisfy the goal of a 95% confidence interval with margin of error 3 or less. Is this sample size within the limits of the budget?

10.25 HOW THE POLL WAS CONDUCTED The New York Times includes a box entitled "How the poll was conducted" in news articles about its own opinion polls. Here are quota- tions from one such box (March 26,1995). The box also announced a margin of error o f f 3%, with 95% confidence.

The latest New York TimeslCBS News poll is based on telephone intervims conducted March 9 through 12 with 1,156 adults around the United States, excluding Alaska and Hawaii. [The box then describes random digit dialing, the method used to select the sample.]

In addition to sampling error, the practical difficulties of conducting any survey of public opinion may introduce other sources of error into the poll. Variations in question wording or in the order of questions, for instance, can lead to somewhat different results.

(a) This account mentions several sources of possible errors in the poll's results. List these sources.

(b) Which of the sources of error you listed in (a) are covered by the announced margin of error?

10.26 CALCULATING CONFIDENCE INTERVALS

(a) Use your TI-83/89 to find the confidence interval for the mean rate of healing for newts in Exercise 10.22. Follow the Technology Toolbox below.

(b) If you have summary statistics but not raw data, you would select "Stats" as your input method and then provide the sample mean 5i, the population standard deviation a, and the number n of observations. Use your calculator in this way to find the confidence interval for the mean number of years hotel managers have spent with their current company in Exercise 10.6 (page 549).

Caution: Calculating the confidence interval is only one part of the inference process. Follow the steps in your Inference Toolbox.

TECHNOLOGY TOOLBOX Calculator confidence intervals

You can use your TI-83189 to construct confidence interval's, using d h e r data stored in a list or summaj statistics. In Exercise 10.5, for example, we would begin by entering the three specimen concentration; '

into Llnistl.

TI-8 3 TI-89

Press m, choose TESTS, and then choose ress 12nd1m (m), choose 1:Zinterval.. . 8 .

7:Zinterval.. . hoose Input Method = Data. Adjust your settings as shown. djust your settings i s shown. ~resi-. Then choose "Calculate." P I

10.2 TESTS OF SIGNIFICANCE

Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population parameter. T h e second common type of inference, called tests of significance, has a different goal: to assess the evidence provided by data about some claim concerning a population. Here is the reasoning of statistical tests in a nutshell.

EXAMPLE 10.8 I'M A GREAT FREE-THROW SHOOTER

I claim that I make 80% of my basketball free throws. To test my claim, you ask me to shoot 20 free throws. I make only 8 of the 20. "Aha!" you say. "Someone who makes 80% of his free throws would almost never make only 8 out of 20. So I don't believe your claim."

Your reasoning is based on asking what would happen if my claim were true and we repeated the sample of 20 free throws many times-I would almost never make as few as 8. This outcome is so unlikely that it gives strong evidence that my claim is not true.

You can say how strong the evidence against my claim is by giving the probability that I would make as few as 8 out of 20 free throws if I really make 80% in the long run. This probability is 0.0001.. I would make as few as 8 of 20 only once in 10,000 tries in the long run if my claim to make 80% is true. The small probability convinces you that my claim is false.

ACTIVITY 10 - Mrs. Hamilton AP...

Documents

Transcript of ACTIVITY 10 - Mrs. Hamilton AP...