8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m...

34
8- 1 ght © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l p g n i m a S Methods & Central Limit Theorem

Transcript of 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m...

Page 1: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 1

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

llpp ggnniimmaaSSMethodsMethods&&

Central Limit TheoremCentral Limit Theorem

Page 2: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 2

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

When you have completed this chapter, you will be able to:

Describe methods for selecting a sample.

Define and construct a sampling distribution of the sample mean.

1.

2.

3.

Explain under what conditions sampling is the proper way to learn something about a population.

Explain the central limit theorem.4.

Use the central limit theorem to find probabilities of selecting possible sample means from

a specified population.

5.

Page 3: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 3

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

We use sample information to make

decisions or inferences about the population.

We use sample information to make

decisions or inferences about the population.

Two KEY steps:Two KEY steps:

1. Choice of a proper method for selecting sample data

&2. Proper analysis of the sample data (more later)

KEY 1.KEY 1.

Page 4: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 4

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Page 5: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 5

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

KEY 1.KEY 1.If the proper

method for selecting the sample is

NOT MADE

If the proper method for selecting the sample is

NOT MADE … the SAMPLE will not be truly

representative of the

TOTAL Population!

… the SAMPLE will not be truly

representative of the

TOTAL Population! … and wrong conclusions can be drawn!

Page 6: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 6

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

…of the physical impossibility of checking all items in the population, and,

also, it would be too time-consuming

$ …the studying of all the items in a population would NOT be cost effective

…the sample results are usually adequate

…the destructive nature of certain tests

Why Sample the Population?Why Sample the Population?Why Sample the Population?Why Sample the Population?

Because…

Page 7: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 7

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

with Replacementwith Replacement

Each data unit in the population is allowed to

appear in the sample more than once

Each data unit in the population is allowed to

appear in the sample more than once

Each data unit in the population is allowed to appear in the sample

no more than once

Each data unit in the population is allowed to appear in the sample

no more than once

Each data unit in the population

has a known likelihood of being

included in the sample

Each data unit in the population

has a known likelihood of being

included in the sample

Non-Probability SamplingNon-Probability Sampling

Does not involve random selection; inclusion of an item

is based on convenience

Does not involve random selection; inclusion of an item

is based on convenience

Probability SamplingProbability Sampling

without Replacementwithout Replacement

Techniques

Page 8: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 8

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Methods

Simple Random

Systematic Random

Stratified Random

Cluster

...each item(person) in the population has an equal chance of being included

…items(people) of the population are arranged in some order.

A random starting point is selected, and then every kth member of the population

is selected for the sample…a population is

first divided into subgroups, called strata, and a sample is selected from each

strata…a population is first divided into primary units, and samples are

selected from each unit

Page 9: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 9

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Terminology

… is the difference between a sample statistic and its

corresponding population parameter

… is the difference between a sample statistic and its

corresponding population parameter

… is a probability distribution consisting of

all possible sample means of a given sample size

selected from a population

… is a probability distribution consisting of

all possible sample means of a given sample size

selected from a population

“Sampling error”

“Sampling distribution of the sample mean”

ExampleExample

Page 10: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 10

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The law firm of Hoya and Associates has five partners. At their weekly partners meeting each reported the number of hours they billed their clients last week:

ExampleExamplePartner Hours

Dunn 22

Hardy 26

Kiers 30

Malinowski 26

Tillman 22

If two partners are selected randomly…how many different samples are possible?

If two partners are selected randomly…how many different samples are possible?

Page 11: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 11

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Partner Hours

Dunn 22

Hardy 26

Kiers 30

Malinowski 26

Tillman 22

ObjectsObjects

5 …taken 2 at a time

Using 5C2 …Using 5C2 …

…for a Total of 10 Samples!

If two partners are selected randomly…how many different samples are possible?

If two partners are selected randomly…how many different samples are possible?

Page 12: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 12

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Partner Hours

Dunn 22

Hardy 26

Kiers 30

Malinowski 26

Tillman 22

ObjectsObjects

5

5C2 =5C2 =

5!=

2!

= 10 Samples= 10 Samples

(5 – 2!)

If two partners are selected randomly…how many different samples are possible?

If two partners are selected randomly…how many different samples are possible?

Page 13: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 13

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Partners Samples of 2 Mean

1&2

1&3

1&4

1&5

2&3

2&4

2&5

3&4

3&5

4&5

(22+26)/2 =

(22+30)/2 =(22+26)/2 =

242624

(22+22)/2 =(26+30)/2 =(26+26)/2 =(26+22)/2 =(30+26)/2 =(30+22)/2 =

(26+22)/2 =

22282624282624

Page 14: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 14

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sample Mean Frequency Relative frequency

Probability

Organize the sample means into a Sampling Distribution

Organize the sample means into a Sampling Distribution

Example …continuedExample …continuedMean

24

26

24

22

28

26

24

28

26

24

22

24

26

28

1

4

3

2

1/10

4/10

3/10

2/10

10 Samples

10 Samples

Page 15: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 15

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sample Mean Frequency

10

22(1)+ 24(4)+ 26(3) + 28(2)

Example …continuedExample …continued

22

24

26

28

1

4

3

2

Compute the mean of the sample means. Compare it with the population mean

Compute the mean of the sample means. Compare it with the population mean

= 25.2 = 25.2

X

Page 16: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 16

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Example …continuedExample …continued

52226302622

The population mean is also the same as the sample means…25.2 hours!

The population mean is also the same as the sample means…25.2 hours!

Note

Partner Hours

Dunn 22

Hardy 26

Kiers 30

Malinowski 26

Tillman 22

= 25.2 = 25.2

Page 17: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 17

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

The sampling distribution of the means of all possible samples of

size ngenerated from the population

will be approximately normally distributed!

The sampling distribution of the means of all possible samples of

size ngenerated from the population

will be approximately normally distributed!

Central Limit TheoremCentral Limit Theorem

Sampling Distributions:Sampling Distributions:

µµ

VarianceVariance 2 /n2 /n

Mean (µx )Mean (µx )

/ n/ nStandard Deviation

(standard error of the mean)Standard Deviation

(standard error of the mean)

X

Page 18: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 18

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

sample meansample standard deviation

sample variancesample proportion

A point estimate is one value ( a single point) that is used to estimate a population

parameter

Point EstimatesPoint Estimates

MoreMore

Page 19: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 19

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Point EstimatesPoint Estimates

Population follows… the normal distribution

Population follows… the normal distribution

The sampling distribution of the sample means also follows

the normal distribution

The sampling distribution of the sample means also follows

the normal distribution

Probability of a sample mean falling within a particular region, use:

Probability of a sample mean falling within a particular region, use: Z =

nX

Population does NOT follow… the normal

distribution

Population does NOT follow… the normal

distributionIf the sample is of at least 30

observations, the sample WILL follow the normal distribution

If the sample is of at least 30 observations, the sample WILL follow the normal distribution

Probability of a sample mean falling within a particular region, use:

Probability of a sample mean falling within a particular region, use: Z =

n

X

s

Page 20: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 20

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Central Limit TheoremCentral Limit Theorem

Chart 8 – 6 Results for Several PopulationsChart 8 – 6 Results for Several Populations

Page 21: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 34

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Suppose it takes an average of 330 minutes

for taxpayers to prepare, copy, and mail an income tax

return form.

Suppose it takes an average of 330 minutes

for taxpayers to prepare, copy, and mail an income tax

return form.

Using the Sampling Distribution of the Sample Mean

Using the Sampling Distribution of the Sample Mean

= 12.6= 12.6

A consumer watchdog agency selects a random sample of 40 taxpayers and finds the standard deviation of the time needed is 80 minutes

A consumer watchdog agency selects a random sample of 40 taxpayers and finds the standard deviation of the time needed is 80 minutes

What is the standard error of the mean?What is the standard error of the mean?

Data…Data…

/n /nFormula Formula = 80 /= 80 / 40

Page 22: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 35

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

What is the likelihood the sample mean is greater than 320 minutes?

What is the likelihood the sample mean is greater than 320 minutes?

Using the Sampling Distribution of the Sample Mean

Using the Sampling Distribution of the Sample Mean

Suppose it takes an average of 330 minutes for taxpayers to prepare, copy, and mail an income tax

return form. A consumer watchdog agency selects a random sample of 40 taxpayers and

finds the standard deviation of the time needed is 80 minutes.

Suppose it takes an average of 330 minutes for taxpayers to prepare, copy, and mail an income tax

return form. A consumer watchdog agency selects a random sample of 40 taxpayers and

finds the standard deviation of the time needed is 80 minutes.

Data…Data…

nswer…

Page 23: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 36

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using the Sampling Distribution of the Sample Mean

Using the Sampling Distribution of the Sample Mean

What is the likelihood the sample mean is greater than 320 minutes?

What is the likelihood the sample mean is greater than 320 minutes?

* average of 330 minutes *random sample of 40 * standard deviation is 80 minutes

* average of 330 minutes *random sample of 40 * standard deviation is 80 minutes

Data…Data…

ns

Xz

Formula Formula

4080

330320 = 0.79= 0.79

1

330 320

a1

Page 24: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 37

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Using the Sampling Distribution of the Sample Mean

Using the Sampling Distribution of the Sample Mean

What is the likelihood the sample mean is greater than 320 minutes?

What is the likelihood the sample mean is greater than 320 minutes?

* average of 330 minutes *random sample of 40 * standard deviation is 80 minutes

* average of 330 minutes *random sample of 40 * standard deviation is 80 minutes

Data…Data…

Look up 0.79 in Table

Look up 0.79 in Table

2

a1 =0.2852a1 =0.2852 Required Area =0.2852 + .5 = 0.7852

Required Area =0.2852 + .5 = 0.7852

330 320

a1

Page 25: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 38

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sampling Distribution of Proportion

Sampling Distribution of Proportion

The normal distribution

(a continuous distribution)

yields a good approximation of the

binomial distribution (a discrete

distribution)

for large values of n.

Use when np and n(1- p ) are both greater than 5!Use when np and n(1- p ) are both greater than 5!

Page 26: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 39

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

np

)1 ( p np

Mean and Varianceof a

Binomial Probability Distribution

Mean and Varianceof a

Binomial Probability Distribution

2

Formula Formula

2Formula Formula

Page 27: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 40

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

A multinational company claims that 55% of its employees are bilingual. To verify this claim, a

statistician selected a sample of 60 employees of the company using simple random sampling and

found 48% to be bilingual.

np =

60(.55) = 33

n(1- p ) = 60(.45) = 27

The sample size is big enough to use the normal

approximation with a mean of .55 and a standard

deviation of (.55)(.45)/60 = 0.064

The sample size is big enough to use the normal

approximation with a mean of .55 and a standard

deviation of (.55)(.45)/60 = 0.064

Sampling Distribution of Proportion

Sampling Distribution of Proportion

Based on this information, what can we say about the

company’s claim?

Page 28: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 41

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

sX

z

1 Z = (0.48 -0.55) / 0.064Z = -1.09

Look up 1.09 in TableLook up 1.09 in Table22

a1 =0.3621a1 =0.3621

Required Area = .5 – 0.3621 =

0.1379 or 14%

Required Area = .5 – 0.3621 =

0.1379 or 14%

Sampling Distribution of Proportion

Sampling Distribution of Proportion

…continued …continued

Formula Formula

.55 .48

a1

Page 29: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 42

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

sX

z

1 Z = (0.48 -0.55) / 0.064Z = -1.09

Look up 1.09 in TableLook up 1.09 in Table22

a =0.3621a =0.3621

Required Area = .5 – 0.3621 =

0.1379 or 14%

Required Area = .5 – 0.3621 =

0.1379 or 14%

There is approximately

a 14% chance that the

company’s claim is true, based on

this sample.

There is approximately

a 14% chance that the

company’s claim is true, based on

this sample.

Sampling Distribution of Proportion

Sampling Distribution of Proportion

Conclusion

…continued …continued

Formula Formula

Page 30: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 43

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Suppose the mean selling price of a litre of gasoline in Canada is $.659.

Further, assume the distribution is positively skewed, with a standard deviation of $0.08.

What is the probability of selecting a sample of 35 gasoline stations and finding the sample mean within $.03 of the population mean?

Sampling Distribution of Mean

Sampling Distribution of Mean

Page 31: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 44

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Sampling Distribution of Mean

Sampling Distribution of Mean

nsz X

1 3508.0$

659$.629$. 22.-2

nsz X

2 3508.0$

659$.689$. 2.22

mean selling price is $.659 SD of $0.08Sample of 35 gasoline stations

Probability of sample mean within $.03?

mean selling price is $.659 SD of $0.08Sample of 35 gasoline stations

Probability of sample mean within $.03?

Data…Data…

Find the z-scores for.659 +/- .03

Find the z-scores for.659 +/- .03 i.e. 0.629 and .689

.629 .689

Page 32: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 45

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

We would expect about 97%

of the sample means to be within $0.03 of the population mean.

We would expect about 97%

of the sample means to be within $0.03 of the population mean.

a1 = .4868a2 = .4868

Sampling Distribution of Mean

Sampling Distribution of Mean

mean selling price is $.659 SD of $0.08Sample of 35 gasoline stations

Probability of sample mean within $.03?

mean selling price is $.659 SD of $0.08Sample of 35 gasoline stations

Probability of sample mean within $.03?

Data…Data…

Find areas from table…Find areas from table…

Required A = .9736

z -2.221z2.222

Page 33: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 46

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Test your learning…Test your learning…

www.mcgrawhill.ca/college/lindClick on…Click on…

Online Learning Centrefor quizzes

extra contentdata setssearchable glossaryaccess to Statistics Canada’s E-Stat data…and much more!

Page 34: 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

8- 47

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

This completes Chapter 8This completes Chapter 8