The Sampling Distribution of the Sample Mean The Sampling Distribution of the Sample Proportion...

78
The Sampling Distribution The Sampling Distribution of the Sample Mean of the Sample Mean The Sampling Distribution The Sampling Distribution of the Sample Proportion of the Sample Proportion Chapter 6 Sampling Chapter 6 Sampling Distributions( Distributions( 样样样样 样样样样 ) )

Transcript of The Sampling Distribution of the Sample Mean The Sampling Distribution of the Sample Proportion...

Page 1: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

The Sampling Distribution of The Sampling Distribution of

the Sample Mean the Sample Mean

The Sampling Distribution of The Sampling Distribution of

the Sample Proportion the Sample Proportion

Chapter 6 Sampling Chapter 6 Sampling Distributions(Distributions( 样本分布样本分布 ))

Page 2: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Sample Mean Let there be a population of units of size Let there be a population of units of size NN Consider all its samples of a fixed size Consider all its samples of a fixed size n n ((n<Nn<N)) For all possible samples of size For all possible samples of size nn, we obtain a , we obtain a

population of sample means. That is, population of sample means. That is, is a random is a random variable which may have all these means as its valuesvariable which may have all these means as its values

Before we draw the sample, the sample mean Before we draw the sample, the sample mean is a random variable.

We consider the probability distribution of the random We consider the probability distribution of the random

variable variable , i.e., the probability distribution for the , i.e., the probability distribution for the population of sample meanspopulation of sample means

x

x

x

Page 3: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 6.1 The Sampling Distribution of Section 6.1 The Sampling Distribution of the Sample Meanthe Sample Mean

The The sampling distribution of the samplesampling distribution of the sample meanmean is the probability distribution of the is the probability distribution of the population of the sample means obtainable population of the sample means obtainable from all possible samples of size from all possible samples of size n n from a from a population of size population of size N.N.

Page 4: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 6.1Example 6.1 The Stock Return Case• We have a population of the percent returns from six

stocks– In order, the values of % return are:

10%, 20%, 30%, 40%, 50%, and 60%• Label each stock A, B, C, …, F in order of

increasing % return• The mean rate of return is 35% with a standard

deviation of 17.078%– Any one stock of these stocks is as likely to be picked

as any other of the six• Uniform distribution with N = 6• Each stock has a probability of being picked of 1/6

Page 5: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Stock

% Return

Frequency

Relative Frequency

Stock A 10 1 1/6 Stock B 20 1 1/6 Stock C 30 1 1/6 Stock D 40 1 1/6 Stock E 50 1 1/6 Stock F 60 1 1/6 Total 6 1

The Stock Return Case 2 ﹟

Page 6: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Now, select all possible samples of size Now, select all possible samples of size nn = 2 from this population of stocks of = 2 from this population of stocks of size size N N = 6= 6

– That is, select all possible pairs of That is, select all possible pairs of stocksstocks

• How to select?How to select?

– Sample randomlySample randomly

– Sample without replacementSample without replacement

– Sample without regard to orderSample without regard to order

The Stock Return Case 3 ﹟

Page 7: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Result: There are 15 possible Result: There are 15 possible samples of size samples of size nn = 2 = 2

• Calculate the sample mean of each and Calculate the sample mean of each and every sampleevery sample

• For example, if we choose the two For example, if we choose the two stocks with returns 10% and 20%, then stocks with returns 10% and 20%, then the sample mean is 15%the sample mean is 15%

26C

The Stock Return Case 4 ﹟

Page 8: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

SampleMean Frequency

RelativeFrequency

15 1 1/1520 1 1/1525 2 2/1530 2 2/1535 3 3/1540 2 2/1545 2 2/1550 1 1/1555 1 1/15

The Stock Return Case 5 ﹟

Page 9: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The population of N = 6 stock returns has a uniform distribution.

• But the histogram of n = 15 sample mean returns:

1. Seems to be centered over the same mean return of 35%, and

2. Appears to be bell-shaped and less spread out than the histogram of individual returns

Observations

Page 10: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 6.2Example 6.2 Sampling all the stocks

• Consider the population of returns of all 1,815 stocks listed on NYSE for 1987– See Figure 6.2(a) on next slide– The mean rate of return was –3.5%

with a standard deviation of 26%• Draw all possible random samples of size n

= 5 and calculate the sample mean return of each– Sample with a computer– See Figure 6.2(b) on next slide

Page 11: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Page 12: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Observations– Both histograms appear to be bell-shaped and

centered over the same mean of –3.5%– The histogram of the sample mean returns

looks less spread out than that of the individual returns

• Statistics– Mean of all sample means: = = -3.5%

– Standard deviation of all possible means:

%63.115

26

nx

x

Page 13: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

If the population of individual items is normal, then the population of all sample means is also normal

Even if the population of individual items is not normal, there are circumstances that the population of all sample means is normal (see Central Limit Theorem( 中心极限定理 ) later)

General Conclusions

Page 14: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The mean of all possible sample means equals the The mean of all possible sample means equals the population meanpopulation mean– That is,That is, =

• The standard deviation sx of all sample means is less than the standard deviation of the population– That is, <

• Each sample mean averages out the high and the low measurements, and so are closer to m than many of the individual population measurements

General Conclusions

x

x

Page 15: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The empirical rule holdsThe empirical rule holds for the sampling distribution for the sampling distribution of the sample meanof the sample mean– 68.26%68.26% of all possible sample means are within of all possible sample means are within

(plus or minus) one standard deviation(plus or minus) one standard deviation ofof

– 95.44% of all possible observed values of x are within (plus or minus) two ofof

• In the example., 95.44% of all possible sample mean returns are in the interval [-3.5 ± (211.63)] = [-3.5 ± 23.26]

• That is, 95.44% of all possible sample means are between -26.76% and 19.76%

– 99.73% of all possible observed values of x are within (plus or minus) three of

x

x

x

Page 16: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Properties of the SamplingProperties of the SamplingDistribution of the Sample Mean #1Distribution of the Sample Mean #1

• If the population being sampled is normal, If the population being sampled is normal, then so is the sampling distribution of the then so is the sampling distribution of the sample meansample mean,

• The mean of the sampling distribution of is

=

That is, the mean of all possible sample means is the same as the population mean

x

x

x

x

Page 17: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The variance The variance 22 of the sampling distribution of the sampling distribution

of of is is

That is, the variance of the sampling That is, the variance of the sampling distribution of distribution of is is

directly proportional to the variance of directly proportional to the variance of the population, and the population, and

inversely proportional to the sample inversely proportional to the sample sizesize

Properties of the SamplingProperties of the SamplingDistribution of the Sample Mean #2Distribution of the Sample Mean #2

nx

22

x

x

x

Page 18: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Properties of the SamplingProperties of the SamplingDistribution of the Sample Mean #3Distribution of the Sample Mean #3• The standard deviation of the sampling

distribution of is

That is, the standard deviation of the sampling distribution of is

directly proportional to the standard deviation of the population, and

inversely proportional to the square root of the sample size

nx

xx

x

Page 19: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

is the point estimate of , and the larger the sample size n, the more accurate the estimate, because when n increases, decreases, is more clustered to the is more clustered to the populationpopulation

–In order to reduceIn order to reduce , take bigger samples!

Notes

x

xx

x

Page 20: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 6.3Example 6.3 Car Mileage Case• Population of all midsize cars of a particular make

and model– Population is normal with mean and standard

deviation – Draw all possible samples of size Draw all possible samples of size nn– Then the sampling distribution of the sample Then the sampling distribution of the sample

mean is normal with meanmean is normal with mean == and standard deviation

– In particular, draw samples of size:• n = 5• n = 49

nx x

Page 21: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

2.23615

x

749

x

So, all possible sample means for n=49 will be more closely clustered around than the case of n =5

Page 22: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Recall from Chapter 2 mileage example, = 31.5531 mpg for a sample of size n=49– With With s = 0.7992s = 0.7992

• Does this give statistical evidence that the Does this give statistical evidence that the population meanpopulation mean is greater than 31 mpg?is greater than 31 mpg?– That is, does the sample mean give evidence thatThat is, does the sample mean give evidence that

is at least is at least 31 mpg31 mpg??• Calculate the probability of observing a sample

mean that is greater than or equal to 31.5531 mpg if = = 31 mpg31 mpg– WantWant P(> 31.5531 if = 31)

Reasoning from Sample Distribution

x

x

Page 23: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Use Use ss as the point estimate for as the point estimate for so that so that

0.114349

79920

.

nx

ThenThen

84.4

1143.0

315531.31

5531.3131 if 5531.31

zP

zP

zPxPx

x

But But z = 4.84z = 4.84 is off the standard normal table is off the standard normal table The largest The largest zz value in the table is value in the table is 3.093.09, which has a , which has a

right hand tail area of right hand tail area of 0.0010.001

Page 24: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Probability that > 31.5531 when = 31= 31x

Page 25: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• z = 4.84 > 3.09z = 4.84 > 3.09, so , so P(z ≥ 4.84) < 0.001P(z ≥ 4.84) < 0.001• That is, ifThat is, if = 31 mpg, then fewer than 1 in 1,000 of

all possible samples have a mean at least as large as samples have a mean at least as large as observedobserved

• Have either of the following explanations:Have either of the following explanations:If is actually is actually 3131 mpg, then picking this sample mpg, then picking this sample

is an almost unbelievable thingis an almost unbelievable thingOROR is not 31 mpgis not 31 mpg

• Difficult to believe such a small chance would occur, so conclude that there is strong evidence thatevidence that does not equal 31 mpg.. – is in fact larger than 31 mpgis in fact larger than 31 mpg

Page 26: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Central Limit TheoremCentral Limit Theorem(( 中心极限定理中心极限定理 )) #1#1

Central Limit TheoremCentral Limit Theorem(( 中心极限定理中心极限定理 )) #1#1

If the population is non-normal, what is the shape of If the population is non-normal, what is the shape of the sampling distribution of the sample means?the sampling distribution of the sample means?

In fact the sampling distribution is approximately In fact the sampling distribution is approximately normal if the sample is large enough, even if the normal if the sample is large enough, even if the population is non-normalpopulation is non-normal

by the “Central Limit Theorem”by the “Central Limit Theorem”

Page 27: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

No matter what is the probability distribution that describes the population, if the sample size n is large enough, then the population of all possible sample means is approximately normal with mean and standard deviation

Further, the larger the sample size n, the closer the sampling distribution of the sample means is to being normal– In other words, the larger n, the better the

approximation

x nx

Page 28: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Random Sample (x1, x2, …, xn)

Population Distribution

(, )

(right-skewed)

X

as n large

n, xx

Sampling Distribution of Sample Means

(nearly normal)

x

Page 29: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 6.4Example 6.4 Effect of the Sample SizeEffect of the Sample Size

The larger the sample The larger the sample size, the more nearly size, the more nearly normally distributed is normally distributed is the population of all the population of all possible sample meanspossible sample means

Also, as the sample size Also, as the sample size increases, the spread of increases, the spread of the sampling distribution the sampling distribution decreasesdecreases

Page 30: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

How Large?How Large?• How large is “large enough?”How large is “large enough?”• If the sample size n is at least 30, then for most If the sample size n is at least 30, then for most

sampled populations, the sampling distribution of sampled populations, the sampling distribution of sample means is approximately normalsample means is approximately normal

Refer to Figure 6.6 on next slideRefer to Figure 6.6 on next slide– Shown in Fig 6.6(a) is an exponential (right Shown in Fig 6.6(a) is an exponential (right

skewed) distributionskewed) distribution– In Figure 6.6(b), 1,000 samples of size In Figure 6.6(b), 1,000 samples of size nn = 5 = 5

» Slightly skewed to rightSlightly skewed to right– In Figure 6.6(c), 1,000 samples with In Figure 6.6(c), 1,000 samples with nn = 30 = 30

» Approximately bell-shaped and normalApproximately bell-shaped and normal• If the population is normal, the sampling distribution If the population is normal, the sampling distribution

of is normal regardless of the sample sizeof is normal regardless of the sample size

x

x

Page 31: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example: Central Limit Theorem Example: Central Limit Theorem SimulationSimulation

Page 32: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Unbiased Estimates(Unbiased Estimates( 无偏估计无偏估计 ))• A sample statistic is an A sample statistic is an unbiasedunbiased point estimate of point estimate of

a population parameter a population parameter if the mean of all possible if the mean of all possible values of the sample statistic equals the population values of the sample statistic equals the population parameterparameter

• is an unbiased estimate of because =– In general, the sample mean is always an

unbiased estimate of – The sample median is often an unbiased estimate

of • But not alwaysBut not always

x x

Page 33: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The sample variance s2 is an unbiased estimate of 2 if the sampled population is infinite– That is why s2 has a divisor of n–1 (if we used n as the divisor when estimating

2 , we would not obtain an unbiased estimate)

However, s is not an unbiased estimate ofis not an unbiased estimate of – Even so, since there is no easy way to

calculate an unbiased point estimate of the usual practice is to use s as an estimate of

Page 34: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Minimum Variance EstimatesMinimum Variance Estimates

(( 最小方差估计最小方差估计 ))• Want the sample statistic to have a small standard deviation

– All values of the sample statistic should be clustered around the population parameter. Then, the statistic from any sample should be close to the population parameter

• Given a choice between unbiased estimates, choose one with smallest standard deviation

• Even though the sample mean and the sample median are both unbiased estimates of , the sampling distribution of sample means has a smaller standard deviation than that of sample medians

Page 35: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

The sample mean is a minimum-variance unbiased estimate ( 最小方差无偏估计 ) of – When the sample mean is used to

estimate , we are more likely to obtain an estimate close to than if we used than if we used any other sample statisticany other sample statistic

– Therefore, the sample mean is the Therefore, the sample mean is the preferred estimate ofpreferred estimate of

Page 36: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 6.2 The Sampling Distribution Section 6.2 The Sampling Distribution of the Sample Proportion(of the Sample Proportion( 样本比例样本比例 ))

For a population of units, we select samples of size n, For a population of units, we select samples of size n, and calculate its proportion for the units of the and calculate its proportion for the units of the sample to be fall into a particular category. sample to be fall into a particular category.

is a random variable and has its probability is a random variable and has its probability distribution.distribution.

The probability distribution of all possible sample The probability distribution of all possible sample proportions is called the proportions is called the sampling distribution of the sampling distribution of the sample proportionsample proportion

Page 37: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

the sampling distribution of isp̂

approximately normal, if n is large (meet the conditions that np≥5 and n(1-p)≥5)

has mean pp ˆ

has standard deviation

n

ppp̂

1

where p is the population proportion for the category

Page 38: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 6.5Example 6.5 The Cheese Spread CaseThe Cheese Spread Case• A food processing company developed a new cheese

spread spout which may save production cost. If only less than 10% of current purchasers do not accept the design, the company would adopt and use the new spout.

• 1000 current purchasers are randomly selected and inquired, and 63 of them say they would stop buying the cheese spread if the new spout were used. So, the sample proportion =0.063.

• To evaluate the strength of this evidence, we ask: if p=0.1, what is the probability of observing a sample of size 1000 with sample proportion 0.063?≦

Page 39: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• If p=0.10, since n=1000, np≥5 and n(1-p)≥5, is approximately normal with p̂

,1.0ˆ pp

,094868.01

ˆ

n

ppp

.001.090.3

094868.0

10.0063.0

063.01.0 if 063.0

zP

zP

zPppP

p

p

Page 40: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• So, if p=0.1, the chance of observing at most 63 out of 1000 randomly selected customers do not accept the new design is less than 0.001

• But such observation does occur. This means that we have extremely strong evidence that p≠0.1, and p is in fact less than 0.1

• Therefore, the company can adopt the new design

Page 41: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

zz-Based Confidence Intervals for a -Based Confidence Intervals for a

Population Mean: Known Population Mean: Known

tt-Based Confidence Intervals for a -Based Confidence Intervals for a

Population Mean: UnknownPopulation Mean: Unknown

Sample Size Determination(Sample Size Determination( 样本样本量计算量计算 ))

Confidence Intervals for a Confidence Intervals for a

Population ProportionPopulation Proportion

Chapter 7 Confidence Chapter 7 Confidence Intervals(Intervals( 置信区间置信区间 ))

Page 42: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 7.1 z-Based Confidence Section 7.1 z-Based Confidence Intervals for a Population Mean Intervals for a Population Mean

• The starting point is the sampling distribution(The starting point is the sampling distribution( 样本分样本分布布 ) of the sample mean) of the sample mean– Recall from Chapter 6 that if a population is Recall from Chapter 6 that if a population is

normally distributed with meannormally distributed with mean and standard deviation , then the sampling distribution of is normal with mean = and standard deviation

– Use a normal curve as a model of the sampling distribution of the sample mean• Exactly, because the population is normal• Approximately, by the Central Limit Theorem for

large samples( 大样本中心极限定理 )

x

x

Page 43: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Recall the empirical rule, so…

– 68.26% of all possible sample means are within one standard deviation of the population mean

– 95.44% of all possible sample means are within two standard deviations of the population mean

– 99.73% of all possible sample means are within three standard deviations of the population mean

Page 44: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.1Example 7.1 The Car Mileage CaseThe Car Mileage Case• Recall that the population of car mileages is normally

distributed with mean and standard deviation = 0.8 mpg– Note that is unknown and is to be estimated but

assumed that = 31 mpg• Taking samples of size n = 5, the sampling

distribution of sample mean mileages is normal with mean = (which is also unknown) and standard deviation

• The probability is 0.9544 that will be within plus or minus 2 = 2 • 0.35777 = 0.7155 of

0.357775

80

.

nx

x

x

x

Page 45: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• That the sample mean is within ±0.7155 of is equivalent to…

will be such that the interval [ ± 0.7115] contains

• Then there is a 0.9544 probability that will be a value so that interval [ ± 0.7115] containscontains – In other words

P(– 0.7155 ≤ ≤ + 0.7155) = 0.9544– The interval [ ± 0.7115] is referred to as the

95.44% confidence interval for

The Car Mileage Case The Car Mileage Case ## 22

x x

x

x

x x

x

Page 46: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

The Car Mileage Case The Car Mileage Case ## 33

95.44% Confidence Intervals for

• Three intervals shown

• Two contain

• One does not

Page 47: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• According to the 95.44% confidence interval, we know even before we sample that of all the possible samples that could be selected …

• … There is 95.44% probability that the sample mean that is calculated is such that the interval[ ± 0.7155] will contain the actual (but unknown) population mean – In other words, of all possible sample means,

95.44% of all the corresponding intervals will contain the population mean

– Note that there is a 4.56% probability that the interval does not contain

• The sample mean is either too high or too low

x

Page 48: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• In the example, we found the probability that is contained in an interval of integer multiples of

• More usual to specify the (integer) probability and find the corresponding number of

• The probability that the confidence interval will not contain the population mean is denoted by– In the example, = 0.0456

GeneralizingGeneralizing

x

x

Page 49: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Generalizing Generalizing ContinuedContinued

• The probability that the confidence interval will contain the population mean is denoted by– 1 – is referred to as the confidence coefficient– (1 – ) 100% is called the confidence level– In the example, 1 – = 0.9544

• Usual to use two decimal point probabilities for 1 – – Here, focus on 1 – = 0.95 or 0.99

Page 50: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

General Confidence IntervalGeneral Confidence Interval• In general, the probability is 1 – that the

population mean is contained in the intervalis contained in the interval

– The normal point z/2 gives a right hand tail area under the standard normal curve equal to /2

– The normal point - z/2 gives a left hand tail area under the standard normal curve equal to /2

– The area under the standard normal curve between -z/2 and z/2 is 1 –is 1 –

n

zxzx x

22

Page 51: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Page 52: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

zz-Based Confidence Intervals for a-Based Confidence Intervals for aMean with Mean with Known Known

• If a population has standard deviation (known),

• and if the population is normal or if sample size is large (n 30), then …

• … a )100% confidence interval for is

nzx,

nzx

nzx 222

Page 53: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

95% Confidence Level95% Confidence Level95% Confidence Level95% Confidence Level• For a 95% confidence level,

1 – = 0.95= 0.05 /2 = 0.025

• For 95% confidence, need the normal point z0.025

• The area under the standard normal curve between -z0.025 and z0.025 is 0.95

• Then the area under the standard normal curve between 0 and z0.025 is 0.475

• From the standard normal table, the area is 0.475 for z = 1.96

• Then z0.025 = 1.96

Page 54: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

The Effect of The Effect of on Confidence on ConfidenceInterval WidthInterval Width

z/2 = z0.025 = 1.96 z/2 = z0.005 = 2.575

Page 55: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

95% Confidence Interval95% Confidence Interval95% Confidence Interval95% Confidence Interval

The 95% confidence interval isThe 95% confidence interval is

n.x,

n.x

n.xzx x.

961961

9610250

Page 56: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

99% Confidence Interval99% Confidence Interval99% Confidence Interval99% Confidence Interval• For 99% confidence, need the normal point

z0.005

• Reading between table entries in the standard normal table, the area is 0.495 for z0.005 = 2.575

• The 99% confidence interval is

n.x,

n.x

n.xzx x.

57525752

57520250

Page 57: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.2Example 7.2 The Car Mileage CaseThe Car Mileage Case

Given: = 31.5531 mpg = 0.8 mpgn = 49

95% Confidence Interval:

78313331

224055313149

809615531310250

.,.

..

...

nzx .

99% Confidence Interval:

85312631

294055313149

8057525531310050

.,.

..

...

nzx .

x

Page 58: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• The 99% confidence interval is slightly wider than the The 99% confidence interval is slightly wider than the 95% confidence interval95% confidence interval– The higher the confidence level, the wider the The higher the confidence level, the wider the

intervalinterval• Reasoning from the intervals:Reasoning from the intervals:

– The target mean mileage should be at least 31 mpgThe target mean mileage should be at least 31 mpg– Both confidence intervals exceed this targetBoth confidence intervals exceed this target– According to the 95% confidence interval, we can According to the 95% confidence interval, we can

be 95% confident that the mileage is between 31.33 be 95% confident that the mileage is between 31.33 and 31.78 mpgand 31.78 mpg

– So we can be 95% confident that, on average, the So we can be 95% confident that, on average, the mean mileage exceeds the target by at least 0.33 mean mileage exceeds the target by at least 0.33 mpg and at most 0.78 mpgmpg and at most 0.78 mpg

Page 59: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 7.2 Section 7.2 tt-Based Confidence -Based Confidence Intervals for a Population Mean Intervals for a Population Mean

• If is unknown (which is usually the case), we can construct a confidence interval for based on the sampling distribution of

• If the population is normal, then for any sample size n, this sampling distribution is called the t distribution(t分布 )

ns

xt

Page 60: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

The The tt Distribution(t Distribution(t 分布分布 ))The The tt Distribution(t Distribution(t 分布分布 ))• The curve of the The curve of the tt distribution is similar to that of the distribution is similar to that of the

standard normal curvestandard normal curve– Symmetrical and bell-shapedSymmetrical and bell-shaped– The The t t distribution is more spread out than the distribution is more spread out than the

standard normal distributionstandard normal distribution– The spread of the The spread of the tt is given by the is given by the number of number of

degrees of freedom(degrees of freedom( 自由度自由度 ))• Denoted by Denoted by dfdf• For a sample of size For a sample of size nn, there are one fewer , there are one fewer

degrees of freedom, that is,degrees of freedom, that is,

dfdf = = nn – 1 – 1

Page 61: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Degrees of Freedom and theDegrees of Freedom and thett-Distribution-Distribution

As the number of degrees of freedom increases, the spread of the t distribution decreases and the t curve approaches the standard normal curve

Page 62: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

The The tt Distribution and Degrees of Distribution and Degrees of FreedomFreedom

The The tt Distribution and Degrees of Distribution and Degrees of FreedomFreedom

• For a t distribution with n – 1 degrees of freedom, – As the sample size n increases, the degrees of

freedom also increases– As the degrees of freedom increase, the spread of

the t curve decreases– As the degrees of freedom increases indefinitely,

the t curve approaches the standard normal curve• If n ≥ 30, so df = n – 1 ≥ 29, the t curve is very

similar to the standard normal curve

Page 63: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

t and Right Hand Tail Areast and Right Hand Tail Areas• Use a t point denoted by t

– t is the point on the horizontal axis under the t curve that gives a right hand tail equal to

– So the value of t in a particular situation depends in a particular situation depends

on the right hand tail areaon the right hand tail area and the number of degrees of freedom• dfdf = = nn – 1 – 1 = 1 – , where 1 – is the specified

confidence coefficient

Page 64: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Page 65: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Rows correspond to the different values of df• Columns correspond to different values of a• See Table 7.3, Tables A.4 and A.20 in Appendix A and the

table on the inside cover– Table 7.3 and A.4 gives t points for df 1 to 30, then for df =

40, 60, 120, and ∞• On the row for ∞, the t points are the z points

– Table A.20 gives t points for df from 1 to 100• For df greater than 100, t points can be approximated by

the corresponding z points on the bottom row for df = ∞– Always look at the accompanying figure for guidance on

how to use the table

Using the Using the t t Distribution TableDistribution Table

Page 66: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Find t for a sample of size n = 15 and right hand tail area of 0.025– For n = 15, df = 14– = 0.025

• Note that = 0.025 corresponds to a confidence level of 0.95

– In Table 7.3, along row labeled 14 and under column labeled 0.025, read a table entry of 2.145

– So t = 2.145

Example 7.3Example 7.3

Page 67: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

tt-Based Confidence Intervals for a-Based Confidence Intervals for aMean: Mean: Unknown Unknown

If the sampled population is normally distributed with mean , then a )100% confidence interval for is

n

stx 2

t/2 is the t point giving a right-hand tail area of /2 under the t curve having n – 1 degrees of freedom

Page 68: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.4Example 7.4 Debt-to-Equity RatioDebt-to-Equity Ratio• Estimate the mean debt-to-equity ratio of the loan

portfolio of a bank• Select a random sample of 15 commercial loan accounts

– Box plot is given in figure below• Know: = 1.34 s = 0.192

n = 15• Want a 95% confidence interval for the ratio• Assume all ratios are normally distributed but unknown

x

Page 69: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

• Have to use the t distribution• At 95% confidence,

• 1 – = 0.95 so = 0.05 and /2 = 0.025• For n = 15,

• df = 15 – 1 = 14• Use the t table to find t/2 for df = 14

• t/2 = t0.025 = 2.145 for df = 14• The 95% confidence interval:

44912371

1060343115

1920145234310250

.,.

..

...

n

stx .

Page 70: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 7.3 Sample Size Section 7.3 Sample Size Determination Determination

If is known, then a sample of size

2

2

E

zn

Letting E denote the desired margin of error, so that is within E units of , with 100(1-)% confidence.

x

Page 71: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

If If is unknown and is estimated from is unknown and is estimated from ss, then a sample , then a sample of sizeof size

2

2

E

stn

so that is within E units of , with 100(1-)% confidence. The number of degrees of freedom for the t/2 point is the size of the preliminary sample minus 1

x

Page 72: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.5Example 7.5 Car Mileage CaseCar Mileage Case

Given: = 31.5531 mpgs = 0.7992 mpgn = 49

t- based 95% Confidence Interval:

where t/2 = t0.025 = 2.0106 for df = 49 – 1 = 48 degrees of freedom

78.31,32.31

2296.05531.3149

7992.00106.25531.31025.0

n

stx

Note: the error bound B = 0.2296 mpg, within the maximum error of 0.3 mpg

x

Page 73: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Section 7.4 Confidence Intervals for a Section 7.4 Confidence Intervals for a Population Proportion Population Proportion

If the sample size n is large*, then a )100% confidence interval for p is

n

p̂p̂zp̂

12

* Here n should be considered large if both

5ˆ15ˆ pnandpn

Page 74: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.6Example 7.6 Phe-Mycin Side EffectsPhe-Mycin Side EffectsGiven: Given: nn = 200, 35 patients experience nausea. = 200, 35 patients experience nausea.

1750200

35.p̂

Note:

so both quantities are > 5

165825.0200ˆ1

35175.0200ˆ

pn

pn

For 95% confidence, z/2 = z0.025 = 1.96 and

22801220

05301750

200

825017509611750

12

.,.

..

....

n

p̂p̂zp̂

Page 75: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Determining Sample Size forDetermining Sample Size forConfidence Interval for Confidence Interval for pp

A sample size

2

21

E

zppn

will yield an estimate , precisely within E units of p, with 100(1-)% confidence

Note that the formula requires a preliminary estimate of p. The conservative value of p = 0.5 is generally used when there is no prior information on p

Page 76: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

A Comparison of ConfidenceA Comparison of ConfidenceIntervals and Tolerance IntervalsIntervals and Tolerance Intervals

A tolerance interval contains a specified percentage of individual population measurements

• Often 68.26%, 95.44%, 99.73%

A confidence interval is an interval that contains the population mean , and the confidence level expresses how sure we are that this interval contains • Often confidence level is set high (e.g., 95% or 99%)

– Because such a level is considered high enough to provide convincing evidence about the value of

Page 77: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Example 7.7Example 7.7 Car Mileage CaseCar Mileage CaseTolerance intervals shown:

[± s] contains 68% of all individual cars

[ ± 2s] contains 95.44% of all individual cars

[ ± 3s] contains 99.73% of all individual cars

The t-based 95% confidence interval for is [31.32, 31.78], so we can be 95% confident that lies between 31.23 and 31.78 mpg

x

x

x

Page 78: The Sampling Distribution of the Sample Mean  The Sampling Distribution of the Sample Proportion Chapter 6 Sampling Distributions( 样本分布 )

Chapter 6 Sampling Distributions

Summary: Selecting an AppropriateSummary: Selecting an AppropriateConfidence Interval for a Population MeanConfidence Interval for a Population Mean