Sampling Distributions A review by Hieu Nguyen (03/27/06)

25
Sampling Distributions A review by Hieu Nguyen (03/27/06)

Transcript of Sampling Distributions A review by Hieu Nguyen (03/27/06)

Page 1: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions

A review by Hieu Nguyen(03/27/06)

Page 2: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Parameter vs Statistic

A parameter is a description for the entire population.

Example:A parameter for the US population is the proportion of all people who support President Bush’s nomination of Samuel Alito to the Supreme Court.

p=.74

Page 3: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Parameter vs Statistic

A statistic is a description of a sample taken from the population. It is only an estimate of the population parameter.

Example:In a poll of 1001 Americans, 73% of those surveyed supported Alito’s nomination.

p-hat=.73

Page 4: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Bias

The bias of a statistic is a measure of its difference from the population parameter.

A statistic is unbiased if it exactly equals the population parameter.

Example:The poll would have been unbiased if 74% of those surveyed approved of Alito’s nomination.

p-hat=.74=p

Page 5: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Variability

Samples naturally have varying results. The mean or sample proportion of one sample may be different from that of another.

In the poll mentioned before p-hat=.73. A repetition of the same poll may have

p-hat=.75.

Page 6: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Central Limit Theorem (CLT)

Populations that are wildly skewed may cause samples to vary a great deal.

However, the CLT states that these samples tend to have a sample proportion (or mean) that is close to the population parameter.The CLT is very similar to the law of large

numbers.

Page 7: Sampling Distributions A review by Hieu Nguyen (03/27/06)

CLT Example

Imagine that many polls of 1001 Americans are done to find the proportion of those who supported Alito’s nomination.

Although the poll results vary, more samples have a mean that is close to the population parameter μ=.74.

Page 8: Sampling Distributions A review by Hieu Nguyen (03/27/06)

CLT Example

Plot the mean of all samples to see the effects of the CLT. Notice how there are more sample means near the population parameter μ=.74.

This histogram is actually a sampling distribution

Page 9: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions: Definition Textbook definition:

A sampling distribution is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

In other words, a sampling distribution is a histogram of the statistics from samples of the same size of a population.

Page 10: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Two Most Common Types of Sampling Distributions Sample Proportion Distribution

Distribution of the sample proportions of samples from a population

Sample Mean Distribution Distribution of the sample means of samples

from a population For both types, the ideal shape is a normal

distribution

Page 11: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions: Conditions Before assuming that a sampling

distribution is normal, check the following conditions:Plausible IndependenceRandomnessEach sample is less than 10% of the

population

Page 12: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions As Normal Distributions When all conditions met, the sampling

distribution can be considered a normal distribution with a center and a spread.

Note:With sample proportion distributions, another condition must be meet:Success-failure conditon – there must be at least 10

success and 10 failures according to the population parameter and sample size

Page 13: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions As Normal Distributions: Equations Sample Proportion

Distributionp = population proportion (given)

Sample Mean Distributionμ = population mean (given)

σ = population standard deviation (given)

n

pqpSD ˆ

pSDpN ˆ,

n

ySD

ySDN ,

Page 14: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distributions As Normal Distributions: Note Note:

If any of the parameters are unknown, use the statistics from a sample to approximate it.

Page 15: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Using Sampling Distributions

Sampling Distributions can estimate the probability of getting a certain statistic in a random sample.Use z-scores or the NormalCDF function in

the TI-83/84.

Page 16: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Using Sampling Distributions: Z-Scores w/ Example Use the z-score table to find appropriate

probabilitiesExample:Find the probability that a poll of Americans that support Alito’s nomination will return a sample proportion of .72.

ppP

OR

ppP

pSD

ppz

ˆˆ

ˆˆ

ˆ

ˆ

0749.72.ˆ

443.10139.

74.72.ˆ

ˆ

0139.1001

26.*74.ˆ

74.

pP

pSD

ppz

n

pqpSD

p

Page 17: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Using Sampling Distributions: NormalCDF Function w/ Example The syntax for the NormalCDF function is:

NormalCDF(lower limit, upper limit, μ, σ)Example:Find the probability that a sample of size 25 will have a mean of 5 given that the population has a mean of 7 and a standard deviation of 3.

000429.)6,.7,5,0(

6.25

3

3

7

NormalCDFn

ySD

Page 18: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distribution for Two Populations Use a difference sampling distribution if

the question presents 2 different populations.

22yxyx

yxyx

Page 19: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Sampling Distribution for Two Populations: Example(adapted from AP Statistics – Chapter 9 – Sampling Distribution Multiple Choice Questions

Medium oranges have a mean weight of 14oz and a standard deviation of 2oz. Large oranges have a mean weight of 18oz and a standard deviation of 3oz. Find the probability of finding a medium orange that weights more than a large orange.

134.)606.3,4,0,(

606.323

41418

3

18

2

14

2222

NormalCDF

xyxy

xyxy

y

y

x

x

Page 20: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem(adapted from DeVeau Sampling Distribution Models Exercise #42)

Ayrshire cows average 47 pounds if milk a day, with a standard deviation of 6 pounds. For Jersey cows, the mean daily production is 43 pounds, with a standard deviation of 5 pounds. Assume that Normal models describe milk production for these breeds. A) We select an Ayrshire at random. What’s the probability that she averages

more than 50 pounds of milk a day? B) What’s the probability that a randomly selected Ayrshire gives more milk

than a randomly selected Jersey? C) A farmer has 20 Jerseys. What’s the probability that the average

production for this small herd exceeds 45 pounds of milk a day? D) A neighboring farmer has 10 Ayrshires. What’s the probability that his herd

average is at least 5 pounds higher than the average for the Jersey herd?

Page 21: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem Solution

First, check the assumptions: Independent samplesRandomnessSample represents less than 10% of

population

Page 22: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem Solution

A) Use the normal model to estimate the appropriate probability.

309.6,47,,50

309.50ˆ5.6

4750

6

47

NormalCDF

pPx

z

Page 23: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem Solution

B) Create a normal model for the difference between Ayrshires and Jerseys. Use the model to estimate the appropriate probability.

696.)810.7,4,,0(

696.0512.810.7

40

810.756

44347

5

43

6

47

2222

NormalCDF

xPx

zja

ja

jaja

jaja

j

j

a

a

Page 24: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem Solution

C) Create a sampling distribution model for which n=20 Jerseys. Use the model to estimate the appropriate probability.

0367.)6,47,,50(

0367.45ˆ789.1.118.1

4345

118.120

5

20

5

43

NormalCDF

pPx

z

nySD

n

Page 25: Sampling Distributions A review by Hieu Nguyen (03/27/06)

Example Problem Solution

D) First create a sampling distribution model for 10 random Ayrshires and 20 random Jerseys. Then create a normal model for the difference between the 10 Ayrshires and 20 Jerseys.

118.120

5

20

5

43

j

jj

j

j

j

nySD

n

897.110

6

10

6

47

a

aa

a

a

a

nySD

n

325.)202.2,4,,5(

325.5454.202.2

45

202.2118.1897.1

44347

2222

NormalCDF

xPx

zja

ja

jaja

jaja