1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known...

Chapter 6Estimates and Sample Sizes

6-1 Estimating a Population Mean: Large Samples / σ Known

6-2 Estimating a Population Mean: Small Samples / σ Unknown

6-3 Estimating a Population Proportion

6-4 Estimating a Population Variance: Will cover with chapter 8

Overview

Methods for estimating population means and proportions

Methods for determining sample sizes

This chapter presents:

Estimating a Population Mean:Large Samples / σ Known

Assumptions

Large Sample is defined as samples with n > 30 and σ known.

Data collected carelessly can be absolutely worthless, even if the sample is quite large.

Estimatora formula or process for using sample data to

estimate a population parameter

Estimatea specific value or range of values used to

approximate some population parameter

Point Estimatea single value (or point) used to approximate a

population parameter

The sample mean x is the best point estimate of the population mean µ.

Definitions

DefinitionConfidence Interval

(or Interval Estimate)

a range (or an interval) of values used to estimate the true value of the population

parameter

Lower # < population parameter < Upper #

As an exampleLower # < < Upper #

DefinitionWhy Confidence Intervals

A couple of points

1. Even though x is the best estimate for and s is the best estimate for they do not give us an indication of how good they are.

2. A confidence interval gives us a range of values based on

a) variation of the sample data

b) How accurate we want to be

3. The width of the range of values gives us an indication of how good the estimate is.

4. The width is called the Margin of Error (E). We will discuss how to calculate this later.

Proportion of times that the confidence interval actual contains the population parameter

Degree of Confidence = 1 - often expressed as a percentage value

usually 90%, 95%, or 99% So ( = 10%), ( = 5%), ( = 1%)

DefinitionDegree of Confidence

(level of confidence or confidence coefficient)

Interpreting a Confidence Interval

Let: 1 - = .95

Correct: we are 95% confident that the interval from 98.08 to 98.32 actually does contain the true value of .

This means that if we were to select many different samples of sufficient size and construct the confidence intervals, 95% of them would actually contain the value of the population mean .

Wrong: There is a 95% chance that the true value of will fall between 98.08 and 98.32. (there is no way to calculate the probability for a population parameter only a sample statistic)

98.08o < µ < 98.32o

Confidence Intervals from 20 Different SamplesSimulations

http://www.ruf.rice.edu/~lane/stat_sim/conf_interval/index.html

The number on the borderline separating sample statistics that are likely to occur from those that are

unlikely to occur. The number z/2 is a critical value that

is a z score with the property that it separates an area /2

in the right tail of the standard normal distribution.

ENGLISH PLEASE!!!!

Definition

Critical Value

The Critical Value

z=0Found from calculator

.025.025

2 = 2.5% = .025 = 5%

Critical Values

Finding z2 for 95% Degree of Confidence

.025.025

- 1.96 1.96

z2 = 1.96

Use calculator to find a z score of 1.96

= 0.025 = 0.05

Finding z2 for other Degrees of

Confidence

1. 1 -

2. 1 -

3. 1 -

4. 1 -

5. 1 - (will use on test for ease of calculation)

Find critical value and sketch

Examples:

Margin of Error is the maximum likely difference observed between sample mean x and true population

mean μ.

denoted by E

Definition

lower limit upper limit

x - E < µ < x + E

Confidence Interval (or Interval Estimate)

for Population Mean µ(Based on Large Samples: n >30)

E = z/2 •n

When can we use zα/2?

If n > 30 and we know

If n 30, the population must have a normal distribution and we must know .

Knowing is largely unrealistic.

1. When using the original set of data, round the confidence interval limits to one more decimal place than used in original set of data.

2. When the original set of data is unknown and only the

summary statistics (n, x, s) are used, round the confidence interval limits to the same number of decimal places used for the sample mean.

Round-Off Rule for Confidence

Intervals Used to Estimate µ

n = 100

x = 43704

σ = 9879

= 0.95

= 0.05/2 = 0.025

z / 2 = 1.96

E = z / 2 • = 1.96 • 9879 = 1936.3n 100

x - E < < x + E

$41,768 < < $45,640

Example: A study found the starting salaries of 100 college graduates who have taken a statistics course. The sample mean was $43,704 and the sample standard deviation was $9,879. Find the margin of error E and the 95% confidence interval.

43704 - 1936.3 < < 43704 + 1936.3

Based on the sample provided, we are 95% confident the population (true) mean of starting salaries is between 41,768 & 45,640.

TI-83 Calculator

Finding Confidence intervals using z

1. Press STAT

2. Cursor to TESTS

3. Choose ZInterval

4. Choose Input: STATS*

5. Enter σ and x and confidence level6. Cursor to calculate

*If your input is raw data, then input your raw data in L1 then use DATA

Width of Confidence Intervals

Test QuestionWhat happens to the width of confidence

intervals with changing confidence levels?

Finding the Point Estimate and E from a Confidence Interval

Point estimate of x:

x = (upper confidence interval limit) + (lower confidence interval limit)

Margin of Error:

E = (upper confidence interval limit) - x

Find x and E

26 < µ < 40

x = (40 + 26) / 2 = 33

E = 40 - 33 = 7

Example

Use for #4 on hw

z/ 2 •E =

(solve for n by algebra)

z/ 2 E

z/2 = critical z score based on the desired degree of confidence E = desired margin of error

= population standard deviation

Sample Size for Estimating Mean

Example: If we want to estimate the mean weight of plastic discarded by households in one week, how many households must be randomly selected to be 99% confident that the sample mean is within 0.25 lb of the true population mean? (A previous study indicates the standard deviation is 1.065 lb.)

= 0.01

z = 2.575

E = 0.25

σ = 1.065

n = z = (2.575)(1.065) E 0.25

= 120.3 = 121 households

If n is not a whole number, round it up to the next higher whole number.

Example: If we want to estimate the mean weight of plastic discarded by households in one week, how many households must be randomly selected to be 99% confident that the sample mean is within 0.25 lb of the true population mean? (A previous study indicates the standard deviation is 1.065 lb.)

= 0.01

z = 2.575

E = 0.25

σ = 1.065

n = z = (2.575)(1.065) E 0.25

We would need to randomly select 121 households to be 99% confident that this mean is within 1/4 lb of the

population mean.

Example: How large will the sample have to be if we want to decrease the margin of error from 0.25 to 0.2? Would you expect it to be larger or smaller?

= 0.01

z = 2.575

E = 0.20

σ = 1.065

n = z = (2.575)(1.065) E 0.2

We would need to randomly select a larger sample because we require a smaller margin of error.

What happens when E is doubled ?

Sample size n is decreased to 1/4 of its original value if E is doubled.

Larger errors allow smaller samples.

Smaller errors require larger samples.

n = =2

(z ) 2

E = 1 :

E = 2 :

Class Assignment

1. Use OLDFAITHFUL Data in Datasets File 2. Construct a 95% and 90% confidence interval for the mean

eruption duration. Write a conclusion for the 95% interval. Assume σ to be 58 seconds

3. Compare the 2 confidence intervals. What can you conclude?4. How large a sample must you choose to be 99% confident the

sample mean eruption duration is within 10 seconds of the true mean

Guidelines:1. Choose a partner2. Suggest having one person working the calculator and one writing3. Due at the end of class (5 HW points)4. Each person must turn in a paper

Estimating a Population Mean:Small Samples / σ Unknown

1. n 302. The sample is a random sample.3. The sample is from a normally

distributed population.

Case 1 ( is known): Largely unrealistic;

Case 2 (is unknown): Use Student t distribution if normal ; if n is very large use z

Small SamplesAssumptions

Case 1 ( is known):

Determining which distribution to use

Case 2 (is unknown):n very large use zn > 30 use tn < 30 & Normal use tn < 30 & skewed neither

n > 30 use zn < 30 & Normal use zn < 30 & Skewed neither

1. n = 150 ; x = 100 ; s = 15 skewed distribution2. n = 8 ; x = 100 ; s = 15 normal distribution3. n = 8 ; x = 100 ; s = 15 skewed distribution4. n = 150 ; x = 100 ; σ = 15 skewed distribution5. n = 8 ; x = 100 ; σ = 15 skewed distribution

Determining which distribution to use

Important Facts about the Student t Distribution

1. Developed by William S. Gosset in 19082. Density function is complex

3. Shape is determined by “n”

4. Has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.

5. The Student t distribution has a mean of t = 0, but the standard deviation varies with the sample size and is always greater than 1

6. Is essentially the normal distribution for large n. For values of n > 30, the differences are so small that we can use the critical z or t value.

Student tdistributionwith n = 3

Student t Distributions for n = 3 and n = 12

Student tdistributionwith n = 12

Standardnormaldistribution

Greater variability than standard normal due to small sample size

Student t Distribution

If the distribution of a population is essentially normal, then the distribution of

critical values denoted by

t =x - µ

Book DefinitionDegrees of Freedom (df )

Corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values.

This doesn’t help me, how about you?

DefinitionDegrees of Freedom (df )

In general, the degrees of freedom of an estimate is equal to the number of independent scores (n) that go into the estimate minus the number of parameters estimated.

In this section

df = n - 1because we are estimating with x

Table A-3 / Calculators / Excel

Table from website TI – 84 (only) Excel function (tinv)

Degreesof

freedom

1234567891011121314151617181920212223242526272829

Large (z)

63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575

.005(one tail)

.01(two tails)

31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327

12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960

6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645

3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282

1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675

.01(one tail)

.02(two tails)

.025(one tail)

.05(two tails)

.05(one tail)

.10(two tails)

.10(one tail)

.20(two tails)

.25(one tail)

.50(two tails)

Table A-3 t Distribution

Critical z Value vs Critical t Values

See “t distribution pdf.xls”

Finding t2 for the following Degrees of

Confidence and sample size

1. 1 - n = 12

2. 1 - n = 15

3. 1 - n = 9

4. 1 - n = 20

Find critical value and sketch

Examples:

Confidence Interval for the Estimate of µ

Based on an Unknown and a Small Simple Random Sample from a Normally Distributed Population

x - E < µ < x + E

where E = t/2 ns

t/2 found in Table A-3

Using the Normal and t Distribution

Example: Let’s do an example comparing z and t. Construct confidence interval’s for each using the following data.

n = 16

x = 50

s = 20

= 0.05/2 = 0.025

Now we wouldn’t use a z distribution here due to the small sample but let’s

do it anyway and compare the width of the confidence interval to a confidence interval created using a t distribution

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

E = t2 s = (2.201)(15,873) = 10,085.3

26,227 - 10,085.3 < µ < 26,227 + 10,085.3

x - E < µ < x + E

We are 95% confident that this interval contains the average cost of repairing a Dodge Viper.

$16,141.7 < µ < $36,312.3

TI-83 Calculator

Finding Confidence intervals using t

1. Press STAT

2. Cursor to TESTS

3. Choose TInterval

4. Choose Input: STATS*

5. Enter s and x and confidence level6. Cursor to calculate

*If your input is raw data, then input your raw data in L1 then use DATA

Degreesof

freedom

1234567891011121314151617181920212223242526272829

Large (z)

63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575

.005(one tail)

.01(two tails)

31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327

12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960

6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645

3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282

1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675

.01(one tail)

.02(two tails)

.025(one tail)

.05(two tails)

.05(one tail)

.10(two tails)

.10(one tail)

.20(two tails)

.25(one tail)

.50(two tails)

Table A-3 t Distribution

Estimating a population proportion

Assumptions 1. The sample is a random sample.

2. The conditions for the binomial distribution are satisfied (See Section 4-3.)

3. The normal distribution can be used to approximate the distribution of sample proportions because np 5 and nq 5 are both satisfied.

q = 1 - p = sample proportion of x failures in a sample size of n

p =ˆ xn sample proportion

p = population proportion

(pronounced ‘p-hat’)

of x successes in a sample of size n

Notation for Proportions

DefinitionPoint Estimate

The sample proportion p is the best point estimate of the population

proportion p.

Confidence Interval for Population Proportion

p - E < < + E where

ˆ p ˆ p

nˆ ˆp q

Round-Off Rule for Confidence Interval Estimates of p

Round the confidence interval limits to

three significant digits.

( )2 ˆp q

Determining Sample Size

p qnˆ ˆ

(solve for n by algebra)

Sample Size for Estimating Proportion p

When an estimate of p is known: ˆ

ˆ( )2 p qn =

When no estimate of p is known:

( )2 0.25n =

= [1.645]2 (0.675)(0.325)

n = [z/2 ]2 p q

= 1483.8215= 1484 Americans

Example: We want to determine, with a margin of error of two percentage points, the percentage of Americans

who own their house. Assuming that we want 90% confidence in our results, how many Americans must we survey? An earlier study indicates 67.5% of Americans

own their own home.

To be 90% confident that our sample percentage is within two percentage points of the

true percentage for all Americans, we should

randomly select and survey 1484 households.

Round-Off Rule for Sample Size n

When finding the sample size n, if the result is not a whole number, always increase the value of n to the next larger whole number.

n = 1483.8215 = 1484 (rounded up)

n = [z/2 ]2 (0.25)

= (1.645)2 (0.25)

= 1690.9647= 1691

Americans

With no prior information, we need a larger sample to achieve the same results

with 90% confidence and an error of no more than 2%.

Example: We want to determine, with a margin of error of two percentage points, the percentage of Americans

who own their house. Assuming that we want 90% confidence in our results, how many Americans must we

survey? There is no prior information suggesting a possible value for the sample percentage.

TI-83 Calculator

Finding Confidence intervals using z (proportions)

1. Press STAT

2. Cursor to TESTS

3. Choose 1-ProbZInt

4. Enter x and n and confidence level5. Cursor to calculate

1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known...

Documents

Transcript of 1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known...

Σ xριάνι σ ο χρόνο, σ ένα αξίι ισ ορίας ......Παιδείας, Ελλάδας και Κύπρο (ο πρώος σ όχος ης χρονιάς «Γνωρίζω,

Section 6.1 Confidence Intervals for the Mean (σ known) (Large Samples)

Σ xριάνι σ ο χρόνο, σ ένα αξίι ισ ορίας ......Σ xριάνι σ ο χρόνο, σ ένα αξίι ισ ορίας, πολιισμού, πριβάλλον

In-Line Measuring Systems · 2016. 3. 9. · PULCOM Σ PULCOM Σ20/ 8 PULCOM Super Σ PULCOM mini Σ-Ⅱ PULCOM Σ-D PULCOM mini Σ-V PULCOM mini Σ-V13 PULCOM CRN-LN8/T10 Thin Outer

67. ESTIMATING IN-SITU STRESS FIELD FROM BASALTIC ROCK ... · 67. ESTIMATING IN-SITU STRESS FIELD FROM BASALTIC ROCK CORE SAMPLES OF HOLE 794C, YAMATO BASIN, JAPAN SEA1 Kensaku Tamaki2

NAU8820 Low Power 24-bit Stereo Audio Codec with High Current Outputs … · 2016. 8. 18. · hpf alc notch filter limiter 5 band eq 3d Σ Σ Σ Σ alc control Σ Σ rinmix ldac rdac

Best practices for collection of luminescence samples and … · 2015-11-16 · Best practices for collection of luminescence samples and estimating water content for dose-rate determination:

ΤΟ ΓΡΑΜΜΑ Σ,σ, ς

Τσίρκας, Σ. Η λέσχη σ. 1-213

Ô w;Æ != ' b...[taputwo-si]の音便変化の過程を以下に示す。 （4） σ σ σ σ σ σ σ σ σ σ ∧ ∧ μ μ μ μ μ μ μ μ μ μ μ μ ∧ ∧ ∧ ∧ ∧ ∧

Slide Slide 1 Chapter 7 Estimates and Sample Sizes 7-1 Overview 7-2 Estimating a Population Proportion 7-3 Estimating a Population Mean: σ Known 7-4 Estimating.

Braid-based cryptography - Patrick Dehornoy · BRAID-BASED CRYPTOGRAPHY 3 = = = = σ−1 i σ i+1 σ i σ i 1 σ i i σ i σ i σ i σ−1 σ i+1 +1 i+2 i+2 Figure 2. Geometric interpretation

Το Γράμμα Σ, σ

Estimating from Samples © Christine Crisp “Teach A Level Maths” Statistics 2.

ΥΔΡΕΥΣΗ ΒΕΡΟΙΑΣGENIKI... · 2018-10-23 · ΒΗΜΑ ΑΠΟΣΤΟΛΟΥ Α a 184 Π26 Υ a b n Δ Σ Υ Σ Σ Σ Α Υ Σ Σ Σ Σ Υ Χ. ΠΡΑΣΙΝΟΥ Υ Υ Υ Σ

How Do We Use the z Distribution to Estimate · (Lesson 29: Estimating a Population Mean µ (If σ is Unknown)) 29.01 LESSON 29: ESTIMATING A POPULATION MEAN µ (IF σ IS UNKNOWN)

Estimating eﬀective population size from samples of ...evolution.gs.washington.edu/papers/ne/ne.pdfRUNNING TITLE: Eﬀective population size from sequence samples Summary It is known

1 Lateral Earth Pressure We can calculate σ v ’ Now, calculate σ h ’ which is the horizontal stress σ h ‘/ σ v ‘ = K Therefore, σ h ‘ = Kσ v ‘ (σ V ‘ is.

Biotechnology 01 central dogma. RNA polymerase σ σ 2 ββ’ Core enzyme σ σ 2 ββ’ promoter DNA σ 2 ββ’

Composite Beam Transformed homogeneous beam obtained through a transformation factor: n = E1E2E1E2 dF = σ dA = σ dA’ σ dz dy = σ’ n dz dy σ = n σ’ and.

Ô w;Æ != ' b...[taputwo-si]の音便変化の過程を以下に示す。（4） σ σ σ σ σ σ σ σ σ σ ∧ ∧ μ μ μ μ μ μ μ μ μ μ μ μ ∧ ∧ ∧ ∧ ∧ ∧