Calculating sampling error

Calculating sampling error

• Actually, we’ll calculate what is known (technically) as the confidence interval. This is the “+ or =” number that you hear when sample results are reported.

• The formula is:

1.96 √(1-f)(p)(1-p)/(n-1), where

f = n/N n = sample size N = pop. size

p = proportion (a sample “response”)

• The questions were asked over a period from…. The methodology was used to select and question a representative sample of randomly selected adults. The margin of error is +/- 3 points.

• This ABC News/Washington Post poll was conducted by telephone Jan. 26-31, 2005, among a random national sample of 1,204 adults. The results have a three-point error margin.

But 1-f has little effect unless n is large in comparison to N. This is unlikely when the population is large. Examples:If N = 100,000,000 and n equals 1,500, 1-f = .999985. If N = 250,000 and n = 1,500, 1-f = .994So, unless n is (impossibly) large, or N is small, 1-f can be ignored.(When n/N is about .15 or above, then one needs to think about it.)

• So, now we’re left with

1.96 √(p)(1-p)/(n-1)

• This is a profound result. Why?

• So, to repeat, we’re left with

1.96 √(p)(1-p)/(n-1)

This expression is largest if p = .50. So, let’s be conservative and estimate the largest error (confidence interval) for a given sample size. I.e., let p = .50.

• Doing this gives us:

1.96 √(.5)(1-.5)/(n-1) = 1.96 √.25/(n-1)

• But in this expression,

1.96 √.25/(n-1)

we can use n instead of n-1. This gives us

1.96 √.25/(n)

• Finally, since we’re approximating, we can use 2 instead of 1.96. And, to make it a percentage rather than a proportion, we can use 200 instead of 2.

• So, from 1.96 √.25/(n)

we get 200 √.25/(n)

• This is a simple, useful approximation of the confidence interval (the + or - number) for a sample of size n.

• Example:

• n = 500, c.i. = 4.47

(for comp., 196 √(.25)/(n-1) = 4.39)

• In short, to answer the question “approximately how accurate is a sample of 1000 respondents,” we simply calculate 200 √.25/(n). The answer is:3.16, or about 3.2%.

• If you wish, you can do a little algebra and answer this question: If I want a c.i. of, say, ± 4%, what sample size do I need?

(the answer is 625)

• Cautionary note: samples of, say, 1000 can represent a large population. But subgroups within that sample may be much smaller.

• It’s the subgroup size that matters.

• This is one reason why non-simple random samples are taken (to obtain larger subgroup sizes).

Point left out

• What I talked about—the ± number you hear reported—is the so-called 95% confidence interval. Suppose it’s three percent.

• What this means is that you can be 95% certain that the population proportion will be within 3% of the proportion you find in the sample.

• Example: In a survey, you find that 45% of the sample says that they are Republican. The point is made that the sample is accurate to within ± 2.5%.

• You can be confident that in the popula-tion, between 42.5% and 47.5% are Rep. (Strictly speaking, this would be correct 19 times out of 20—i.e., 95% of the time.)

Calculating sampling error

Documents

Transcript of Calculating sampling error