Sample Size

56
Sample Size

description

Sample Size. Review: Base Rate Neglect. On HW 4, I asked you to find a fallacy on the internet. More than one of you found studies where scientists took data from thousands of people and found correlations. Misunderstanding. - PowerPoint PPT Presentation

Transcript of Sample Size

Page 1: Sample Size

Sample Size

Page 2: Sample Size

Review: Base Rate Neglect

On HW 4, I asked you to find a fallacy on the internet.

More than one of you found studies where scientists took data from thousands of people and found correlations.

Page 3: Sample Size

Misunderstanding

You said, “this is the base rate neglect fallacy. There are billions of people in the world, and these studies only looked at thousands of people. They are neglecting the base rate.”

But that is not the base rate neglect fallacy. And this is important to know: you shouldn’t ignore good science just because you’re confused about what the base rate fallacy is.

Page 4: Sample Size

Base Rate Neglect

First of all, the base rate neglect fallacy has nothing at all to do with the number of people there are in the world. Nothing.

It has to do with the probability of a variable taking on a certain value, for instance, the probability that someone’s height = 1.5m, the probability that terrorist = true (someone is a terrorist)…

Page 5: Sample Size

Base Rates

This is the “base rate” of people who are 1.5m tall, and the “base rate” of terrorists.

If 1 in 100 people are terrorists, then the rate of terrorists is 1 in 100 and the probability that a randomly selected person is a terrorist is 1 in 100.

Page 6: Sample Size

Base Rates

We call this the base rate, because it is the probability that someone is a terrorist when we don’t know anything else about them.

It might be that the base rate of terrorists is 1 in 100, but the rate of terrorists among people who are holding rocket launchers is 1 in 2, and the rate of terrorists among retirees is 1 in 500.

Page 7: Sample Size

Tests

The base rate neglect fallacy happens when we have a test that is meant to detect the value of a variable.

For example we might have a test that tells us whether someone has AIDS or not, or whether someone is driving over the speed limit, or whether they are drunk.

Page 8: Sample Size

Reliability of Tests

Here is the important, and crucial fact. Please learn this:

As the base rate of X = x decreases, the # of false positives on tests for X = x increases.

Tests are less reliable when the condition we are testing for becomes rare (low base rate).

Page 9: Sample Size

Base Rate Neglect Fallacy

The base rate neglect fallacy happens when:

1. There is a low base rate of some condition.2. We have a test for that condition.3. Someone tests positive.4. We assume that means they have the

condition, ignoring the unreliability of tests for conditions with low base rates.

Page 10: Sample Size

Prosecutor’s Fallacy

The base rate neglect fallacy is often called the prosecutor’s fallacy, as I shall explain.

Page 11: Sample Size

Murder!

Let’s suppose that there has been a murder.

There is almost no evidence to go on except that the police find one hair at the crime scene.

Page 12: Sample Size

You are the Suspect

If someone is the killer, there is a 100% chance that their DNA will match the hair’s DNA.

The police have a database that contains the DNA of everyone in Hong Kong.

They run the DNA in the hair through their database and discover that you are a match!

Page 13: Sample Size

Comprehension Question

If you have been following along you should be able to answer this question:

What is the probability that you are the murderer, given that you are a DNA match for the hair?

Page 14: Sample Size

Answer

If you said 100%, then you have just committed the base rate neglect fallacy.

The correct answer is “Much lower, because the base rate of people who committed this murder out of the Hong Kong population as a whole is 1 in 7 million.”

Page 15: Sample Size

Perfect Conditions for Fallacy

Here’s what we have:

1. A low base rate (only 1 person who committed this murder in the world).

2. A test for whether someone is the murderer.3. You, who’ve tested positive on this test.4. And the police who think you did it!

Page 16: Sample Size

Let’s Look at the Numbers

We know that if you are the murderer, then there is a 100% chance of a DNA match.

But what is the false positive rate? How likely is a randomly selected person will match the DNA?

Page 17: Sample Size

False Results

Here’s a quote from “False result fear over DNA tests,” Nick Paton Walsh, The Guardian:

“Researchers had asked the labs to match a series of DNA samples. They knew which ones were from the same person, but found that in over 1 per cent of cases the labs falsely matched samples, or failed to notice a match.”

Page 18: Sample Size

Let’s assume that half of the cases where “labs falsely matched samples, or failed to notice a match.” were cases where they falsely matched samples.

So the probability of a false positive is ½ x 1% = 0.5%, or 5 in 1,000.

Page 19: Sample Size

Since there are 7 million people in Hong Kong, we expect about 0.5% x 7 million = 35,000 of them to match the hair’s DNA.

Actually, it’s 35,000 + 1, because the true killer is a match, and not by accident.

Page 20: Sample Size

So we expect that there are 35,001 DNA matches in all of Hong Kong.

And only one of them is the murderer. So what is the probability that you are the murderer?

1 in 35,001. That’s way less than 100%.

Page 21: Sample Size

Important Things to Remember

There are three important things to remember:

1. If the test is more accurate (fewer false positives), then it’s more reliable

2. If the base rate is higher, the test is more reliable.

3. If the police have other reasons to suspect you, the test is more reliable.

Page 22: Sample Size

1. If the test is more reliable…

Theoretically, DNA tests only return a false positive about 1 in 3 billion times.

In that case, we’d expect only .002 false positives in all of Hong Kong.

So your chances of being guilty would be 1 in 1.002, or 99.8%. Still, that’s lower than 100%.

Page 23: Sample Size

2. If there base rate is higher…

Maybe the person who died was stabbed 5,000 times, once each by 5,000 different people. So there are 5,000 murderers.

Then with the previous false positive number at 35,000, you have a 5,000 in 40,000 chance of being one of the killers, or 12.5%.

Page 24: Sample Size

3. If the police have some other reason to suspect you…

To figure out your chances of being guilty, we looked at the probability that a randomly selected person from HK would be a DNA match. We were assuming you were randomly selected.

But what if you weren’t randomly selected? What if the police tested you because you had a reason to kill the victim?

Page 25: Sample Size

Reason to Suspect You

Then we would have to look at not the probability that a randomly selected person would match, but the probability that a person who had reason to kill the victim would match.

Suppose there are 5 people who had reasons to kill the victim, and the killer is one of them.

Page 26: Sample Size

Much Higher Chance

Then your chances are:Let K = you’re the killer and M = you’re a matchP(K/ M) = [P(K) x P(M/ K)] ÷ P(M)= [(1/5) x 100%] ÷ P(M)= 0.2 ÷ [(1 + 0.025) ÷ 5]= 97.6%

Page 27: Sample Size

SAMPLING

Page 28: Sample Size

Now we know what the base rate neglect bias is (hopefully).

But this still doesn’t answer our question: how many people do we need in our scientific study to reliably generalize the results to everyone?

Page 29: Sample Size

For example, if I want to know whether increased economic dependence in men is correlated with increased infidelity, how many people do I need to study?

Surely one is too few. Is 10 fine? Do I need 100? A million?

Page 30: Sample Size

Sample

In statistics, the people who we are studying are called the sample. (Or if I’m studying the outcomes of coin flips, my sample is the coin flips that I’ve looked at. Or if I’m studying penguins, it’s the penguins I’ve studied.)

Our question is then: what sample size is needed for a result that applies to the population?

Page 31: Sample Size

Evaluating Evidence

Well, remember what we learned last class. There are two measures of success for a study:

Statistical significance: how likely would my results be if they were just due to random chance? Does the study rule out the null hypothesis?

Page 32: Sample Size

Evaluating Evidence

Well, remember what we learned last class. There are two measures of success for a study:

Effect size: If I find that A and B are positively correlated, how much does the value of A affect B? What’s the percentage difference in the odds/probability of B as we vary the odds/probability of A?

Page 33: Sample Size

Two Questions

So there are really two questions we’re asking:

How many people do I need to study to obtain statistically significant results?

How big should my sample be to accurately estimate effect sizes in the population at large?

Page 34: Sample Size

Law of Large Numbers

Luckily, we do know that more is always better.

The “Law of Large Numbers” says that if you make a large number of observations, the results should be close to the expected value.

(There is no “Law of Small Numbers”)

Page 35: Sample Size

Average of Dice Rolls

Page 36: Sample Size

Example

Let’s think about a particular problem.

Suppose we are having an election between Mitt and Barack and we want to know how many people in the population plan to vote for Mitt.

How many people do we need to ask?

Page 37: Sample Size

Non-Random Samples

The first thing we should realize is that it’s not going to do us any good to ask a non-random group of people.

Suppose everyone who goes to ILoveMitt.com is voting for Mitt. If I ask them, it will seem like 100% of the population will vote for Mitt, even if only 3% will really vote for him.

Page 38: Sample Size

Internet Polls

(Important Critical Thinking Lesson:

Internet polls are not trustworthy. They are biased toward people who have the internet, people who visit the site that the poll is on, and people who care enough to vote on a useless internet poll.)

Page 39: Sample Size

Representative Samples

The opposite of a biased sample is a representative sample.

A perfectly representative sample is one where if n% of the population is X, then n% of the sample is X, for every X.

For example, if 10% of the population smokes, 10% of the sample smokes.

Page 40: Sample Size

Random Sampling

One way to get a representative sample is to randomly select people from the population, so that each has a fair and equal chance of ending up in the sample.

For example, when we randomize our experiments, we randomly sample the participants to obtain our experimental group. (Ideally our participants are randomly sampled from the population at large.)

Page 41: Sample Size

Problems with Random Sampling

Random sampling isn’t a cure-all, however.

For example, if I randomly select 10 people from a (Western) country, on average I’ll get 5 men and 5 women. On average.

But, on any particular occasion, I might select (randomly) 7 men and 3 women, or 4 men and 6 women.

Page 42: Sample Size

Stratified Sampling

One way to fix these problems would be to randomly sample 5 women and randomly sample 5 men. Then I would always have an even split between men and women, and my men would be randomly drawn from the group of men, while my women were randomly drawn from the group of women.

Page 43: Sample Size

Example

Let’s continue with our example.

We’re convinced that we should randomly sample n individuals from the population of women and n from the population of men.

Still, what is that number n?

Page 44: Sample Size

We know that, of the people in our sample, X% will vote for Mitt.

We want to know, of the people in the population, what percent will vote for Mitt?

We can never know that it is exactly X, unless we ask everyone. But we can increase our confidence.

Page 45: Sample Size

Confidence Interval

What we can do is find out, based on our sample, that we are Z% sure (confident) that the number of people who will vote for Mitt is between X% and Y%.

For example, we can be 90% confident that the percentage of people who vote for Mitt is between 44% and 48%.

Page 46: Sample Size

Confidence Interval

This would mean we think there’s a 10% chance that either less than 44% or more than 48% of people vote for Mitt.

The very same data might warrant us in saying that we are 95% confident that the percentage of people who vote for Mitt is between 40% and 52%.

Page 47: Sample Size

Sample Size Determination

So if we want to know how many people to look at, we should determine:

1. What level of confidence we want2. How big we want our confidence interval to

be.

Page 48: Sample Size

Common Choices

Common choices for these numbers are:

1. We want to be 95% confident of our estimation.

2. We want our confidence interval to be 6% wide (e.g. between 42% and 48%).

Page 49: Sample Size

Expected Value, Deviation

Each variable has an expected value (for example, 3.5 is the expected value of a dice roll, the average of all the sides of a die).

Each variable has an expected deviation from its expected value: how far are all the dice values (1, 2, 3, 4, 5, 6) from the expected value (3.5)– the answer is 1.5 on average.

Page 50: Sample Size

Variance

The variance is the expected squared deviation–

[(6 – 3.5)^2 + (5 – 3.5)^2 + (4 – 3.5)^2 + (3.5 – 3)^2 + (3.5 – 2)^2 + (3.5 – 1)^2] ÷ 6

Or about 2.9 for a die. Don’t worry you don’t need to know this.

Page 51: Sample Size

Standard Deviation

The standard deviation is the square root of the variance. So √2.9 for a die.

The important point is that we can use this number, the standard deviation, to figure out how many people we need in our sample.

Page 52: Sample Size

Solving for Sample Size

If we want our confidence interval to be 6% wide, then a 95% confidence interval of this width will be:

4 x standard deviation = 6%

The standard deviation of any estimate of a proportion will be √(0.25/n)

Page 53: Sample Size

A Little Bit of Math

So,

4 x standard deviation = 6%4 x √(0.25/n) = 0.06√(0.25/n) = 0.06/4 = 0.015(0.25/n) = 0.015^2 = 0.000225n = 0.25/0.000225 = 1,111

Page 54: Sample Size

The Important Point

What’s the point?

The point is that you need about 1,000 people to be 95% sure that the vote counts you estimate from the sample are within 6% of the actual voting behavior of the population.

Page 55: Sample Size

Things to Note

This doesn’t mean that studies with less than a thousand people can’t tell us anything—

What they tell us will just be either less confident than 95% or have greater error bars than 6%.

If a confidence interval of 20% is fine, you only need 100 people.

Page 56: Sample Size

The Base Rate Still Matters

It also doesn’t mean that 100 or 1,000 people is sufficient for any study.

When the value of the variable being studied is rare in the population, you need more people. For example, if it’s 1 in 1 million, then most samples of 1,000 won’t contain it, but that doesn’t mean it’s at 0 prevalence.