Drawing conclusions from data

Drawing Conclusions from Data An introduc4on to sta4s4cal tes4ng

without equa4ons

Metcalf Ins4tute 15th Annual Science Immersion Workshop

for Journalists

Jonathan Stray Columbia University

You see a story in the data Is it really there?

Why wouldn't there be a story?

•  You misunderstand how the data is collected •  The data is incomplete or bad •  The paIern is due to chance •  The paIern is real, but it isn't a causal rela4onship

•  The data doesn't generalize the way you want it to

How was this data created?

Inten4onal or uninten4onal problems

What doesn't a TwiIer map show?

NYC popula4oncolored by income

"Interview the data" Where do these numbers come from? Who recorded them? How? For what purpose was this data collected? How do we know it is complete? What are the demographics? Is this the right way to quan4fy this issue? Who is not included in these figures? Who is going to look bad or lose money as a result of these numbers? What arbitrary choices had to be made to generate the data? Is the data consistent with other sources? Who has already analyzed it? Does it have known flaws? Are there mul4ple versions?

.

.

What stats know-‐how gets you

•  You misunderstand how the data is collected •  The data is incomplete or bad •  The paIern is due to chance •  The paIern is real, but it isn't a causal rela4onship

•  The data doesn't generalize the way you want it to

Sta4s4cal tes4ng

Assumes the data is good, but includes an element of chance.

Then the first ques4on is: is the paIern a coincidence... or not? Or: Is "coincidence" consistent with the data?

Is this die loaded?

First rule of sta4s4cs

Smaller samples have more variance. That's why more data is always beIer, from the point of view of sta4s4cal tes4ng. More data increases "sta4s4cal power."

Are these two dice loaded?

Two dice: non-‐uniform distribu4on

The Null Hypothesis, H0

The paIern I see is just due to chance.

Distribu4on of data under H0 = what might the data look like if generated purely by chance?

Comparing two sets of numbers Let's say you measure the grades of students taught by two different classes and the average is different. Is this evidence that something is different between the two classes?

Construc4ng the null distribu4on We don't have a theore4cal argument (like the dice.) But if the two classes are really the same, then we can switch students between them if we want, and the null hypothesis will s4ll hold.

Construc4ng the null distribu4on Observed data Class A = 0.90 0.93 1.25 1.24 1.38 0.94 1.14 0.73 1.46 Class B = 1.15 0.88 0.90 0.74 1.21 Permuted Data Class A = 1.25 0.90 0.90 0.93 0.74 0.73 0.94 1.15 0.88 Class B = 1.21 1.14 1.46 1.38 1.24

observed difference

How sure do we need to be? These plots of the null distribu4on show us how oeen H0 will look like a paIern. This is the "p-‐value." If p-‐value is lower, we have stronger evidence that what we see is "real." How low is low enough?

Two values

"Significance" is how sure you are that the effect you're seeing is real. "Effect size" is how big the effect is. Example: class grades differed by 3%, p<0.05 Warning: large significance doesn't mean a large effect!

I see a trend. Is it "real"?

Looking for correla4ons Suppose you want to know if more firearms correlate with more firearm homicides. First, a scaIerplot.

Construc4ng the null distribu4on Again, we don't have a theore4cal distribu4on that tells us what the distribu4ons of firearms and homicides should be, if they're independent. But...

Construc4ng the null distribu4on But... if X and Y variables are truly independent, then switching which X goes with which Y won't make any difference.

A correla4on puzzle Suppose I you discover that the students with the top 5% of standardized test scores come from smaller classes. Why?

Have we learned nothing? Smaller samples will always have higher variance. So the smaller classes will have higher scores. They will also have lower scores. Protect yourself from reasoning errors: always plot null distribu4ons.

Have I really found the cause?

Suppose you apply a sta4s4cal test, and the smaller classes really are unlikely to have scores this high by chance. Was it really because of the smaller class size?

You will invent stories about your data

How correla4on happens

YX

X causes Y

YX

Y causes X

YX

random chance!

YX

hidden variable causes X and Y

YX

Z causes X and Y

Z

Guns and firearm homicides?

YX

if you have a gun, you're going to use it

YX

if it's a dangerous neighborhood, you'll buy a gun

YX

the correla4on is due to chance

Beauty and responses

YX

telling a woman she's beau4ful doesn't work

YX

if a woman is beau4ful, 1) she'll respond less 2) people will tell her that

Z

Beauty is a "confounding variable." The correla4on is real, but you've misunderstood the causal structure.

Suppose you apply a sta4s4cal test, and the smaller classes really are unlikely to have scores this high by chance. Will the same thing be true in other states?

Generalizability

Are those three students you interviewed really representa4ve of all students? Everyone you know is talking about it, but is everyone else? What's the margin of error of this poll? The sta4s4cs of generalizability: another 4me...

Generalizability

In Short

•  First ask about what the data means, where it came from, and if it's good.

•  Then ask about coincidence. Get a look at the null distribu4on.

•  If the correla4on is significant, then ask about causality. Rule out each case.

•  Are your results standing in for things you don't actually have data on?

Drawing conclusions from data

Education

Transcript of Drawing conclusions from data