Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments...

45
Natural experiments: the basics Class Data Mining Technology for Business and Society Program M. Sc. Data Science University Sapienza University of Rome Semester Spring 2016 Lecturer Carlos Castillo http://chato.cl/ Sources: Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [ link].

Transcript of Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments...

Page 1: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Natural experiments:the basics

Class Data Mining Technology for Business and SocietyProgram M. Sc. Data ScienceUniversity Sapienza University of RomeSemester Spring 2016Lecturer Carlos Castillo http://chato.cl/

Sources:● Thad Dunning: Natural Experiments in the Social Sciences.

Cambridge University Press, 2012 [link].

Page 2: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Results from a “Natural Experiment”

Yulia Tyshchuk, Cindy Hui, Martha Grabowski, William A. Wallace: “Social Media and Warning Response Impacts in Extreme Events: Results from a Naturally Occurring Experiment” HICSS 2012● “On April 6th, 2010, at 8:15 a.m., an armed perpetrator

robbed Regina Check Cashing Corporation, located at 450 Hoosick Street in Troy, N.Y., which is about one mile away from the Rensselaer Polytechnic Institute (RPI) campus. Later on, the perpetrator was seen on campus, specifically, in the East Campus Athletic Village. The RPIAlert system was activated and the first ‘stay in shelter’ warning, via on campus loudspeakers, emails, phone calls, voice mails and text messages, was issued at 9:30 a.m. Two more ‘stay in shelter’ warnings were issued at 10:48 a.m. and 11:48 a.m. that day, before the ‘all clear’ message was issued at 12:52 p.m.”

● Paper describes Twitter's network evolution, keywords, etc. after the event

Page 3: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Natural experiments

Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Page 4: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

What is a randomized controlled experiment?

● Start with a population (= “study group”)● Separate control and treatment groups at

random● Apply treatment to treatment group● Measure outcomes in both groups● Compare outcomes● Profit!

Page 5: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Key elements of

randomized controlled experiments

1.Randomized: assignment of subjects to treatment/control groups is done at random

2.Controlled: response of subjects assigned to treatment is compared to response of subjects assigned to control

3.Experiment: treatment received by treatment group is under the control of a researcher

Page 6: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Why natural experiments?

● In some contexts, direct manipulation is– Expensive

– Impractical

– Unethical

● Most results from social sciences and computational social science in large populations are observational

Page 7: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example: John Snow's cholera research (1855)

Do not confuse John Snow with Jon Snow.

Red = cholera death

Blue = water pump

Prevalent hypothesis: cholera was caused by miasma in air.

Page 8: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

● Two companies– (1) Southwark & Vauxhall (2) Lambeth

● In 1852, Lambeth moved their intake pipes upstream the Thames river, before city sewage, but Shouthwark and Vauxhall did not

Snow's observations (1853-1854)

Page 9: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Snow's words

Page 10: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Comparison

Randomized controlled experiment

1.Response of subjects assigned to treatment compared to response of subjects assigned to control

2.Assignment of subjects to groups is done using a randomization device

3.Treatment is under the control of a researcher

Natural experiment

1.Response of subjects assigned to treatment compared to response of subjects assigned to control

2.Assignment of subjects to groups is as-if random, or as good as random

3.Treatment was not under the control of a researcher

Page 11: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Types of natural experiment

1.Standard natural experiment

2.Instrumental variables

3.Regression discontinuity

Page 12: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

“Perfect” natural experiment

● Doherty, Gerber, Green 2006– Compare political attitudes of lottery winners vs

lottery non-winners

– Found that “lottery-induced affluence increases hostility toward estate taxes, marginally increases hostility towards government redistribution, but has little effect on broader attitudes concerning economic stratification or the role of government as a provider of social insurance”

Weakness: study group = lottery players. Are they representative of the whole society?

Page 13: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Another natural experiment: random attacks incite violence?

Lyall 2009● Russian soldiers in

Chechnya are instructed to strike random positions at random times during random durations

Page 14: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Another natural experiment: random attacks incite violence? (No)

Lyall 2009● Russian soldiers in

Chechnya were instructed to strike random positions at random times during random durations

● Moreover, many Russian soldiers were drunk (data from disciplinary actions)

● Less attacks to Russian soldiers from villages that were attacked

Page 15: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Natural experiment through instrumental variables

● Control/treatment differences too difficult to model

● Use an instrumental variable instead of assignment to control/treatment

Page 16: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example: exposure of Chinese to protests in Hong Kong [Zhang 2015]● Study group: Chinese who

traveled to Hong Kong before the protests started

● Control: returned to China 36 to 6 days before the protests started

● Treatment: returned after the protests, and hence possibly witnessed them

Page 17: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Descriptive statistics

Page 18: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Descriptive statistics

Page 19: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Evaluating effects: difference of differences

● Found increase in political activity in Weibo

● Before: 1.66/100 posts were political posts

● After: difference in differences of 0.66 posts: 75% more.

Page 20: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example: economic output and war[Miguel, Satyanath, Sergenti 2004]

● Economic shocks push countries to war?– Big methodological problem is reciprocal causation:

poverty creates the conditions for war, which creates more poverty

● External variable for economic output: rainfall● Study across countries with high/low relative rainfall● Low rainfall in one year increases chance of war

next year

→ evidence that economic shocks breed war

Page 21: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example: military and future income

● Setting: we want to study whether going to the Vietnam war affected the future income of people in the US– E.g. lost experience or years of career caused drop in

future salary, trauma of war caused decline in productivity, etc.

– Very important to apportion stipends, pensions, etc.

● Instrumental variable: eligibility for military draft– Date of birth yields a number from 1 to 365

– All whose number is larger than X are drafted

– Not all drafted go to war, not all that go to war were drafted

Page 22: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example: military and future income

● Study group: white men of military age in 1971

● This is called intention-to-treat analysis because in this case we use the intention to send people to war for dividing the study group

● Why? Because going to war is not a random process, while day of birth is as-if random

● More on this later ...

Instrumental variable 1984 earnings adjusted by inflation

Eligible by day of birth $16.172

Not eligible by day of birth $15.813 (about 2.2% less)

Page 23: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Regression discontinuity design

● A scalar variable is used to decide who receives treatment and who does not

● An arbitrary threshold is used● Observe outcomes just below and just above

the threshold

Page 24: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Regression discontinuity designs

Students with a score above 11 are given a “Certificate of Merit”

Hypothesis: getting a certificate of merit increases chances of scholarship

Page 25: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Regression discontinuity designs: three hypothetical series (A, B, C)

Page 26: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Electronic voting in Brazil[Fujiwara 2015]

● Literally hundreds of candidates per ballot● Municipalities with 40,500 voters or more

→ electronic ballot

● Municipalities with less than 40,500 voters

→ paper ballot

● Create two bands and compare, using h=20k

[40500-h, 40500) [40500, 40500+h)

● Jump from 75% to 90% valid votes, particularly in municipalities with lower literacy

Band size: if it's too narrow we question sample size, if it's too wide we question randomness

Page 27: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

The Neyman-Holland-Rubin model(see, e.g., [Freedman 2006])

Page 28: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

What we really would like to observe

● We would like to observe the following:– If we take a random subject i

– And create two parallel universes, one (T) in which i receives treatment and one (C) in which she doesn't

– What would be the expected treatment effect:

Page 29: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

What we observe

● Instead, we observe

● Actually, by construction we never observe both outcomes for the same unit — that's why those outcomes are called counter-factual

● Instead, we either observe YT or YC

Page 30: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Example in regression discontinuity designs

(Students with a score above 11 were given a “Certificate of Merit”)

Page 31: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Neyman's model

...

... ...

Study group

Treatment group

Control group

Page 32: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Justification

● We assume:

● Which requires:

Average causal effect

How do we satisfy these requirements?

Page 33: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Reality is more complicated ...

● In reality, matching designs are often used, in which the control group is of similar size to treatment group and with similar characteristics

● The best matching design is strong in terms of “similar characteristics”, ideally taking everything into account

● More on this later, but first ...

Page 34: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Analysis in instrumental variables design

Page 35: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

In instrumental variables

● Instrumental variable: eligibility for military draft– Date of birth yields a number from 1 to 365

– All whose number is larger than X are drafted

● Note imperfect overlapDrafted

Went to war

Declared medically unfit,escaped to Canada, etc.

Volunteered togo kill people

Drafted andwent to war

Page 36: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

In instrumental variables

● Complier:– If drafted, goes to war

– If not drafted, doesn't go to war

● Always-treat:– Goes to war no matter what

● Never-treat:– Never goes to war

Page 37: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Neyman's model with crossovers

...

... ...

Study group

...

...

Treatment group

Control group

...Never-treats

Compliers and always-

treats

... Always-treats

Compliers and never-

treats

Page 38: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Analysis for instrumental variables

● Average response of compliers to treatment = – How much money a random person who was

drafted and hence went to war will make?

● Average response of compliers to control = – How much money a random person who was not

drafted and hence did not go to war will make?

● Objective: determine the average response of compliers to treatment =

Page 39: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Let's work this outN = effect on never treats; T = effect on always-treats

α proportion of always-treats; β proportion of compliers, γ proportion of never-treats

Page 40: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

How do we estimate β?...

... ...

Study group

...

...

Treatment group

Control group

... ...

α proportion of always-treats; β proportion of compliers, γ proportion of never-treats

Page 41: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

How do we estimate β?...

... ...

Study group

...

...

Treatment group

Control group

... ...

α proportion of always-treats; β proportion of compliers, γ proportion of never-treats

Page 42: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

How do we estimate β?α proportion of always-treats; β proportion of compliers, γ proportion of never-treats

... ...

Treatment group

Control group

... ...

● = proportion of treated in the treatment group● = proportion of treated in the control group

Page 43: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Qualitative evidence

● Do not ignore qualitative evidence! It tells you about the causal process

● Several kinds of causal-process observations, including:– How were units assigned to treatment?

– Which instrumental variables could be useful?

– What is the mechanism by which treatment acts?

Page 44: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Evaluation

● Quality of random assignment– Information

– Incentives

– Capacities

● Credibility of the model● Relevance of the intervention

Page 45: Natural experiments: the basics - chato.cl · Natural experiments Thad Dunning: Natural Experiments in the Social Sciences. Cambridge University Press, 2012 [link].

Additional material: application to WWW research

● See also this tutorial:

https://sites.google.com/site/csswwwtutorial/