Post on 09-Jul-2020
On Causal Analysis for Heterogeneous Networks
Katerina Marazopoulou, David Arbour, David Jensen
August 2017
University of Massachusetts Amherst College of Information and Computer Sciences
KDD Workshop on Causal Discovery
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 2
source: Visual Complexity
Causal inference in networks: How is the behavior of an individual affected by his/her peers?
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 3
source: Visual Complexity
How does the presence of multiple relationship types affect causal analysis?
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Outline
• Background: Causal effect estimation on networks
• Causal effect estimation in heterogeneous networks
• Experiments on synthetic data
• Application on real-world dataset
4
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Causal Effect Estimation in Networks
5
friends
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
• Population of n individuals that form an undirected graph
• Binary treatment T and outcome O
Causal Effect Estimation in Networks
6
friends
Oi(T = t) t 2 {0, 1}nwhere
• The outcome of a node depends on the global treatment assignment:
⌧(1,0) =1
n
nX
i=1
E[Oi(T = 1)�Oi(T = 0)]
• ATE between global treatment and global control
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Causal effect estimation
1. Treatment assignment
2. Exposure model: When an individual is considered to be treated
3. Analysis: How to estimate the causal quantity of interest
7
Estimation procedure for causal inference:
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Gui, Basin, Han. WWW 2015
Estimation procedure for causal inference:
Causal effect estimation
8
1. Treatment assignment
2. Exposure model: Fraction neighborhood exposure [Gui et al. 2015]
3. Analysis: Linear regression adjustment [Gui et al. 2015]
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
The Gui et al. framework
• Fraction neighborhood exposure model:
The response function depends on a node’s own treatment assignment and the proportion of its treated peers
9
g(Ti,�i) = ↵+ �Ti + ��i
⌧(1,0) = g(Ti = 1,�i = 1)� g(Ti = 0,�i = 0) = � + �• ATE:
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Heterogenous Network
10
friends coworkers
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Response function
11
Heterogeneous networks:
g(Ti,�i) = ↵+ �Ti + ��i
gf,c(Ti,�i) = ↵+ �Ti + �f�fi + �c�c
i
Homogeneous networks:
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Sets of peers
• There are more options than friends and coworkers.
• We can consider any combination of non-overlapping sets of peers
friends and coworkers
friends only
friends or coworkers but not both
12
gf,c(Ti,�i) = ↵+ �Ti + �f�fi + �c�c
i
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Peer-sets of interest
• Friends (homogeneous network)
• Coworkers (homogeneous network)
• Friends or coworkers (union as a homogeneous network)
• Disjoint
• Friends-coworkers
13
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Sets of peers we consider
14
A
B
C
D
Coworkers
Friends only
Friends and coworkers
A
B
C
D
Coworkers only
Friends
A
B
C
D
Coworkers
Friends or coworkers
A
B
C
D
Friends
A
B
C
D
friendscoworkers
Friends Coworkers Friends or coworkers
Disjoint Friends-coworkers
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Peer sets of interest: Where are they used?
• Response functions • ATE estimators • Outcome generation
15
Used for:
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Peer sets of interest: Where are they used?
• Response functions
• ATE estimators
• Outcome generation
16
Friends-coworkers
Used for:
Friends
A
B
C
D
Coworkersgf,c(Ti,�i) = ↵+ �Ti+�f�f
i +�c�ci
⌧f,c = �+�f+�c
Oi = w0 + w1Ti+wf2
F [·, i]>Ot
DFi
+wc2C[·, i]>Ot
DCi
+ ✏
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 17
How does ignoring/mis-specifying the type of relationships affect estimation of causal effects?
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Experiments (synthetic data)
Goal: impact on estimation of causal effects
• Generation of graphs
Erdos-Renyi
Watts-Strogatz
Stochastic block model
• Generation of treatment values
1. Independent assignment for every node
2. Graph cluster randomization [Ugander et al. 2013]
18
Ugander, Karrer, Backstrom, Kleinberg. KDD 2013
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 19
• Generation of outcome values
1. Outcome Interference
2. Treatment Interference
Experiments (synthetic data)
Oi,t+1 ⇠ w0 + w1Ti + f(Opeers of i,t) + ✏
Oi ⇠ w0 + w1Ti + f(Tpeers of i
) + ✏
where: ✏ = �✏N (0, 1)
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 20
Experiment configuration:
• Graph model: Watts-Strogatz
• Treatment assignment: Graph cluster randomization
• Treatment probability: 0.5
• Outcome generation: Treatment interference
Results
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 21
0
−6.3
−30.2
0.1
3.8
−12.2
2
−46.2
−11.2
−2.8
−16.7
−3.7
0
−0.1
2
−4.1
−6.4
−25
0
3.8
−19.6
−7.7
−19.5
−6.9
0
Coworkers
Disjoint
Friends
Friends−Coworkers
Friends or Coworkers
CoworkersDisjoint
Friends
Friends−Coworkers
Friends or Coworkers
Generative model
Assu
med
mod
el
10
20
30
40
Absoluterelativebias
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 22
Experiment configuration:
• Graph model: Watts-Strogatz
• Treatment assignment: Graph cluster randomization
• Treatment probability: 0.5
• Outcome generation: Treatment interference
Results
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 23
Experiment configuration:
• Graph model: Watts-Strogatz
• Treatment assignment: Graph cluster randomization
• Treatment probability: varying
• Outcome generation: Treatment interference
Results
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 24
● ● ● ●●
●● ●
●
● ● ● ● ● ● ● ● ●
● ● ●● ● ●
●●
●
● ● ● ●●
●●
●
●
● ● ● ●●
●● ●
●
Generative model: Coworkers
Generative model: Disjoint
Generative model: Friends
Generative model: Friends−Coworkers
Generative model: Friends or Coworkers
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
−30
−20
−10
0
Treatment probability
Rel
ative
bia
s (%
ove
r tru
e at
e)
Exposure model ●Coworkers Disjoint Friends Friends−Coworkers Friends or Coworkers
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Model selection
Given a set of alternative models, is it possible to identify the true generating model?
25
Procedure:
• Generate synthetic networks and synthetic data (as before).
• Compute BIC for each of the five alternative models.
• Select model with the lowest BIC.
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Model selection
26
Erdos−Renyi Stochastic−block−model Watts−Strogatz
C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
0.00
0.25
0.50
0.75
1.00
Configuration of coefficients
Accu
racy
of m
odel
sel
ectio
n
Noise 0.5 1.0 2.0
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Model selection
27
Random Random Random
Erdos−Renyi Stochastic−block−model Watts−Strogatz
C0 C1 C2 C3 C0 C1 C2 C3 C0 C1 C2 C3
0.00
0.25
0.50
0.75
1.00
Configuration of coefficients
Accu
racy
of m
odel
sel
ectio
n
Noise 0.5 1.0 2.0
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Real data
• Study on the diffusion of micro financing loans through various social networks
• Survey conducted in 75 villages in southern India
• Village-level survey and follow-up survey on a subsample of individuals for each village
• Individual surveys identify 13 types of social relationships (e.g., friends, relatives, borrowing money from, going to temple with)
• Individual’s attributes (age, gender, etc)
28
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Real heterogeneous network
29
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Experimental setup for real data
• Several pairs of social relationships
• Combinations of treatment-outcome variables
• Estimate effect using different response functions
30
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks 31
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
0.0
0.2
0.4
0.6
0.8
friends−relatives
gender−savingsfriends−relatives
gender−working
helping with decisions−relatives
gender−working
borrowing money−relatives
gender−working
Relation1−Relation2 Treatment−Outcome
Estim
ated
effe
ctAssumed model●
●
●
●
●
Rel1
Rel2
Rel1orRel2
Rel1−Rel2
Disjoint
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Summary
• Recent work has extended causal inference frameworks for network data.
• We address the case of heterogeneous networks and causal effect estimation in this framework.
• Mis-specifying the relational structure of causal dependence can lead to significant bias.
• Model selection for distinguishing among candidate response functions.
32
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Directions for future work
33
• Formal characterization of bias and variance of ATE estimators for heterogeneous networks
• Interactions of relational semantics (effect present from multiple relational phenomena)
• Measure of model selection for relational data
• Fully automated methods for choosing appropriate response functions
• Extending A/B testing framework for heterogeneous networks
Katerina Marazopoulou On Causal Analysis for Heterogeneous Networks
Questions?
Thank you!
34