Bernhard Sch olkopf and Jonas Peters MPI for Intelligent...
Transcript of Bernhard Sch olkopf and Jonas Peters MPI for Intelligent...
Causality
Bernhard Scholkopf and Jonas PetersMPI for Intelligent Systems, Tubingen
MLSS, Tubingen
21st July 2015
Charig et al.: “Comparison of treatment of renal calculi by open surgery, (...) ”, British Medical Journal, 1986
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Charig et al.: “Comparison of treatment of renal calculi by open surgery, (...) ”, British Medical Journal, 1986
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
J. Mooij et al.: Distinguishing cause from effect using observational data: methods and benchmarks, submitted
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Assume P(X1, . . . ,X4) has been induced by
X1 = f1(X3,N1)
X2 = N2
X3 = f3(X2,N3)
X4 = f4(X2,X3,N4)
• Ni jointly independent
• G0 has no cycles
X4
X2 X3
X1G0
Functional causal model.Can the DAG be recovered from P(X1, . . . ,X4)?
No.JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
S. Shimizu, P. Hoyer, A. Hyvarinen, A. Kerminen: A linear non-Gaussian acyclic model for causal discovery. JMLR, 2006
P. Buhlmann, JP, J. Ernest: CAM: Causal add. models, high-dim. order search and penalized regr., Annals of Statistics 2014
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Assume P(X1, . . . ,X4) has been induced by
X1 = f1(X3,N1)
X2 = N2
X3 = f3(X2,N3)
X4 = f4(X2,X3,N4)
• Ni jointly independent
• G0 has no cycles
X4
X2 X3
X1G0
Functional causal model.Can the DAG be recovered from P(X1, . . . ,X4)? No.JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
S. Shimizu, P. Hoyer, A. Hyvarinen, A. Kerminen: A linear non-Gaussian acyclic model for causal discovery. JMLR, 2006
P. Buhlmann, JP, J. Ernest: CAM: Causal add. models, high-dim. order search and penalized regr., Annals of Statistics 2014
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Assume P(X1, . . . ,X4) has been induced by
X1 = f1(X3) +N1
X2 = N2
X3 = f3(X2) +N3
X4 = f4(X2,X3) +N4
• Ni ∼ N (0, σ2i ) jointly independent
• G0 has no cycles
X4
X2 X3
X1G0
Additive noise model with Gaussian noise.Can the DAG be recovered from P(X1, . . . ,X4)? Yes iff fi nonlinear.JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
P. Buhlmann, JP, J. Ernest: CAM: Causal add. models, high-dim. order search and penalized regr., Annals of Statistics 2014
S. Shimizu, P. Hoyer, A. Hyvarinen, A. Kerminen: A linear non-Gaussian acyclic model for causal discovery. JMLR, 2006
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Consider a distribution generated by
Y = f (X ) + NY
with NY ,Xind∼ N
X Y
Then, if f is nonlinear, there is no
X = g(Y ) + MX
with MX ,Yind∼ N
X Y
JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Consider a distribution generated by
Y = f (X ) + NY
with NY ,Xind∼ N
X Y
Then, if f is nonlinear, there is no
X = g(Y ) + MX
with MX ,Yind∼ N
X Y
JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Consider a distribution corresponding to
Y = X 3 + NY
with NY ,Xind∼ N
X Y
with
X ∼ N (1, 0.52)
NY ∼ N (0, 0.42)
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
−0.5 0.0 0.5 1.0 1.5 2.0 2.5
05
1015
X
Y
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
−0.5 0.0 0.5 1.0 1.5 2.0 2.5
05
1015
X
Y
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
−0.5 0.0 0.5 1.0 1.5 2.0 2.5
05
1015
X
Y
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
−0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4
05
1015
gam(X ~ s(Y))$residuals
Y
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Surprise (under some assumptions):
2 variables ⇒ p variables
JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
Let P(X1, . . . ,Xp) be induced by a ...
conditions identif.structural equation model: Xi = fi (XPAi
,Ni ) - 7
additive noise model: Xi = fi (XPAi) + Ni nonlin. fct. 3
causal additive model: Xi =∑
k∈PAifik(Xk) + Ni nonlin. fct. 3
linear Gaussian model: Xi =∑
k∈PAiβikXk + Ni linear fct. 7
.
(results hold for Gaussian noise)
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Surprise (under some assumptions):
2 variables ⇒ p variables
JP, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, JMLR 2014
Let P(X1, . . . ,Xp) be induced by a ...
conditions identif.structural equation model: Xi = fi (XPAi
,Ni ) - 7
additive noise model: Xi = fi (XPAi) + Ni nonlin. fct. 3
causal additive model: Xi =∑
k∈PAifik(Xk) + Ni nonlin. fct. 3
linear Gaussian model: Xi =∑
k∈PAiβikXk + Ni linear fct. 7
.
(results hold for Gaussian noise)
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
GAUL GAUSS“the LINEAR”
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
GAUL GAUSS“the LINEAR”
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
0 20 40 60 80 1000
10
20
30
40
50
60
70
80
90
100
Significant
Not significant
Decision rate (%)
Accura
cy (
%)
IGCI
LiNGaM
Additive Noise
PNL
see alsoD. Lopez-Paz, K. Muandet, B. Scholkopf, I. Tolstikhin: Towards a Learning Theory of Cause-Effect Inference, ICML 2015
E. Sgouritsa, D. Janzing, P. Hennig, B. Scholkopf: Inf. of Cause and Effect with Unsupervised Inverse Regr., AISTATS 2015
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Real data: genetic perturbation experiments for yeast (Kemmeren et al.,2014)
p = 6170 genes
nobs = 160 wild-types
nint = 1479 gene deletions (targets known)
true hits: ≈ 0.1% of pairs
“Invariant prediction” method: E = {obs, int}JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identification and conf. intervals, arXiv, 1501.01332D. Rothenhaeusler, C. Heinze et al.: backShift: Learning causal cyclic graphs from unknown shift interv., arXiv 1506.02494
M. Rojas-Carulla et al.: A Causal Perspective on Domain Adaptation, arXiv 1507.05333
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Real data: genetic perturbation experiments for yeast (Kemmeren et al.,2014)
p = 6170 genes
nobs = 160 wild-types
nint = 1479 gene deletions (targets known)
true hits: ≈ 0.1% of pairs
“Invariant prediction” method: E = {obs, int}JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identification and conf. intervals, arXiv, 1501.01332D. Rothenhaeusler, C. Heinze et al.: backShift: Learning causal cyclic graphs from unknown shift interv., arXiv 1506.02494
M. Rojas-Carulla et al.: A Causal Perspective on Domain Adaptation, arXiv 1507.05333
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Real data: genetic perturbation experiments for yeast (Kemmeren et al.,2014)
p = 6170 genes
nobs = 160 wild-types
nint = 1479 gene deletions (targets known)
true hits: ≈ 0.1% of pairs
“Invariant prediction” method: E = {obs, int}
JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identification and conf. intervals, arXiv, 1501.01332D. Rothenhaeusler, C. Heinze et al.: backShift: Learning causal cyclic graphs from unknown shift interv., arXiv 1506.02494
M. Rojas-Carulla et al.: A Causal Perspective on Domain Adaptation, arXiv 1507.05333
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
Real data: genetic perturbation experiments for yeast (Kemmeren et al.,2014)
p = 6170 genes
nobs = 160 wild-types
nint = 1479 gene deletions (targets known)
true hits: ≈ 0.1% of pairs
“Invariant prediction” method: E = {obs, int}JP, P. Buhlmann, N. Meinshausen: Causal inference using inv. pred.: identification and conf. intervals, arXiv, 1501.01332D. Rothenhaeusler, C. Heinze et al.: backShift: Learning causal cyclic graphs from unknown shift interv., arXiv 1506.02494
M. Rojas-Carulla et al.: A Causal Perspective on Domain Adaptation, arXiv 1507.05333
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
ACTIVITY GENE 5954
AC
TIV
ITY
GE
NE
471
0
−1.0 −0.5 0.0 0.5
−1.
0−
0.5
0.0
0.5
observational training data
ACTIVITY GENE 5954−1.0 −0.5 0.0 0.5
interventional training data(interv. on genes other than 5954 and 4710)
ACTIVITY GENE 5954
AC
TIV
ITY
GE
NE
471
0
−5 −4 −3 −2 −1 0 1
−5
−4
−3
−2
−1
01
interventional test data point(intervention on gene 5954)
most significant pair
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
ACTIVITY GENE 3729
AC
TIV
ITY
GE
NE
373
0
−0.5 0.0 0.5 1.0
−0.
50.
00.
51.
0
observational training data
ACTIVITY GENE 3729−0.5 0.0 0.5 1.0
interventional training data(interv. on genes other than 3729 and 3730)
ACTIVITY GENE 3729
AC
TIV
ITY
GE
NE
373
0
−4 −3 −2 −1 0 1 2
−4
−3
−2
−1
01
2 interventional test data point(intervention on gene 3729)
2nd most significant pair
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
ACTIVITY GENE 3672
AC
TIV
ITY
GE
NE
147
5
−0.5 0.0 0.5 1.0 1.5−0.
50.
00.
51.
01.
5
observational training data
ACTIVITY GENE 3672−0.5 0.0 0.5 1.0 1.5
interventional training data(interv. on genes other than 3672 and 1475)
ACTIVITY GENE 3672
AC
TIV
ITY
GE
NE
147
5
−3 −2 −1 0 1 2
−3
−2
−1
01
2 interventional test data point(intervention on gene 3672)
3rd most significant pair
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
# INTERVENTION PREDICTIONS
# S
TR
ON
G IN
TE
RV
EN
TIO
N E
FF
EC
TS
0 5 10 15 20 25
02
46
8
PERFECTINVARIANTHIDDEN−INVARIANTPCRFCIREGRESSION (CV−Lasso)GES and GIESRANDOM (99% prediction− interval)
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
http://xkcdsw.com/3039
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015
B. Watterson: It’s a magical world, Andrews McMeel Publishing, 1996
B. Scholkopf & J. Peters (MPI) Causality 21st July 2015