Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England...
Transcript of Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England...
![Page 1: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/1.jpg)
Encoding Linguistic Structures with Graph Convolutional Networks
Diego MarcheggianiJoint work with Ivan Titov and Joost Bastings
University of AmsterdamUniversity of Edinburgh
@South England NLP Meetup
![Page 2: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/2.jpg)
Structured (Linguistic) Priors
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
“I voted for Palpatine because he was
most aligned with my values,” she said.
2
![Page 3: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/3.jpg)
Sequence to Sequence
3
[Sutskever et al., 2014]
the black cat
le chat noire <\s>
<s> le chat noire
![Page 4: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/4.jpg)
Sequence to Sequence
} Language is not (only) a sequence of words} We have linguistic knowledge
4
[Sutskever et al., 2014]
the black cat
le chat noire <\s>
<s> le chat noire
![Page 5: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/5.jpg)
Sequence to Sequence
} Language is not (only) a sequence of words} We have linguistic knowledge
Encode structured linguistic knowledge into NN using Graph Convolutional Networks
5
the black cat
le chat noire <\s>
<s> le chat noire
![Page 6: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/6.jpg)
Outline
} Semantic Role Labeling} Graph Convolutional Networks (GCN)} Syntactic GCN for Semantic Role Labeling (SRL)} SRL Model} Exploiting Semantics in Neural Machine Translation with GCNs
Encoding Sentences with Graph Convolutional Networks for Semantic Role LabelingDiego Marcheggiani, Ivan Titov. In Proceedings of EMNLP, 2017.
Exploiting Semantics in Neural Machine Translation with Graph Convolutional NetworksDiego Marcheggiani, Joost Bastings, Ivan Titov. In Proceedings of NAACL-HLT, 2018.
6
![Page 7: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/7.jpg)
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
Sequa makes and repairs jet engines.Sequa makes and repairs jet engines.
7
![Page 8: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/8.jpg)
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence} Discover and disambiguate predicates
8
Sequa makes and repairs jet engines.make.01 repair.01
Sequa makes and repairs jet engines.
![Page 9: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/9.jpg)
} Predicting the predicate-argument structure of a sentence} Discover and disambiguate predicates} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.make.01 repair.01
Creator
Semantic Role Labeling
9
![Page 10: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/10.jpg)
} Predicting the predicate-argument structure of a sentence} Discover and disambiguate predicates} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.make.01 repair.01
Creator
Creation
Semantic Role Labeling
10
![Page 11: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/11.jpg)
} Predicting the predicate-argument structure of a sentence} Discover and disambiguate predicates} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
11
![Page 12: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/12.jpg)
Semantic Role Labeling
} Only the head of an argument is labeled} Sequence labeling task for each predicate} Focus on argument identification and labeling
12
Sequa makes and repairs jet engines.make.01 repair.01
Creator
Creation
Entity repaired
Repairer
![Page 13: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/13.jpg)
Semantic Role Labeling
13
Question answeringNarayanan and Harabagiu 2004
Shen and Lapata 2007 Khashabi et al. 2018
Machine translationWu and Fung 2009
Aziz et al. 2011
Information extractionSurdeanu et al. 2003
Christensen et al. 2010
![Page 14: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/14.jpg)
Related work
14
Tutorial on Semantic Role Labeling at EMNLP 2017
![Page 15: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/15.jpg)
Related work
} SRL systems that use syntax with simple NN architectures} [FitzGerald et al., 2015]} [Roth and Lapata, 2016]
} Recent models ignore linguistic bias } [Zhou and Xu, 2014]} [He et al., 2017]} [Marcheggiani et al., 2017]
15
Tutorial on Semantic Role Labeling at EMNLP 2017
![Page 16: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/16.jpg)
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
Sequa makes and repairs jet engines.
creator
creation
SBJ COORD
OBJ
CONJ NMOD
ROOT
16
![Page 17: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/17.jpg)
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
Motivations
} Some semantic dependencies are mirrored in the syntactic graph} Not all of them – syntax-semantics interface is not trivial
17
![Page 18: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/18.jpg)
Outline
} Semantic Role Labeling} Graph Convolutional Networks (GCN)} Syntactic GCN for Semantic Role Labeling (SRL)} SRL Model} Exploiting Semantics in Neural Machine Translation with GCNs
18
![Page 19: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/19.jpg)
Graph Convolutional Networks (message passing)
Undirected graph
[Gori et al. 2005 Scarselli et al. 2009Kipf and Welling, 2016]
19
![Page 20: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/20.jpg)
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Gori et al. 2005 Scarselli et al. 2009Kipf and Welling, 2016]
20
![Page 21: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/21.jpg)
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Kipf and Welling, 2016]
21
hi = ReLU
0
@W0hi +X
j2N (v)
W1hj
1
A
<latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit>
NeighborhoodSelf loop
![Page 22: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/22.jpg)
GCNs PipelineHidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
Initial feature representation of
nodes
Representation informed by nodes’
neighborhood
[Kipf and Welling, 2016]
… …
…
22
![Page 23: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/23.jpg)
GCNs PipelineHidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
[Kipf and Welling, 2016]
… …
…
Extend GCNs for syntactic dependency trees
Initial feature representation of
nodes
Representation informed by nodes’
neighborhood
23
![Page 24: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/24.jpg)
Outline
} Semantic Role Labeling} Graph Convolutional Networks (GCN)} Syntactic GCN for Semantic Role Labeling (SRL)} SRL Model} Exploiting Semantics in Neural Machine Translation with GCNs
24
![Page 25: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/25.jpg)
Example
Lane disputed those estimates
NMOD
SBJ OBJ
[Marcheggiani and Titov, 2017]
25
![Page 26: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/26.jpg)
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani and Titov, 2017]
26
![Page 27: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/27.jpg)
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W(1)subj
⇥W(1)n
m
o
d
⇥W(1)
o
b
j
[Marcheggiani and Titov, 2017]
27
![Page 28: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/28.jpg)
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W(1)subj
⇥W(1)n
m
o
d
⇥W(1)
o
b
j
⇥W (1)o
b
j
0
⇥W(1)
n
m
o
d
0
⇥W(1)
subj
0
[Marcheggiani and Titov, 2017]
28
![Page 29: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/29.jpg)
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W(1)subj
⇥W(1)n
m
o
d
⇥W(1)
o
b
j
⇥W (1)o
b
j
0
⇥W(1)
n
m
o
d
0
⇥W(1)
subj
0
[Marcheggiani and Titov, 2017]
29
![Page 30: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/30.jpg)
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W(1)subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)o
b
j
0
⇥W(1)n
m
o
d
⇥W(1)
n
m
o
d
0
⇥W(1)
o
b
j
⇥W(1)
subj
0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani and Titov, 2017]
30
![Page 31: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/31.jpg)
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W(1)subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)o
b
j
0
⇥W(1)n
m
o
d
⇥W(1)
n
m
o
d
0
⇥W(1)
o
b
j
⇥W(1)
subj
0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W
(2)
self ⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥W(2)subj⇥W
(2)
subj
0
⇥W (2)o
b
j
0
⇥W(2)
o
b
j
⇥W
(2)n
m
o
d
⇥W(2)
n
m
o
d
0
Stacking GCNs widens the syntactic neighborhood
[Marcheggiani and Titov, 2017]
31
![Page 32: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/32.jpg)
Syntactic GCNs
h(k+1)v = ReLU
0
@X
u2N (v)
W (k)L(u,v)h
(k)u + b(k)L(u,v)
1
A
[Marcheggiani and Titov, 2017]
32
![Page 33: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/33.jpg)
Syntactic GCNs
h(k+1)v = ReLU
0
@X
u2N (v)
W (k)L(u,v)h
(k)u + b(k)L(u,v)
1
A
Syntactic neighborhood
[Marcheggiani and Titov, 2017]
33
![Page 34: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/34.jpg)
Syntactic GCNs
Syntactic neighborhood
h(k+1)v = ReLU
0
@X
u2N (v)
W (k)L(u,v)h
(k)u + b(k)L(u,v)
1
A
Message
[Marcheggiani and Titov, 2017]
34
![Page 35: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/35.jpg)
Syntactic GCNs
Syntactic neighborhood Self-loop is included in NMessages are direction and
label specific
h(k+1)v = ReLU
0
@X
u2N (v)
W (k)L(u,v)h
(k)u + b(k)L(u,v)
1
A
Message
[Marcheggiani and Titov, 2017]
35
![Page 36: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/36.jpg)
} Overparametrized: one matrix for each label-direction pair}
Syntactic GCNs
Syntactic neighborhood
W (k)L(u,v) = V (k)
dir(u,v)
Self-loop is included in NMessages are direction and
label specific
h(k+1)v = ReLU
0
@X
u2N (v)
W (k)L(u,v)h
(k)u + b(k)L(u,v)
1
A
Message
[Marcheggiani and Titov, 2017]
36
![Page 37: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/37.jpg)
Edge-wise Gates
} Not all edges are equally important for the final task
[Marcheggiani and Titov, 2017]
37
![Page 38: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/38.jpg)
Edge-wise Gates
} Not all edges are equally important for the final task} We should not blindly rely on predicted syntax
[Marcheggiani and Titov, 2017]
38
![Page 39: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/39.jpg)
Edge-wise Gates
} Not all edges are equally important for the final task} We should not blindly rely on predicted syntax} Gates decide the “importance” of each message
Lane disputed those estimates
NMOD
SBJ OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani and Titov, 2017]
39
![Page 40: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/40.jpg)
Edge-wise Gates
} Not all edges are equally important for the final task} We should not blindly rely on predicted syntax} Gates decide the “importance” of each message
Gates depend on nodes and edges Lane disputed those estimates
NMOD
SBJ OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani and Titov, 2017]
40
![Page 41: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/41.jpg)
Outline
} Semantic Role Labeling} Graph Convolutional Networks (GCN)} Syntactic GCN for Semantic Role Labeling (SRL)} SRL Model} Exploiting Semantics in Neural Machine Translation with GCNs
41
![Page 42: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/42.jpg)
Our Model
} Word representation} Bidirectional LSTM encoder} GCN Encoder} Local role classifier
[Marcheggiani and Titov, 2017]
42
![Page 43: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/43.jpg)
Word Representation
} Pretrained word embeddings} Word embeddings} POS tag embeddings} Predicate lemma embeddings
Lane disputed those estimates
wordrepresentation
[Marcheggiani and Titov, 2017]
43
![Page 44: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/44.jpg)
BiLSTM Encoder
} Encode each word with its left and right context} Stacked BiLSTM
Lane disputed those estimates
wordrepresentation
J layers BiLSTM
[Marcheggiani and Titov, 2017]
44
![Page 45: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/45.jpg)
GCNs Encoder
} Syntactic GCNs after BiLSTM encoder} Add syntactic information} Skip connections} Longer dependencies are captured
Lane disputed those estimates
wordrepresentation
J layers BiLSTM
dobj
nmodnsubj
K layers GCN
[Marcheggiani and Titov, 2017]
45
![Page 46: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/46.jpg)
Semantic Role Classifier
Lane disputed those estimates
wordrepresentation
J layers BiLSTM
dobj
nmodnsubj
K layers GCN
A1Classifier
�
predicate representation
candidate argument representation
} Local log-linear classifier
p(r|ti, tp, l) / exp(Wl,r(ti � tp))
46
![Page 47: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/47.jpg)
Experiments
} Data} CoNLL-2009 dataset - English and Chinese} F1 evaluation measure
} Model} Hyperparameters tuned on English development set} State-of-the-art predicate disambiguation models
[Marcheggiani and Titov, 2017]
47
![Page 48: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/48.jpg)
Ablation Experiments (Dev set)
82.7
83.3
81
82
83
84
85
English SRL w/o predicate disambiguation
BiLSTM GCN
[Marcheggiani and Titov, 2017]
48
75.2
77.1
73
74
75
76
77
78
Chinese SRL w/o predicate disambiguation
BiLSTM GCN
![Page 49: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/49.jpg)
English Test Set
87.3
87.7 87.7
88
86
87
88
89
FitzGerald et al. (2015) (global)
Roth and Lapata (2016) (global)
Marcheggiani et al. (2017, CoNLL) (local)
Ours (Bi-LSTM + GCN) (local)
SRL with predicate disambiguation
[Marcheggiani and Titov, 2017]
49
![Page 50: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/50.jpg)
English Out of Domain
75.2
76.1
77.7
77.2
74
75
76
77
78
FitzGerald et al. (2015) (global)
Roth and Lapata (2016) (global)
Marcheggiani et al. (2017, CoNLL) (local)
Ours (Bi-LSTM + GCN) (local)
SRL with predicate disambiguation
[Marcheggiani and Titov, 2017]
50
![Page 51: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/51.jpg)
English Test Set (Ensemble)
87.787.9
89.1
86
87
88
89
90
FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble)
SRL with predicate disambiguation
[Marcheggiani and Titov, 2017]
51
![Page 52: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/52.jpg)
Chinese Test Set
77.7
78.6
79.4
82.5
76
77
78
79
80
81
82
83
Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009) (global)
Roth and Lapata (2016) (global)
Ours (Bi-LSTM + GCN) (local)
SRL with predicate disambiguation
[Marcheggiani and Titov, 2017]
52
![Page 53: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/53.jpg)
Syntactic Graph Convolutional Networks
53
} Fast and simple} Can be seamlessly applied to other tasks
![Page 54: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/54.jpg)
Syntactic Graph Convolutional Networks
54
} Fast and simple} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine TranslationJoost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima'an. In Proceedings of EMNLP, 2017.
![Page 55: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/55.jpg)
Syntactic Graph Convolutional Networks
55
} Fast and simple} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine TranslationJoost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, Khalil Sima'an. In Proceedings of EMNLP, 2017.
Improvements on English to German and
English to Czech translations
![Page 56: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/56.jpg)
Multi-document Question Answering
56
[De Cao et al., 2018]
• Nodes are entities and edges are co-reference links• Inference on a graph representing the documents collection
![Page 57: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/57.jpg)
Multi-document Question Answering
57
[De Cao et al., 2018]
![Page 58: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/58.jpg)
Syntactic Graph Convolutional Networks
58
![Page 59: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/59.jpg)
Syntactic Graph Convolutional Networks
59
![Page 60: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/60.jpg)
Syntactic Graph Convolutional Networks
60
![Page 61: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/61.jpg)
Outline
} Semantic Role Labeling} Graph Convolutional Networks (GCN)} Syntactic GCN for Semantic Role Labeling (SRL)} SRL Model} Exploiting Semantics in Neural Machine Translation with GCNs
61
![Page 62: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/62.jpg)
Motivations [Marcheggiani at al., 2018]
62
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
![Page 63: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/63.jpg)
Motivations
SRL helps to generalize over different surface realizations of the same underlying “meaning”.
[Marcheggiani at al., 2018]
63
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
![Page 64: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/64.jpg)
Motivations
64
![Page 65: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/65.jpg)
Motivations
65
Lost in translation
![Page 66: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/66.jpg)
Related work
} Semantics in statistical MT} [Wu and Fung, 2009]} [Liu and Gildea, 2010]} [Aziz et al., 2011]} ...
} Syntax in neural MT} [Sennrich and Haddow, 2016]} [Aharoni and Goldberg, 2017 ]} [Bastings et al., 2017]} …
} Semantics in neural MT} ???
[Marcheggiani at al., 2018]
66
![Page 67: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/67.jpg)
Predicate-argument encoding
67
John gave his wonderful wife a nice present
WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Semantic GCN
Semantic GCN WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Giver
Thing given
Entity given to
![Page 68: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/68.jpg)
Our Model
} Standard sequence2sequence with attention} Semantic GCN encoder on top of a bidirectional RNN} RNN decoder
[Marcheggiani at al., 2018]
68
![Page 69: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/69.jpg)
Our model
John gave his wonderful wife a nice present
WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
BiRNN/CNN
Semantic GCN
Semantic GCN WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf<bos> John
John
+RNN
DECODERATTENTIONMECHANISM
[Marcheggiani at al., 2018]
69
![Page 70: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/70.jpg)
Our model
John gave his wonderful wife a nice present
WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
BiRNN/CNN
Semantic GCN
Semantic GCN WA0
WA1WA2
WA0’ WA2’
WA1’
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf
Wse
lf<bos> John
John
+RNN
DECODERATTENTIONMECHANISM
[Marcheggiani at al., 2018]
70
![Page 71: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/71.jpg)
Experiments
} Data} WMT ‘16 English-German dataset (~4.5 million sentence pairs)} BLEU as evaluation measure
} Model} Hyperparameters tuned on News Commentary En-De (~226K sentence pairs)} GRU as RNN
[Marcheggiani at al., 2018]
71
![Page 72: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/72.jpg)
Results
23.323.9
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
72
![Page 73: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/73.jpg)
Results
23.323.9
24.5
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
73
![Page 74: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/74.jpg)
Results
23.323.9
24.5
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
74
+ 1.2 BLEU
![Page 75: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/75.jpg)
Results
23.323.9
24.5
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
Semantics is helpful
[Marcheggiani at al., 2018]
75
+ 1.2 BLEU
![Page 76: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/76.jpg)
Results
23.323.9
24.524.9
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
76
![Page 77: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/77.jpg)
Results
23.323.9
24.524.9
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
77
+ 1.6 BLEU
![Page 78: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/78.jpg)
Results
23.323.9
24.524.9
20
21
22
23
24
25
26
BiRNN (Bastings et al. 2017)
BiRNN + Syntactic GCN
(Bastings et al. 2017)
BiRNN + Semantic GCN
BiRNN+Syntactic GCN +Semantic GCN
Full WMT 2016 English-German BLEU
Syntax and semantics are
complementary
[Marcheggiani at al., 2018]
78
+ 1.6 BLEU
![Page 79: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/79.jpg)
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
[Marcheggiani at al., 2018]
79
BiRNN mistranslates “to” as “nach” (directionality)
![Page 80: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/80.jpg)
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
[Marcheggiani at al., 2018]
80
BiRNN mistranslates “to” as “nach” (directionality)
![Page 81: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/81.jpg)
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
81
BiRNN mistranslates “to” as “nach” (directionality)
Analysis [Marcheggiani at al., 2018]
![Page 82: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/82.jpg)
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
Analysis [Marcheggiani at al., 2018]
82
Both translations are wrong,but the BiRNN’s one is grammatically correct
![Page 83: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/83.jpg)
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
Analysis [Marcheggiani at al., 2018]
83
Both translations are wrong,but the BiRNN’s one is grammatically correct
![Page 84: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/84.jpg)
John sold the car to Mark .
Seller Thing sold Buyer
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player GameAM-LOC
The boy walking down the dusty road is drinking a beer .
Walker AM-DIR Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
SOURCE
Analysis [Marcheggiani at al., 2018]
84
Both translations are wrong,but the BiRNN’s one is grammatically correct
![Page 85: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/85.jpg)
Conclusion
} GCNs for encoding linguistic structures into NN} Semantics, coreference, discourse} Fast} Cheap
} State-of-the-art model for dependency-based SRL
} First to exploit semantics in NMT
85
![Page 86: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/86.jpg)
Roadmap
86
Including structured bias into neural NLP models
![Page 87: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/87.jpg)
Roadmap
87
Including structured bias into neural NLP models
Low-resource setting
![Page 88: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/88.jpg)
Roadmap
88
Including structured bias into neural NLP models
Low-resource settingLong-range dependencies
Document levelCross-document level
![Page 89: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/89.jpg)
Roadmap
89
Including structured bias into neural NLP models
Low-resource settingLong-range dependencies
Document levelCross-document level
Integrating external knowledge i.e., knowledge graphs
![Page 90: Encoding Linguistic Structures with Graph Convolutional Networks · 2019-11-10 · @South England NLP Meetup. Structured (Linguistic) Priors ... 2017. Exploiting Semantics in Neural](https://reader033.fdocuments.net/reader033/viewer/2022042914/5f4d30c36d689868d17bdbcb/html5/thumbnails/90.jpg)
Roadmap
90
Including structured bias into neural NLP models
Low-resource settingLong-range dependencies
Document levelCross-document level
Integrating external knowledge i.e., knowledge graphs
Thanks for your attention!