words of a sentence may be Artificial Intelligence
Transcript of words of a sentence may be Artificial Intelligence
AR
TIF
ICIA
L I
NT
EL
LIG
EN
CE
217
Learning and Applying Contextual
Con
stra
ints
in S
ente
nce
Com
preh
ensi
on *
Mar
k F.
St.
John
** and James L. McClelland
Dep
artm
ent o
f Ps
ycho
logy
, Car
negi
e-M
ello
n U
nive
rsity
,Pi
ttsbu
rgh
, PA
1521
3U
SA
AB
STR
AC
f
A p
aral
lel d
istr
ibut
ed p
roce
ssin
g m
odel
is described that learns to co
mpr
ehen
d si
ngle
cla
use
ente
nces
, Spe
cifi
cally
, it a
ssig
ns th
emat
ic r
oles
10
sent
ence
con
stitu
ents
, dis
ambi
guat
es a
mbi
guou
sw
ord~
', in
stan
tiate
s va
gue
wor
ds, a
nd e
labo
rate
s im
plie
d ro
les,
The
sen
tenc
es a
re p
re-s
egm
ente
d ill
low
m'tituent phrQJ,'es
, Eac
h co
nstit
uent
is p
roce
ssed
in tu
rn 1
0 up
date
an
evol
ving
rep
rese
ntat
ion
of th
eev
ent d
escr
ibed
by
the
sent
ence
, The
mod
el u
ses
the
info
rmat
ion
deri
ved
from
eac
h co
nstit
uent
tore
vise
its
ongo
ing
inte
rpre
tatio
n of
the
sent
ence
and
to a
ntic
ipat
e ad
ditio
nal c
onst
ituen
ts, T
he n
etw
ork
lear
ns to
per
form
thes
e ta
sks
thro
ugh
prac
tice
on p
roce
ssin
g ex
ampl
e se
nten
ce/e
vent
pai
rs, T
helearning procedure allows the model to take a st
atis
tical
app
roac
h to
sol
ving
the
boot
stra
ppin
gpr
oble
m o
f le
arni
ng th
e sy
ntax
"nd
sem
antic
s of
a la
ngua
ge f
rom
the
sam
e da
ta, T
he m
odel
per
form
sve
ry w
ell o
n th
e co
rpus
of
sent
ence
s on
whi
ch it
was
trai
ned,
and
gen
eral
izes
to s
ente
nces
on
whi
ch it
was
not
trai
ned,
but
lear
ns s
low
ly,
1. Introduction
The
goa
l of
our
rese
arch
has
bee
n to
dev
elop
a m
odel
that
can
lear
n to
con
vert
a simple sentence into a conceptual representation of the event that the
sent
ence
des
crib
es, S
peci
fical
ly, w
e ha
ve b
een
conc
erne
d w
ith th
e la
ter
stag
esof
this
pro
cess
: the
con
vers
ion
of a
seq
uenc
e of
sen
tenc
e co
nstit
uent
s, s
uch
asno
un p
hras
es, i
nto
a re
pres
enta
tion
of th
e ev
ent.
A n
umbe
r of
pro
blem
s m
ake
this
pro
cess
dif
ficu
lt. F
irst
, the
wor
ds o
f a
sent
ence
may
be
ambi
guou
s or
vague. In the sentence
, "
The
pitc
her
thre
w th
e ba
ll"
each
con
tent
wor
d is
ambi
guou
s. "
Pitc
her" could ei
ther
ref
er to
a b
all-player or a co
ntai
ner;
The
aut
hors
wou
ld li
ke to
than
k G
eoff
rey
Hin
ton,
Bri
an M
acW
hinn
ey, A
ndre
w H
udso
n, a
ndth
e m
embe
rs o
f the
PD
P R
esea
rch
Gro
up a
t Car
negi
e-M
ello
n, T
his
rese
arch
was
sup
port
ed b
yN
SF
Gra
nt B
NS
86-09729, ONR Contracts NOOOI4-
86-
0146
, NO
OO
I4-8
6-O
OI6
7, a
nd N
OO
O14
.
86-
0349
, and
NIM
H C
aree
r D
evel
opm
ent A
war
d M
HO
O38
5 to
the
seco
nd a
utho
r,..
Pre
sent
add
ress
: Dep
artm
ent o
f Cog
nitiv
e S
cien
ce, U
nive
rsity
of C
alifo
rnia
at S
an D
iego
, La
Jolla
, CA
920
93, U
SA,
Artificial Intelligence 46
(19
90)
217-
257
0004
-370
2/90
/$03
, 50 (Q 1990
Els
evie
r Sc
i~nc
e Pu
blis
hers
B.v
, (N
orth
-Hol
land
)
218
F, S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
thre
w"
coul
d ei
ther
ref
er to
toss
or
host
; and
"ba
ll" c
ould
ref
er to
a s
pher
e or
a da
nce,
How
are
the
appr
opri
ate
mea
ning
s se
lect
ed s
o th
at a
sin
gle,
coh
eren
tin
terp
reta
tion
of th
e se
nten
ce is
pro
duce
d? V
ague
wor
ds a
lso
pres
ent d
iffic
ul-
ties. In the sentences
, "
The
con
tain
er h
eld
the
appl
es" and "T
he c
onta
iner
held
the
cola
," th
e w
ord
"con
tain
er" refers to two different objects (1
). H
owdo
es th
e co
ntex
t aff
ect t
he in
terp
reta
tion
of v
ague
wor
ds?
A th
ird
prob
lem
is th
e co
mpl
exity
of
assi
gnin
g th
e co
rrec
t the
mat
ic r
oles
(9)
to th
e ob
ject
s re
ferr
ed to
in a
sen
tenc
e. C
onsi
der:
(1)
The
teac
her
ate
the spaghetti with the busdriver,
(2)
The
teac
her
ate
the spaghetti with the red sauce,
(3)
The
bus
driv
er h
it th
e fi
rem
an.
(4)
The
bus
driv
er w
as h
it by
the
fire
man
.
In th
e fi
rst t
wo
exam
ples
, semantics play an important role. In the fi
rst
sent
ence
, it i
s th
e re
ader
s kn
owle
dge
that
bus
driv
ers
are
peop
le th
at p
recl
udes
the
read
er f
rom
dec
idin
g th
e bu
sdri
ver
is to
be
serv
ed a
s a
cond
imen
t. In
stea
dit
mus
t be
that
bot
h he
and
the teacher are eating the sp
aghe
tti. S
eman
tic
cons
trai
nts
wor
k conversely in the second sentence. In the third sentence,
sem
antic
s do
not
hel
p de
term
ine
who
is th
e ag
ent a
nd w
ho is
the
patie
nt.
Inst
ead
, wor
d or
der
dete
rmin
es th
e th
emat
ic r
ole
assi
gnm
ents
. The
bus
driv
er is
the
agen
t bec
ause
"th
e bu
sdriv
er"
is th
e pr
e-ve
rbal
con
stitu
ent.
Fina
lly, i
n th
efo
urth
sen
tenc
e, th
e in
flue
nce
of o
ther
mor
phol
ogic
al f
eatu
res
can
be s
een.
The
passive verb tense and the "
" preposition, in conjunction with the w
ord
orde
r, d
eter
min
e th
at th
e bu
sdri
ver
is th
e pa
tient
. The
mat
ic r
ole
assi
gnm
ent
then, requires the joint consideration of a variety of aspects of the sentence,
A f
ourt
h pr
oble
m f
or p
roce
ssin
g se
nten
ces
is th
at a
sen
tenc
e m
ay le
ave
som
eth
emat
ic c
onst
ituen
ts im
plic
it th
at a
re n
ever
thel
ess
pres
ent i
n th
e ev
ent.
For
exam
ple
in s
ente
nces
(1)
and
(2)
abo
ve, t
he s
pagh
etti
was
und
oubt
edly
eat
enwith forks, Psychological evidence indicates that missing constituents, w
hen
strongly related to the action, a
re in
ferr
ed a
nd a
dded
to th
e de
scrip
tion
of th
eev
ent.
McK
oon
and
Rat
clif
f (2
3) fo
und, for example, that "ha
mm
er" was
infe
rred
aft
er s
ubje
cts
read
"B
obby
pou
nded
the
boar
ds to
geth
er w
ith n
ails
.O
ur m
odel
of
the
com
preh
ensi
on p
roce
ss c
ente
rs o
n vi
ewin
g th
e pr
oces
s as
afo
rm o
f co
nstr
aint
sat
isfa
ctio
n. T
he s
urfa
ce f
eatu
res
of a
sen
tenc
e, i
ts p
artic
ular
wor
ds a
nd th
eir
orde
r an
d m
orph
olog
y, p
rovi
de a
ric
h se
t of
cons
trai
nts
on th
ese
nten
ces meaning. Each feature constrains the meaning in a number of
resp
ects
, Con
junc
tions
of
feat
ures
, suc
h as
wor
d or
der
and
pass
ive-
voic
e
mor
phol
ogy,
pro
vide
add
ition
al c
onst
rain
ts, T
oget
her,
the
cons
trai
nts
lead
to a
cohe
rent
inte
rpre
tatio
n of
the
sent
ence
(17)
, The
se c
onst
rain
ts a
re n
ot ty
pica
llyal
l-or
-non
e. I
nste
ad, c
onst
rain
ts te
nd to
var
y in
str
engt
h: s
ome
are
stro
ng a
ndot
hers
are
rel
ativ
ely
wea
k, A
n ex
ampl
e ad
apte
d fr
om M
arcu
s (1
8) p
rovi
des
anill
ustr
atio
n of
the
com
petit
ion
betw
een
cons
trai
nts.
(1)
Whi
ch d
rago
n di
d th
e kn
ight
giv
e th
e bo
y?
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
219
(2)
Whi
ch b
oy d
id th
e kn
ight
giv
e th
e dr
agon
?(3
) W
hich
boy
did
the
knig
ht g
ive
the
swor
d?(4
) W
hich
boy
did
the
knig
ht g
ive
to th
e sw
ord?
App
aren
tly, i
n th
e fi
rst t
wo
sent
ence
s, a
wea
k sy
ntac
tic c
onst
rain
t mak
es u
spr
efer
the
firs
t nou
n as
the
patie
nt a
nd th
e no
un a
fter
the
verb
as
the
reci
pien
t.T
he s
ubtle
sem
antic
s in
the
seco
nd s
ente
nce ,
that
kni
ghts
don
t give boys to
drag
ons,
doe
s no
t ove
rrid
e th
e sy
ntac
tic c
onst
rain
t for
mos
t rea
ders
, tho
ugh
itm
ay m
ake
the sentence seem ungrammatical to some, In se
nten
ce (
3), a
stro
nger
sem
antic
con
stra
int o
verr
ides
this
syn
tact
ic c
onst
rain
t: sw
ords
, whi
char
e in
anim
ate
obje
cts
, can
not r
ecei
ve b
oys,
Fin
ally
, in
the
four
th s
ente
nce, a
stronger syntactic constraint overrides the semantics, It is cl
ear
from
this
exam
ple
that
con
stra
ints
var
y in
str
engt
h an
d co
mpe
te to
pro
duce
an
inte
rpre
-ta
tion
of a
sen
tenc
e, A
goo
d m
etho
d fo
r ca
ptur
ing
this
com
petit
ion
is to
ass
ign
real-valued strengths to the constraints, and to alIow them to co
mpe
te o
rco
oper
ate
acco
rdin
g to
thei
r st
reng
th.
Parallel distributed processing, or connectionist, models are pa
rtic
ular
lygood for modeling this style of processing, T
hey
allo
w la
rge
amou
nts
ofinformation to be processed simultaneously and competitively, and they alIow
evid
ence
to b
e w
eigh
ted
on a
con
tinuu
m (
20, 2
2), A
num
ber
of re
sear
cher
sha
ve p
ursu
ed th
is id
ea a
nd h
ave
built
mod
els
to a
pply
con
nect
ioni
sm to
sent
ence
pro
cess
ing
(5, 6
, 35)
, The
dev
elop
men
t of
this
app
roac
h, h
owev
er, h
asbe
en r
etar
ded
beca
use
it is
dif
ficu
lt to
det
erm
ine
exac
tly w
hat c
onst
rain
ts a
reim
pose
d by
eac
h fe
atur
e or
set
of
feat
ures
in a
sen
tenc
e, I
t is
even
mor
edi
ffic
ult t
o de
term
ine
the
appr
opri
ate
stre
ngth
s ea
ch o
f th
ese
cons
trai
nts
shou
ldha
ve, C
onne
ctio
nist
lear
ning
pro
cedu
res ,
how
ever
, allo
w a
mod
el to
lear
n th
eap
prop
riat
e co
nstr
aint
s an
d as
sign
app
ropr
iate
str
engt
hs to
them
,T
o ta
ke a
dvan
tage
of
this
fea
ture
, lea
rnin
g w
as a
dded
to o
ur li
st o
f go
als,
The
mod
el is
giv
en a
sen
tenc
e as
inpu
t. F
rom
the
sent
ence
, the
model must
prod
uce
a re
pres
enta
tion
of th
e ev
ent t
o w
hich
the
sent
ence
ref
ers,
The
act
ual
event that corresponds to the sentence is then used as fe
edba
ck to
trai
n th
em
odel
. But
lear
ning
is n
ot w
ithou
t its
ow
n pr
oble
ms,
Sev
eral
fea
ture
s of
the
lear
ning
task
mak
e le
arni
ng d
iffi
cult.
One
pro
blem
is th
at th
e en
viro
nmen
t is
prob
abili
stic
, On
diff
eren
t occ
asio
ns, a
sen
tenc
e m
ay r
efer
to d
iffe
rent
eve
nts
that
is, i
t may
be
refe
rent
ially
am
bigu
ous,
For
exa
mpl
e, a
sen
tenc
e lik
e, "
The
pitc
her
thre
w th
e ba
lI\'
may
ref
er to
eith
er th
e tossing of a projectile or the
host
ing
of a
par
ty, T
he r
obus
t , g
rade
d, a
nd in
crem
enta
l cha
ract
er o
f co
nnec
-tio
nist
lear
ning
alg
orith
ms
lead
s us
to h
ope
that
they
will
be
able
to c
ope
with
the
vari
abili
ty in
the
envi
ronm
ent i
n w
hich
they
lear
n,A
sec
ond
lear
ning
pro
blem
con
cern
s th
e di
ffic
ulty
of
lear
ning
the
map
ping
betw
een
the
part
s of
the
sent
ence
and
the
part
s of
the
even
t (10
, 26)
, Lea
rnin
gth
e m
appi
ng is
som
etim
es r
efer
red
to a
s a
boot
-str
appi
ng p
robl
em s
ince
the
mea
ning
of
the
cont
ent w
ords
and
sig
nific
ance
of t
he s
ynta
x m
ust b
e ac
quire
d
220
F, S
T, J
OH
N A
ND
j,L,
McC
LELL
AN
D
from
the
sam
e se
t of d
ata,
To
lear
n th
e sy
ntax
, it s
eem
s ne
cess
ary
to a
lrea
dykn
ow th
e w
ord meanings. Conversely, to learn the word m
eani
ngs
it se
ems
nece
ssar
y to
kno
w h
ow th
e sy
ntax
map
s th
e w
ords
ont
o th
e ev
ent d
escr
iptio
n,T
he c
onne
ctio
nist
lear
ning
pro
cedu
re ta
kes
a st
atis
tical
app
roac
h to
this
prob
lem
, Thr
ough
exp
osur
e to
larg
e nu
mbe
rs o
f se
nten
ces
and
the
even
ts th
eyde
scri
be, t
he m
appi
ng b
etw
een
feat
ures
of
the
sent
ence
s an
d ch
arac
teri
stic
s of
the
even
ts w
ill e
mer
ge a
s st
atis
tical
reg
ular
ities
, For
inst
ance
, in
the
long
run
the
lear
ning
pro
cedu
re s
houl
d di
scov
er th
e re
gula
rity
that
sen
tenc
es b
egin
ning
with
"the boy" and containing a tr
ansi
tive
verb
in th
e ac
tive
voic
e re
fer
toev
ents
in w
hich
a y
oung
, mal
e hu
man
par
ticip
ates
as
an a
gent
, The
dis
cove
ryof
the
entir
e en
sem
ble
of s
uch
regu
lari
ties
prov
ides
a jo
int s
olution to the
prob
lem
s of
lear
ning
the
synt
ax a
nd th
e m
eani
ngs
of w
ords
.So
me
aspe
cts
of th
ese
goal
s ha
ve b
een
addr
esse
d by
our
ow
nea
rlier
wor
k(2
1, 3
0), H
owev
er, these previous models used a cumbersome
a priori
repr
e-se
ntat
ion
of s
ente
nces
that
pro
ved
unw
orka
ble
(see
(30)
for
dis
cuss
ion)
, Giv
enth
e re
cent
suc
cess
es in
usi
ng c
onne
ctio
nist
lear
ning
pro
cedu
res
to le
arn
inte
rnal
repr
esen
tatio
ns (
12, 2
8), w
e de
cide
d to
exp
lore
the
feas
ibili
ty o
f having a
netw
ork
lear
n its
ow
n re
pres
enta
tion
of s
ente
nces
,A
fin
al c
hara
cter
istic
of
lang
uage
com
preh
ensi
on w
e w
ante
d to
cap
ture
isso
met
imes
cal
Ied
the
prin
cipl
e of
imm
edia
te u
pdat
e (3
, 19,
34)
, As
each
constituent of the sentence is encountered, the interpretation of the entire
even
t is
adju
sted
to r
efle
ct th
e co
nstr
aint
s ar
isin
g fr
om th
e ne
w c
onst
ituen
t in
conj
unct
ion
with
the
cons
trai
nts
from
con
stitu
ents
alr
eady
enc
ount
ered
, Bas
edon
all
of th
e av
aila
ble
cons
trai
nts,
the
mod
el s
houl
d tr
y to
ant
icip
ate
upco
min
gco
nstit
uent
s, I
t sho
uld
also
adj
ust i
ts in
terp
reta
tion
of p
rece
ding
con
stitu
ents
tore
flec
t eac
h ne
w b
it of
info
rmat
ion,
In
this
way
, par
ticul
ar s
ente
nce
inte
rpre
ta-
tions
may
gai
n an
d lo
se s
uppo
rt th
roug
hout
the
cour
se o
f pr
oces
sing
as
each
new
bit
of in
form
atio
n is
pro
cess
ed. T
his
imm
edia
te u
pdat
e sh
ould
be
acco
m-
plis
hed
whi
le a
void
ing
the
diff
icul
ty o
f pe
rfor
min
g ba
cktr
acki
ng,
In s
um, t
he m
odel
add
ress
es s
ix g
oals
:
. to disambiguate am
bigu
ous
wor
ds;
. to instantiate va
gue
wor
ds;
. to assign thematic ro
les;
. to elaborate implied ro
les;
. to learn to perform these ta
sks;
. to immediately adjust its
inte
rpre
tatio
n as
eac
h co
nstit
uent
is p
roce
ssed
,
2, D
escr
iptio
n of
the
SG
Mod
el
1. T
ask
The
mod
el's
task
is to
pro
cess
a s
ingl
e cl
ause
sen
tenc
e w
ithou
t em
bedd
ings
into
a representation of the event it de
scri
bes,
The
sen
tenc
e is
pre
sent
ed to
the
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
221
mod
el a
s a
tem
pora
l seq
uenc
e of
con
stitu
ents
, A c
onst
ituen
t is
eith
er a
sim
ple
noun
phr
ase,
a p
repo
sitio
nal p
hras
e, o
r a
verb
(in
clud
ing
the
auxi
liary
ver
b, if
any)
, The
info
rmat
ion
each
of
thes
e se
nten
ce c
onst
ituen
ts y
ield
s is
imm
edia
tely
used as evidence to update the model's internal representation of the entire
even
t. T
his
repr
esen
tatio
n is
cal
led
the
sent
ence
ges
talt
beca
use
all o
f th
ein
form
atio
n fr
om th
e se
nten
ce is
rep
rese
nted
toge
ther
with
in a
sin
gle
, dis
trib
ut-
ed r
epre
sent
atio
n; th
e m
odel
is c
alle
d th
e Se
nten
ce G
esta
lt, o
r SG
, mod
el
because it co
ntai
ns th
is r
epre
sent
atio
n. T
his
gene
ral c
oncept of sentence
repr
esen
tatio
n co
mes
fro
m H
into
ns
pion
eeri
ng w
ork
(11)
, Fro
m th
e se
nten
cege
stal
t, th
e m
odel
can
pro
duce
, as
outp
ut, a
rep
rese
ntat
ion
of th
e ev
ent,
Thi
sev
ent r
epre
sent
atio
n co
nsis
ts o
f a
set o
f pa
irs,
Eac
h pa
ir c
onsi
sts
of a
them
atic
role
and
the
conc
ept t
hat f
ills
that
rol
e, T
oget
her,
the
pair
s de
scri
be th
e ev
ent.
2,
Architecture and processing
The
mod
el c
onsi
sts
of tw
o pa
rts,
One
par
t, th
e se
quen
tial e
ncod
er, s
eque
ntia
llypr
oces
ses
each
con
stitu
ent t
o pr
oduc
e th
e se
nten
ce g
esta
lt. T
he s
econ
d pa
rt is
used
to p
rodu
ce th
e ou
tput
rep
rese
ntat
ion
from
the
sent
ence
ges
talt,
1,
Prod
ucin
g th
e se
nten
ce g
esta
lt
To
proc
ess
the
cons
titue
nt p
hras
es o
f a
sent
ence
, we
adap
ted
an a
rchi
tect
ure
from Jordan (1
5) th
at u
ses
the
outp
ut o
f pr
evio
us processing as input on the
next
i~er
atio
n (s
ee F
ig, 1
), E
ach
cons
titue
nt is
pro
cess
ed in
turn
to u
pdat
e th
ese
nten
ce g
esta
lt. T
o pr
oces
s a
cons
titue
nt, i
t is
firs
t rep
rese
nted
as
a pa
ttern
of
activation over the
current constituent
units
, Act
ivat
ion
from
thes
e un
its
proj
ects
to th
e fi
rst h
idde
n un
it la
yer
and
com
bine
s w
ith th
e ac
tivat
ion
from
the
sentence gestalt
crea
ted
as th
e re
sult
of p
roce
ssin
g th
e pr
evio
us c
onst
ituen
t. T
heac
tual
impl
emen
tatio
n of
this
arr
ange
men
t is
to c
opy
the
activ
atio
n fr
om th
esentence gestalt
to the
previous sentence gestalt
units
, and
allo
w a
ctiv
atio
n to
feed
for
war
d fr
om th
ere,
Act
ivat
ion
in th
e hi
dden
laye
r th
en c
reat
es a
new
...."
""""
,-""
",..,
.".."
. cop
y
-....
".."
"....
"..,.
.,....
..".."
"..,.
."
Fig,
1, T
he a
rchi
tect
ure
of th
e ne
twor
k, T
he b
oxes
hig
hlig
ht th
e. functional parts: Area
processes the constituents into the sentence gestalt, and Area
proc
esse
s th
e se
nten
ce g
esta
lt in
to
the
outp
ut r
epre
sent
atio
n, T
he n
umbe
rs in
dica
te th
e nu
mbe
r of
uni
ts in
eac
h la
yer,
222
F, S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
pattern of activation over the
sentence gestalt
units
, The
sen
tenc
e ge
stal
t,
ther
efor
e, is
not
a superimposition of each co
nstit
uent
. Rat
her,
eac
h ne
w
patte
rn in
the
sent
ence
ges
talt
is c
ompu
ted
thro
ugh
two
laye
rs o
f w
eigh
ts a
nd
repr
esen
ts th
e m
odel
's n
ew b
est g
uess
inte
rpre
tatio
n of
the
mea
ning
of
the
sent
ence
,
2,
Pro
duci
ng th
e ou
tput
As
note
d pr
evio
usly
, sev
eral
oth
er m
odel
s ha
ve u
sed
a ty
pe o
f se
nten
ce g
esta
ltto represent a sentence, McClelland and Kawamoto (2
1) u
sed
units
that
repr
esen
ted
the
conj
unct
ion
ot's
eman
tic f
eatu
res
of th
e ve
rb w
ith th
e se
man
ticfe
atur
es o
f a
conc
ept,
To
enco
de a
sen
tenc
e, t
he p
atte
rns
of a
ctiv
ity p
rodu
ced
for
each
ver
b/co
ncep
t con
junc
tion
wer
e ac
tivat
ed in
a s
ingl
e po
ol o
f un
its th
atco
ntai
ned
ever
y po
ssib
le c
onju
nctio
n, S
t. Jo
hn a
nd M
cCle
lland
(30)
use
d a
sim
ilar
conj
unct
ive
repr
esen
tatio
n to
enc
ode
a nu
mbe
r of
sen
tenc
es a
t once
,
The
se r
epre
sent
atio
ns s
uffe
r fr
om in
effic
ienc
y an
d sc
ale
badl
y be
caus
e so
man
y
units are required to represent all of the conjunctions, T
he c
urre
nt m
odel
'
repr
esen
tatio
n is
far
mor
e ef
ficie
nt.
The
mod
el's
eff
icie
ncy
com
es f
rom
mak
ing
the
sent
ence
ges
talt
a tr
aina
ble,
hidd
en u
nit l
ayer
. Mak
ing
the
sent
ence
ges
talt
trai
nabl
e al
low
s th
e ne
twor
k to
crea
te th
e pr
imiti
ves
it ne
eds
to r
epre
sent
the
sent
ence
effi
cien
tly. I
nste
ad o
f
having to represent every possible conjunction, o
nly
thos
e co
njun
ctio
ns th
at
are
usef
ul w
ill b
e le
arne
d an
d ad
ded
to th
e re
pres
enta
tion,
Fur
ther
, these
prim
itive
s do
not
hav
e to
be
conj
unct
ions
bet
wee
n th
e ve
rb a
nd a
con
cept
. Ahidden layer could learn to re
pres
ent c
onju
nctio
ns b
etw
een
the
conc
epts
them
selv
es o
r ot
her
com
bina
tions
of
info
rmat
ion
if th
ey w
ere
usef
ul f
or s
olvi
ngits
task
,S
ince
a la
yer
of h
idde
n un
its c
anno
t be
trai
ned
by e
xplic
itly
spec
ifyin
g its
activ
atio
n va
lues
, we
inve
nted
a w
ay o
f "d
ecod
ing"
the
sent
ence
ges
talt
into
an
outp
ut la
yer,
Bac
kpro
paga
tion
can
then
be
used
to tr
ain
the
hidd
en la
yer,
The
outp
ut la
yer
repr
esen
ts th
e ev
ent a
s a
set o
f th
emat
ic r
ole
and
fille
r pa
irs,
For
example, the event de
scri
bed
by "
The
pitc
her
thre
w th
e ba
ll" would be
repr
esen
ted
as th
e se
t (ag
ent/p
itche
r(ba
ll-pl
ayer
), a
ctio
n/th
rew
(tos
s), p
atie
nt/
ball(
sphe
reH
, The
wor
ds in
par
enth
eses
indi
cate
whi
ch c
once
pts
the
ambi
guou
sw
ords
cor
resp
ond
to,
The
out
put l
ayer
can
rep
rese
nt o
ne r
ole/
fille
r pa
ir a
t a ti
me,
To
deco
de a
part
icul
ar r
ole/
fille
r pa
ir, t
he s
ente
nce
gest
alt i
s pr
obed
with
hal
f of
the
pair
,either the role or the filler, Activation from the probe and the
sent
ence
ges
talt
com
bine
in th
e se
cond
hid
den
laye
r w
hich
in tu
rn a
ctiv
ates
the
entir
e ro
le/f
iller
pair
in th
e ou
tput
laye
r, T
he e
ntire
eve
nt c
an b
e de
code
d in
this
way
by
successively probing with each half of
eac
h pa
ir,
Whe
n m
ore
than
one
con
cept
can
pla
usib
ly f
ill a
rol
e, w
e as
sum
e th
at th
e
corr
ect r
espo
nse
is to
act
ivat
e ea
ch p
ossi
ble
fille
r to
a d
egre
e, T
he d
egre
e of
activation of the units representing ea
ch f
iller
cor
resp
onds
to th
e fi
ller
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
223
cond
ition
al p
roba
bilit
y of
occ
urri
ng in
the
give
n co
ntex
t. T
he n
etw
ork
shou
ldle
arn
wei
ghts
to p
rodu
ce th
ese
activ
atio
ns th
roug
h tr
aini
ng, T
o ac
hiev
e th
isgo
al, w
e em
ploy
ed a
n er
ror
mea
sure
in th
e le
arni
ng p
roce
dure
, cro
ss-e
ntro
py(1
3), t
hat c
onve
rges
on
this
goa
l:
c = - L (Tj
log2
+ (1- T)log2
(1-
whe
re
is the target activation and
is the output activation of unit
j,
with
man
y co
nnec
tioni
st le
arni
ng p
roce
dure
s , th
e go
al is
to m
inim
ize
the
erro
rm
easu
re o
r co
st- f
unct
ion
(13)
. The
min
imum
of
C o
ccur
s at
the
poin
t in
wei
ght
spac
e w
here
the
activ
atio
n. v
alue
of e
ach
outp
ut u
nit e
qual
s th
e co
nditi
onal
prob
abili
ty th
at th
e un
it sh
ould
be
on in
the
curr
ent c
onte
xt. I
n th
e m
odel
whe
n th
e ne
twor
k is
pro
bed
with
a p
artic
ular
rol
e, s
ever
al o
f th
e ou
tput
uni
tsre
pres
ent t
he o
ccur
renc
e of
a p
artic
ular
fill
er o
f th
at r
ole,
Whe
n C
is a
t its
min
imum
, the units' a
ctiv
atio
n va
lues
rep
rese
nt th
e co
nditi
onal
pro
babi
lity
ofth
e oc
curr
ence
of
that
fill
er, in that role, g
iven
the
curr
ent s
ituat
ion:
Not
eho
wev
er, t
hat o
n an
y pa
rtic
ular
trai
ning
tria
l , the target of training is
one
part
icul
ar e
vent
, The
min
imum
of
C, t
hen
, is
defi
ned
acro
ss th
e en
sem
ble
oftr
aini
ng e
xam
ples
the
mod
el is
sho
wn,
Prob
ing
with
the
fille
r w
orks
sim
ilarl
y, T
he a
ctiv
atio
n va
lue
of e
ach
role
uni
tin the output layer represents the conditional probability of the pr
obed
fille
rpl
ayin
g th
at r
ole
in th
e cu
rren
t situ
atio
n. In
per
form
ing
grad
ient
des
cent
in C
the
netw
ork
is s
earc
hing
for
wei
ghts
that
allo
w it
to m
atch
act
ivat
ions
to th
ese
cond
ition
al p
roba
bilit
ies.
3. E
nvir
onm
ent a
nd tr
aini
ng r
egim
e
Tra
inin
g co
nsis
ts o
f tr
ials
in w
hich
the
netw
ork
is p
rese
nted
with
a s
ente
nce
and
the
even
t it d
escribes, The rationale is that the la
ngua
ge le
arne
r ex
-pe
rien
ces
som
e ev
ent a
nd th
en h
ears
a s
ente
nce
abou
~ it.
The
lear
ner
proc
esse
sth
e se
nten
ce a
nd c
ompa
res
the
conc
eptu
al r
epre
sent
atio
n its
com
preh
ensi
onmechanism produces to the co
ncep
tual
rep
rese
nt~t
~on
it ob
tain
ed f
rom
ex-
peri
enci
ng th
e ev
ent,
Dis
crep
anci
es a
re u
sed
as f
eedb
ack
for
the
com
preh
en-
sion
mec
hani
sm, T
hese
sen
tenc
e/ev
ent p
airs
wer
e ge
nera
ted
on- l
ine
for
each
trai
ning
tria
l. So
me
pair
s ar
e m
ore
likel
y to
be
gene
rate
d th
an o
ther
s, O
ver
trai
ning
, the
se d
iffe
renc
es in
the
likel
ihoo
d of
gen
erat
ion
tran
slat
e in
to d
iffe
r-ences in training frequency.
The
net
wor
k is
trai
ned
to g
ener
ate
the
even
t fro
m th
e se
nten
ce a
s in
put.
To
I The
situ
atio
n, a
s de
fined
in th
e le
arni
ng p
roce
dure
, is
the
com
bina
tion
of th
e pr
evio
us s
ente
nce
gest
alt ,
the
curr
ent c
onst
ituen
t , a
nd th
e cu
rren
t pro
be. I
t wou
ld b
e de
sira
ble
to d
efin
e th
e si
tuat
ion
sole
ly in
term
s of
the
sequ
ence
of
sent
ence
con
stitu
ents
, Whi
le o
ur r
esul
ts s
ugge
st th
at th
e se
nten
cege
stal
t lea
rns
to s
ave
all t
he r
elev
ant i
nfor
mat
ion
from
ear
lier
cons
titue
nts,
we
have
no
proo
f th
at it
does
,
224
F, S
T, J
OH
N A
ND
J,L
. McC
LELL
AN
D
prom
ote
imm
edia
te p
roce
ssin
g, a
spe
cial
trai
ning
reg
ime
is u
sed.
Aft
er e
ach
cons
titue
nt h
as b
een
proc
esse
d, th
e ne
twor
k is
trai
ned
to p
redi
ct th
e se
t of
role
/fill
er p
airs
of
the
entir
e ev
ent.
From
the
firs
t con
stitu
ent o
f th
e se
nten
ceth
en, t
he m
odel
is forced to try to predict the entire event. This tr
aini
ngre
gim
e, t
here
fore
, ass
umes
that
the
com
plet
e ev
ent i
s av
aila
ble
to th
e le
arni
ngpr
oced
ure
as s
oon
as s
ente
nce
proc
essi
ng b
egin
s, b
ut it
doe
s no
t assume any
spec
ial k
now
ledg
e ab
out w
hich
aspects of the event correspond to which
sent
ence
con
stitu
ents
. Of
cour
se, a
fter
pro
cess
ing
only
the
firs
t con
stitu
ent ,
the
mod
el g
ener
ally
can
not c
orre
ctly
gue
ss th
e en
tire
even
t. B
y fo
rcin
g it
to tr
y,th
is tr
aini
ng p
roce
dure
req
uire
s th
e m
odel
to d
isco
ver
the
map
ping
bet
wee
nco
nstit
uent
s an
d as
pect
s of
the
even
t, a
s it
forc
es th
e m
odel
to e
xtra
ct a
s m
uch
info
rmat
ion
as p
ossi
ble
from
eac
h co
nstit
uent
. Con
sequ
ently
, as
each
new
cons
titue
nt is
pro
cess
ed, t
he m
odel
's p
redi
ctio
ns o
f the
eve
nt a
re r
efin
ed to
refl
ect t
he a
dditi
onal
evi
denc
e it
supp
lies.
4. A
n ill
ustr
atio
n of
pro
cess
ing
An
exam
ple
of h
ow a
trai
ned
netw
ork
proc
esse
s a
sent
ence
will
hel
p ill
ustr
ate
how
the
mod
el w
orks
. To
proc
ess
the
sent
ence
, "
The
teac
her
ate
the
soup
,th
e co
nstit
uent
s of
the
sent
ence
are
pro
cess
ed in
turn
. As
each
con
stitu
ent i
spr
oces
sed
, the
net
wor
k pe
rfor
ms
a ty
pe o
f pat
tern
com
plet
ion.
The
mod
el tr
ies
to p
redi
ct th
e en
tire
even
t by
augm
entin
g th
e in
form
atio
n su
pplie
d by
the
cons
titue
nts
proc
esse
d so
far
with
add
ition
al in
form
atio
n th
at c
orre
late
s w
ithth
e in
form
atio
n su
pplie
d by
the
cons
titue
nts.
With
eac
h ad
ditio
nal c
onst
ituen
t , th
e m
odel
' s p
redi
ctio
ns im
prov
e. E
arly
inthe sentence, m
any
poss
ible
eve
nts
are
cons
iste
nt w
ith w
hat l
ittle
is k
now
nab
out t
he s
ente
nce
so f
ar, T
he c
ompl
etio
n pr
oces
s ac
tivat
es e
ach
of th
ese
alte
rnat
ives
slig
htly
, acc
ordi
ng to
thei
r su
ppor
t. A
s m
ore
cons
titue
nts
are
proc
esse
d, the additional evidence m
ore
stro
ngly
sup
port
s fe
wer
pos
sibl
eev
ents
.T
he p
atte
rn o
f ac
tivat
ion
over
the
sent
ence
ges
talt
can
be o
bser
ved
dire
ctly
,an
d re
spon
ses
to p
robe
s ca
n be
exa
min
ed, t
o se
e w
hat i
t is
repr
esen
ting
afte
rprocessing each constituent of the sentence (see Fig. 2). After processing the
first
con
stitu
ent, "
The
teac
her
" of
our
exa
mpl
e se
nten
ce, t
he n
etw
ork
assu
mes
the sentence is in the active voice and therefore assigns
teac
her
to the agent
role
, The
net
wor
k al
so f
ills
in th
e se
man
tic f
eatu
res
of te
ache
rs (
pers
on, a
dult,
and
fem
ale)
acc
ordi
ng to
its
prev
ious
exp
erie
nce
with
teac
hers
. Whe
n pr
obed
with
the
actio
n ro
le, t
he n
etw
ork
wea
kly
activ
ates
a n
umbe
r of
pos
sibl
e ac
tions
whi
ch th
e te
ache
r pe
rfor
ms.
The
net
wor
k si
mila
rly m
akes
gue
sses
abo
ut th
eot
her
role
s fo
r w
hich
it is
pro
bed.
Whe
n th
e se
cond
con
stitu
ent, "
ate" is processed, the sentence ge
stal
t is
refi
ned
to r
epre
sent
the
new
info
rmat
ion,
In
addi
tion
to r
epre
sent
ing
both
that
teac
her
is the agent and that
ate
is the action, t
he n
etw
ork
is a
ble
to m
ake
bette
r gu
esse
s ab
out t
he o
ther
rol
es, F
or e
xam
ple ,
it in
fers
that
the
patie
nt is
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
The
teac
her
ate
the
soup
.
. Sen
tenc
e G
esta
lt A
ctiv
atio
nsR
ole/
Fill
er A
ctiv
atio
nsun
itag
ent
I:J
I:J
8:J
pers
onad
uhI:
J8:
Jm
ale
8:J
I:J
8:J
fem
ale
bus
driv
erte
ache
r8:
JI:
Jac
tion
cons
umed
I:J
ate
I:J
8:J
gave
I:J
thre
w(h
ost)
8:J
drov
e(m
otiv
. ) I:J
8:J
patie
ntpe
rson
8:J
8:J
aduh
I:J
child
I:J
fem
ale
8:J
8:J
8:J
scho
olgi
rlI:
J8:
J8:
J8:
J18
1ba
ll(~a
rty)
8:J
8:J
stea
soup
IIJ
181
crac
kers
I:J
225
Fig. 2. The evolution
of
the
sent
ence
ges
talt
duri
ng p
roce
ssin
g. O
n th
e le
ft, the activation
of
part
of
the
sent
ence
ges
talt
is s
how
n af
ter
each
sen
tenc
e co
nstit
uent
has
bee
n pr
oces
sed.
On
the
righ
tth
e ac
tivat
ion
of s
elec
ted
outp
ut u
nits
is s
how
n w
hen
the
evol
ving
ges
talt
is p
robe
d w
ith e
ach
role
.T
he
#s
correspond to the number
of
cons
titue
nts
that
hav
e be
en p
rese
nted
to th
e ne
twor
k at
that
poin
t. #1
mea
ns th
e ne
twor
k ha
s se
en "
The
teac
her;
" #2
mea
ns it
has
see
n " T
he te
ache
r at
e;et
c. T
he a
ctiv
atio
ns (
rang
ing
betw
een
0 an
d 1)
are
dep
icte
d as
the
dark
ened
are
a of
eac
h bo
x.
food. Since, in the network's experience, teachers ty
pica
lly e
at s
oup,
the
netw
ork
prod
uces
act
ivat
ion
corr
espo
ndin
g to
the
infe
renc
e th
at th
e fo
od is
soup
. After the third constituent is processed, t
he n
etw
ork
has
settl
ed o
n an
inte
rpre
tatio
n of
th
e se
nten
ce. T
he th
emat
ic r
oles
are
rep
rese
nted
with
thei
rap
prop
riate
fille
rs.
5. S
peci
fics
of th
e m
odel
1.
Input representation
Each sentence constituent can be thought
of
as a
sur
face
rol
e/fi
ller
pair
. It
cons
ists
of
one unit indicating the surface role
of
the
cons
titue
nt a
nd o
ne u
nit
repr
esen
ting
each
wor
d in
the
constituent. One unit stands for each
of
verb
s , 3
1 no
uns ,
4 p
repo
sitio
ns, 3
adv
erbs
, and 7 ambiguous words, Two
of
the
ambi
guou
s w
ords
hav
e tw
o ve
rb m
eani
ngs ,
thre
e ha
ve tw
o no
un m
eani
ngs ,
and
two
have
a v
erb
and
a no
un m
eani
ng. S
ix o
f the
wor
ds a
re v
ague
term
s (e
.so
meo
ne, s
omet
hing
, and
foo
d). F
or p
repo
sitio
nal p
hras
es, t
he p
repo
sitio
n an
d
226
F. S
T. J
OH
N A
ND
j,L,
McC
LELL
AN
D
the
noun
are
eac
h re
pres
ente
d by
a u
nit i
n th
e in
put.
For
the
verb
con
stitu
ent
the presence
of
the
auxi
liary
ver
b "w
as"
is li
kew
ise
enco
ded
by a
sep
arat
e un
it.
Art
icle
s ar
e no
t rep
rese
nted
, and
nou
ns a
re a
ssum
ed to
be
sing
ular
and
defi
nite
thro
ugho
ut.
The
sur
face
rol
e, o
r lo
catio
n, o
f ea
ch c
onst
ituen
t is
code
d by
fou
r un
its th
atre
pres
ent l
ocat
ion
resp
ectiv
.eto
the
verb
: pre
-ver
bal,
verb
al, first-po
st-v
erba
l,
and
n-po
st-v
erba
l. T
he fi
rst-post-verbal unit is active for the co
nstit
uent
imm
edia
tely
follo
win
g th
e ve
rb, a
nd th
e n-
post-verbal unit is active for any
cons
titue
nt o
ccur
ring
afte
r th
e fir
st-p
ost-
verb
al c
onst
ituen
t. A
num
ber
of
cons
titue
nts,
ther
efor
e, m
ay' s
hare
the
n-po
st-v
erba
l pos
ition
. The
sen
tenc
eT
he b
all w
as h
it by
som
eone
with
the
bat i
n th
e pa
rk"
wou
ld b
e en
code
d as
the
follo
win
g or
dere
d se
t in
whi
ch th
e w
ords
in p
aren
thes
es r
epre
sent
uni
ts in
the
inpu
t tha
t are
on
for
each
con
stitu
ent (
(pre
-ver
bal,
bal
l), (
verb
al, w
as, h
it),
(fir
st-p
ost-
verb
al, b
y, s
omeo
ne),
(n-
post
-ver
bal,
with
, bat), (n-po
st-v
erba
l, in
,
park
)). W
ithou
t the
sur
face
rol
es, t
he n
etw
ork
mus
t lea
rn to
use
the
tem
pora
lor
der
of
the
cons
titue
nts
to p
rodu
ce s
ynta
ctic
con
stra
ints
. Add
ition
al s
imul
a-tio
n ha
s sh
own
that
the
netw
ork
can
lear
n th
e co
rpus
with
out t
he s
urfa
ce r
oles
.In
tere
stin
gly,
rem
ovin
g th
e su
rfac
e ro
les
did
not s
low
dow
n le
arni
ng,
2.
Out
put r
epre
sent
atio
n
The
out
put h
as o
ne u
nit f
or each of 9 po
ssib
le th
emat
ic r
oles
(e.g., agent
actio
n, p
atie
nt, i
nstr
umen
t) a
nd o
ne u
nit f
or e
ach
of 4
5 co
ncep
ts, i
nclu
ding
28
noun
con
cept
s, 1
4 ac
tions
, and
3 a
dver
bs. A
dditi
onal
ly, t
here
is a
uni
t for
the
pass
ive
voic
e. F
inal
ly, t
here
are
13
"fea
ture
" units
, suc
h as
mal
e, fe
mal
e, a
nd
adul
t. T
hese
uni
ts a
re in
clud
ed in
the
outp
ut to
allo
w th
e de
mon
stra
tion
mor
e su
btle
eff
ects
of
cons
trai
nts
on in
terp
reta
tion
(see
App
endi
x A
for
the
complete set
of
role
s an
d co
ncep
ts),
Thi
s re
pres
enta
tion
is n
ot m
eant
to b
eco
mpr
ehen
sive
. Ins
tead
, it i
s m
eant
to p
rovi
de a
con
veni
ent w
ay to
trai
n an
ddemonstrate the processing abilities
of
the
netw
ork.
Any
one
role
/fill
er
patte
rn, t
hen,
con
sist
s of
two
part
s. F
or th
e ro
le, o
ne o
f th
e 9
role
uni
ts s
houl
dbe
act
ive
, and
for
the
fille
r, a
uni
t rep
rese
ntin
g th
e co
ncep
t, a
ctio
n, o
r ad
verb
should be active. If relevant, some
of the feature units or the passive voice unit
shou
ld b
e ac
tive.
3.
Tra
inin
g en
viro
nmen
t
Whi
le th
e se
nten
ces
ofte
n in
clud
e ambiguous or vague words
, the
eve
nts
are
always specific and complete: each event consists
of
a sp
ecif
ic a
ctio
n an
d ea
ch
2 A
sec
ond output layer was included in the si
mul
atio
ns, T
his
laye
r re
prod
uced
the
sent
ence
cons
titue
nt th
at li
ts w
ith th
e ro
le/li
ller
pair
bei
ng p
robe
d. C
onse
quen
tly, t
he m
odel
was
req
uire
d to
reta
in th
e sp
ecili
c w
ords
in th
e se
nten
ce a
s w
ell a
s th
eir
mea
ning
, Sin
ce this aspect of the
processing does not lit into the context of the current discussion, these units are not discussed
furt
her,
Add
ition
al s
imul
atio
n ha
s sh
own
that
this
ext
ra d
eman
d on
the
netw
ork
does
not
qual
itativ
ely
affe
ct it
s pe
rfor
man
ce.
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
227
them
atic
rol
e re
late
d to
this
act
ion
is fi
lled
by s
ome
spec
ific
con
cept
. Acc
ord-
ingl
y, e
ach
even
t occ
urs
in a
par
ticul
ar location, and actions requiring an
inst
rum
ent a
lway
s ha
ve a
spe
cific
inst
rum
ent.
Sen
tenc
e/ev
ent p
airs
are
cre
ated
on-
line during training from scaffoldings
calle
d se
nten
ce-frames, The sentence-fr
ames
spe
cify
whi
ch th
emat
ic r
oles
and
fille
rs c
an b
e us
ed w
ith th
at a
ctio
n. E
ach
of th
e 14 actions has a separate
sent
ence
-fra
me,
Fou
r ad
ditio
nal f
ram
es w
ere
mad
e to
cov
er p
assi
ve v
ersi
ons
ofsentences involving the actions
kiss
ed, s
hot,
hit
and
gave
,T
o cr
eate
a s
ente
nce/
even
t pai
r, a
sen
tenc
e-fr
ame
is p
icke
d at
ran
dom
and
then
eac
h th
emat
ic r
ole
is p
roce
ssed
in tu
rn. (
App
endi
x B
con
tain
s a
sam
ple
sent
ence
-fra
me.) For example, let's assume that the
Ate
se
nten
ce-f
ram
e is
chos
en. A
gent
is th
e fi
rst r
ole
proc
esse
d, F
irst ,
a c
once
pt to
fill
the
role
isselected from the set of concepts that can play the agent role in the
Ate
sent
ence
-fra
me,
Thi
s ro
le/f
iller
pai
r is
add
ed to
the
even
t description, Since
som
e ro
les
, suc
h as
inst
rum
ent a
nd lo
catio
n, m
ay n
ot b
e m
entio
ned
in th
ese
nten
ce, i
t is
rand
omly
det
erm
ined
, acc
ordi
ng to
a p
rese
t pro
babi
lity,
whe
ther
a ro
le w
ill b
e in
clud
ed in
the
sent
ence
. If t
he r
ole
is to
be
incl
uded
, a w
ord
isch
osen
to r
epre
sent
the
fille
r in
the
sent
ence
, Oth
erw
ise
, the
rol
e is
left
out
of
the
sent
ence
, but
it is
stil
l included in the event description, Since the agent
role
mus
t be
incl
uded
in s
ente
nces
abo
ut e
atin
g, it
is p
lace
d in
the
sent
ence
and
a w
ord
is ch
osen, Assuming
busd
rive
r is
cho
sen
as th
e fil
ler
conc
ept,
a
wor
d to
desc
ribe
bu
sdri
ver
is s
elec
ted,
For
exa
mpl
e, t
he w
ord
"som
eone
might be chosen.
Nex
t , the action role is processed, Since
the
Ate
se
nten
ce-f
ram
e is
bei
ngus
ed, the action must be
ate,
A
wor
d to
desc
ribe
at
e is
then
cho
sen:
"co
n-su
med
" fo
r ex
ampl
e. T
hen
the
patie
nt is
cho
sen,
The
pro
babi
litie
s of
cho
osin
gpa
rtic
ular
pat
ient
s de
pend
upo
n w
hat h
as b
een
sele
cted
for
the
agen
t and
actio
n. G
iven
the
selection of
busd
rive
r as the agent
stea
k is
a m
uch
mor
elikely patient than
soup
. L
et's
assume that
stea
k is
sel
ecte
d, a
nd th
at th
e w
ord
stea
k" is
cho
sen
to r
epre
sent
it. I
n ge
nera
l, by
cha
ngin
g th
e pr
obab
ilitie
s of
sele
ctin
g sp
ecif
ic f
iller
s ac
cord
ing
to w
hich
oth
er f
iller
s ha
ve b
een
sele
cted
so
far,
sta
tistic
al r
egul
ariti
es a
mon
g th
e fi
llers
will
dev
elop
acr
oss
the
corp
us,
In th
e sa
me
way
, the
rem
aini
ng r
ole/
fille
r pa
irs' f
or th
e se
nten
ce-f
ram
e ar
ege
nera
ted.
Ass
umin
g on
ly th
e fir
st th
ree
role
s ar
e ch
osen
to b
e in
clud
ed in
this
sent
ence
, the input sentence w
ill b
e
, "
Som
eone
con
sum
ed th
e st
eak," The
even
t will
be
the
entir
e se
t of
role
/fill
er pairs (agent/busdriver, action/ate,
patie
nt /
stea
k, i
nstr
umen
t / k
nife
, loc
atio
n /li
ving
-roo
m, etc,
In th
is w
ay, 1
20 d
iffer
ent e
vent
s ca
n be
gen
erat
ed fr
om th
e se
t of frames
with
som
e be
ing
mor
e lik
ely
to a
ppea
r th
an o
ther
s. T
he m
ost f
requ
ent e
vent
occu
rs, o
n av
erag
e, 5
, 5 ti
mes
per
100
tria
ls, b
ut th
e le
ast f
requ
ent e
vent
occ
urs
only 9 times per 10, 0
00. T
he n
umbe
r of
wor
ds th
at c
an b
e ch
osen
to d
escr
ibe
an e
vent
and
the
optio
n to
incl
ude
or e
limin
ate
optio
nal c
onst
ituen
ts f
rom
the
sent
ence
bri
ngs
the
num
ber
of s
ente
nce/
even
t pai
rs to
22
645,
228
F. S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
In training the model on this corpus, sentence-fr
ames
wer
e pi
cked
at
rand
om, a
nd s
ente
nce/
even
t pai
rs w
ere
gene
rate
d ac
cord
ing
to th
e ra
ndom
proc
edur
e de
scrib
ed a
bove
, No
spec
ific
sent
ence
s or
eve
nts
wer
e se
t asi
de to
not b
e tr
aine
d.T
he s
ente
nces
are
lim
ited
in c
ompl
exity
bec
ause
of
the
limita
tions
of
the
even
t rep
rese
ntat
ion,
Onl
y on
e fi
ller
can
be assigned to a role in a particular
sent
ence
, Als
o, a
ll th
e ro
les
are
assu
med
to b
elon
g to
the
sent
ence
as
a w
hole
,T
here
fore
, no
embe
dded
cla
uses
or
phra
ses
atta
ched
to s
ingl
e co
nstit
uent
s ar
e
poss
ible
,
5.4,
T
rain
ing
proc
edur
e de
tails
Eac
h tr
aini
ng tr
ial c
onsi
sts
of f
irst
gen
erat
ing
a se
nten
ce/e
vent
pai
r, a
nd th
enpresenting the sentence to the m
odel
for
trai
ning
. The
constituents are
pres
ente
d to
the
mod
el s
eque
ntia
lly, o
ne a
t a ti
me,
Aft
er th
e m
odel
pro
cess
es a
cons
titue
nt, t
he m
odel
is p
robe
d w
ith e
ach
half
of
each
rol
e/fi
ller
pair
for
the
entir
e ev
ent.
The
err
or p
rodu
ced
on e
ach
prob
e is
col
lect
ed a
nd p
ropa
gate
dba
ckw
ard
thro
ugh
the
netw
ork
(ct.
(28)
), T
he weight changes from each
sent
ence
tria
l are
add
ed to
geth
er a
nd u
sed
to u
pdat
e th
e w
eigh
ts a
fter
ever
y 60
trials. The learning rate,
E,
was
set
to O
,()(
)()5
, and
mom
entu
m w
as s
et to
0,
No
atte
mpt
was
mad
e to
opt
imiz
e th
ese
valu
es, s
o it
is li
kely
that
lear
ning
tim
eco
uld
be im
prov
ed b
y tu
ning
thes
e pa
ram
eter
s.
3. Results
1. O
vera
ll pe
rfor
man
ce
Firs
t, we will assess the model's
abili
ty to
com
preh
end
sent
ence
s ge
nera
lly,
The
n w
e w
ill e
xam
ine
the
mod
el's
abi
lity
to f
ulfi
ll ou
r sp
ecifi
c pr
oces
sing
goa
ls,
and
we
will
examine the development of the m
odel
's p
erfo
rman
ce a
cros
str
aini
ng tr
ials
, Fin
ally
, we
will
dis
cuss
the
mod
el's
abi
lity
to g
ener
aliz
e,O
nce
the
mod
el w
as a
ble
to p
roce
ss c
orre
ctly
bot
h active and passive
sent
ence
s, th
e si
mul
atio
n w
as s
topp
ed a
nd e
valu
ated
. Cor
rect
pro
cess
ing
was
defin
ed a
s ac
tivat
ing
the
corr
ect u
nits
mor
e st
rong
ly th
an th
e in
corr
ect u
nits
.After 330
000
rand
om s
ente
nce
tria
ls, t
he m
odel
beg
an c
orre
ctly
pro
cess
ing
the
passive sentences in the corpus.
A s
et o
f 10
0 te
st s
ente
nce/
even
t pai
rs w
ere
gene
rate
d ra
ndom
ly f
rom
the
corpus, These sentence/event pairs were generated in the sa
me
way
the
trai
ning
sen
tenc
es w
ere
gene
rate
d ex
cept
that
they
wer
e ge
nera
ted
with
out
rega
rd to
thei
r fr
eque
ncy
durin
g tr
aini
ng, s
o se
ldom
pra
ctic
ed p
airs
wer
e as
likel
y to
app
ear
in th
e te
st s
et a
s fr
eque
ntly
pra
ctic
ed p
airs
, Of
thes
e pa
irs;
w
ere
set a
side
for
sep
arat
e an
alys
is b
ecau
se th
ey w
ere
ambi
guou
s: a
t lea
st tw
odi
ffer
ent i
nter
pret
atio
ns c
ould
be
deri
ved
from
eac
h (e
.g, S
omeo
ne a
te s
ome-
thin
g), O
f the
rem
aini
ng s
ente
nce/
even
t pai
rs, e
very
sen
tenc
e co
ntai
ned
at
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
229
leas
t one
vag
ue o
r am
bigu
ous
wor
d, y
et e
ach
had
only
one
inte
rpre
tatio
n.T
hese
una
mbi
guou
s se
nten
ce/e
vent
pai
rs w
ere
test
ed b
y fi
rst a
llow
ing
the
mod
el to
pro
cess
all
of th
e co
nstit
uent
s of
the
sent
ence
. The
n th
e m
odel
was
prob
ed w
ith e
ach
half
of
each
con
stitu
ent t
hat w
as m
entio
ned
in th
e se
nten
ce.
The
out
put p
rodu
ced
in r
espo
nse
to e
ach
prob
e w
as c
ompa
red
to th
e ta
rget
outp
ut. F
igur
e 3
pres
ents
a h
isto
gram
of
the
resu
lts.
Una
mbi
guou
s S
ente
nces
1ft
~ ~
~ ~ 1f
t
cros
a-en
lrop
y
Fig,
3, H
isto
gram
of
the
cros
s-en
trop
y er
ror
for
rand
om s
ente
nces
aft
er 3
3000
0 se
nten
ce tr
ials
,
The
sen
tenc
es w
ere
draw
n ra
ndom
ly f
rom
the corpus without regard to their fr
eque
ncy,
Acr
oss-
entr
opy
mea
sure
of
betw
een
0 an
d 10
res
ults
from sentences that are processed almost
perf
ectly
, Onl
y sm
all e
rror
s oc
cur
whe
n an
out
put u
nit s
houl
d be
com
plet
ely
activ
ate
(with
a v
alue
of I), but only obtains an activation of 0,7 or 0,8,
or
whe
n a
unit
shou
ld h
ave
an a
ctiv
atio
n of
0,
but h
as a
n ac
tivat
ion
of 0
,1 or 0,2,
Cro
ss-e
ntro
py e
rror
s of
bet
wee
n 15
and
20
occu
r w
hen
one
ofthe role/filler pairs is incorrect, For example, if
teac
her
wer
e su
ppos
ed to
be
the
agen
t, bu
t the
network activates
busd
r;ve
ran
err
or o
f abo
ut 1
5 w
ould
res
ult,
230
F, S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
For these unambiguous se
nten
ces,
the
cros
s-en
trop
y, s
umm
ed o
ver
con-
stitu
ents
, ave
rage
d 3.
9 pe
r se
nten
ce, A
noth
er m
easu
re o
f pe
rfor
man
ce is
the
num
ber
of ti
mes
an
outp
ut u
nit t
hat s
houl
d be
on
is le
ss a
ctiv
e th
an a
n ou
tput
unit
that
sho
uld
be o
ff. T
he id
ea b
ehin
d th
is m
easu
re is
that
as
long
as
the
corr
ect u
nit w
ithin
any
set
, suc
h as
peo
ple
or g
ende
r, is
the
mos
t act
ive
, it c
anw
in a
com
petit
ion
with
the
othe
r un
its in
that
set
. Che
ckin
g th
at a
ll of
the
corr
ect u
nits
are
mor
e ac
tive
than
any
of t
he in
corr
ect u
nits
is a
qui
ck, a
ndco
nser
vativ
e, w
ay o
f ca
lcul
atin
g th
is m
easu
re. A
n in
corr
ect u
nit w
as m
ore
active in 14 out of the
17W
. po
ssib
le c
ases
, or
on 0
.8%
of t
he o
ppor
tuni
ties.
The
14
erro
rs w
ere
i:lis
trib
uted
ove
r 8
of th
e 55
sen
tenc
es..
In
5 of
the
8se
nten
ces,
the
erro
r in
volv
~d
the
inco
rrec
t ins
tant
iatio
n of
the
spec
ific
conc
ept
or a
fea
ture
of
that
con
cept
, ref
erre
d to
by
a va
gue
wor
d. T
wo
othe
r er
rors
invo
lved
the
inco
rrec
t act
ivat
ion
of th
e co
ncep
t rep
rese
ntin
g a
non
vagu
e w
ord.
In each case, the in
corr
ect c
once
pt was similar to the correct concept.
The
refo
re, e
rror
s w
ere
not r
ando
m; t
hey
invo
lved
the
mis
activ
atio
n of
a s
imila
rco
ncep
t or
the
mis
activ
atio
n of
a f
eatu
re o
f a
sim
ilar
conc
ept.
The
err
ors
in th
ere
mai
ning
sen
tenc
e in
volv
ed th
e in
corr
ect a
ssig
nmen
t of
them
atic
rol
es in
pa
ssiv
e, r
ever
sibl
e se
nten
ce: "
Som
eone
hit
the
pitc
her" (see the section on
lear
ning
for
a d
iscu
ssio
n of
this
pro
blem
).A
dditi
onal
pra
ctic
e, o
f co
urse
, im
prov
ed th
e m
odel
's p
erfo
rman
ce. I
mpr
ove-
men
t is
slow
, how
ever
, bec
ause
the
sent
ence
s pr
oces
sed
inco
rrec
tly a
re r
ela-
tively rare. After a total of 630
000
tria
ls, t
he n
umbe
r of
sentences having a
cros
s-en
trop
y hi
gher
than
15
drop
ped
from
3 to
1. T
he n
umbe
r of
err
ors
drop
ped
from
14
to 1
1.
2. P
erfo
rman
ce o
n sp
ecifi
c ta
sks
Our
spe
cific
inte
rest
was
to d
evel
op a
pro
cess
or th
at c
ould
cor
rect
ly p
erfo
rmse
vera
l im
port
ant l
angu
age
com
preh
ensi
on ta
sks.
Fiv
e ty
pica
l sen
tenc
es w
ere
draw
n fr
om th
e co
rpus
to te
st e
ach
proc
essi
ng ta
sk. T
he c
ateg
orie
s an
d on
eex
ampl
e se
nten
ce f
or e
ach
are
pres
ente
d in
Tab
le 1
. The
par
enth
eses
den
ote
the
impl
icit
, to
be in
ferr
ed, role.
Table 1
Tas
k ca
tego
ries
Cat
egor
yE
xam
ple
Role assignment
Active semantic
Act
ive
synt
actic
Pass
ive
sem
antic
Pass
ive
synt
actic
Wor
d am
bigu
ityC
once
pt in
stan
tiatio
nR
ole
elab
orat
ion
The
sch
oolg
irl s
tirre
d th
e ko
ol-a
id w
ith a
spo
on,
The
bus
driv
er g
ave
the
rose
to th
e te
ache
r,T
he b
all w
as h
it by
the
pitc
her,
The
bus
driv
er w
as g
iven
the
rose
by
the
teac
her,
The
pitc
her
hit t
he b
at w
ith th
e ba
t,T
he te
ache
r ki
ssed
som
eone
,T
he te
ache
r at
e th
e so
up (
with
a s
poon
),
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
The
fir
st c
ateg
ory
invo
lves
rol
e as
sign
men
t. T
he c
ateg
ory
was
div
ided
into
four sub-categories based on the type of information available to help assign
the
corr
ect t
hem
atic
rol
es to
con
stitu
ents
, Sen
tenc
es in
the
activ
e se
man
ticgr
oup
cont
ain
sem
antic
info
rmat
ion
that
can
hel
p as
sign
rol
es, I
n th
e ex
ampl
efr
om T
able
I, o
f th
e co
ncep
ts r
efer
red
to in
the
sent
ence
, only the schoolgirl
can
play
the
role
of a
n ag
ent o
f stir
ring,
The
net
wor
k ca
n th
eref
ore
use
that
sem
antic
info
rmat
ion
to a
ssig
n sc
hool
girl
to th
e ag
ent r
ole,
Sim
ilarly
, koo
l-aid
is s
omet
hing
that
can
be
stirr
ed, b
ut c
anno
t stir
or
be u
sed
to s
tir s
omet
hing
else
, Aft
er e
ach
sent
ence
was
pro
cess
ed, t
he s
ente
nce
gest
alt w
as p
robe
d w
ithth
e fi
ller
half
of
each
rol
e/fi
ller
pair
, The
net
wor
k th
en h
ad to
com
plet
e th
epa
ir b
y fi
lling
in th
e co
rrec
t the
mat
ic r
ole,
For
eac
h pa
ir, i
n ea
ch s
ente
nce
, the
unit
repr
esen
ting
the
corr
ect r
ole
was
the
mos
t act
ive,
Sen
tenc
es in
the
pass
ive
sem
antic
cat
egor
y ar
e pr
oces
sed
equa
lly w
ell,
Of
cour
se th
e se
man
tic k
now
l-ed
ge n
eces
sary
to p
erfo
rm th
is ta
sk is
nev
er provided in the input or pro-
gram
med
into
the
netw
ork,
Ins
tead
, it m
ust b
e de
velo
ped
inte
rnal
ly in
the
sent
ence
ges
talt
as th
e ne
twor
k le
arns
to p
roce
ss s
ente
nces
,Sy
ntac
tic in
form
atio
n do
es n
ot h
ave
to b
e us
ed in
thes
e ca
ses;
the
sem
antic
cons
trai
nts
suff
ice.
In
fact
, if
the
surf
ace
loca
tion
of th
e co
nstit
uent
s is
rem
oved
from
the
inpu
t , th
e ro
les
are
still
ass
igne
d co
rrec
tly, F
urth
er, i
f th
e co
nstit
uent
sar
e pr
esen
ted
in d
iffer
ent o
rder
s, t
he a
ctiv
atio
n va
lues
in th
e ou
tput
are
affe
cted
onl
y sl
ight
ly. S
ente
nce
proc
essi
ng th
at c
an r
ely
on s
eman
tic in
form
a-tio
n, t
here
fore
, ess
entia
lly d
oes ,
thou
gh c
onfu
sing
the
synt
ax a
ppea
rs to
hav
e a
slig
ht c
orru
ptin
g ef
fect
,T
he r
elat
ive
stre
ngth
s of
syn
tact
ic a
nd s
eman
tic c
onst
rain
ts a
re d
eter
min
edby
thei
r re
liabi
lity
in th
e tr
aini
ng c
orpu
s, T
he m
ore
relia
ble
a co
nstr
aint
, the
mor
e po
tent
its
infl
uenc
e in
pro
cess
ing,
Thi
s ef
fect
of
the
corp
us o
n la
ter
proc
essi
ng is
als
o fo
und
in n
atur
al la
ngua
ges,
Wor
d or
der
in E
nglis
h is
ver
yre
liabl
e an
d is
a v
ery
stro
ng c
onst
rain
t on
the
mea
ning
of
a se
nten
ce, I
n It
alia
n,ho
wev
er, w
ord
orde
r is
less
rel
iabl
e an
d is
a m
uch
wea
ker
cons
trai
nt w
hich
can
be over-ridden by semantic co
nstr
aint
s, C
onse
quen
tly, '
the
sent
ence
"T
hepe
ncil
kick
ed th
e co
w" in English is
take
n to
mea
n th
at th
e pe
ncil
did
the
kick
ing
beca
use
of th
e w
ord
orde
r , w
hile
in I
talia
n it
is ta
ken
to m
ean
that
the
cow
did
the
kick
ing
beca
use
of th
e se
man
tics
of th
e si
tuat
ion
(17)
,T
o pr
oces
s se
nten
ces
in th
e ac
tive
and
pass
ive
synt
actic
cat
egor
ies
, how
ever
the
netw
ork
cann
ot r
ely
entir
ely
on s
eman
tic c
onst
rain
ts :t
o as
~ign
them
atic
role
s, S
ente
nces
in th
ese
cate
gori
es w
ere
crea
ted
by in
clud
ing in the corpus
pair
s of
rev
ersi
ble
even
ts, s
uch
as th
e bu
sdri
ver
givi
ng a
ros
e to
the
teac
her
and
the
teac
her
givi
ng a
rose to the busdriver, Both of these events w
ere
trai
ned
with
equ
al f
requ
ency
. With
out a
dif
fere
nce
in f
requ
ency
, the
re is
no
sem
antic
reg
ular
ity to
hel
p pr
edic
t whi
ch o
f th
e tw
o ev
ents
a s
ente
nce
refe
rs to
,T
he m
odel
mus
t rel
y on
syn
tact
ic in
form
atio
n, s
uch
as w
ord
orde
r , to assign
the
them
atic
rol
es, P
assi
ve s
ente
nces
fur
ther
com
plic
ate
proc
essi
ng b
y m
akin
gw
ord
orde
r , b
y its
elf ,
unp
redi
ctiv
e, T
he p
ast p
artic
iple
and
the
"" preposi-
231
232
F, S
T, J
OH
N A
ND
J.L
. McC
LELL
AN
D
tion
prov
ide
cues
des
igna
ting
the
pass
ive,
but
in th
emse
lves
do
not c
ue w
hich
pers
on p
lays
whi
ch r
ole
eith
er. T
he w
ord
orde
r in
form
atio
n m
ust b
e us
ed in
conj
unct
ion
with
the
pass
ive
cues
to d
eter
min
e th
e co
rrec
t rol
e as
sign
men
ts,
Whe
n se
nten
ces
in th
e sy
ntac
tic c
ateg
orie
s w
ere
test
ed, f
or e
ach
role
/fill
erpa
ir in
eac
h te
st s
ente
nce,
the
corr
ect r
ole
was
the
mos
t act
ive,
Fig
ure
4provides an example of role assignment in the semantic and sy
ntac
ticca
tego
ries
,T
he r
emai
ning
thre
e ca
tego
ries
invo
lve
the
use
of c
onte
xt to
hel
p sp
ecify
the
conc
epts
ref
erre
d to
in a
sen
tenc
e, S
ente
nces
in th
e w
ord
ambi
guity
cat
egor
ycontain one or m
ore
ambi
guou
s w
ords
, Aft
er processing a sentence, the
netw
ork
was
pro
bed
with
the
role
hal
f of
eac
h ro
le/f
iller
pai
r, T
he o
utpu
tpa
ttern
s fo
r th
e fil
lers
wer
e th
en e
xam
ined
, Fig
ure
5 pr
ovid
es a
n ex
ampl
ese
nten
ce w
ith a
mbi
guou
s w
ords
, For
all
pair
s in
eac
h te
st s
ente
nce,
the
corr
ect
fille
r w
as th
e m
ost a
ctiv
e,Disambiguation requires the competition and cooperation of co
nstr
aint
sfr
om b
oth
the
wor
d an
d its
con
text
. Whi
le th
e w
ord
itsel
f cue
s tw
o di
ffer
ent
inte
rpre
tatio
ns, t
he c
onte
xt f
its o
nly
one,
In
"The
pitc
her
hit t
he b
at w
ith th
eba
t
" "
pitc
her"
cues both
cont
aine
r an
d ba
ll-pl
ayer
, T
he c
onte
xt c
ues
both
ball-
play
er
and
busd
rive
r be
caus
e th
e m
odel
has
see
n se
nten
ces
invo
lvin
g bo
thpe
ople
hitt
ing
bats
, All
the
constraints supporting
ball-
play
er
com
bine
, and
agen
t ac
tion
CJ
patie
nt C
Jin
stru
men
t CJ
loca
tion
CJ
co-a
gent
CJ
co-p
atie
nt C
Jre
cipi
ent C
Jsc
hool
girl
The
sch
oolg
irl s
tirre
d th
e ko
ol-a
ld w
ith s
spo
on.
agen
t CJ
agen
t CJ
actio
n ac
tion
CJ
patie
nt C
J pa
tient
in
stru
men
t CJ
inst
rum
ent C
Jlo
catio
n C
J lo
catio
n C
Jco
-age
nt C
J co
-age
nt C
Jco
-pat
ient
CJ
co-p
atie
nt C
Jre
cipi
ent C
J re
cipi
ent C
Jst
irre
d ko
ol-s
ld
agen
t CJ
actio
n C
Jpa
tient
CJ
inst
rum
ent
loca
tion
CJ
co-a
gent
CJ
co-p
atie
nt C
Jre
cipi
ent C
Jsp
oon
The
bus
drlv
er w
ss g
iven
the
rose
by
the
tesc
her.
agen
t CJ
agen
t CJ
agen
t ac
tion
CJ
actio
n ac
tion
patie
nt
patie
nt
patie
nt
inst
rum
ent
inst
rum
ent
inst
rum
ent
loca
tion
loca
tion
loca
tion
co-a
gent
co
-age
nt
co-a
gent
co
-pat
ient
0 co-
patie
nt 0 co-
patie
nt
reci
pien
t re
cipi
ent
reci
pien
t CJ
bus driver wss given
rose
(nou
n)
agen
t ac
tion
CJ
patie
nt C
Jin
stru
men
t CJ
loca
tion
CJ
co-a
gent
CJ
co-p
atie
nt C
Jre
cipi
ent C
J
tesc
her
Fig.
4, R
ole
assi
gnm
ent.
Aft
er a
sen
tenc
e is
pro
cess
ed, t
he n
etw
ork
is p
robe
d w
ith th
e fi
ller
half
of
each
rol
e/fi
ller
pair
. The
act
ivat
ion
over
a s
ubse
t of
the
them
atic
rol
e un
its is
dis
play
ed. T
he fi
rst
sentence contains semantic information useful for role assignment. while the second sentence
contains only syntactic information useful for role assignment.
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
233
The
pitc
her
hit t
he b
at w
ith th
e ba
t.
pers
on
adul
l CJ
child
m
ale
fem
ale
CJ
busd
river
CJ
teac
her
CJ
scho
olgi
rl C
Jpi
lche
r(pe
rson
) ag
ent a
ctio
n pa
tient
Inst
rum
ent
Fig.
5, W
ord
disa
mbi
guat
ion.
The
sen
tenc
e "T
he p
itche
r hi
t the
bat
with
the
bat"
is p
roce
ssed
by
the
netw
ork,
The
net
wor
k is
then
pro
bed
with
eac
h th
emat
ic r
ole
in th
e ev
ent,
The
act
ivat
ion
over
a su
bset
of t
he fi
llers
is d
ispl
ayed
. The
net
wor
k co
rrec
tly di
sam
bigu
ates
eac
h w
ord,
ale
CJ
kiss
ed C
Jhi
t .:J
thre
w(t
osse
d) C
J
pers
on C
Jth
ing
food
CJ
bat(animal) .:J
bat(
base
ball)
pers
on C
Jth
ing
food
CJ
bat(
anim
al)
CJ
bat(baseball) .:J
toge
ther
they
win
the
com
petit
ion
for
the
inte
rpre
tatio
n of
the
sent
ence
, As
can
be s
een
from
the
pres
ent e
xam
ple,
eve
n w
hen
seve
ral w
ords
of
a se
nten
ceare ambiguous, the event which they support in co
mm
on d
omin
ates
the
disp
arat
e ev
ents
that
they
sup
port
indi
vidu
ally
, The
pro
cess
ing
of b
oth
in-
stances of "ba
t" w
ork
sim
ilarl
y: th
e w
ord
and
the
cont
ext m
utua
lly s
uppo
rt th
eco
rrec
t int
erpr
etat
ion,
Con
sequ
ently
, the
fina
l int
erpr
etat
ion
of e
ach
wor
d fit
stogether into a globally consistent event.
Con
cept
inst
antia
tion
wor
ks s
imila
rly, T
houg
h th
e w
ord
cues
a n
umbe
r of
mor
e sp
ecif
ic c
once
pts,
onl
y on
e fit
s th
e co
ntex
t. A
gain
, the
con
stra
ints
from
the
wor
d an
d fr
om th
e co
ntex
t com
bine
to p
rodu
ce a
uni
que,
spec
ific
inte
rpre
-
tatio
n of
the
term
, As
with
the
disa
mbi
guat
ion
task
, eac
h te
st s
ente
nce
was
processed, and then the ne
twor
k w
as p
robe
d w
ith th
e ro
le h
alf
of e
ach
role
/fill
er p
air,
The
out
put f
iller
pat
tern
s w
ere
exam
ined
to s
ee if
the
corr
ect
conc
ept a
nd s
eman
tic f
eatu
res
wer
e in
stan
tiate
d (s
ee F
ig, 6
). I
n ea
ch c
ase,
the
corr
ect c
once
pt a
nd fe
atur
es w
ere
the
mos
t act
ive,
Dep
endi
ng u
pon
the
sent
ence
, how
ever
, the context m
ay o
nly
part
ially
cons
trai
n th
e in
terp
reta
tion,
Suc
h is
the
case
in "
The
teac
her
kiss
ed s
omeo
ne,
Som
eone
" co
uld
refe
r to
any
of
the
four
peo
ple
foun
d in
the
corp
us, S
ince
, in
the network's
expe
rien
ce, f
emal
es o
nly
kiss
mal
es, t
he c
onte
xt c
onst
rain
s th
einterpretation of "so
meo
ne" to be either the busdriver or the pitcher, b
ut n
o
further, Consequently, the model can activate the
mal
e an
d pe
rson
fe
atur
es o
f
the patient while leaving the units representing
busd
rive
r an
d pi
tche
r on
ly
partially active, The features
adul
t an
d ch
ild
are
also
par
tially
and
equ
ally
activ
e be
caus
e th
e bu
sdri
ver
is a
n ad
ult w
hile
the
pitc
her
is a
chi
ld (
see
Fig,
6).
Whi
le
pitc
her
is s
light
ly m
ore
activ
e in
this
exa
mpl
e, n
eith
er is
act
ivat
ed a
bove
5 (s
ee th
e se
ctio
n on
am
bigu
ous
sent
ence
s fo
r an
exp
lana
tion
of th
e di
ffer
-en
ce in
act
ivat
ions
), In
gen
eral
, the
mod
el is
cap
able
of
infe
rrin
g as
muc
h
info
rmat
ion
as th
e ev
iden
ce p
erm
its: t
he m
ore
evid
ence
, the
mor
e sp
ecifi
c th
ein
fere
nce,
Wor
d di
sam
bigu
atio
n ca
n be
see
n as
one
type
of
this
general inference
234
F. S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
The
sch
oolg
irl s
prea
d so
met
hing
with
a k
nlle
.pe
rson
CJ
icec
ream
CJ
plac
e C
J cr
acke
rs C
Jth
ing
jelly
fo
od
iced
-tea
CJ
stea
k C
J ko
ol-a
id
soup
pa
tient
The
teac
her
kiss
ed s
omeo
ne.
pe,r
son
busd
rive
r adult lEI
teac
her
CJ
child lEI
pilc
her(
pers
on)
lEI
male .:J
scho
olgi
rl
lem
ale
patie
nt
Fig, 6, Concept instantiation, The network has learned that
jelly
is always the patient of
spre
ad,
Whe
n th
e ne
twor
k processes "T
he s
choo
lgirl
spr
ead
som
ethi
ng w
ith a
kni
fe," it instantiates
som
ethi
ng" as
jelly
, For the sentence "T
he te
ache
r ki
ssed
som
eone
," the network pa
rtia
lly
inst
antia
tes
"som
eone
" as a
mal
e an
d pe
rson
, and activates both
pilc
her
and
busd
rive
r pa
rtia
lly,
proc
ess,
The
onl
y di
ffer
ence
is th
at f
or ambiguous words, both the general
concept and the sp
ecif
ic f
eatu
res
diff
er b
etw
een
the
alte
rnat
ives
, whi
le f
orvague words
, the
gen
eral
con
cept
is th
e sa
me
and
only
som
e of
the
spec
ific
feat
ures
diff
er,
Fina
lly, s
ente
nces
in th
e ro
le e
labo
ratio
n ca
tego
ry te
st th
e m
odel
's a
bilit
y to
infe
r th
emat
ic r
oles
not
men
tione
d in
the
inpu
t sen
tenc
e. F
or e
xam
ple,
in "
The
teacher ate the soup," no instrument is mentioned, yet a spoon can
infe
rred
. For
eac
h te
st s
ente
nce,
aft
er th
e se
nten
ce w
as p
roce
ssed
, the
net
wor
kw
as p
robe
d w
ith th
e ro
le h
alf
of th
e to
-be-
infe
rred
rol
e/fi
ller
pair
. The
cor
rect
fille
r w
as th
e m
ost a
ctiv
e in
eac
h ca
se, F
igur
e 7
prov
ides
an
exam
ple.
For
rol
e
The
teac
her
ate
the
soup
., The schoolgirl ate.
pers
on
plac
e C
Jth
ing
lood
st
eak
soup
ic
ecre
am
crac
kers
ic
ed-te
a D
kool
-aid
patie
nt
pers
on
plac
e th
ing
lood
ut
ensi
l kn
ife C
Jsp
oon
Inst
rum
ent
Fig, 7, Role elaboration, After processing the sentence "T
he te
ache
r at
e th
e so
up,"
the
netw
ork
is
prob
ed w
ith th
e in
stru
men
t rol
e, T
he f
iller
act
ivat
ions
are
disp
laye
d, T
he n
etw
ork
corr
ectly
infe
rs
spoo
n,
For "T
he s
choo
lgirl
ate
," th
e m
odel
mus
t inf
er a
pat
ient
, Bec
ause
the
scho
olgi
rl is
like
ly to
eat a
var
iety
of f
oods
, no
part
icul
ar fo
od is
wel
l act
ivat
ed.
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
235
elab
orat
ion
, the
con
text
alo
ne p
rovi
des
the
cons
trai
nts
for
mak
ing
the
infe
r-en
ce. E
xtra
rol
es th
at a
re v
ery
likel
y w
ill b
e in
ferr
ed s
tron
gly.
Whe
n th
e ro
les
are
less
like
ly, o
r co
uld
be f
iIle
d by
more than one concept
, the
y ar
e on
lyw
eakl
y in
ferr
ed.
As
it st
ands
, the
re is
not
hing
to k
eep
the
netw
ork
from
gen
eral
izin
g to
infe
rex
tra
role
s fo
r ev
ery
sent
ence
, eve
n fo
r ev
ents
in w
hich
thes
e ro
les
mak
e no
sense. For instance, i
n "T
he b
usdr
iver
dra
nk th
e ic
ed-t
ea, " no instrument
shou
ld b
e in
ferr
ed, y
et th
e ne
twor
k in
fers
kni
fe b
ecau
se o
f its
ass
ocia
tion
with
the busdriver. It appears that since the busdriver uses a knife in m
any
even
tsab
out e
atin
g, th
e ne
twor
k ge
nera
lizes
to in
fer
the
knife
as
an in
stru
men
t for
his
drin
king
. How
ever
, in
even
ts f
urth
er r
emov
ed f
rom
eat
ing,
inst
rum
ents
are
not
infe
rred
. For
exa
mpl
e, in
"T
he b
usdr
iver
ros
e"
no in
stru
men
t is
activ
ated
. It
appe
ars,
then
, tha
t gen
eral
izat
ion
of r
oles
is a
ffec
ted
by th
e de
gree
of
sim
ilari
tybe
twee
n ev
ents
. Whe
n ev
ents
are
sim
ilar ,
ela
bora
tive
role
s m
ay b
e ge
nera
l-iz
ed. W
hen
even
ts a
re d
istin
ct, r
oles
do
not g
ener
aliz
e, a
nd th
e m
odel
has
no
reas
on to
act
ivat
e an
y pa
rtic
ular
fiI
ler
for
a ro
le.
3. Immediate update
As
each
constituent is processed, the information it
conv
eys
mod
ifie
s th
ese
nten
ce g
esta
lt an
d st
reng
then
s th
e in
fere
nces
it s
uppo
rts.
But
the
begi
nnin
gof
a s
ente
nce
may
not
alw
ays
accu
rate
ly p
redi
ct it
s ev
entu
al f
uIl m
eani
ng. F
orex
ampl
e, i
n "T
he a
dult
ate
the
stea
k w
ith d
aint
ines
s" the identity of the adult
is in
itiaI
ly u
nkno
wn.
Aft
er "
The
adu
lt at
e" has been processed
busd
rive
r an
dte
ache
r are equaIly active. After processing "T
he a
dult
ate
the
stea
k" the
model guesses that the agent is the
busd
rive
r si
nce
stea
k is
typi
caIly
eat
en b
ybu
sdri
vers
. At t
his
poin
t , th
e m
odel
has
suf
fici
ent i
nfor
mat
ion
to in
stan
tiate
the adult" to be the
busd
rive
r.
Alo
ng w
ith th
is in
fere
nce
gust
o is
infe
rred
as
the manner of eating, since busdrivers eat with gu
sto.
Her
e th
e m
odel
dem
onst
rate
s its
abi
lity
to in
fer
addi
tiona
l the
mat
ic r
oles
.T
he m
odel
has
, at t
his
poin
t, be
en le
d do
wn
the
gard
en p
ath
tow
ard
anul
timat
ely
inco
rrec
t int
erpr
etat
ion
of th
e se
nten
ce. T
he n
ext c
onst
ituen
t pro
-ce
ssed
, "
with
dai
ntin
ess,
" on
ly f
its w
ith th
e te
ache
r an
d th
e sc
hool
girl
. Sin
ceth
e se
nten
ce s
peci
fies
an a
dult
, the agent must be the
teac
her.
The model must
revise its representation of the ev
ent t
o fit
with
. the
new
information by
de-a
ctiv
atin
g bu
sdri
ver
and activating
teac
her
(see
FIg
. 8).
In general
, as
each
con
stitu
ent i
s pr
oces
sed,
the
info
rmat
ion
it explicitly
conveys is added to the re
pres
enta
tion
of th
e se
nten
ce a
long
with
impl
icit
information implied by the constituent in the current si
tuat
ion.
Whe
n th
eev
iden
ce is
am
bigu
ous
and
supp
orts
may
con
flic
ting
infe
renc
es (
such
as
afte
rT
he a
dult
ate"
has
bee
n pr
oces
sed)
all
the
infe
renc
es a
re w
eakl
y ac
tivat
ed in
the
sent
ence
ges
talt.
Whe
n ne
w e
vide
nce
sugg
ests
a d
iffe
rent
inte
rpre
tatio
n,th
e se
nten
ce g
esta
lt is
rev
ised
.
236
F, S
T. J
OH
N A
ND
j,L.
McC
LELL
AN
D
The
adu
b at
e th
e st
eak
with
dai
ntin
ess,
Sen
tenc
e G
esta
lt A
cllv
allo
nsR
olel
Fl1I
er A
ctiv
atio
nsun
itag
ent
I:J
pers
onaO
Ob
child
mal
eID
..:
Ife
mal
e..:
Ibu
s dr
iver
..:I
teac
her
..:I
actio
nI:
JI:
JI:
Jat
esh
otI:
JI:
Jdr
ove~
tran
s)
drove motiv)
patI
ent
pers
onad
ub..:
Ich
ildbu
s dr
iver
scho
olgi
rl..:
I
..:I
stea
kso
upcr
acke
rsad
verb
gust
oID
..:
I..:
Ipl
easu
reda
intin
ess
..:I
Fig.
8, T
he s
eque
ntia
l pro
cess
ing
of a
gar
den-
path sentence, After "
the
stea
k" h
as b
een
proc
esse
d,the network instantiates "the adult" with the concept
busd
r;ve
r,
Whe
n "w
ith d
aint
ines
s" is
proc
esse
d, th
e ne
twor
k m
ust r
eint
erpr
et "the adult" to mean
teac
her.
4. Ambiguous se
nten
ces
The
am
bigu
ous
sent
ence
s in
the
test
set
wer
e te
sted
sep
arat
ely.
As
note
dab
ove, a
n ambiguous sentence has more than one co
nsis
tent
inte
rpre
tatio
n.Fo
r ex
ampl
e, th
e ad
ult i
n th
e se
nten
ce, "
The
adu
lt dr
ank
the
iced
-tea in the
livin
g-ro
om" can be instantiated with either
busd
rive
r or
te
ache
r as
the
agen
t,but the sentence offers no clues that
teac
her
is th
e co
rrec
t age
nt in
this
particular sentence/event pair in the test set. In
thes
e am
bigu
ous
case
s, th
emodel should compromise and activate
busd
rive
r an
d te
ache
r pa
rtia
Ily
and
equa
Ily,
cau
sing
two
smaI
l err
ors.
Wha
t the
net
wor
k ty
pica
Ily
did,
how
ever
was
to a
ctiv
ate
one
conc
ept s
light
ly m
ore
than
the
othe
r.O
ne r
easo
n fo
r th
ese
diffe
renc
es in
act
ivat
ions
is th
e re
cent
trai
ning
his
tory
of th
e ne
twor
k. T
he s
ente
nce/
even
t pai
rs tr
aine
d m
ore
rece
ntly
hav
e a
grea
ter
impact on the w
eigh
ts a
nd, therefore, on subsequent pr
oces
sing
. Bec
ause
sele
ctio
n of
trai
ning
exa
mpl
es o
ccur
s ra
ndom
ly, s
ever
al s
ente
nces
involving a
part
icul
ar a
gent
may
occ
ur b
efor
e a
sent
ence
/eve
nt in
volv
ing
a di
ffere
nt a
gent
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
237
is tr
aine
d, S
uch
trai
ning
bia
ses
can
lead
to a
bia
s in
the
activ
atio
n of
alte
rnat
ives
in a
mbi
guou
s se
nten
ces.
We
test
ed th
is e
xpla
natio
n by
trai
ning
the
netw
ork
on s
ente
nce/
even
t pai
rs th
at c
onsi
sted
of
an a
mbi
guou
s se
nten
ce a
ndthe subordinate
, wea
kly
activ
ated
, eve
nt. F
rom
one
to th
ree
trai
ning
tria
lsw
ere
requ
ired
to b
alan
ce th
e ac
tivat
ion
of th
e su
bord
inat
e ev
ent w
ith th
at o
fth
e pr
evio
usly
dom
inan
t eve
nt.
The
sen
sitiv
ity o
f the
net
wor
k to
rec
ent t
rain
ing
on a
mbi
guou
s se
nten
ces
isdu
e to
the
dyna
mic
s of
the
activ
atio
n fu
nctio
n, B
ecau
se th
e ac
tivat
ion
func
tion
is s
igm
oida
l, it
is s
ensi
tive
to c
hang
es in
the
valu
e of
its
inpu
t whe
n th
e va
lue
isin
the
mid
dle
of it
s ra
nge,
Sin
ce e
ach
mea
ning
of
an a
mbi
guou
s se
nten
cesh
ould
be
activ
ated
par
tially
, its
inpu
ts to
the
activ
atio
n fu
nctio
n m
ust l
ie in
this
mid
dle
rang
e, C
onse
quen
tly, m
inor
cha
nges
to th
e w
eigh
ts th
at d
eter
min
eth
e in
put v
alue
will
hav
e a
maj
or im
pact
on
the activation value,
Con
vers
ely,
in th
e ex
trem
es o
f the
ran
ge o
f the
act
ivat
ion
func
tion
, cha
nges
in th
e in
put v
alue
will
hav
e lit
tle discernible effect on the activation value,
Sin
ce th
e on
e m
eani
ng o
f an
unam
bigu
ous
sent
ence
sho
uld
be a
ctiv
ated
ful
ly,
its in
puts
to th
e ac
tivat
ion
func
tion
lie in
the
extr
emes
of
the
func
tion
s ra
nge,
Min
or c
hang
es in
the
wei
ghts
, the
refo
re, w
ill h
ave
only
min
or c
hang
es o
n th
eactivation value, m
akin
g th
e pr
oces
sing
of
unam
bigu
ous
sent
ence
s ro
bust
tore
cent
trai
ning
.
5. Learning
As
the
netw
ork
lear
ns to
com
preh
end sentences correctly, a nu
mbe
r of
deve
lopm
enta
l phe
nom
ena
can
be o
bser
ved,
In fa
ct, t
he o
nly
real
failu
res
inpe
rfor
man
ce s
tem
fro
m a
dev
elop
men
tal e
ffec
t. Pr
oble
ms
in processing only
aris
e in
pro
cess
ing
infr
eque
nt a
nd ir
regu
lar
sent
ence
s, F
or e
xam
ple,
sen
tenc
esab
out t
he b
usdr
iver
eat
ing
soup
are
rar
e, T
he n
etw
ork
is s
even
tim
es m
ore
likel
y to
see
a s
ente
nce
abou
t the
bus
driv
er e
atin
g st
eak
than
eat
ing
soup
, Thi
sfr
eque
ncy
diff
eren
ce c
reat
es a
str
ong
regu
lari
ty b
etw
een
"The
bus
driv
er a
teand the concept
stea
k,
In a
sen
tenc
e ab
out t
he b
usdr
iver
eat
ing
soup
, the
wor
dso
upconstrains the patient to be
soup
, w
hile
"T
he b
usdr
iver
ate
" partially
constrains the patient to be
stea
k,
The
con
stra
ints
com
pete
for
an in
terp
reta
-tio
n of
the
sent
ence
, Whe
n th
e re
gula
ritie
s ar
e pa
rtic
ular
ly s
tron
g, th
e co
n-te
xtua
l con
stra
ints
can
win
the
com
petit
ion
and
caus
e th
e bo
ttom
-up
activ
atio
nfr
om th
e w
ord
itsel
f to
be
over
ridd
en,
Tho
ugh
this
effe
ct s
eem
s lik
e a
serio
us fl
aw, i
t is
a fl
aw th
at th
e m
odel
sha
res
with
peo
ple,
In a
n ill
umin
atin
g ex
perim
ent,
Eri
ckso
n an
d M
atts
on (
8) a
sked
subj
ects
que
stio
ns li
ke
, "
How
man
y an
imal
s of
eac
h ki
nd d
id M
oses
take
on
the
Ark
?" S
ubje
cts
typi
cally
ans
wer
ed, "
Tw
o"
desp
ite th
eir
know
ledg
e, w
hen
late
r as
ked
, tha
t Mos
es h
ad n
othi
ng to
do
with
the
Ark
, Con
stra
ints
fro
m th
eco
ntex
t ove
rwhe
lmed
the
cons
trai
nt f
rom
the
wor
d "M
oses
,E
rick
son
and
Mat
tson
als
o de
scri
be a
sec
ond
orde
r ef
fect
in th
at s
ubje
cts
will
238
F, S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
balk
whe
n as
ked,
"H
ow m
any
anim
als
of e
ach
kind
did
Nix
on ta
ke o
n th
eA
rk?"
App
aren
tly, t
he d
egre
e of
sem
antic
ove
rlap
bet
wee
n th
e co
rrec
t con
cept
and
the
foil
affe
cts
how
eas
ily s
ubje
cts
will
be
mis
led,
The
mod
el d
emon
stra
tes
the
seco
nd o
rder
eff
ect a
s w
ell,
Giv
en "
The
bus
driv
er a
te th
e ba
ll"
the
mod
elfa
ils to
act
ivat
e an
y pa
tient
. The
mod
el c
anno
t exp
licitl
y ba
lk, b
ut it
s fa
ilure
to
repr
esen
t the
sen
tenc
e ac
cura
tely
or
mis
inte
rpre
t it i
s si
mila
r to
bal
king
, Thi
s
exam
ple
sugg
ests
that
the
SG m
odel
hol
ds p
rom
ise
for
dem
onst
ratin
g in
tere
st-
ing
hum
an-l
ike
erro
rs in
com
preh
ensi
on.
In th
e m
odel
, thi
s fr
eque
ncy
or r
egul
arity
eff
ect d
imin
ishe
s w
ith tr
aini
ng: t
here
liabi
lity
of a
con
stra
int,
its p
roba
bilit
y of
cor
rect
ly p
redi
ctin
g th
e ou
tput
rath
er th
an it
s ov
eral
l fre
quen
cy, b
ecom
es in
crea
sing
ly im
port
ant.
The
wor
dso
up" perfectly predicts the concept
soup
: w
hene
ver
"sou
p" appears in a
sentence, the event contains the concept
soup
. O
n th
e other hand, the
busd
rive
r ea
ts a
var
iety
of
food
s: "the busdriver ate"
is o
nly
70%
rel
iabl
e as
apredictor of
stea
k,
With
incr
ease
d tr
aini
ng, e
ven
low
fre
quen
cy c
onst
rain
ts a
repr
actic
ed, I
f th
ey a
re r
elia
ble,
they
gai
n st
reng
th a
nd e
vent
ually
out
wei
gh m
ore
freq
uent
but
less
rel
iabl
e co
nstr
aint
s. S
imila
r de
velo
pmen
tal t
rend
s oc
cur
asch
ildre
n le
arn
lang
uage
(17
). P
rogr
ess
is s
low
, but
aft
er a
tota
l of
630
000
tria
ls
even
thes
e ve
ry in
freq
uent
and
irre
gula
r se
nten
ces
are
proc
esse
d co
rrec
tly,
The
ear
ly e
ffec
t of
freq
uenc
y w
orks
for
syn
tact
ic c
onst
rain
ts a
s w
ell a
sse
man
tic c
onst
rain
ts, A
s sh
own
in F
ig, 9
, the
mod
el m
aste
rs s
ente
nces
in th
eactive voice sooner than it masters sentences in the passive voice, This
diff
eren
ce is
due
to th
e gr
eate
r fr
eque
ncy
of s
ente
nces
in th
e ac
tive
voic
e in
the
corp
us, W
hile
14
sent
ence
fra
mes
use
the
activ
e vo
ice,
onl
y 4
fram
es u
se th
epa
ssiv
e vo
ice,
Aft
er 3
3000
0 tr
ials
, tho
ugh
, bot
h vo
ices
are
han
dled
cor
rect
ly.
The
syn
tact
ic c
onst
rain
ts d
evel
op m
ore
slow
ly th
an th
e re
gula
r se
man
ticco
nstr
aint
s, Y
et w
hile
eve
ry s
ente
nce
cont
ains
wor
d or
der
cons
trai
nts,
onl
y an
occa
sion
al s
ente
nce
will
con
tain
a p
artic
ular
sem
antic
con
stra
int.
Bas
ed o
n th
e
100
Perc
ent
Correct 50
Perf
orm
ance
c::::
::J a
ctiv
e sy
ntac
ticE
im p
assi
ve s
ynta
cllc
c::J regular semantic
Irregular Bemanllc
Fig.
9, D
evel
opm
ent o
f pe
rfor
man
ce, a
ctiv
e sy
ntac
tic-T
he b
usdr
iver
kis
sed
the
teac
her;
pas
sive
synt
actic
-The
teac
her
was
kis
sed
by th
e bu
sdri
ver;
reg
ular
sem
antic
-The
bus
driv
er a
te th
e st
eak;
irre
gula
r se
mam
ic-
The
bus
driv
er a
te th
e so
up, C
orre
ct p
erfo
rman
ce m
eans
the
corr
ect c
once
pts
are
mor
e ac
tive
than
inco
rrec
t con
cept
s,
SEN
TE
NC
E C
OM
PRE
HE
NSI
ON
239
freq
uenc
y of
pra
ctic
e w
ith p
artic
ular
con
stra
ints
, then, the word order con-
stra
ints
sho
uld
be le
arne
d m
uch
earl
ier
than
the
sem
antic
con
stra
ints
, Tw
ocaveats to the frequency rule help explain this result. First, the syntactic
cons
trai
nts
invo
lve
the
conj
unct
ion
of w
ord
orde
r w
ith th
e pr
esen
ce o
r ab
senc
eof the passive markers, a
nd s
uch
conj
unct
ions
are
dif
ficu
lt to
lear
n, S
econ
dle
arni
ng te
nds
to g
ener
aliz
e ac
ross
sem
antic
ally
sim
ilar
wor
ds, s
o tr
aini
ng o
non
e w
ord
can
faci
litat
e th
e le
arni
ng o
f sim
ilar
wor
ds.
The
larg
e nu
mbe
r of
trai
ning
tria
ls r
equi
red
to a
chie
ve g
ood
perf
orm
ance
led
us to
look
for
way
s to
impr
ove
the
spee
d of
lear
ning
. One
exp
erim
ent c
onsi
sted
of r
emov
ing
the
firs
t hid
den
laye
r fr
om th
e ar
chite
ctur
e. T
he in
put a
nd th
epr
evio
us s
ente
nce
gest
alt t
hen
fed
dire
ctly
into
the
sent
ence
ges
talt
laye
r. I
tw
as h
oped
that
by making the network one layer more shallow, and by
redu
cing
the
num
ber
of w
eigh
ts, e
rror
cor
rect
ion
wou
ld p
roce
ed m
ore
quic
kly,
Tra
inin
g of
the
mod
ifie
d ne
twor
k w
as s
topp
ed w
hen
it be
cam
e ap
pare
nt th
atth
e m
odel
did
not
lear
n th
e se
man
tic c
onst
rain
ts m
ore
quic
kly,
and
it h
ad n
otle
arne
d th
e sy
ntac
tic c
onst
rain
ts a
t all.
It i
s po
ssib
le th
at th
e ne
twor
k co
uld
still
lear
n th
e sy
ntac
tic c
onst
rain
ts g
iven
mor
e tr
aini
ng, T
he p
oint
of
the
mod
ific
a-tio
n, h
owev
er, w
as to
spe
ed le
arni
ng, a
nd th
is g
oal w
as n
ot fu
lfille
d.T
he r
easo
n fo
r th
e ut
ility
of
the
firs
t hid
den
laye
r in
com
putin
g th
e sy
ntac
ticco
nstr
aint
s lie
s in
the
conj
unct
ive
natu
re o
f the
con
stra
ints
. The
hid
den
laye
r is
usef
ul in
com
putin
g th
e co
njun
ctio
n of
wor
d or
der
and
active/passive voice
mar
kers
, With
out t
he h
idde
n la
yer , the sentence ge
stal
t wou
ld h
ave
toco
mpu
te th
e co
njun
ctio
n as
wel
l as
repr
esen
t the
mea
ning
of
the
sent
ence
.Su
ch a
rep
rese
ntat
ion
is a
ppar
ently
dif
ficu
lt fo
r th
e ne
twor
k to
fin
d.
6. Representations
While the input to the network is a local encoding w
here
eac
h w
ord
isre
pres
ente
d by
a d
iffe
rent
uni
t, th
e ne
twor
k ca
n cr
eate
inte
rnal
rep
rese
ntat
ions
that
are
dis
trib
uted
and
that
exp
licitl
y en
code
hel
pful
'sem
antic
info
rmat
ion,
The
wei
ghts
run
ning
fro
m th
e in
put l
ayer
to th
e fi
rst h
idde
n la
yer
can
be s
een
as "constraint vectors"
whi
ch d
eter
min
e ho
w e
ach
wor
d in
fluen
ces
the
evol
u-tio
n of
the
sent
ence
ges
talt.
The
se c
onst
rain
t vec
tors
are
the
mod
el' s
botto
m-
up r
epre
sent
atio
n of
eac
h w
ord,
Wor
ds th
at im
pose
sim
ilar
cons
trai
nts
shou
ldde
velo
p si
mila
r co
nstr
aint
vec
tors
. A c
lust
er analysis ~f the
" w
eigh
t vec
tors
reve
als
thei
r si
mila
rity
. Sep
arat
e cl
uste
r an
alys
es w
ere
perf
orm
ed f
or u
n-am
bigu
ous
verb
s an
d no
uns
(see
Fig
, 10)
,T
he v
erbs
clu
ster
into
a n
umbe
r of
hie
rarc
hica
l gro
ups.
One
clu
ster
con
tain
sthe consumption ve
rbs,
Ano
ther
con
tain
s st
irred
and
spr
ead.
These two
clus
ters
then
com
bine
into
a c
lust
er o
f ve
rbs
invo
lvin
g pe
ople
and
foo
d. K
isse
dhi
t , and shot formed another cl
uste
r. F
or e
ach
of these verbs there were
pass
ive-
voic
e se
nten
ces
in th
e co
rpus
, and
eac
h co
uld
take
an
anim
ate
obje
ct,
Gav
e, t
he o
nly
dativ
e ve
rb, s
tand
s ap
art f
rom
the
othe
r ve
rbs.
Thi
s cl
uste
ring
240
F. S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
refle
cts
the
sim
ilarit
y of
the case frames of the m
embe
rs o
f th
e di
ffer
ent
clus
ters
.T
he c
onst
rain
t vec
tors
of
noun
s fu
rthe
r re
flec
t sim
ilari
ties
in th
e co
nstr
aint
sth
ey im
pose
on
the
evol
ving
sen
tenc
e ge
stal
t. T
his
sim
ilari
ty is
ref
lect
ed in
two
way
s. A
s w
ith th
e ve
rbs,
sem
antic
ally
sim
ilar
wor
ds c
lust
er: a
ll of
the
peop
lecl
uste
r, a
nd d
og a
nd s
pot a
re v
ery
sim
ilar.
Wor
ds th
at o
ccur
together in the
sam
e co
ntex
t als
o ha
ve s
imila
r co
nstr
aint
vec
tors
, For
exa
mpl
e, ice cr
eam
clus
ters
with
par
k, a
nd je
lly c
lust
ers
with
kni
fe, I
n th
e co
rpus
, ice
cre
am is
always eaten in the park
, and
jelly
is a
lway
s sp
read
with
a k
nife
. The
ir s
imila
rco
nstr
aint
vec
tors
fol
low
fro
m th
e si
mila
r co
nstr
aint
s th
ey im
pose
on
the
even
tsde
scri
bed
by th
e se
nten
ces
in w
hich
they
app
ear,
7. Generalization
An
impo
rtan
t rem
aini
ng q
uest
ion
is w
heth
er th
e m
odel
is actually learning
usef
ul c
onst
rain
ts th
at it
can
app
ly to
nov
el s
ente
nces
or
whe
ther
it is
sim
ply
mem
oriz
ing
sent
ence
/eve
nt p
airs
, Sin
ce s
ente
nces
in th
e fi
rst s
imul
atio
n w
ere
high
sim
ilaril
y -
low
sim
ilaril
y 10 1
2 ga
ve
stir
red
spre
ad dran
kc:
::::
(1).., ji)..,;:+ "C:
cons
umed at
e
hit
shot
Fig.
10.
Clu
ster
ana
lyse
s. T
he a
naly
sis
com
pute
s th
e si
mila
rity
bet
wee
n th
e w
eigh
t vec
tors
lead
ing
from
eac
h in
put u
nit t
o th
e fi
rst h
idde
n la
yer,
The
mor
e si
mila
r tw
o ve
ctor
s or
clu
ster
s of
vec
tors
(in
eucl
idea
n di
stan
ce).
the
soon
er th
ey a
re c
ombi
ned
into
a n
ew c
lust
er. P
hysical distance in the
figur
e is
irre
leva
nt; o
nly
the
clus
terin
g is
impo
rtan
t. F
or in
stan
ce, "
stir
red"
is n
ot n
otab
ly m
ore
similar to "dr
ank"
than
it is
to "
ate,
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
241
high
sim
ilarit
y. lo
w s
imila
rity
pers
onch
ildic
ecre
am park
::J iiI..,;:+
soup
crac
kers
spoo
nki
tche
nje
llykn
iledo
gsp
olsl
eak
iced
-lea
livin
g-ro
om
Fig. 10 (conld,
generated randomly and none were set aside to not be trained, the fi
rst
sim
ulat
ion
cann
ot b
e us
ed to
eva
luat
e ge
nera
lizat
ion.
Tw
o ne
w c
orpo
ra w
ere
deve
lope
d to
test
the
mod
el's
gen
eral
izat
ion
abili
ties.
One
of t
he n
ew c
orpo
raw
as d
esig
ned
to te
st th
e m
odel
's a
bilit
y to
gen
eral
ize
synt
actic
reg
ular
ities
and
the
othe
r w
as d
esig
ned
to te
st it
s ab
ility
to g
ener
aliz
e se
man
tic r
egul
ariti
es. T
hem
odel
was
trai
ned
and
eval
uate
d on
eac
h co
rpus
sep
arat
ely.
1.
Synt
ax
For
syn
tax,
we
test
ed th
e m
odel
's a
bilit
y to
lear
n an
d us
e th
e sy
ntax
of
activ
ean
d pa
ssiv
e se
nten
ces.
Cou
ld th
e m
odel
lear
n to
use
the active/passive voice
mar
kers
and
the
tem
pora
l ord
er o
f th
e co
nstit
uent
s, w
ord
orde
r, p
rodu
ctiv
ely
on n
ovel
sen
tenc
es?
Thi
s is
a te
st o
f co
mpo
sitio
nalit
y. T
he m
odel
mus
t lea
rn to
com
pose
the
fam
iliar
con
stitu
ents
of
sent
ence
s in
new
com
bina
tions
. To
perf
orm
this
generalization task, the m
odel
mus
t lea
rn s
ever
al ty
pes
ofin
form
atio
n. F
irst,
it m
ust l
earn
the
conc
ept r
efer
red
to b
y ea
ch w
ord
in a
sentence: "Jo
hn" refers to
John
an
d "s
aw" refers to
saw
. Se
cond
, the
mod
el
mus
t lea
rn th
at th
e or
der
of th
e co
nstit
uent
s, w
hich
con
stitu
ent c
omes
bef
ore
the
verb
and
whi
ch a
fter,
is im
port
ant t
o as
sign
ing
the
agen
t and
pat
ient
.T
hird
, the
mod
el m
ust l
earn
the relevance of the passive markers: the past
242
F. S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
part
icip
le o
f the
ver
b an
d th
e pr
epos
ition
al p
hras
e be
ginn
ing
with
"by
.
Four
th, t
he m
odel
mus
t lea
rn to
inte
grat
e th
e or
der
info
rmat
ion
and
the
pass
ive
mar
ker
info
rmat
ion
to c
orre
ctly
ass
ign
the
them
atic
rol
es.
The
mod
el w
as tr
aine
d on
a c
orpu
s of
sen
tenc
es c
ompo
sed
of te
n pe
ople
and
ten reversible actions
, suc
h as
"John saw Mary.
" T
hese
sen
tenc
es c
ould
app
ear
in th
e ac
tive
or p
assi
ve v
oice
. The
bas
ic c
orpu
s co
nsis
ted
of a
ll 2000 sentences
(10
peop
le b
y 10
act
ions
by
10 p
eopl
e by
2 v
oice
s). B
efor
e tr
aini
ng b
egan
thou
gh, 2
50 o
f th
ese
sent
ence
s (1
2.5%
) w
ere
set a
side
for
late
r te
stin
g: th
eyw
ere
not t
rain
ed. T
he m
odel
was
then
trai
ned
on th
e re
mai
ning
sen
tenc
es.
Onc
e th
e tr
aini
ng c
orpu
s w
as m
aste
red
(afte
r 10
000
0 tr
ials
), th
e 25
0 te
stse
nten
ces
wer
e pr
esen
ted
to th
e m
odel
. The
mod
el p
roce
ssed
97%
of
thes
ese
nten
ces
corr
ectly
. In
' onl
y 11
sen
tenc
es d
id th
e m
odel
inco
rrec
tly a
ssig
n a
thematic role. In these sentences, w
hen
prob
ed w
ith o
ne o
f the
fille
rs, the
mod
el a
ctiv
ated
an
inco
rrec
t rol
e m
ore
stro
ngly
than
the
corr
ect r
ole.
2.
Sem
antic
s
Sem
antic
reg
ular
ities
may
als
o pr
ovid
e th
e ba
sis
for
gene
raliz
atio
n. W
e te
sted
the
mod
el's
abi
lity
to le
arn
som
e se
man
tic r
egul
ariti
es a
nd a
pply
them
in n
ovel
cont
exts
. The
re a
re tw
o ba
sic
gene
raliz
atio
n ef
fect
s w
e w
ould
like
to s
ee in
the
mod
el's
beh
avio
r. F
irst,
as
with
syn
tax,
the
trai
ned
mod
el s
houl
d ex
hibi
tco
mpo
sitio
nalit
y: th
e m
odel
sho
uld
be a
ble
to r
epre
sent
suc
cess
fully
sen
tenc
esit
has
neve
r se
en b
efor
e. S
econ
dly,
the
mod
el s
houl
d m
ake
pred
ictio
ns: i
tshould be able to use information presented early in
a s
ente
nce
to h
elp
itpr
oces
s su
bseq
uent
info
rmat
ion.
To
test
thes
e ge
nera
lizat
ion
effe
cts
, we
agai
n cr
eate
d a
sim
ple
corp
us f
or th
em
odel
to le
arn.
Thi
s co
rpus
con
sist
ed o
f 8
adul
ts a
nd 8
chi
ldre
n, 5
act
ions
, and
8 ob
ject
s fo
r ea
ch a
ctio
n. T
he c
orpu
s w
as arranged to make age a semantic
regu
lari
ty. F
or e
ach
actio
n, 3 objects were presented with adults, 3 were
pres
ente
d w
ith c
hild
ren,
and
2 w
ere
pres
ente
d w
ith b
oth
adul
ts a
nd c
hild
ren.
Thi
s ar
rang
emen
t cre
ated
a c
orpu
s of
400
sen
tenc
es (
8 ad
ults
by
5 ac
tions
by
5ob
ject
s pl
us 8
chi
ldre
n by
5 a
ctio
ns b
y 5
obje
cts)
. For
exa
mpl
e
Geo
rge
wat
ched
the
new
s.G
eorg
e w
atch
ed J
ohnn
y C
arso
n.G
eorg
e w
atch
ed D
avid
Let
term
an.
Geo
rge
wat
ched
Sta
r T
rek.
Geo
rge
wat
ched
the
Roa
dR
unne
r.
Bob
by w
atch
ed H
e-M
an.
Bob
by w
atch
ed S
mur
fs.
Bob
by w
atch
ed M
ight
yM
ouse
.B
obby
wat
ched
Sta
r T
rek.
Bob
by w
atch
ed th
e R
oad
Run
ner.
The
reg
ular
ity is
that
the
age
of th
e ag
ent,
in conjunction with the verb
pred
icts
a s
et o
f ob
ject
s. A
ge, t
houg
h, i
s no
t exp
licitl
y en
code
d in
the
inpu
t or
the output representations. Instead, i
t is
a "h
idde
n" feature (Unlike people in
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
243
the
first
sim
ulat
ion
, peo
ple
in th
is c
orpu
s ar
e re
pres
ente
d by
a single unit, i,
loca
lly.)
, The
mod
el m
ust l
earn
whi
ch a
gent
s ar
e th
e ad
ults
and
whi
ch a
re th
ech
ildre
n ba
sed
on it
s ex
perie
nce
seei
ng th
e ob
ject
s w
ith w
hich
eac
h is
pai
red.
The
chi
ldre
n al
l wat
ch o
ne s
et o
f sh
ows,
and
the
adul
ts a
ll w
atch
ano
ther
set
.A
ge, t
here
fore
, org
aniz
es th
e se
nten
ces
into
set
s, S
ince
age
doe
s no
t app
ear
inthe input
, how
ever
, it m
ust b
e in
duce
d by
the
mod
el a
nd accessed as each
sent
ence
is p
roce
ssed
.B
efor
e th
e m
odel
was
trai
ned
on th
e corpus of sentences, 50 of the 400
sent
ence
s (1
2,5%
) w
ere
rand
omly
pic
ked
to b
e se
t asi
de a
s th
e ge
nera
lizat
ion
set.
The
mod
el was then trained on the remaining se
nten
ces,
Whe
n th
ese
sent
ence
s w
ere
proc
esse
d co
rrec
tly (
afte
r 90
000
tria
ls),
the
mod
el w
as te
sted
for
com
posi
tiona
l and
pre
dict
ive
gene
raliz
atio
n,A
gain
, to be compositional, the m
odel
sho
uld
be a
ble
to p
roce
ss c
orre
ctly
the
sent
ence
s on
whi
ch it
had
not
bee
n tr
aine
d, T
he m
odel
pro
cess
ed 8
6% o
fth
ese
sent
ence
s co
rrec
tly, O
n th
e re
mai
ning
14%
, or 7 sentences, t
he m
odel
activ
ated
the
wro
ng o
bjec
t mor
e st
rong
ly th
an it
act
ivat
ed th
e co
rrec
t obj
ect.
At t
he p
oint
in tr
aini
ng w
hen
the
mod
el w
as s
topp
ed, t
hen,
it c
ould
com
pose
nove
l sen
tenc
es r
easo
nabl
y w
ell,
Semantic generalization should allow the model to predict subsequent
info
rmat
ion,
The
re a
re tw
o ty
pes
of r
egul
ariti
es th
at th
e m
odel
sho
uld
disc
over
to h
elp
mak
e pr
edic
tions
. One
is th
e ge
nera
l reg
ular
ity th
at c
hild
ren
doch
ildre
ns
activ
ities
and
that
adu
lts d
o ad
ult a
ctiv
ities
, To
test
the
lear
ning
of
this
reg
ular
ity w
e ob
serv
ed w
hich
obj
ects
the
mod
el p
redi
cted
aft
er th
e m
odel
had processed the agent and action, Given an agent and an action
, the
mod
elsh
ould
pre
dict
the
obje
cts
appr
opri
ate
for
that
act
ion
and
an a
gent
of
that
age
.Fo
r ex
ampl
e, g
iven
the
part
ial s
ente
nce
, "
Bob
by w
atch
ed..
, ," the model
shou
ld p
redi
ct th
e ob
ject
to b
e ei
ther
He-
Man
, Sm
urfs
, Mig
hty
Mou
se, t
heR
oad
Run
ner,
or
Sta
r T
rek,
Spe
cific
ally
, thi
s re
gula
rity
sugg
ests
that
the
mod
elshould activate each of these subjects liS or 0,
Wha
t mak
es th
is a
gen
eral
izat
ion
task
is th
at s
ome
of th
e se
nten
ces
in th
eco
rpus
wer
e se
t asi
de a
nd n
ot tr
aine
d: s
ome
agen
ts w
ere
neve
r pa
ired
with
cert
ain
obje
cts,
For
exa
mpl
e, th
e ne
twor
k w
as n
ever
trai
ned
on th
e se
nten
ceB
obby
wat
ched
He-
Man
," T
o ac
tivat
e ea
ch o
bjec
t in
the
age
appr
opri
ate
set
to 0.
, the
mod
el m
ust g
ener
aliz
e, I
t mus
t generalize both to discover the
com
plet
e se
t of f
ive
child
ren
s sh
ows,
and
to associate, that set to ea
ch o
f th
eei
ght c
hild
ren,
Thi
s ge
nera
lizat
ion
can
be te
sted
by
obse
rvin
g w
heth
er H
e-M
anis
act
ivat
ed a
s a
pred
icte
d ob
ject
for
"B
obby
wat
ched
, , ,
,T
he o
ther
type
of
regu
larit
y is
spe
cific
to e
ach
agen
t. T
he fa
ct th
at th
etr
aini
ng c
orpu
s do
es n
ot c
onta
in
, "
Bob
by w
atch
ed H
e-M
an"
is a
lso
a re
gula
ri-
ty th
at th
e m
odel
can
lear
n. It
lear
ns th
at it
sho
uld
not p
redi
ct H
e-M
an a
s an
obje
ct f
or "
Bob
by w
atch
ed, .
. . " Since
, for
Bob
by, t
here
are
onl
y fo
ur s
how
sthat he actually watches, e
ach
shou
ld b
e gi
ven
an a
ctiv
atio
n le
vel o
f 1/
4 or
25, T
he m
odel
's te
nden
cy to
gen
eral
ize
by u
sing
the
gene
ral r
egul
arity
abo
ut
244
F, S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
child
ren
s vi
ewin
g ha
bits
, the
refo
re, i
s co
unte
ract
ed b
y th
e sp
ecifi
c re
gula
rity
about Bobbys personal viewing ha
bits
, The
activation of the appropriate/
untr
aine
d (e
,g, H
e-M
an f
or B
obby
) ob
ject
s sh
ould
be
a compromise between
the two competing fo
rces
, The
app
ropr
iate
/unt
rain
ed o
bjec
ts, t
hen, should
have
an
activ
atio
n so
mew
hat l
ess
than
0.2
. For
the
appr
opria
te/tr
aine
d (e
.M
ight
y M
ouse
for
Bob
by)
obje
cts,
the
activ
atio
n le
vel s
houl
d lie
bet
wee
n 0.
and
0,25
, and
the
activ
atio
n le
vel f
or th
e in
appr
opri
ate/
untr
aine
d ob
ject
s (e
,th
e ne
ws
for
Bob
by)
shou
ld b
e cl
ose
to 0
,T
he m
odel
's p
redi
ctio
ns o
n ea
ch o
f th
e fi
ve a
ctio
ns f
or f
our
of th
e pe
ople
inth
e co
rpus
wer
e ta
bula
ted.
The
activation values for the model's object
pred
ictio
ns w
ere
cate
gori
zed
into
thre
e gr
oups
: the
app
ropr
iate
/trai
ned, the
appr
opria
te/u
ntra
ined
, and
the
inap
prop
riate
/unt
rain
ed. T
he a
vera
ge a
ctiv
a-
tion
in e
ach
grou
p is
sho
wn
in T
able
2. T
he m
eans
are
sig
nific
antly
diff
eren
t
from
one
ano
ther
.T
hree
con
clus
ions
can
be
draw
n fr
om th
ese
data
. One
, the
app
ropr
iate
/tr
aine
d ob
ject
s ar
e ac
tivat
ed to
the
appr
opri
ate
degr
ee. T
he m
odel
cor
rect
ly
pred
icts
the
corr
ect s
et o
f ob
ject
s fo
r th
e ac
tion
and
the
age
of th
e ag
ent.
Tw
oth
e m
odel
generalizes in that it also ac
tivat
es a
ge-c
orre
ct o
bjec
ts it
has
not
actu
ally
see
n in
that
con
text
bef
ore.
The
sub
stan
tially
wea
ker
activ
atio
n of
thes
e ap
prop
riate
/unt
rain
ed o
bjec
ts d
emon
stra
tes
the
com
petit
ion
betw
een
the
gene
ral a
nd s
peci
fic
regu
lari
ties.
The
gen
eral
reg
ular
ity a
ctiv
ates
the
obje
cts
but t
he a
gent
-spe
cific
reg
ular
ities
red
uce
that
act
ivat
ion,
Thr
ee, t
he s
et o
fin
appr
opria
te/u
ntra
ined
obj
ects
for
each
act
ion
is tu
rned
off.
The
dif
fere
nce
in a
ctiv
atio
n be
twee
n th
e ap
prop
riate
/unt
rain
ed a
nd in
ap-
prop
riat
e/un
trai
ned
obje
cts
unde
rsta
tes
thei
r di
ffer
ence
in th
e ne
twor
k, A
s th
eactivation of a unit approaches 1 or 0, the non-linearity of the activation
func
tion
requ
ires
that
exp
onen
tially
mor
e ac
tivat
ion
be a
dded
to m
ove
clos
erto 1 or 0, A large change in the net input
, the
refo
re, w
ould
be
requ
ired
tore
duce
the
activ
atio
n le
vel o
f an
app
ropr
iate
/unt
rain
ed o
bjec
t to
that
of
anin
appr
opri
ate/
untr
aine
d ob
ject
. Con
sequ
ently
, the
two
type
s of
obj
ects
are
substantially different
, and
the
gene
ral r
egul
arity
is h
avin
g a
sign
ifica
nt im
pact
on p
redi
ctio
n,In
tere
stin
gly,
ther
e is
a tr
ade-
off d
uring training between compositional
gene
raliz
atio
n an
d pr
edic
tive
gene
raliz
atio
n, D
urin
g tr
aini
ng th
e m
odel
im-
prov
es it
s ab
ility
to c
ompo
se n
ovel
sen
tenc
es, b
ut lo
ses
the
abili
ty to
mak
ege
nera
l pre
dict
ions
, The
gen
eral
pre
dict
ions
are
lost
as
the
mod
el le
arns
the
Table 2
Sem
antic
pre
dict
ion
App
ropr
iate
ness
Tra
inin
g
Tra
ined
Unt
rain
ed
App
ropr
iate
Inap
prop
riat
e21
01I
36
(XI6
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
245
spec
ific
reg
ular
ities
of
the
trai
ning
cor
pus.
Com
posi
tiona
lity
is g
aine
d as
the
inpl
lt co
nstit
uent
s be
com
e le
xem
e-lik
e. A
s th
e ne
twor
k tr
ains
, it s
low
ly le
arns
whi
ch p
arts
of
the
inpu
t are
res
pons
ible
for
whi
ch p
arts
of
the
outp
ut. T
hene
twor
k sl
owly
hon
es th
e m
eani
ngs
of th
e in
put c
onst
ituen
ts u
ntil
they
are
esse
ntia
lly r
epre
sent
ed a
s le
xem
es.
The
mod
el's
trai
ning
pro
cedu
re e
ncou
rage
s pr
edic
tive
gene
raliz
atio
n fo
rbo
th ty
pes
of r
egul
ariti
es b
y ex
plic
itly
requ
iring
the
mod
el to
pre
dict
the
obje
ctaf
ter
proc
essi
ng th
e ag
ent a
nd a
ctio
n. R
ecal
l tha
t the
mod
el must answer
ques
tions
abo
ut th
e en
tire
sent
ence
aft
er p
roce
ssin
g ea
ch c
onst
ituen
t. T
hem
odel
lear
ns th
at a
ll of
the
untr
aine
d ob
ject
s, b
oth
appr
opria
te a
nd in
appr
op-
riat
e, s
houl
d no
t be
pred
icte
d be
caus
e th
ey n
ever
occ
ur.
It m
ay s
eem
that
the
mod
el d
iver
ges
from
hum
an b
ehav
ior
as it
red
uces
the
infl
uenc
e of
the
gene
ral r
egul
arity
in f
avor
of
the
pers
on-s
peci
fic
regu
lari
ties.
It
mus
t be
rem
embe
red
, tho
ugh
, tha
t the
mod
el is
trai
ned
on a
set
of
only
pe
ople
. If
ther
e w
ere
man
y m
ore
peop
le, t
he p
erso
n-sp
ecif
ic r
egul
ariti
es w
ould
rece
ive
less
pra
ctic
e an
d th
e ge
nera
l reg
ular
ity w
ould
rece
ive
mor
e pr
actic
e.T
his
chan
ge w
ould
mak
e th
e ag
e ge
nera
lizat
ion
muc
h st
rong
er.
The
mod
el c
an c
lear
ly le
arn
and
prod
uctiv
ely
appl
y re
gula
ritie
s fr
om it
str
aini
ng c
orpu
s to
nov
el s
ente
nces
. How
ever
, the
deg
ree
to w
hich
the
mod
elco
mpo
ses
nove
l sen
tenc
es, 9
7% in
the
synt
actic
cor
pus,
but
onl
y 86
% in
the
semantic corpus, i
s fa
r fr
om th
e de
gree
we
expe
ct f
rom
peo
ple.
Aga
in, t
his
perf
orm
ance
may
be
due
to th
e si
ze a
nd c
onte
nt o
f the
cor
pus.
The
effe
cts
ofth
ese
fact
ors
on g
ener
aliz
atio
n ar
e no
t wel
l und
erst
ood.
Our
sim
ulat
ion
shou
ldbe
take
n on
ly a
s an
indi
catio
n th
at s
ome
degr
ee o
f com
posi
tiona
lity
can
beac
quir
ed.
As
dem
onst
rate
d by
lear
ning
the
rule
for
the passive voice construction, it
can
lear
n sy
ntac
tic r
egul
ariti
es, a
nd a
s de
mon
stra
ted
by le
arni
ng th
e m
eani
ngof each word and the hidden-fe
atur
e ag
e, it
can
lear
n se
man
tic r
egul
ariti
es.
Gen
eral
izat
ion
base
d on
thes
e re
gula
ritie
s ta
kes
two
form
s. W
hen
the
regu
lari
-ty consists of input/output pairings, g
ener
aliz
atio
n le
ads
to c
ompo
sitio
nalit
y.In
put/o
utpu
t pai
ring
s ar
e lik
e le
xem
es in
that
the
spec
ific
inpu
t con
stra
ins
asp
ecifi
c pa
rt o
f the
out
put.
The
re a
re fe
w, i
f an
y, c
onst
rain
ts o
n ot
her
part
s of
the
outp
ut. L
exem
es w
ill c
ompo
se in
nov
el c
ombi
natio
ns e
asily
bec
ause
eac
hon
e m
akes
a s
epar
ate
and
inde
pend
ent c
ontr
ibut
ion
to th
e in
terp
reta
tion.
Whe
n th
e re
gula
rity
con
sist
s of
inpu
t/inp
ut p
airi
ngs,
gen
eral
izat
ion
can
lead
to prediction. For input/input pairings, one part of the input affects multiple
part
s of
the
outp
ut. S
ome
of th
e pa
rts
may
be
futu
re c
onst
ituen
ts, s
o th
e in
put,
in effect, p
redi
cts
futu
re in
puts
. The
se in
put/i
nput
pai
ring
s m
ay n
ot c
ompo
seea
sily
. The
par
ts m
ay m
ake
mut
ually
con
trad
icto
ry p
redi
ctio
ns. F
or e
xam
ple
in, "
Bobby watched the news
" "
Bob
by" predicts that
the news
will
not
be
wat
ched
, and
"the news" predicts that
Bob
by,
a ch
ild, w
ill n
ot b
e w
atch
ing.
The
se c
ontr
adic
tory
pre
dict
ions
will
cre
ate
conf
lict i
n th
e se
nten
ce g
esta
lt an
dm
ake
the
inte
rpre
tatio
n ha
rd to
rep
rese
nt.
246
F. S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
Add
ition
ally
, a n
ovel
pai
ring
may
not
com
pose
eas
ily b
ecau
se th
e co
nstr
aint
ssp
ecifi
c to
a p
air
may
not
hav
e be
en le
arne
d. In
oth
er w
ords
, the
re m
ay b
eco
nstr
aint
s, in
the
envi
ronm
ent,
that
per
tain
to a
pai
r of
inpu
t con
stitu
ents
. If
the
mod
el h
as n
ever
exp
erie
nced
this
pai
r, it
can
not k
now
thes
e co
nstr
aint
s,an
d th
ey w
ill n
ot a
ffec
t the
inte
rpre
tatio
n.
8. V
aria
ble
synt
actic
fram
es
The
mod
el is
abl
e to
lear
n an
d us
e sy
ntac
tic in
form
atio
n. I
t can
cor
rect
ly a
pply
synt
actic
info
rmat
ion
to. a
ssig
n thematic roles in sentences with re
vers
ible
verb
s. B
ut h
ow c
ompl
ex .
synt
ax c
an th
e m
odel
lear
n? S
ince
the
mod
el c
an o
nly
represent simple sentences in the output layer, it
cann
ot le
arn
or u
se th
e
com
plex
syn
tax
invo
lved
in s
ente
nces
with
em
bedd
ed c
laus
es. T
he s
ynta
x of
sentences involving "
gave
" however, is relatively co
mpl
ex b
ecau
se o
f th
eva
riab
ility
in th
e lo
catio
n of
rol
es in
the
sent
ence
. Bec
ause
it n
eed
only
invo
lve
sim
ple
sent
ence
s, it
is representable in the output la
yer,
and
bec
ause
it is
repr
esen
tabl
e, it
can
be
used
as
the
targ
et f
or e
rror
cor
rect
ion.
The
que
stio
n is
,w
ill th
e m
odel
be
able
to le
arn?
We
crea
ted
a co
rpus
con
sist
ing
of th
e di
ffer
ent l
egal
con
stru
ctio
ns o
f "g
ave.
The
bus
driv
er g
ave
the
rose
to th
e te
ache
r.T
he b
usdr
iver
gav
e th
e te
ache
r th
e ro
se.
The
teac
her
was
giv
en th
e ro
se b
y th
e bu
sdriv
er.
The
ros
e w
as g
iven
to th
e te
ache
r by
the
busd
river
.T
he r
ose
was
giv
en b
y th
e bu
sdri
ver
to th
e te
ache
r.
The
cor
pus
cons
iste
d of
56
even
ts o
f th
is ty
pe. E
ach
even
t was
equ
ally
fre
quen
tso
that
ther
e w
ere
no s
eman
tic r
egul
ariti
es fo
r th
e m
odel
to d
etec
t and
use
tohe
lp it
ass
ign
the
agen
t and
rec
ipie
nt th
emat
ic r
oles
. The
mod
el w
as a
ble
tom
aste
r th
is c
orpu
s. I
t lea
rned
to c
orre
ctly
ass
ign
each
of
the
them
atic
rol
es in
the event described by the sentence.
The
re a
re s
till m
any
mor
e ph
enom
ena
on w
hich
the
mod
el h
as n
ot b
een
trai
ned
or te
sted
. The
gen
eral
poi
nt, t
houg
h, i
s th
at th
e m
odel
app
ears
to h
ave
the
capa
bilit
y to
lear
n an
d us
e sy
ntac
tic c
onst
rain
ts p
rodu
ctiv
ely.
Whe
re it
slim
its a
re is
pre
sent
ly u
nkno
wn.
4, Discussion
The
SG
mod
el h
as b
een
quite
suc
cess
ful i
n m
eetin
g th
e go
als
that
we
set o
utfo
r it,
but
it is
of
cour
se f
ar f
rom
bei
ng th
e fi
nal w
ord
on s
ente
nce
com
preh
en-
sion
. Her
e w
e br
iefly
review the model's accomplishments. Following th
is
revi
ew, w
e co
nsid
er s
ome
of it
s lim
itatio
ns a
nd h
ow th
ey m
ight
be
addr
esse
d by
furt
her
wor
k.
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
247
1. A
ccom
plis
hmen
ts o
f th
e m
odel
One
of
the
prin
cipl
e su
cces
ses
of th
e SG
mod
el is that it correctly assigns
cons
titue
nts
to th
emat
ic r
oles
bas
ed o
n sy
ntac
tic a
nd s
eman
tic c
onst
rain
ts, T
hesy
ntac
tic c
onst
rain
ts a
re m
ore
diff
icul
t for the model to master than the
sem
antic
con
stra
ints
eve
n th
ough
we
have
pro
vide
d ex
plic
it cu
es to
the
synt
axin
the
form
of
the
surf
ace
loca
tion
of th
e co
nstit
uent
s , in
the
inpu
t. T
he m
odel
does
, how
ever
, com
e to
mas
ter
thes
e co
nstr
aint
s as
they
are
exe
mpl
ified
in th
eco
rpus
of
trai
ning
sen
tenc
es, T
houg
h sy
ntac
tic c
onst
rain
ts c
an b
e si
gnif
ican
tlym
ore
subt
le th
an th
ose
our
mod
el h
as f
aced
thus
far
, tho
se it
has
fac
ed a
refa
irly
dif
ficu
lt. T
o co
rrec
tly l1
andl
e ac
tive
and
pass
ive
sent
ence
s, the model
mus
t map
sur
face
con
stitu
ents
ont
o di
ffer
ent r
oles
dep
endi
ng o
n th
e pr
esen
ceof
var
ious
sur
face
cue
s el
sew
here
in th
e se
nten
ce,
The
mod
el a
lso
exhi
bits
con
side
rabl
e ca
paci
ty to
use
con
text
to d
isam
bigu
ate
meanings and to in
stan
tiate
vag
ue te
rms
in contextually appropriate ways,
Inde
ed, it is pr
obab
ly m
ost a
ppro
pria
te to
vie
w th
e m
odel
as
trea
ting
each
constituent in a sentence as a clue or set of cl
ues
that
con
stra
in th
e ov
eral
lev
ent d
escr
iptio
n, r
athe
r th
an a
s tr
eatin
g ea
ch c
onst
ituen
t as
a le
xica
l ite
m w
itha
part
icul
ar m
eani
ng, A
lthou
gh e
ach
clue
may
pro
vide
str
onge
r co
nstr
aint
s on
som
e as
pect
s of
the
even
t des
crip
tion
than
on
othe
rs, i
t is simply not the case
that
the
mea
ning
ass
ocia
ted
with
the
part
of the event designated by each
cons
titue
nt is
con
veye
d by
onl
y th
at c
onst
ituen
t its
elf,
The
mod
el li
kew
ise
infe
rs u
nspe
cifie
d ar
gum
ents
rou
ghly
to th
e ex
tent
that
they
can
be
relia
bly
pred
icte
d fr
om th
e co
ntex
t, H
ere
we
see
very
cle
arly
that
cons
titue
nts
of a
n ev
ent d
escr
iptio
n ca
n be
cue
d w
ithou
t bei
ng s
peci
fica
llyde
sign
ated
by
any
cons
titue
nt o
f th
e se
nten
ce, T
hese
infe
renc
es a
re g
rade
d to
refle
ct th
e de
gree
to w
hich
they
are
app
ropr
iate
giv
en th
e se
t of c
lues
pro
vide
d,T
he d
raw
ing
of th
ese
infe
renc
es is
als
o co
mpl
etel
y in
trin
sic to the basic
com
preh
ensi
on p
roce
ss: n
o sp
ecia
l sep
arat
e in
fere
nce
proc
esse
s m
ust b
e sp
aw-
ned
to m
ake
infe
renc
es, t
hey
sim
ply
occu
r im
plic
itly
as' the constituents of the
sent
ence
are
pro
cess
ed,
The
mod
el d
emon
stra
tes
the
capa
city
to u
pdat
e its
.rep
rese
ntat
ion
as e
ach
new
con
stitu
ent i
s en
coun
tere
d, O
ur d
emon
stra
tion
of th
is aspect of the
mod
el's
per
form
ance
is s
omew
hat i
nfor
mal
; nev
erth
eles
s , it
s ca
pabi
litie
s se
emim
pres
sive
, As
each
con
stitu
ent i
s en
coun
tere
d, t
he in
terp
reta
tion
of a
ll as
pect
sof
the
even
t des
crip
tion
is s
ubje
ct to
cha
nge,
If
we
reve
rt to
thin
king
in te
rms
of m
eani
ngs
of p
artic
ular
con
stitu
ents
, bot
h pr
ior
and
subs
eque
nt c
onte
xt c
anin
fluen
ce th
e in
terp
reta
tion
of e
ach
cons
titue
nt. U
nlik
e m
ost c
onve
ntio
nal
sent
ence
pro
cess
ing
mod
els
, the
abi
lity
to e
xplo
it su
bseq
uent
con
text
is a
gain
an in
trin
sic
part
of
the
proc
ess
of in
terp
retin
g ea
ch n
ew c
onst
ituen
t. T
here
isno
bac
ktra
ckin
g; r
athe
r , the representation of the sentence is simply up
date
dto
ref
lect
the
cons
trai
nts
impo
sed
by e
ach
cons
titue
nt a
s it
is e
ncou
nter
ed,
Whi
le a
void
ing
back
trac
king
, the
mod
el also avoids the computational
248
F. S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
expl
osio
n of
com
putin
g ea
ch p
ossi
ble
inte
rpre
tatio
n of
a s
ente
nce
as it
enc
oun-
ters
am
bigu
ous
wor
ds a
nd th
emat
ic r
ole
assi
gnm
ents
, A s
impl
er m
odel
hel
psex
plai
n ho
w th
e S
G m
odel
avo
ids
thes
e du
al p
itfal
ls, K
awam
oto
(16)
des
crib
esan
aut
o-as
soci
ativ
e m
odel
of
lexi
cal a
cces
s, I
n hi
s m
odel
, pat
tern
s of
act
ivat
ion
repr
esen
t a w
ord
and
its m
eani
ng, F
or e
ach
ambi
guou
s w
ord,
ther
e ar
e tw
opa
ttern
s, e
ach
repr
esen
ting
the association of the w
ord
with
one
of i
tsm
eani
ngs,
Pos
itive
wei
ghts
inte
rcon
nect
the
units
rep
rese
ntin
g a
patte
rn, a
ndne
gativ
e w
eigh
ts in
terc
onne
ct u
nits
bet
wee
n pa
ttern
s,W
hen
an a
mbi
guou
s w
ord
is p
roce
ssed
, it initially ac
tivat
es s
eman
tic u
nits
repr
esen
ting
both
mea
ning
s, T
he r
esul
ting
patte
rn o
f ac
tivat
ion
is a
com
bina
-tion of the semantic features of both m
eani
ngs,
so
the
bind
ings
am
ong
the
features of a meaning are lost in the ac
tivat
ion
patte
rn, T
hese
bin
ding
s,ho
wev
er a
re preserved in the weights, a
nd th
e m
odel
will
set
tle in
to o
nein
terp
reta
tion
of th
e w
ord
or th
e ot
her,
One
can think of the al
tern
ativ
ein
terp
reta
tions
as
min
ima
in a
n en
ergy
land
scap
e, T
he in
itial
pat
tern
of
activ
atio
n fa
lls o
n th
e hi
gh e
nerg
y rid
ge b
etw
een
the
min
ima,
The
set
tling
proc
ess
mov
es th
e pa
ttern
of
activ
atio
n do
wn
one
side
of
the
ridg
e to
one
of
the
min
ima,
The
SG
mod
el d
oes
not s
ettle
, but
the
idea
is s
imila
r. W
hen
ther
e is
not
suff
icie
nt in
form
atio
n to
res
olve
an
ambi
guity
, the
sen
tenc
e in
terp
reta
tions
may
beco
me
conf
late
d in
the
patte
rn o
f ac
tivat
ion
over
the
sent
ence
ges
talt
and
inth
e ou
tput
res
pons
es to
pro
bes,
The
bin
ding
s w
ithin
an
inte
rpre
tatio
n, h
ow-
ever, are preserved in the w
eigh
ts, W
hen
suffi
cien
t new
info
rmat
ion
for
disa
mbi
guat
ion
is p
rovi
ded
, the
sen
tenc
e ge
stal
t com
pute
s a
sing
le in
terp
reta
-tio
n w
ith c
orre
ct th
emat
ic r
ole
and
sem
antic
fea
ture
bin
ding
s, F
or e
xam
ple,
give
n "T
he a
dult
ate,
" and with
agen
t as the probe, the model pa
rtia
llyac
tivat
es
busd
rive
r, teacher, male,
and
fem
ale,
W
hich
is m
ale
and
whi
ch is
fem
ale
is lo
st in
the
activ
atio
n pa
ttern
ove
r th
e ou
tput
laye
r, (
It is
not
cle
ar to
wha
t ext
ent t
he b
indi
ngs
are
lost
in th
e ac
tivat
ion
patte
rn o
ver
the
sent
ence
gest
alt.
In g
ener
al, t
he r
epre
sent
atio
n ov
er th
e se
nten
ce g
esta
lt is
an
area
for
further inquiry,),
Aft
er th
e m
odel
is f
urth
er g
iven
, , ,
the
stea
k w
ith g
usto
" it
is c
lear
that
the
agen
t is
the
busd
rive
r. W
hen
prob
ed f
or th
e ag
ent,
the
mod
elunambiguously activates
busd
rive
r in
the
outp
ut la
yer,
For the unambiguous sentences in the corpus, the model predominately
achi
eves
the
corr
ect b
indi
ngs
and
inte
rpre
tatio
ns, F
or th
e am
bigu
ous
sent
-en
ces,
the
mod
el c
onfla
tes,
in th
e ou
tput
laye
r, the patterns for each possible
role
fill
er f
or th
e ro
le b
eing
pro
bed,
The
oret
ical
ly, t
he m
odel
sho
uld
set t
heac
tivat
ions
to m
atch
the
cond
ition
al p
roba
bilit
ies
for
the
aspe
cts
of th
e ev
ent
that
rem
ain
unde
rspe
cifi
ed, I
nste
ad, t
hese
act
ivat
ions
tend
to v
acill
ate
base
d on
rece
nt, r
elat
ed tr
aini
ng tr
ials
, Thi
s va
cilla
tion
tow
ard
alte
rnat
e in
terp
reta
tions
is r
emin
isce
nt o
f th
e fr
eque
nt f
indi
ng th
at h
uman
s ge
nera
lly d
o no
t not
ice
the
ambi
guity
of
sent
ence
s, I
nste
ad, t
hey
gene
rally
set
tle f
or o
ne in
terp
reta
tion
or
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
249
the
othe
r, u
nles
s th
eir
atte
ntio
n is
exp
licitl
y dr
awn
to th
e am
bigu
ity, I
n su
mthe long-te
rm, a
vera
ge p
roba
bilit
ies
of p
icki
ng p
artic
ular
inte
rpre
tatio
ns m
ayre
flect
the
stat
istic
al p
rope
rtie
s of
the environment, while the m
omen
t to
mom
ent f
luct
uatio
ns o
f in
terp
reta
tions
ref
lect
rec
ent e
xper
ienc
e,T
he g
radu
al, i
ncre
men
tal l
earn
ing
capa
bilit
ies
of th
e ne
twor
k un
derl
ie it
sab
ility
to s
olve
the
boot
stra
ppin
g pr
oble
m, that is, to learn simultaneously
about both the syntax and semantics of co
nstit
uent
s, T
he p
robl
em o
f lea
rnin
gsy
ntax
and
sem
antic
s is
cen
tral
for
dev
elop
men
tal p
sych
olin
guis
tics,
Nai
gles
Gle
itman
and
Gle
itman
(25
) st
ate
that
lear
ning
syn
tax
and
sem
antic
s us
ing
only
stat
istic
al in
form
atio
n se
ems impossible because, "at a minimum, i
t wou
ldre
quir
e su
ch e
xten
sive
sto
rage
and
man
ipul
atio
n of
con
tinge
ntly
cat
egor
ized
even
t/con
vers
atio
n pa
irs a
s to
be
unre
alis
tic,"
Yet
it is
exa
ctly
by
usin
g su
chin
form
atio
n th
at o
ur m
odel
sol
ves
the
prob
lem
, The
mod
el le
arns
the
synt
axan
d se
man
tics
of th
e tr
aini
ng c
orpu
s si
mul
tane
ousl
y, A
cros
s tr
aini
ng tr
ials
, the
mod
el g
radu
ally
lear
ns w
hich
asp
ects
of
the
even
t des
crip
tion
each
con
stitu
ent
of th
e in
put c
onst
rain
s an
d in
wha
t way
s it
cons
trai
ns th
ese
aspe
cts,
The
pro
blem
of discovering which event in the world a sentence describes
whe
n m
ultip
le e
vent
s ar
e pr
esen
t wou
ld b
e ha
ndle
d in
a s
imila
r w
ay, t
houg
h w
eha
ve n
ot m
odel
ed it
. Aga
in, t
he a
spec
ts o
f th
e w
orld
that
the
sent
ence
act
ually
desc
ribe
s w
ould
be
disc
over
ed g
radu
ally
ove
r re
peat
ed tr
ials, while those
aspe
cts
that
spu
riou
sly
co-o
ccur
with
thes
e de
scri
bed
aspe
cts
wou
ld w
ash
out.
For both the bootstrapping and the am
bigu
ous
refe
renc
e pr
oble
m, t
hen
, our
mod
el ta
kes
a gr
adua
l, s
tatis
tical
app
roac
h, W
e do
not
wan
t to
over
stat
e th
ecase here
, sin
ce th
e ch
ild le
arni
ng a
lang
uage
con
fron
ts a
con
side
rabl
y m
ore
com
plex
ver
sion
of t
hese
pro
blem
s th
an o
ur m
odel
doe
s, O
ur s
ente
nces
are
pre-
segm
ente
d in
to c
onst
ituen
ts, are very simple in structure
, and
are
muc
hfe
wer
in n
umbe
r th
an th
e se
nten
ces
a ch
ild w
ould
hea
r, H
owev
er, t
he r
esul
tsde
mon
stra
te th
at th
e bo
otst
rapp
ing
and
ambi
guou
s re
fere
nce
prob
lem
s m
ight
ultim
atel
y be
ove
rcom
e by
an
exte
nsio
n of
the
pres
ent a
ppro
ach,
Man
y of
the
acco
mpl
ishm
ents
of
the
SG m
odel
are
sha
red
by p
rede
cess
ors,
Cot
trel
l, (
5) C
ottr
ell a
nd S
mal
l (6)
, Wal
tz a
nd P
olla
ck (
35),
and
McC
lella
nd a
ndK
awam
oto
(21)
hav
e al
l demonstrated the use of sy
ntac
tic a
nd s
eman
ticco
nstr
aint
s in
rol
e as
sign
men
t and
mea
ning
dis
ambi
guat
ion,
Of t
hese
, the
firs
ttw
o em
bodi
ed th
e im
med
iate
upd
ate
prin
cipl
e, b
ut d
id n
ot le
arn,
whi
le th
eth
ird le
arne
d in
a li
mite
d w
ay, a
nd h
ad a
fix
ed s
et o
f in
put s
lots
,T
he g
reat
er le
arni
ng c
apab
ility
of
our
mod
el a
llow
s it
to f
ind
conn
ectio
nst
reng
ths
that
sol
ve th
e co
nstr
aint
s em
bodi
ed in
the
corp
us w
ithou
t req
uirin
gth
e m
odel
er to
indu
ce th
ese
cons
trai
nts
and
with
out t
he m
odel
er tr
ying
to b
uild
them
in b
y ha
nd. I
t als
o al
low
s th
e m
odel
to c
onst
ruct
its
own
repr
esen
tatio
nsin the sentence ge
stal
t, an
d th
is a
bilit
y al
low
s th
ese representations to be
cons
ider
ably
mor
e co
mpa
ct th
an in
oth
er c
ases
,S
ome
prev
ious
mod
els
have
use
d conjunctive representations in which
250
F, S
T, J
OH
N A
ND
J,L
. McC
LELL
AN
D
role
/fill
er p
airs
are
exp
licitl
y re
pres
ente
d by
uni
ts pre-assigned to represent
eith
er s
peci
fic
role
/fill
er p
airs
(5,
35) or pa
rtic
ular
com
bina
tions
of
role
feat
ures
and
fill
er f
eatu
res
(21)
, Par
ticul
arly
, whe
n su
ch r
epre
sent
atio
ns a
reex
tend
ed s
o th
at tr
iple
s, r
athe
r th
an s
impl
y pa
irs, c
an b
e re
pres
ente
d (3
033
),
thes
e ne
twor
ks c
an b
ecom
e in
trac
tabl
y la
rge
even
with
sm
all v
ocab
ular
ies,
The
pres
ent m
odel
avo
ids
intr
acta
ble
size
by
lear
ning
to u
se it
s re
pres
enta
tiona
lca
paci
ty s
pari
ngly
to r
epre
sent
just
thos
e ro
le/f
iller
pai
ring
s th
at a
re c
onsi
sten
tw
ith it
s ex
peri
ence
, Thi
s ab
ility
pre
vent
s th
e m
odel
fro
m b
eing
abl
e to
repr
esen
t tot
ally
arb
itrar
y, ~vents: its representational capacities are strongly
cons
trai
ned
by th
e ra
nge
of it
s ex
peri
ence
, In
this
reg
ard
the
mod
el s
eem
ssi
mila
r to
hum
ans:
it is
wid
ely
know
n th
at h
uman
com
preh
ensi
on is
str
ongl
yin
fluen
ced
by e
xper
ienc
e (2
4),
The
maj
or s
imul
atio
n re
port
ed h
ere
cont
aine
d ex
plic
it su
rfac
e ro
le m
arki
ngs
in th
e in
put.
The
se markings were designed to make the learning of the
synt
actic
info
rmat
ion
easi
er. A
dditi
onal
sim
ulat
ions
, tho
ugh
, sho
wed
that
the
netw
ork
coul
d le
arn
the
synt
actic
info
rmat
ion
from
onl
y th
e te
mpo
ral s
e-qu
ence
s of
con
stitu
ents
, In
a di
ffer
ent t
ask
, whe
re th
e ne
twor
k m
ust a
ttem
pt to
antic
ipat
e th
e ne
xt in
put,
ther
e ha
ve b
een
seve
ral d
emon
stra
tions
that
net
-w
orks
can
lear
n to
kee
p tr
ack
of p
arse
pos
ition
, at l
east
for
smal
l fin
ite-s
tate
gram
mar
s (7
,29)
, Ext
ract
ing
info
rmat
ion
from
the
tem
pora
l ord
er o
f inp
utse
quen
ces,
then, is a function generally within the co
mpu
tatio
nal l
imits
of
recu
rren
t net
wor
ks,
Fin
ally
, the
mod
el is
abl
e to
generalize the processing kn
owle
dge
it ha
slearned and apply it to no
vel s
ente
nces
, Gen
eral
izat
ion
can
com
e in
two
varie
ties:
com
posi
tiona
l and
pre
dict
ive
gene
raliz
atio
n. C
ompo
sitio
nal g
ener
ali-
zatio
n oc
curs
whe
n th
e m
odel
has
lear
ned
the
cons
trai
nts
on s
ente
nce
inte
rpre
-ta
tion
cont
ribu
ted
by e
ach
elem
ent o
f th
e se
nten
ce, i
nclu
ding
bot
h sy
ntax
and
sem
antic
s, and has learned ho
w to
com
bine
that
info
rmat
ion
in n
ovel
se-
quen
ces,
The
clu
ster
ana
lysi
s of
the
inpu
t wei
ghts
con
fim
s th
at th
e ne
twor
k is
lear
ning
the
sem
antic
con
stra
ints
impo
sed
by c
onst
ituen
ts, I
t see
ms
likel
y th
at a
cons
ider
able
par
t of t
he s
peci
ficat
ion
of th
ese
cons
trai
nts
mig
ht b
e de
rivab
le b
yth
e ne
twor
k fr
om e
xper
ienc
e on
a s
ubse
t of
the
poss
ible
con
text
s w
here
a w
ord
can
occu
r, T
he in
terp
reta
tion
acqu
ired
in th
ese
expe
rien
ces
wou
ld th
en c
ause
the
new
wor
d to
beh
ave
like
othe
r si
mila
r w
ords
in c
onte
xts
in w
hich
it w
as n
ottrained, lllustrations that back pr
opag
atio
n ne
twor
ks c
an g
ener
aliz
e in
this
way
are
prov
ided
by
Hin
ton
(12)
, Tar
aban
, McD
onal
d an
d M
acW
hinn
ey (
31),
and
Rum
elha
rt (
27),
In
both
the
synt
actic
and
the
sem
antic
gen
eral
izat
ion
corp
ora,
the
mod
el d
emon
stra
tes
its a
bilit
y to
lear
n th
is in
form
atio
n an
d co
mpo
se it
.T
he s
econ
d va
riet
y of
generalization is predictive generalization. It occurs
whe
n th
e m
odel
can
use
a r
egul
arity
to p
redi
ct u
pcom
ing
sent
ence
con
stitu
ents
.In the semantic ge
nera
lizat
ion
expe
rimen
t, the model le
arne
d re
gula
ritie
sab
out t
he a
ge o
f ag
ents
in th
e co
rpus
, The
mod
el th
en u
sed
thes
e re
gula
ritie
sto predict appropriate objects,
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
251
2. D
efic
ienc
ies
and
limita
tions
of t
he m
odel
The
mod
el h
as s
ever
al li
mita
tions
and
a fe
w o
bvio
us d
efic
ienc
ies.
The
mod
elon
ly a
ddre
sses
a li
mite
d nu
mbe
r of
lang
uage
phe
nom
ena,
It d
oes
not a
ddre
ssqu
antif
icat
ion
, ref
eren
ce a
nd c
o-re
fere
nce
, coo
rdin
ate
cons
truc
tions
, or
man
yot
her
phen
omen
a. P
erha
ps th
e m
ost i
mpo
rtan
t lim
itatio
n is
the
limita
tion
onthe complexity of the sentences, and of the events that they de
scrib
e, In
gene
ral ,
it is ne
cess
ary
to c
hara
cter
ize
the
role
s an
d fi
llers
of
sent
ence
s w
ithre
spec
t to
thei
r su
pero
rdin
ate
cons
titue
nts.
Sim
ilarly
in c
ompl
ex e
vent
s, t
here
may
be
mor
e th
an o
ne a
ctor
, eac
h pe
rfor
min
g an
action in a di
ffer
ent
sub-
even
t of
the
over
all e
vent
or
actio
n. R
epre
sent
ing
thes
e st
ruct
ures
req
uire
she
ad/r
ole/
fille
r tr
iple
s in
stea
d of
sim
ple
role
/fill
er p
airs
.O
ne s
olut
ion
is to
trai
n th
e m
odel
usi
ng tr
iple
s ra
ther
than
pai
rs a
s th
ese
nten
ce a
nd e
vent
con
stitu
ents
, The
dif
ficu
lty li
es in
specifying the non-
sent
ence
mem
bers
of
the
trip
les.
The
se n
on-s
ente
nce
mem
bers
wou
ld s
tand
for
entir
e st
ruct
ures
, Thu
s th
ey w
ould
be
very
muc
h lik
e th
e pa
ttern
s th
at w
e ar
ecu
rren
tly u
sing
as
sent
ence
ges
talts
, It w
ould
be
desi
rabl
e to
hav
e th
e le
arni
ngpr
oced
ure
indu
ce th
ese
repr
esen
tatio
ns, but this is a bo
otst
rapp
ing
prob
lem
that
we
have
not
yet
atte
mpt
ed to
sol
ve.
Ano
ther
lim
itatio
n of
the
mod
el is
the
use
of lo
cal r
epre
sent
atio
ns b
oth
for
conc
epts
and
for
rol
es, T
he p
rese
nt m
odel
use
d pr
edom
inan
tly lo
cal r
epre
-se
ntat
ions
of
conc
ept m
eani
ngs
only
for
con
veni
ence
; in
real
ity w
e w
ould
supp
ose
that
the
conc
eptu
al r
epre
sent
atio
ns u
nder
lyin
g ev
ents
wou
ld b
e re
pre-
sent
ed b
y di
stri
bute
d pa
ttern
s (1
4), T
his
kind
of
repr
esen
tatio
n w
ould
hav
ese
vera
l adv
anta
ges.
Con
text
has
the
capa
bilit
y no
t onl
y of
sel
ectin
g am
ong
high
ly d
istin
ct m
eani
ngs
such
as
betw
een
flyi
ng b
ats
and
base
ball
bats
, but
als
ow
e be
lieve
, of shading meanings
, em
phas
izin
g ce
rtai
n fe
atur
es a
nd a
lterin
gpr
oper
ties
slig
htly
as
a fu
nctio
n of
con
text
(21
), B
oth
of th
ese
phen
omen
a ar
eea
sily
cap
ture
d if
we
view
the
repr
esen
tatio
n of
a c
once
pt a
s a
dist
ribut
ed~~
rn,
Sim
ilarly
, the
re a
re s
ever
al problems with the concept of role which are
solv
ed if
dis
trib
uted
rep
rese
ntat
ions
are
use
d. It
is. o
ften
dif
ficu
lt to
det
erm
ine
whe
ther
two
role
s ar
e th
e sa
me ,
and
it is
ver
y di
fficu
lt to
dec
ide
exac
tly h
owm
any
diffe
rent
rol
es th
ere
are.
If r
oles
wer
e re
pres
ente
d as
dis
trib
uted
patte
rns ,
thes
e is
sues
wou
ld s
impl
y fa
ll by
the
way
sidl
?' In
ear
lier
wor
k (2
1),
was
nec
essa
ry to
inve
nt d
istr
ibut
ed r
epre
sent
atio
ns f
or c
once
pts ,
but
rec
ently
anu
mbe
r of
res
earc
hers
hav
e sh
own
that
suc
h re
pres
enta
tions
can
be
lear
ned
(12
, 24
, 28)
. The
pro
cedu
re s
houl
d al
so a
pply
to d
istr
ibut
ed r
epre
sent
atio
ns o
fro
les,
A f
inal
lim
itatio
n is
the
smal
l siz
e of
the
corp
us u
sed
in tr
aini
ng th
e m
odel
.G
iven
the length of time required for tr
aini
ng, o
ne might be somewhat
pess
imis
tic a
bout
the
poss
ibili
ty th
at a
net
wor
k of
this
kin
d co
uld
mas
ter
asu
bsta
ntia
l cor
pus,
How
ever
, it should be
not
ed th
at th
e extent to which
252
F, S
T, J
OH
N A
ND
j,L.
McC
LELL
AN
D
lear
ning
tim
e gr
ows
with
cor
pus
size
is e
xtre
mel
y ha
rd to
pre
dict
for
con
nec-
tioni
st m
odel
s, a
nd is
hig
hly
prob
lem
dep
ende
nt. F
or s
ome
prob
lem
s (e
.pa
rity
), le
arni
ng ti
me
per
patte
rn in
crea
ses
mor
e th
an li
near
ly w
ith th
e nu
mbe
rof training patterns (3
2), w
hile
for
othe
r pr
oble
ms
(e.g, negation), learning
time
per
patte
rn a
ctua
lly c
an d
ecre
ase
as th
e nu
mbe
r of
pat
tern
s in
crea
ses
(27)
,
Whe
re th
e cu
rren
t pro
blem
fal
ls o
n th
is c
ontin
uum
is n
ot y
et k
now
n. A
com
pari
son
of th
e le
arni
ng ti
mes
bet
wee
n th
e ge
nera
l cor
pus
and
the
synt
actic
corpus used in the ge
nera
lizat
ion
expe
rim
ents
, how
ever
, is
sugg
estiv
e, T
he
netw
ork
requ
ired
630
,000
tria
ls to
lear
n th
e 12
0 events in the general corpus
(330
000
tria
ls to
lear
n al
l but
the
mos
t irr
egul
ar e
vent
s). O
n th
e ot
her
hand
the network required only 100,00
0 tr
ials
to learn the 2000 events of the
synt
actic
cor
pus.
The
syn
tact
ic c
orpu
s, o
f co
urse
, is
extr
emel
y re
gula
r, a
nd th
ere
gula
ritie
s ar
e co
mpo
sitio
nal.
Giv
en th
e m
odel
's g
ood
gene
raliz
atio
n re
sults
on th
e sy
ntac
tic c
orpu
s, it
is p
ossi
ble
that
the
mod
el w
ill s
cale
wel
l to
very
larg
e
corp
ora
if th
eir
regu
lari
ties
are
com
posa
ble,
One
fin
al d
efic
ienc
y of
the
mod
el is
its
tend
ency
to a
ctiv
ate
fille
rs f
or r
oles
that do not apply to a pa
rtic
ular
fra
me,
Thi
s tendency could perhaps be
over
com
e by
exp
licit
trai
ning
that
ther
e sh
ould
be
no o
utpu
t for
a p
artic
ular
role
, but
this
see
ms
inel
egan
t and
impr
actic
al, e
spec
ially
if w
e ar
e co
rrec
t in
believing that the set of roles is op
en-e
nded
, The
abs
ence
of
role
s se
ems
som
ehow
impl
icit
in e
vent
s, r
athe
r th
an e
xplic
itly
note
d. P
erha
ps e
vent
rep
re-
sent
atio
ns th
at p
rese
rved
mor
e de
tail
of th
e re
al-w
orld
eve
nt w
ould
pro
vide
the
rele
vant
impl
icit
cons
trai
nts,
S. Conclusion
The
SG
mod
el r
epre
sent
s an
othe
r st
ep in
wha
t will
sur
ely
be a
long
ser
ies
of
expl
orat
ions
of
conn
ectio
nist
mod
els
of la
ngua
ge p
roce
ssin
g, T
he m
odel
is a
nad
vanc
e in
our
vie
w, b
ut th
ere
is s
till a
ver
y lo
ng w
ay to
go.
The
nex
t ste
p is
tofi
nd w
ays
to e
xten
d th
e ap
proa
ch to
mor
e co
mpl
ex s
truc
ture
s an
d m
ore
extensive corpora, while increasing the rate of learning,
Appendix A. Input and Output R
epre
sent
atio
ns
Inpu
t
surf
ace
loca
tions
:pr
e-ve
rbal
, ver
bal,
pos
t-ve
rbal
-, p
ost-
verb
al-n
wor
dsco
nsum
ed, ate, d
rank
, stirred, s
prea
d, kissed, g
ave, hit, shot, t
hrew
, dro
ve
shed
, ros
e
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
253
som
eone
, adu
lt, c
hild
, dog
, bus
driv
er, t
each
er, s
choo
lgirl
, pitc
her,
spo
t
som
ethi
ng, f
ood
, ste
ak, s
oup,
ice
crea
m, c
rack
ers,
jelly
, ice
d-te
a, k
ool-a
id
uten
sil,
spoo
n, k
nife
, fin
ger,
gun
place, kitchen, living-room, park, bat, ball, bus, f
urgu
sto
, ple
asur
e, d
aint
ines
swith, in, t
o, b
yw
as
Out
put
role
s:agent, action, patient, instrument, co-agent, co-patient, location, adverb
reci
pien
t
actions and concepts:
ate,
dra
nk, s
tirre
d, s
prea
d, k
isse
d, g
ave, hit, s
hot,
thre
w(t
osse
d),
thre
w(h
oste
d), d
rove
(tra
nspo
rted
), d
rove
(mot
ivat
ed),
she
d(ve
rb),
rose
(ve
rb)
busd
rive
r, te
ache
r, s
choo
lgirl
, pitc
her(
pers
on),
spo
tst
eak,
sou
p, ic
e cr
eam
, cra
cker
s, jelly, iced-te
a, k
ool-a
idsp
oon,
kni
fe, f
inge
r, g
unki
tche
n, l
ivin
g-ro
om, s
hed(
noun
), p
ark
rose
(nou
n), b
at(a
nim
al),
bat
(bas
ebal
l), b
all(
sphe
re),
bal
l(pa
rty)
, bus
pitcher(container), fur
gust
o, p
leas
ure
, dai
ntin
ess
actio
n fe
atur
es:
cons
umed
, pas
sive
conc
ept f
eatu
res:
pers
on, a
dult,
chi
ld, d
og, m
ale
, fem
ale
thin
g, fo
od, u
tens
ilpl
ace, in-do
ors,
out
-doo
rs
App
endi
x B
. Sam
ple
Sent
ence
-Fra
me
Hit
In the sentence-fr
ame
belo
w, s
uper
ior
num
bers
1,
4 ha
ve th
e fo
llow
ing
mea
ning
:
lnc1ude a role in the input with this pr
obab
ility
o2
Cho
ose
this
fill
er w
ith th
is p
roba
bilit
y.3
Cho
ose
this
wor
d w
ith th
is p
roba
bilit
y T
he w
ord
appe
ars
in th
is p
repo
sitio
nal p
hras
e.
254
F, S
T. J
OH
N A
ND
j,L.
McC
LELL
AN
D
agent 100
2 bu
sdri
ver
703
adul
t 20
pers
on 1
0
verb 100
100 hit 100
patient 100
25 s
hed-
n 80
som
ethi
ng 2
0instrument SO
100 bus 80 something 20 with
40 b
all-
s 80
som
ethi
ng 2
0lo
catio
n S
O ' .
100
park
100
ininstrument SO
100
bat-b 80 something 20 with
10 b
at-a
80
som
ethi
ng 2
0location SO
100
shed
-n 1
00 in
instrument SO
100
bat-
b 80
som
ethi
ng 2
0 w
ith25
pitc
her-
p 70
chi
ld 2
0 pe
rson
10
location SO
100
park
100
ininstrument SO
100
ball-
s 80
som
ethi
ng 2
0 w
ith25 teacher 70 adult 20 person 10
verb 100
100 hit 100
patient 100
34 p
itche
r-c
80 s
omet
hing
20
location SO
100
kitc
hen
100
ininstrument SO
100 spoon 80 something 20 with
33 p
itche
r-p
70 c
hild
20
pers
on 1
0location SO
100
livin
g-ro
om 1
00 in
instrument SO
100
pitc
her-
c 80
som
ethi
ng 2
0 w
ith33
sch
oolg
irl 7
0 ch
ild 2
0 pe
rson
10
location SO
100
kitc
hen
100
ininstrument SO
100 spoon 80 something 20 with
25 p
itche
r-p
70 c
hild
20
pers
on 1
0
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
255
verb 100
tOO hit 100
patient 100
40 b
all-
s 80
som
ethi
ng 2
0lo
catio
n 50
100
park
100
ininstrument 50
100
bat-b 80 something 20 with
10 b
at-a
80
som
ethi
ng 2
0lo
catio
n 50
100
shed
-n 1
00 in
instrument 50
100
bat-b 80 something 20 with
25 bus 80 something 20
location 50
100
park
100
ininstrument 50
100
ball-
s 80
som
ethi
ng 2
0 w
ith25
bus
driv
er 7
0 ad
ult 2
0 pe
rson
10
location 50
100
park
100
ininstrument 50
100
ball-
s 80
som
ethi
ng 2
0 w
ith25
sch
oolg
irl 7
0 ch
ild 2
0 pe
rson
10
verb 100
100 hit 100
patient 100
34 p
itche
r-c
80 s
omet
hing
20
location 50
100
kitc
hen
100
ininstrument 50
100 spoon 80 something 20 with
33 s
pot 8
0 do
g 20
loca
tion
5010
0 ki
tche
n 10
0 in
instrument 50
100 spoon 80 something 20 with
33 te
ache
r 70
adu
lt 20
per
son
10lo
catio
n 50
100
kitc
hen
100
ininstrument 50
100
spoo
n 80
som
ethi
ng 2
0 w
ith
256
F, S
T. J
OH
N A
ND
J.L
. McC
LELL
AN
D
RE
FER
EN
CE
S
I. R.c. Anderson and A. Ortony, On pu
tting
app
les
into
bot
tles:
A problem of polysemy,
Cognitive PJycllOl.
7 (1975) 167-
18U
,
2, F,c. Bartlett, Remembering: An Experimental and Social Study
(Cam
brid
ge U
nive
rsity
Pre
ss,
Cam
brid
ge, 1932).
3. P.A. Carpenter and M.
A. J
ust,
Rea
ding
com
preh
ensi
on a
s th
e ey
es s
ee it
, in:
M.A
, Jus
t and
A. C
arpe
nter
, eds
.Cognitive Processes in Comprehension
(Erl
baum
, Hillsdale, NJ, 1977).
4. W
G, C
hase
and
H.A. Simon, Perception in chess,
Cognitive Psychol.
4 (1
973)
55-
81.
5. G.W
. Cot
trel
l, A
con
nect
ioni
st a
ppro
ach
to w
ord
sens
e di
sam
bigu
atio
n, D
isse
rtat
ion,
Com
pu-
ter
Scie
nce
Dep
artm
ent,
Uni
vers
ity o
f R
oche
ster
, NY
(19
85).
6. G,W
. Cot
trel
l and
A,L
. Sm
all,
A c
onne
ctio
nist
sch
eme
for
mod
elin
g w
ord
sens
e di
sam
bigu
a-tio
nCognition and Brain Theory
6 (1
983)
89-
120,
7. J
.L. Elman, Finding structure in time, C
RL
Tec
h, R
ept.
8801, Center for Research in
Lan
guag
e, U
nive
rsity
of
Cal
ifor
nia,
San
Die
go, L
a Jo
lla, C
A (
1988
),8. T.
O. E
ricks
on a
nd M
.E. M
atts
on, F
rom
wor
ds to
mea
ning
: A s
eman
tic illusion, J.
Ver
bal
Learn. Verbal Behav.
20 (1981) 54U-
551.
9, c.J. Fillmore, T
he c
ase
for
case
, in:
E, B
ach
and
R,T, Harms, eds" Universals in Linguistic
The
ory
(Hol
t, N
ew Y
ork,
196
8),
10. L
. R. G
leitm
an a
nd E
. Wan
ner,
Lan
guag
e ac
quis
ition
: The
sta
te o
f th
e st
ate
of th
e ar
t, in
: E.
Wanner and L.R. Gleitman, eds.,
Language Acquisition: The State of the Art
(Cam
brid
geUniversity Press, C
ambr
idge
, MA
, 198
2),
11. G
.E. H
into
n, I
mpl
emen
ting
sem
antic
net
wor
ks in
par
alle
l har
dwar
e, in
: G.E
. Hin
ton
and
J.Anderson, eds"
Parallel Models of AJsociative Memory
(Erlb
aum
, Hill
sdal
e, N
J, 1
981)
.
12. G
.E. Hinton, Learning distributed representations of concepts, in:
Proc
eedi
ngs
Eig
hth
Ann
ual
Conference of the Cognitive Science Society,
Am
hers
t, M
A (
1986
),13
. G.E, Hinton, Connectionist learning procedures, T
ech.
Rep
t. C
MU
-CS-
87-1
15, C
ompu
ter
Sci
ence
Dep
artm
ent,
Car
negi
e-M
ello
n U
nive
rsity
, Pitt
sbur
gh, P
A (
1987
).14
. G.E
. Hin
ton,
J.L. McClelland and D.
E. R
umel
hart
, Dis
trib
uted
representations, in: D.
Rum
elha
rt, J
.L. M
cCle
lland
and
the
PDP
Res
earc
h G
roup
, eds
.Pa
ralle
l DiJ
trib
wed
Pro
cess
-ing: Explorations in the Microstructure of Cognition
I:
Foun
datio
ns
(MIT
Pre
ss, C
ambr
idge
,, 1986).
15. M
.I. J
orda
n, A
ttrac
tor
dyna
mic
s an
d pa
ralle
lism
in a
con
nect
ioni
st s
eque
ntia
l mac
hine
, in:
Proc
eedi
ngs
Eig
hth
Ann
ual C
onfe
renc
e of
the
Cog
nitiv
e Sc
ienc
e So
ciet
y,
Am
hers
t, M
A (
1986
).16
, A.H
, Kaw
amot
o, D
istr
ibut
ed r
epre
sent
atio
ns o
f am
bigu
ous
wor
ds a
nd th
eir
reso
lutio
n in
aco
nnec
tioni
st n
etw
ork,
in: S
.L. Small, G.W
. Cot
trel
l, a
nd M
.K. Tanenhaus, eds"
Lex
ical
Am
bigu
ity R
esol
utio
n: P
ersp
ectiv
es f
rom
Psy
chol
ingu
istic
s, N
euro
psyc
holo
gy, a
nd A
rtif
icia
lIn
telli
genc
e (M
orga
n K
aufm
ann,
San
Mat
eo, C
A, 1988),
17. B
. Mac
Whi
nney
, E. B
atcs
and
R. K
liegl
, Cue
val
idity
and
sen
tenc
e in
terp
reta
tion
in E
nglis
hG
erm
an, and Italian, J. Verbal Learn, and Verbal Behav.
23 (1984) 127-
150.
18. M
.P. Marcus,
A T
heor
y of
Syntactic Recogni/ion for Natural Language
(MIT
Pre
ss, C
am-
brid
ge, M
A, 1980).
19, W. Marslen-Wilson and L.K. Tyler
, The
tem
pora
l str
uctu
re o
f sp
oken
lang
uage
und
erst
andi
ng,
Cog
r/i/i
on
8 (1
980)
1-7
1.20
. J.L
. McC
lella
nd a
nd J
.L. E
lman
, Int
erac
tive
proc
esse
s in
spe
ech
perc
eptio
n: T
he T
RA
CE
mod
el, in: J.L. McClelland, D.
E. R
umel
hart
and
the
PDP
Res
earc
h G
roup
, eds
.Pa
ralle
lDiJ/ribu/ed Processing: Explora/ions in the Micros/ructure of Cognition
2:
App
lica/
ions
(M
ITPr
ess,
Cam
brid
ge, M
A, 1
986)
.
21, J
,L. M
cCle
lland
and
A.H
, Kaw
amot
o, M
echa
nism
s of
sen
tenc
e pr
oces
sing
: Ass
igni
ng r
oles
toco
nstit
uent
s, in
: J.L. McClelland, D.
E. R
umel
hart
and
the
PDP
Res
earc
h G
roup
, eds
.Parallel Dis/ribu/ed ProcesJing: Explora/ionJ in the Micros/ruc/ure of Cognition
2:
App
lica/
ions
(MIT
Pre
ss, C
ambr
idge
, MA
, 198
6).
SE
NT
EN
CE
CO
MP
RE
HE
NS
ION
257
22, j
,McClelland and D,E. Rumelhart
, An
inte
ract
ive
activ
atio
n m
odel
of
cont
ext e
ffec
ts in
1elle
r pe
rcep
tion:
Par
i 1. A
n ac
coun
t of b
asic
find
ings
Psychol, Rev,
88 (
1981
) 37
5-40
7,23
, G, M
cKoo
n an
d R
, Rat
cliff
, The
com
preh
ensi
on p
roce
sses
and
mem
ory
stru
ctur
es in
volv
ed in
instrumental inference
J, Verbal Learn. and Verbal Behav.
20 (
19IH
) 67
1-68
2,24
, R, M
iikku
lain
en a
nd M
. G, D
yer,
Bui
ldin
g di
stri
bute
d re
pres
enta
tions
with
out m
icro
feat
ures
,T
ech,
Rep
l., A
rtifi
cial
Int
ellig
ence
Lab
orat
ory,
Com
pute
r Sc
ienc
e D
epar
tmen
t, U
nive
rsity
of
Cal
ifor
nia
, Los
Ang
eles
, CA
(19
88),
25. LG, Naigles, H, Gleitman and LR. Gleitman
, Syn
tact
ic b
oots
trap
ping
in v
erb
acqu
isiti
on:
Evi
denc
e fr
om c
ompr
ehen
sion
, Tec
h, R
epl.,
Dep
artm
ent o
f Psy
chol
ogy,
Uni
vers
ity o
f Pe
nn-
sylv
ania
, Phi
lade
lphi
a, P
A (
1987
),26, W.V
. Qui
neWord and Object
(Har
vard
Pre
ss, C
ambr
idge
, MA
, 1960).
27, D,
E, R
umel
hart
, col
loqu
ium
pre
sent
ed to
the
Dep
artm
ent o
f C
ompu
ter
Scie
nce
, Car
negi
e-M
ello
n U
nive
rsity
, Pill
sbur
gh, P
A (
1987
),28. D.E, Rumelhart
, G,E
, Hin
ton
and
R.J
. Will
iam
s, L
earn
ing
inte
rnal
rep
rese
ntat
ions
by
erro
rpr
opag
atio
n, in: D,E
. Rum
elha
rt, j
.M
cCle
lland
and
the
PDP
Res
earc
h G
roup
, eds
.Parallel Distributed Processing: Explorations
in the Microstructure of Cognition
I:
Foun
datio
ns(M
IT P
ress
, Cam
brid
ge, M
A, 1986).
29, D, Servan-Sc
hrei
ber,
A, C
leer
eman
s an
d j,L McClelland
, Enc
odin
g se
quen
tial s
truc
ture
insi
mpl
e re
curr
ent n
etw
orks
, Tec
h, R
epl.
CM
U-C
S-88
-183
, Dep
artm
ent o
f C
ompu
ter
Scie
nce
Car
negi
e M
ello
n U
nive
rsity
, Pill
sbur
gh, P
A (
1988
),30
. M, F
, SI.
john
and
j,M
cCle
lland
, Rec
onst
ruct
ive
mem
ory
for
sent
ence
s: A
PD
P ap
proa
chProceedings Inference: OU/C
86, University
of
Ohi
o, A
then
s, O
H (
1987
),31
. R, T
arab
an j,
McD
onal
d an
d B
. Mac
Whi
nney
, Cat
egor
y le
arni
ng in
a c
onne
ctio
nist
mod
el:
Lear
ning
to d
eclin
e th
e G
erm
an d
efin
ite a
rtic
le, in: R. Corrigan, ed"
Milw
auke
e C
onfe
renc
eon
C
ateg
oriz
atio
n (B
enja
min
s , P
hila
delp
hia
, PA
, to
appe
ar).
32. G. Tesauro
, Sca
ling
rela
tions
hips
in b
ack-
prop
agat
ion
lear
ning
: Dep
ende
nce
on tr
aini
ng s
etsi
ze, T
ech,
Rep
l., C
ente
r fo
r C
ompl
ex S
yste
ms
Res
earc
h, U
nive
rsity
of
Illin
ois
at U
rban
a-C
ham
paig
n, C
ham
paig
nIL
(1
987)
,33
. S.
T
oure
tzky
and
S, G
eva, A distributed connectionist representation for concept struc-
tures, in: Pr
ocee
ding
s N
inth
Ann
ual C
onfe
renc
e of
the
Cog
nitiv
e Sc
ienc
e So
ciet
y,
Seal
lle, W
A(
1987
),34, T,A
, van
Dijk
and
W, K
ints
ch,
Stra
tegi
es o
f D
isco
urse
Com
preh
ensi
on
(Aca
dem
ic P
ress
Orla
ndo,
FL, 1983),
35. D,
Wal
tz a
nd j,
B, Pollack, Massively pa
ralle
l par
sing
: A s
tron
gly
inte
ract
ive
mod
el o
fnatural language interpretation,
Cognitive Sci,
9 (1
985)
51-
74,