words of a sentence may be Artificial Intelligence

21
ARTIFICIAL INTELLIGENCE 217 Learning and Applying Contextual Constraints in Sentence Comprehension * Mark F. St. John ** and James L. McClelland Department of Psychology, Carnegie- Mellon University, Pittsburgh , PA 15213 USA ABSTRACf A parallel distributed processing model is described that learns to comprehend single clause entences, Specifically, it assigns thematic roles 10 sentence constituents, disambiguates ambiguous word~' , instantiates vague words , and elaborates implied roles, The sentences are pre-segmented illlo wm' tituent phrQJ,' es, Each constituent is processed in turn 10 update an evolving representation of the event described by the sentence, The model uses the information derived from each constituent to revise its ongoing interpretation of the sentence and to anticipate additional constituents, The network learns to perform these tasks through practice on processing example sentence/event pairs, The learning procedure allows the model to take a statistical approach to solving the bootstrapping problem of learning the syntax "nd semantics of a language from the same data, The model performs very well on the corpus of sentences on which it was trained, and generalizes to sentences on which it was not trained, but learns slowly, 1. Introduction The goal of our research has been to develop a model that can learn to convert a simple sentence into a conceptual representation of the event that the sentence describes, Specifically, we have been concerned with the later stages of this process: the conversion of a sequence of sentence constituents , such as noun phrases , into a representation of the event. A number of problems make this process difficult. First, the words of a sentence may be ambiguous or vague. In the sentence , " The pitcher threw the ball " each content word is ambiguous. " Pitcher " could either refer to a ball- player or a container; The authors would like to thank Geoffrey Hinton, Brian MacWhinney, Andrew Hudson, and the members of the PDP Research Group at Carnegie- Mellon, This research was supported by NSF Grant BNS 86- 09729, ONR Contracts NOOOI4- 86- 0146 , NOOOI4- 86- OOI67 , and NOOO14. 86- 0349 , and NIMH Career Development Award MHOO385 to the second author, .. Present address: Department of Cognitive Science, University of California at San Diego, La Jolla , CA 92093, USA, Artificial Intelligence 46 (1990) 217- 257 0004- 3702/90/$03, 50 (Q 1990 Elsevier Sci~nce Publishers B.v, (North- Holland) 218 F, ST, JOHN AND j, L. McCLELLAND threw " could either refer to toss or host; and " ball" could refer to a sphere or a dance, How are the appropriate meanings selected so that a single, coherent interpretation of the sentence is produced? Vague words also present difficul- ties. In the sentences , " The container held the apples " and "The container held the cola, " the word " container " refers to two different objects (1). How does the context affect the interpretation of vague words? A third problem is the complexity of assigning the correct thematic roles (9) to the objects referred to in a sentence. Consider: (1) The teacher ate the spaghetti with the busdriver, (2) The teacher ate the spaghetti with the red sauce, (3) The busdriver hit the fireman. (4) The busdriver was hit by the fireman. In the first two examples , semantics play an important role. In the first sentence, it is the reader s knowledge that busdrivers are people that precludes the reader from deciding the busdriver is to be served as a condiment. Instead it must be that both he and the teacher are eating the spaghetti. Semantic constraints work conversely in the second sentence. In the third sentence, semantics do not help determine who is the agent and who is the patient. Instead , word order determines the thematic role assignments. The busdriver is the agent because " the busdriver " is the pre-verbal constituent. Finally, in the fourth sentence, the influence of other morphological features can be seen. The passive verb tense and the " " preposition, in conjunction with the word order, determine that the busdriver is the patient. Thematic role assignment then, requires the joint consideration of a variety of aspects of the sentence, A fourth problem for processing sentences is that a sentence may leave some thematic constituents implicit that are nevertheless present in the event. For example in sentences (1) and (2) above , the spaghetti was undoubtedly eaten with forks, Psychological evidence indicates that missing constituents , when strongly related to the action , are inferred and added to the description of the event. McKoon and Ratcliff (23) found , for example , that " hammer " was inferred after subjects read " Bobby pounded the boards together with nails. Our model of the comprehension process centers on viewing the process as a form of constraint satisfaction. The surface features of a sentence , its particular words and their order and morphology, provide a rich set of constraints on the sentence s meaning. Each feature constrains the meaning in a number of respects, Conjunctions of features , such as word order and passive-voice morphology, provide additional constraints, Together, the constraints lead to a coherent interpretation of the sentence (17), These constraints are not typically all-or-none. Instead, constraints tend to vary in strength: some are strong and others are relatively weak, An example adapted from Marcus (18) provides an illustration of the competition between constraints. (1) Which dragon did the knight give the boy?

Transcript of words of a sentence may be Artificial Intelligence

Page 1: words of a sentence may be Artificial Intelligence

AR

TIF

ICIA

L I

NT

EL

LIG

EN

CE

217

Learning and Applying Contextual

Con

stra

ints

in S

ente

nce

Com

preh

ensi

on *

Mar

k F.

St.

John

** and James L. McClelland

Dep

artm

ent o

f Ps

ycho

logy

, Car

negi

e-M

ello

n U

nive

rsity

,Pi

ttsbu

rgh

, PA

1521

3U

SA

AB

STR

AC

f

A p

aral

lel d

istr

ibut

ed p

roce

ssin

g m

odel

is described that learns to co

mpr

ehen

d si

ngle

cla

use

ente

nces

, Spe

cifi

cally

, it a

ssig

ns th

emat

ic r

oles

10

sent

ence

con

stitu

ents

, dis

ambi

guat

es a

mbi

guou

sw

ord~

', in

stan

tiate

s va

gue

wor

ds, a

nd e

labo

rate

s im

plie

d ro

les,

The

sen

tenc

es a

re p

re-s

egm

ente

d ill

low

m'tituent phrQJ,'es

, Eac

h co

nstit

uent

is p

roce

ssed

in tu

rn 1

0 up

date

an

evol

ving

rep

rese

ntat

ion

of th

eev

ent d

escr

ibed

by

the

sent

ence

, The

mod

el u

ses

the

info

rmat

ion

deri

ved

from

eac

h co

nstit

uent

tore

vise

its

ongo

ing

inte

rpre

tatio

n of

the

sent

ence

and

to a

ntic

ipat

e ad

ditio

nal c

onst

ituen

ts, T

he n

etw

ork

lear

ns to

per

form

thes

e ta

sks

thro

ugh

prac

tice

on p

roce

ssin

g ex

ampl

e se

nten

ce/e

vent

pai

rs, T

helearning procedure allows the model to take a st

atis

tical

app

roac

h to

sol

ving

the

boot

stra

ppin

gpr

oble

m o

f le

arni

ng th

e sy

ntax

"nd

sem

antic

s of

a la

ngua

ge f

rom

the

sam

e da

ta, T

he m

odel

per

form

sve

ry w

ell o

n th

e co

rpus

of

sent

ence

s on

whi

ch it

was

trai

ned,

and

gen

eral

izes

to s

ente

nces

on

whi

ch it

was

not

trai

ned,

but

lear

ns s

low

ly,

1. Introduction

The

goa

l of

our

rese

arch

has

bee

n to

dev

elop

a m

odel

that

can

lear

n to

con

vert

a simple sentence into a conceptual representation of the event that the

sent

ence

des

crib

es, S

peci

fical

ly, w

e ha

ve b

een

conc

erne

d w

ith th

e la

ter

stag

esof

this

pro

cess

: the

con

vers

ion

of a

seq

uenc

e of

sen

tenc

e co

nstit

uent

s, s

uch

asno

un p

hras

es, i

nto

a re

pres

enta

tion

of th

e ev

ent.

A n

umbe

r of

pro

blem

s m

ake

this

pro

cess

dif

ficu

lt. F

irst

, the

wor

ds o

f a

sent

ence

may

be

ambi

guou

s or

vague. In the sentence

, "

The

pitc

her

thre

w th

e ba

ll"

each

con

tent

wor

d is

ambi

guou

s. "

Pitc

her" could ei

ther

ref

er to

a b

all-player or a co

ntai

ner;

The

aut

hors

wou

ld li

ke to

than

k G

eoff

rey

Hin

ton,

Bri

an M

acW

hinn

ey, A

ndre

w H

udso

n, a

ndth

e m

embe

rs o

f the

PD

P R

esea

rch

Gro

up a

t Car

negi

e-M

ello

n, T

his

rese

arch

was

sup

port

ed b

yN

SF

Gra

nt B

NS

86-09729, ONR Contracts NOOOI4-

86-

0146

, NO

OO

I4-8

6-O

OI6

7, a

nd N

OO

O14

.

86-

0349

, and

NIM

H C

aree

r D

evel

opm

ent A

war

d M

HO

O38

5 to

the

seco

nd a

utho

r,..

Pre

sent

add

ress

: Dep

artm

ent o

f Cog

nitiv

e S

cien

ce, U

nive

rsity

of C

alifo

rnia

at S

an D

iego

, La

Jolla

, CA

920

93, U

SA,

Artificial Intelligence 46

(19

90)

217-

257

0004

-370

2/90

/$03

, 50 (Q 1990

Els

evie

r Sc

i~nc

e Pu

blis

hers

B.v

, (N

orth

-Hol

land

)

218

F, S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

thre

w"

coul

d ei

ther

ref

er to

toss

or

host

; and

"ba

ll" c

ould

ref

er to

a s

pher

e or

a da

nce,

How

are

the

appr

opri

ate

mea

ning

s se

lect

ed s

o th

at a

sin

gle,

coh

eren

tin

terp

reta

tion

of th

e se

nten

ce is

pro

duce

d? V

ague

wor

ds a

lso

pres

ent d

iffic

ul-

ties. In the sentences

, "

The

con

tain

er h

eld

the

appl

es" and "T

he c

onta

iner

held

the

cola

," th

e w

ord

"con

tain

er" refers to two different objects (1

). H

owdo

es th

e co

ntex

t aff

ect t

he in

terp

reta

tion

of v

ague

wor

ds?

A th

ird

prob

lem

is th

e co

mpl

exity

of

assi

gnin

g th

e co

rrec

t the

mat

ic r

oles

(9)

to th

e ob

ject

s re

ferr

ed to

in a

sen

tenc

e. C

onsi

der:

(1)

The

teac

her

ate

the spaghetti with the busdriver,

(2)

The

teac

her

ate

the spaghetti with the red sauce,

(3)

The

bus

driv

er h

it th

e fi

rem

an.

(4)

The

bus

driv

er w

as h

it by

the

fire

man

.

In th

e fi

rst t

wo

exam

ples

, semantics play an important role. In the fi

rst

sent

ence

, it i

s th

e re

ader

s kn

owle

dge

that

bus

driv

ers

are

peop

le th

at p

recl

udes

the

read

er f

rom

dec

idin

g th

e bu

sdri

ver

is to

be

serv

ed a

s a

cond

imen

t. In

stea

dit

mus

t be

that

bot

h he

and

the teacher are eating the sp

aghe

tti. S

eman

tic

cons

trai

nts

wor

k conversely in the second sentence. In the third sentence,

sem

antic

s do

not

hel

p de

term

ine

who

is th

e ag

ent a

nd w

ho is

the

patie

nt.

Inst

ead

, wor

d or

der

dete

rmin

es th

e th

emat

ic r

ole

assi

gnm

ents

. The

bus

driv

er is

the

agen

t bec

ause

"th

e bu

sdriv

er"

is th

e pr

e-ve

rbal

con

stitu

ent.

Fina

lly, i

n th

efo

urth

sen

tenc

e, th

e in

flue

nce

of o

ther

mor

phol

ogic

al f

eatu

res

can

be s

een.

The

passive verb tense and the "

" preposition, in conjunction with the w

ord

orde

r, d

eter

min

e th

at th

e bu

sdri

ver

is th

e pa

tient

. The

mat

ic r

ole

assi

gnm

ent

then, requires the joint consideration of a variety of aspects of the sentence,

A f

ourt

h pr

oble

m f

or p

roce

ssin

g se

nten

ces

is th

at a

sen

tenc

e m

ay le

ave

som

eth

emat

ic c

onst

ituen

ts im

plic

it th

at a

re n

ever

thel

ess

pres

ent i

n th

e ev

ent.

For

exam

ple

in s

ente

nces

(1)

and

(2)

abo

ve, t

he s

pagh

etti

was

und

oubt

edly

eat

enwith forks, Psychological evidence indicates that missing constituents, w

hen

strongly related to the action, a

re in

ferr

ed a

nd a

dded

to th

e de

scrip

tion

of th

eev

ent.

McK

oon

and

Rat

clif

f (2

3) fo

und, for example, that "ha

mm

er" was

infe

rred

aft

er s

ubje

cts

read

"B

obby

pou

nded

the

boar

ds to

geth

er w

ith n

ails

.O

ur m

odel

of

the

com

preh

ensi

on p

roce

ss c

ente

rs o

n vi

ewin

g th

e pr

oces

s as

afo

rm o

f co

nstr

aint

sat

isfa

ctio

n. T

he s

urfa

ce f

eatu

res

of a

sen

tenc

e, i

ts p

artic

ular

wor

ds a

nd th

eir

orde

r an

d m

orph

olog

y, p

rovi

de a

ric

h se

t of

cons

trai

nts

on th

ese

nten

ces meaning. Each feature constrains the meaning in a number of

resp

ects

, Con

junc

tions

of

feat

ures

, suc

h as

wor

d or

der

and

pass

ive-

voic

e

mor

phol

ogy,

pro

vide

add

ition

al c

onst

rain

ts, T

oget

her,

the

cons

trai

nts

lead

to a

cohe

rent

inte

rpre

tatio

n of

the

sent

ence

(17)

, The

se c

onst

rain

ts a

re n

ot ty

pica

llyal

l-or

-non

e. I

nste

ad, c

onst

rain

ts te

nd to

var

y in

str

engt

h: s

ome

are

stro

ng a

ndot

hers

are

rel

ativ

ely

wea

k, A

n ex

ampl

e ad

apte

d fr

om M

arcu

s (1

8) p

rovi

des

anill

ustr

atio

n of

the

com

petit

ion

betw

een

cons

trai

nts.

(1)

Whi

ch d

rago

n di

d th

e kn

ight

giv

e th

e bo

y?

common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 2: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

219

(2)

Whi

ch b

oy d

id th

e kn

ight

giv

e th

e dr

agon

?(3

) W

hich

boy

did

the

knig

ht g

ive

the

swor

d?(4

) W

hich

boy

did

the

knig

ht g

ive

to th

e sw

ord?

App

aren

tly, i

n th

e fi

rst t

wo

sent

ence

s, a

wea

k sy

ntac

tic c

onst

rain

t mak

es u

spr

efer

the

firs

t nou

n as

the

patie

nt a

nd th

e no

un a

fter

the

verb

as

the

reci

pien

t.T

he s

ubtle

sem

antic

s in

the

seco

nd s

ente

nce ,

that

kni

ghts

don

t give boys to

drag

ons,

doe

s no

t ove

rrid

e th

e sy

ntac

tic c

onst

rain

t for

mos

t rea

ders

, tho

ugh

itm

ay m

ake

the sentence seem ungrammatical to some, In se

nten

ce (

3), a

stro

nger

sem

antic

con

stra

int o

verr

ides

this

syn

tact

ic c

onst

rain

t: sw

ords

, whi

char

e in

anim

ate

obje

cts

, can

not r

ecei

ve b

oys,

Fin

ally

, in

the

four

th s

ente

nce, a

stronger syntactic constraint overrides the semantics, It is cl

ear

from

this

exam

ple

that

con

stra

ints

var

y in

str

engt

h an

d co

mpe

te to

pro

duce

an

inte

rpre

-ta

tion

of a

sen

tenc

e, A

goo

d m

etho

d fo

r ca

ptur

ing

this

com

petit

ion

is to

ass

ign

real-valued strengths to the constraints, and to alIow them to co

mpe

te o

rco

oper

ate

acco

rdin

g to

thei

r st

reng

th.

Parallel distributed processing, or connectionist, models are pa

rtic

ular

lygood for modeling this style of processing, T

hey

allo

w la

rge

amou

nts

ofinformation to be processed simultaneously and competitively, and they alIow

evid

ence

to b

e w

eigh

ted

on a

con

tinuu

m (

20, 2

2), A

num

ber

of re

sear

cher

sha

ve p

ursu

ed th

is id

ea a

nd h

ave

built

mod

els

to a

pply

con

nect

ioni

sm to

sent

ence

pro

cess

ing

(5, 6

, 35)

, The

dev

elop

men

t of

this

app

roac

h, h

owev

er, h

asbe

en r

etar

ded

beca

use

it is

dif

ficu

lt to

det

erm

ine

exac

tly w

hat c

onst

rain

ts a

reim

pose

d by

eac

h fe

atur

e or

set

of

feat

ures

in a

sen

tenc

e, I

t is

even

mor

edi

ffic

ult t

o de

term

ine

the

appr

opri

ate

stre

ngth

s ea

ch o

f th

ese

cons

trai

nts

shou

ldha

ve, C

onne

ctio

nist

lear

ning

pro

cedu

res ,

how

ever

, allo

w a

mod

el to

lear

n th

eap

prop

riat

e co

nstr

aint

s an

d as

sign

app

ropr

iate

str

engt

hs to

them

,T

o ta

ke a

dvan

tage

of

this

fea

ture

, lea

rnin

g w

as a

dded

to o

ur li

st o

f go

als,

The

mod

el is

giv

en a

sen

tenc

e as

inpu

t. F

rom

the

sent

ence

, the

model must

prod

uce

a re

pres

enta

tion

of th

e ev

ent t

o w

hich

the

sent

ence

ref

ers,

The

act

ual

event that corresponds to the sentence is then used as fe

edba

ck to

trai

n th

em

odel

. But

lear

ning

is n

ot w

ithou

t its

ow

n pr

oble

ms,

Sev

eral

fea

ture

s of

the

lear

ning

task

mak

e le

arni

ng d

iffi

cult.

One

pro

blem

is th

at th

e en

viro

nmen

t is

prob

abili

stic

, On

diff

eren

t occ

asio

ns, a

sen

tenc

e m

ay r

efer

to d

iffe

rent

eve

nts

that

is, i

t may

be

refe

rent

ially

am

bigu

ous,

For

exa

mpl

e, a

sen

tenc

e lik

e, "

The

pitc

her

thre

w th

e ba

lI\'

may

ref

er to

eith

er th

e tossing of a projectile or the

host

ing

of a

par

ty, T

he r

obus

t , g

rade

d, a

nd in

crem

enta

l cha

ract

er o

f co

nnec

-tio

nist

lear

ning

alg

orith

ms

lead

s us

to h

ope

that

they

will

be

able

to c

ope

with

the

vari

abili

ty in

the

envi

ronm

ent i

n w

hich

they

lear

n,A

sec

ond

lear

ning

pro

blem

con

cern

s th

e di

ffic

ulty

of

lear

ning

the

map

ping

betw

een

the

part

s of

the

sent

ence

and

the

part

s of

the

even

t (10

, 26)

, Lea

rnin

gth

e m

appi

ng is

som

etim

es r

efer

red

to a

s a

boot

-str

appi

ng p

robl

em s

ince

the

mea

ning

of

the

cont

ent w

ords

and

sig

nific

ance

of t

he s

ynta

x m

ust b

e ac

quire

d

220

F, S

T, J

OH

N A

ND

j,L,

McC

LELL

AN

D

from

the

sam

e se

t of d

ata,

To

lear

n th

e sy

ntax

, it s

eem

s ne

cess

ary

to a

lrea

dykn

ow th

e w

ord meanings. Conversely, to learn the word m

eani

ngs

it se

ems

nece

ssar

y to

kno

w h

ow th

e sy

ntax

map

s th

e w

ords

ont

o th

e ev

ent d

escr

iptio

n,T

he c

onne

ctio

nist

lear

ning

pro

cedu

re ta

kes

a st

atis

tical

app

roac

h to

this

prob

lem

, Thr

ough

exp

osur

e to

larg

e nu

mbe

rs o

f se

nten

ces

and

the

even

ts th

eyde

scri

be, t

he m

appi

ng b

etw

een

feat

ures

of

the

sent

ence

s an

d ch

arac

teri

stic

s of

the

even

ts w

ill e

mer

ge a

s st

atis

tical

reg

ular

ities

, For

inst

ance

, in

the

long

run

the

lear

ning

pro

cedu

re s

houl

d di

scov

er th

e re

gula

rity

that

sen

tenc

es b

egin

ning

with

"the boy" and containing a tr

ansi

tive

verb

in th

e ac

tive

voic

e re

fer

toev

ents

in w

hich

a y

oung

, mal

e hu

man

par

ticip

ates

as

an a

gent

, The

dis

cove

ryof

the

entir

e en

sem

ble

of s

uch

regu

lari

ties

prov

ides

a jo

int s

olution to the

prob

lem

s of

lear

ning

the

synt

ax a

nd th

e m

eani

ngs

of w

ords

.So

me

aspe

cts

of th

ese

goal

s ha

ve b

een

addr

esse

d by

our

ow

nea

rlier

wor

k(2

1, 3

0), H

owev

er, these previous models used a cumbersome

a priori

repr

e-se

ntat

ion

of s

ente

nces

that

pro

ved

unw

orka

ble

(see

(30)

for

dis

cuss

ion)

, Giv

enth

e re

cent

suc

cess

es in

usi

ng c

onne

ctio

nist

lear

ning

pro

cedu

res

to le

arn

inte

rnal

repr

esen

tatio

ns (

12, 2

8), w

e de

cide

d to

exp

lore

the

feas

ibili

ty o

f having a

netw

ork

lear

n its

ow

n re

pres

enta

tion

of s

ente

nces

,A

fin

al c

hara

cter

istic

of

lang

uage

com

preh

ensi

on w

e w

ante

d to

cap

ture

isso

met

imes

cal

Ied

the

prin

cipl

e of

imm

edia

te u

pdat

e (3

, 19,

34)

, As

each

constituent of the sentence is encountered, the interpretation of the entire

even

t is

adju

sted

to r

efle

ct th

e co

nstr

aint

s ar

isin

g fr

om th

e ne

w c

onst

ituen

t in

conj

unct

ion

with

the

cons

trai

nts

from

con

stitu

ents

alr

eady

enc

ount

ered

, Bas

edon

all

of th

e av

aila

ble

cons

trai

nts,

the

mod

el s

houl

d tr

y to

ant

icip

ate

upco

min

gco

nstit

uent

s, I

t sho

uld

also

adj

ust i

ts in

terp

reta

tion

of p

rece

ding

con

stitu

ents

tore

flec

t eac

h ne

w b

it of

info

rmat

ion,

In

this

way

, par

ticul

ar s

ente

nce

inte

rpre

ta-

tions

may

gai

n an

d lo

se s

uppo

rt th

roug

hout

the

cour

se o

f pr

oces

sing

as

each

new

bit

of in

form

atio

n is

pro

cess

ed. T

his

imm

edia

te u

pdat

e sh

ould

be

acco

m-

plis

hed

whi

le a

void

ing

the

diff

icul

ty o

f pe

rfor

min

g ba

cktr

acki

ng,

In s

um, t

he m

odel

add

ress

es s

ix g

oals

:

. to disambiguate am

bigu

ous

wor

ds;

. to instantiate va

gue

wor

ds;

. to assign thematic ro

les;

. to elaborate implied ro

les;

. to learn to perform these ta

sks;

. to immediately adjust its

inte

rpre

tatio

n as

eac

h co

nstit

uent

is p

roce

ssed

,

2, D

escr

iptio

n of

the

SG

Mod

el

1. T

ask

The

mod

el's

task

is to

pro

cess

a s

ingl

e cl

ause

sen

tenc

e w

ithou

t em

bedd

ings

into

a representation of the event it de

scri

bes,

The

sen

tenc

e is

pre

sent

ed to

the

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 3: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

221

mod

el a

s a

tem

pora

l seq

uenc

e of

con

stitu

ents

, A c

onst

ituen

t is

eith

er a

sim

ple

noun

phr

ase,

a p

repo

sitio

nal p

hras

e, o

r a

verb

(in

clud

ing

the

auxi

liary

ver

b, if

any)

, The

info

rmat

ion

each

of

thes

e se

nten

ce c

onst

ituen

ts y

ield

s is

imm

edia

tely

used as evidence to update the model's internal representation of the entire

even

t. T

his

repr

esen

tatio

n is

cal

led

the

sent

ence

ges

talt

beca

use

all o

f th

ein

form

atio

n fr

om th

e se

nten

ce is

rep

rese

nted

toge

ther

with

in a

sin

gle

, dis

trib

ut-

ed r

epre

sent

atio

n; th

e m

odel

is c

alle

d th

e Se

nten

ce G

esta

lt, o

r SG

, mod

el

because it co

ntai

ns th

is r

epre

sent

atio

n. T

his

gene

ral c

oncept of sentence

repr

esen

tatio

n co

mes

fro

m H

into

ns

pion

eeri

ng w

ork

(11)

, Fro

m th

e se

nten

cege

stal

t, th

e m

odel

can

pro

duce

, as

outp

ut, a

rep

rese

ntat

ion

of th

e ev

ent,

Thi

sev

ent r

epre

sent

atio

n co

nsis

ts o

f a

set o

f pa

irs,

Eac

h pa

ir c

onsi

sts

of a

them

atic

role

and

the

conc

ept t

hat f

ills

that

rol

e, T

oget

her,

the

pair

s de

scri

be th

e ev

ent.

2,

Architecture and processing

The

mod

el c

onsi

sts

of tw

o pa

rts,

One

par

t, th

e se

quen

tial e

ncod

er, s

eque

ntia

llypr

oces

ses

each

con

stitu

ent t

o pr

oduc

e th

e se

nten

ce g

esta

lt. T

he s

econ

d pa

rt is

used

to p

rodu

ce th

e ou

tput

rep

rese

ntat

ion

from

the

sent

ence

ges

talt,

1,

Prod

ucin

g th

e se

nten

ce g

esta

lt

To

proc

ess

the

cons

titue

nt p

hras

es o

f a

sent

ence

, we

adap

ted

an a

rchi

tect

ure

from Jordan (1

5) th

at u

ses

the

outp

ut o

f pr

evio

us processing as input on the

next

i~er

atio

n (s

ee F

ig, 1

), E

ach

cons

titue

nt is

pro

cess

ed in

turn

to u

pdat

e th

ese

nten

ce g

esta

lt. T

o pr

oces

s a

cons

titue

nt, i

t is

firs

t rep

rese

nted

as

a pa

ttern

of

activation over the

current constituent

units

, Act

ivat

ion

from

thes

e un

its

proj

ects

to th

e fi

rst h

idde

n un

it la

yer

and

com

bine

s w

ith th

e ac

tivat

ion

from

the

sentence gestalt

crea

ted

as th

e re

sult

of p

roce

ssin

g th

e pr

evio

us c

onst

ituen

t. T

heac

tual

impl

emen

tatio

n of

this

arr

ange

men

t is

to c

opy

the

activ

atio

n fr

om th

esentence gestalt

to the

previous sentence gestalt

units

, and

allo

w a

ctiv

atio

n to

feed

for

war

d fr

om th

ere,

Act

ivat

ion

in th

e hi

dden

laye

r th

en c

reat

es a

new

...."

""""

,-""

",..,

.".."

. cop

y

-....

".."

"....

"..,.

.,....

..".."

"..,.

."

Fig,

1, T

he a

rchi

tect

ure

of th

e ne

twor

k, T

he b

oxes

hig

hlig

ht th

e. functional parts: Area

processes the constituents into the sentence gestalt, and Area

proc

esse

s th

e se

nten

ce g

esta

lt in

to

the

outp

ut r

epre

sent

atio

n, T

he n

umbe

rs in

dica

te th

e nu

mbe

r of

uni

ts in

eac

h la

yer,

222

F, S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

pattern of activation over the

sentence gestalt

units

, The

sen

tenc

e ge

stal

t,

ther

efor

e, is

not

a superimposition of each co

nstit

uent

. Rat

her,

eac

h ne

w

patte

rn in

the

sent

ence

ges

talt

is c

ompu

ted

thro

ugh

two

laye

rs o

f w

eigh

ts a

nd

repr

esen

ts th

e m

odel

's n

ew b

est g

uess

inte

rpre

tatio

n of

the

mea

ning

of

the

sent

ence

,

2,

Pro

duci

ng th

e ou

tput

As

note

d pr

evio

usly

, sev

eral

oth

er m

odel

s ha

ve u

sed

a ty

pe o

f se

nten

ce g

esta

ltto represent a sentence, McClelland and Kawamoto (2

1) u

sed

units

that

repr

esen

ted

the

conj

unct

ion

ot's

eman

tic f

eatu

res

of th

e ve

rb w

ith th

e se

man

ticfe

atur

es o

f a

conc

ept,

To

enco

de a

sen

tenc

e, t

he p

atte

rns

of a

ctiv

ity p

rodu

ced

for

each

ver

b/co

ncep

t con

junc

tion

wer

e ac

tivat

ed in

a s

ingl

e po

ol o

f un

its th

atco

ntai

ned

ever

y po

ssib

le c

onju

nctio

n, S

t. Jo

hn a

nd M

cCle

lland

(30)

use

d a

sim

ilar

conj

unct

ive

repr

esen

tatio

n to

enc

ode

a nu

mbe

r of

sen

tenc

es a

t once

,

The

se r

epre

sent

atio

ns s

uffe

r fr

om in

effic

ienc

y an

d sc

ale

badl

y be

caus

e so

man

y

units are required to represent all of the conjunctions, T

he c

urre

nt m

odel

'

repr

esen

tatio

n is

far

mor

e ef

ficie

nt.

The

mod

el's

eff

icie

ncy

com

es f

rom

mak

ing

the

sent

ence

ges

talt

a tr

aina

ble,

hidd

en u

nit l

ayer

. Mak

ing

the

sent

ence

ges

talt

trai

nabl

e al

low

s th

e ne

twor

k to

crea

te th

e pr

imiti

ves

it ne

eds

to r

epre

sent

the

sent

ence

effi

cien

tly. I

nste

ad o

f

having to represent every possible conjunction, o

nly

thos

e co

njun

ctio

ns th

at

are

usef

ul w

ill b

e le

arne

d an

d ad

ded

to th

e re

pres

enta

tion,

Fur

ther

, these

prim

itive

s do

not

hav

e to

be

conj

unct

ions

bet

wee

n th

e ve

rb a

nd a

con

cept

. Ahidden layer could learn to re

pres

ent c

onju

nctio

ns b

etw

een

the

conc

epts

them

selv

es o

r ot

her

com

bina

tions

of

info

rmat

ion

if th

ey w

ere

usef

ul f

or s

olvi

ngits

task

,S

ince

a la

yer

of h

idde

n un

its c

anno

t be

trai

ned

by e

xplic

itly

spec

ifyin

g its

activ

atio

n va

lues

, we

inve

nted

a w

ay o

f "d

ecod

ing"

the

sent

ence

ges

talt

into

an

outp

ut la

yer,

Bac

kpro

paga

tion

can

then

be

used

to tr

ain

the

hidd

en la

yer,

The

outp

ut la

yer

repr

esen

ts th

e ev

ent a

s a

set o

f th

emat

ic r

ole

and

fille

r pa

irs,

For

example, the event de

scri

bed

by "

The

pitc

her

thre

w th

e ba

ll" would be

repr

esen

ted

as th

e se

t (ag

ent/p

itche

r(ba

ll-pl

ayer

), a

ctio

n/th

rew

(tos

s), p

atie

nt/

ball(

sphe

reH

, The

wor

ds in

par

enth

eses

indi

cate

whi

ch c

once

pts

the

ambi

guou

sw

ords

cor

resp

ond

to,

The

out

put l

ayer

can

rep

rese

nt o

ne r

ole/

fille

r pa

ir a

t a ti

me,

To

deco

de a

part

icul

ar r

ole/

fille

r pa

ir, t

he s

ente

nce

gest

alt i

s pr

obed

with

hal

f of

the

pair

,either the role or the filler, Activation from the probe and the

sent

ence

ges

talt

com

bine

in th

e se

cond

hid

den

laye

r w

hich

in tu

rn a

ctiv

ates

the

entir

e ro

le/f

iller

pair

in th

e ou

tput

laye

r, T

he e

ntire

eve

nt c

an b

e de

code

d in

this

way

by

successively probing with each half of

eac

h pa

ir,

Whe

n m

ore

than

one

con

cept

can

pla

usib

ly f

ill a

rol

e, w

e as

sum

e th

at th

e

corr

ect r

espo

nse

is to

act

ivat

e ea

ch p

ossi

ble

fille

r to

a d

egre

e, T

he d

egre

e of

activation of the units representing ea

ch f

iller

cor

resp

onds

to th

e fi

ller

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 4: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

223

cond

ition

al p

roba

bilit

y of

occ

urri

ng in

the

give

n co

ntex

t. T

he n

etw

ork

shou

ldle

arn

wei

ghts

to p

rodu

ce th

ese

activ

atio

ns th

roug

h tr

aini

ng, T

o ac

hiev

e th

isgo

al, w

e em

ploy

ed a

n er

ror

mea

sure

in th

e le

arni

ng p

roce

dure

, cro

ss-e

ntro

py(1

3), t

hat c

onve

rges

on

this

goa

l:

c = - L (Tj

log2

+ (1- T)log2

(1-

whe

re

is the target activation and

is the output activation of unit

j,

with

man

y co

nnec

tioni

st le

arni

ng p

roce

dure

s , th

e go

al is

to m

inim

ize

the

erro

rm

easu

re o

r co

st- f

unct

ion

(13)

. The

min

imum

of

C o

ccur

s at

the

poin

t in

wei

ght

spac

e w

here

the

activ

atio

n. v

alue

of e

ach

outp

ut u

nit e

qual

s th

e co

nditi

onal

prob

abili

ty th

at th

e un

it sh

ould

be

on in

the

curr

ent c

onte

xt. I

n th

e m

odel

whe

n th

e ne

twor

k is

pro

bed

with

a p

artic

ular

rol

e, s

ever

al o

f th

e ou

tput

uni

tsre

pres

ent t

he o

ccur

renc

e of

a p

artic

ular

fill

er o

f th

at r

ole,

Whe

n C

is a

t its

min

imum

, the units' a

ctiv

atio

n va

lues

rep

rese

nt th

e co

nditi

onal

pro

babi

lity

ofth

e oc

curr

ence

of

that

fill

er, in that role, g

iven

the

curr

ent s

ituat

ion:

Not

eho

wev

er, t

hat o

n an

y pa

rtic

ular

trai

ning

tria

l , the target of training is

one

part

icul

ar e

vent

, The

min

imum

of

C, t

hen

, is

defi

ned

acro

ss th

e en

sem

ble

oftr

aini

ng e

xam

ples

the

mod

el is

sho

wn,

Prob

ing

with

the

fille

r w

orks

sim

ilarl

y, T

he a

ctiv

atio

n va

lue

of e

ach

role

uni

tin the output layer represents the conditional probability of the pr

obed

fille

rpl

ayin

g th

at r

ole

in th

e cu

rren

t situ

atio

n. In

per

form

ing

grad

ient

des

cent

in C

the

netw

ork

is s

earc

hing

for

wei

ghts

that

allo

w it

to m

atch

act

ivat

ions

to th

ese

cond

ition

al p

roba

bilit

ies.

3. E

nvir

onm

ent a

nd tr

aini

ng r

egim

e

Tra

inin

g co

nsis

ts o

f tr

ials

in w

hich

the

netw

ork

is p

rese

nted

with

a s

ente

nce

and

the

even

t it d

escribes, The rationale is that the la

ngua

ge le

arne

r ex

-pe

rien

ces

som

e ev

ent a

nd th

en h

ears

a s

ente

nce

abou

~ it.

The

lear

ner

proc

esse

sth

e se

nten

ce a

nd c

ompa

res

the

conc

eptu

al r

epre

sent

atio

n its

com

preh

ensi

onmechanism produces to the co

ncep

tual

rep

rese

nt~t

~on

it ob

tain

ed f

rom

ex-

peri

enci

ng th

e ev

ent,

Dis

crep

anci

es a

re u

sed

as f

eedb

ack

for

the

com

preh

en-

sion

mec

hani

sm, T

hese

sen

tenc

e/ev

ent p

airs

wer

e ge

nera

ted

on- l

ine

for

each

trai

ning

tria

l. So

me

pair

s ar

e m

ore

likel

y to

be

gene

rate

d th

an o

ther

s, O

ver

trai

ning

, the

se d

iffe

renc

es in

the

likel

ihoo

d of

gen

erat

ion

tran

slat

e in

to d

iffe

r-ences in training frequency.

The

net

wor

k is

trai

ned

to g

ener

ate

the

even

t fro

m th

e se

nten

ce a

s in

put.

To

I The

situ

atio

n, a

s de

fined

in th

e le

arni

ng p

roce

dure

, is

the

com

bina

tion

of th

e pr

evio

us s

ente

nce

gest

alt ,

the

curr

ent c

onst

ituen

t , a

nd th

e cu

rren

t pro

be. I

t wou

ld b

e de

sira

ble

to d

efin

e th

e si

tuat

ion

sole

ly in

term

s of

the

sequ

ence

of

sent

ence

con

stitu

ents

, Whi

le o

ur r

esul

ts s

ugge

st th

at th

e se

nten

cege

stal

t lea

rns

to s

ave

all t

he r

elev

ant i

nfor

mat

ion

from

ear

lier

cons

titue

nts,

we

have

no

proo

f th

at it

does

,

224

F, S

T, J

OH

N A

ND

J,L

. McC

LELL

AN

D

prom

ote

imm

edia

te p

roce

ssin

g, a

spe

cial

trai

ning

reg

ime

is u

sed.

Aft

er e

ach

cons

titue

nt h

as b

een

proc

esse

d, th

e ne

twor

k is

trai

ned

to p

redi

ct th

e se

t of

role

/fill

er p

airs

of

the

entir

e ev

ent.

From

the

firs

t con

stitu

ent o

f th

e se

nten

ceth

en, t

he m

odel

is forced to try to predict the entire event. This tr

aini

ngre

gim

e, t

here

fore

, ass

umes

that

the

com

plet

e ev

ent i

s av

aila

ble

to th

e le

arni

ngpr

oced

ure

as s

oon

as s

ente

nce

proc

essi

ng b

egin

s, b

ut it

doe

s no

t assume any

spec

ial k

now

ledg

e ab

out w

hich

aspects of the event correspond to which

sent

ence

con

stitu

ents

. Of

cour

se, a

fter

pro

cess

ing

only

the

firs

t con

stitu

ent ,

the

mod

el g

ener

ally

can

not c

orre

ctly

gue

ss th

e en

tire

even

t. B

y fo

rcin

g it

to tr

y,th

is tr

aini

ng p

roce

dure

req

uire

s th

e m

odel

to d

isco

ver

the

map

ping

bet

wee

nco

nstit

uent

s an

d as

pect

s of

the

even

t, a

s it

forc

es th

e m

odel

to e

xtra

ct a

s m

uch

info

rmat

ion

as p

ossi

ble

from

eac

h co

nstit

uent

. Con

sequ

ently

, as

each

new

cons

titue

nt is

pro

cess

ed, t

he m

odel

's p

redi

ctio

ns o

f the

eve

nt a

re r

efin

ed to

refl

ect t

he a

dditi

onal

evi

denc

e it

supp

lies.

4. A

n ill

ustr

atio

n of

pro

cess

ing

An

exam

ple

of h

ow a

trai

ned

netw

ork

proc

esse

s a

sent

ence

will

hel

p ill

ustr

ate

how

the

mod

el w

orks

. To

proc

ess

the

sent

ence

, "

The

teac

her

ate

the

soup

,th

e co

nstit

uent

s of

the

sent

ence

are

pro

cess

ed in

turn

. As

each

con

stitu

ent i

spr

oces

sed

, the

net

wor

k pe

rfor

ms

a ty

pe o

f pat

tern

com

plet

ion.

The

mod

el tr

ies

to p

redi

ct th

e en

tire

even

t by

augm

entin

g th

e in

form

atio

n su

pplie

d by

the

cons

titue

nts

proc

esse

d so

far

with

add

ition

al in

form

atio

n th

at c

orre

late

s w

ithth

e in

form

atio

n su

pplie

d by

the

cons

titue

nts.

With

eac

h ad

ditio

nal c

onst

ituen

t , th

e m

odel

' s p

redi

ctio

ns im

prov

e. E

arly

inthe sentence, m

any

poss

ible

eve

nts

are

cons

iste

nt w

ith w

hat l

ittle

is k

now

nab

out t

he s

ente

nce

so f

ar, T

he c

ompl

etio

n pr

oces

s ac

tivat

es e

ach

of th

ese

alte

rnat

ives

slig

htly

, acc

ordi

ng to

thei

r su

ppor

t. A

s m

ore

cons

titue

nts

are

proc

esse

d, the additional evidence m

ore

stro

ngly

sup

port

s fe

wer

pos

sibl

eev

ents

.T

he p

atte

rn o

f ac

tivat

ion

over

the

sent

ence

ges

talt

can

be o

bser

ved

dire

ctly

,an

d re

spon

ses

to p

robe

s ca

n be

exa

min

ed, t

o se

e w

hat i

t is

repr

esen

ting

afte

rprocessing each constituent of the sentence (see Fig. 2). After processing the

first

con

stitu

ent, "

The

teac

her

" of

our

exa

mpl

e se

nten

ce, t

he n

etw

ork

assu

mes

the sentence is in the active voice and therefore assigns

teac

her

to the agent

role

, The

net

wor

k al

so f

ills

in th

e se

man

tic f

eatu

res

of te

ache

rs (

pers

on, a

dult,

and

fem

ale)

acc

ordi

ng to

its

prev

ious

exp

erie

nce

with

teac

hers

. Whe

n pr

obed

with

the

actio

n ro

le, t

he n

etw

ork

wea

kly

activ

ates

a n

umbe

r of

pos

sibl

e ac

tions

whi

ch th

e te

ache

r pe

rfor

ms.

The

net

wor

k si

mila

rly m

akes

gue

sses

abo

ut th

eot

her

role

s fo

r w

hich

it is

pro

bed.

Whe

n th

e se

cond

con

stitu

ent, "

ate" is processed, the sentence ge

stal

t is

refi

ned

to r

epre

sent

the

new

info

rmat

ion,

In

addi

tion

to r

epre

sent

ing

both

that

teac

her

is the agent and that

ate

is the action, t

he n

etw

ork

is a

ble

to m

ake

bette

r gu

esse

s ab

out t

he o

ther

rol

es, F

or e

xam

ple ,

it in

fers

that

the

patie

nt is

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 5: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

The

teac

her

ate

the

soup

.

. Sen

tenc

e G

esta

lt A

ctiv

atio

nsR

ole/

Fill

er A

ctiv

atio

nsun

itag

ent

I:J

I:J

8:J

pers

onad

uhI:

J8:

Jm

ale

8:J

I:J

8:J

fem

ale

bus

driv

erte

ache

r8:

JI:

Jac

tion

cons

umed

I:J

ate

I:J

8:J

gave

I:J

thre

w(h

ost)

8:J

drov

e(m

otiv

. ) I:J

8:J

patie

ntpe

rson

8:J

8:J

aduh

I:J

child

I:J

fem

ale

8:J

8:J

8:J

scho

olgi

rlI:

J8:

J8:

J8:

J18

1ba

ll(~a

rty)

8:J

8:J

stea

soup

IIJ

181

crac

kers

I:J

225

Fig. 2. The evolution

of

the

sent

ence

ges

talt

duri

ng p

roce

ssin

g. O

n th

e le

ft, the activation

of

part

of

the

sent

ence

ges

talt

is s

how

n af

ter

each

sen

tenc

e co

nstit

uent

has

bee

n pr

oces

sed.

On

the

righ

tth

e ac

tivat

ion

of s

elec

ted

outp

ut u

nits

is s

how

n w

hen

the

evol

ving

ges

talt

is p

robe

d w

ith e

ach

role

.T

he

#s

correspond to the number

of

cons

titue

nts

that

hav

e be

en p

rese

nted

to th

e ne

twor

k at

that

poin

t. #1

mea

ns th

e ne

twor

k ha

s se

en "

The

teac

her;

" #2

mea

ns it

has

see

n " T

he te

ache

r at

e;et

c. T

he a

ctiv

atio

ns (

rang

ing

betw

een

0 an

d 1)

are

dep

icte

d as

the

dark

ened

are

a of

eac

h bo

x.

food. Since, in the network's experience, teachers ty

pica

lly e

at s

oup,

the

netw

ork

prod

uces

act

ivat

ion

corr

espo

ndin

g to

the

infe

renc

e th

at th

e fo

od is

soup

. After the third constituent is processed, t

he n

etw

ork

has

settl

ed o

n an

inte

rpre

tatio

n of

th

e se

nten

ce. T

he th

emat

ic r

oles

are

rep

rese

nted

with

thei

rap

prop

riate

fille

rs.

5. S

peci

fics

of th

e m

odel

1.

Input representation

Each sentence constituent can be thought

of

as a

sur

face

rol

e/fi

ller

pair

. It

cons

ists

of

one unit indicating the surface role

of

the

cons

titue

nt a

nd o

ne u

nit

repr

esen

ting

each

wor

d in

the

constituent. One unit stands for each

of

verb

s , 3

1 no

uns ,

4 p

repo

sitio

ns, 3

adv

erbs

, and 7 ambiguous words, Two

of

the

ambi

guou

s w

ords

hav

e tw

o ve

rb m

eani

ngs ,

thre

e ha

ve tw

o no

un m

eani

ngs ,

and

two

have

a v

erb

and

a no

un m

eani

ng. S

ix o

f the

wor

ds a

re v

ague

term

s (e

.so

meo

ne, s

omet

hing

, and

foo

d). F

or p

repo

sitio

nal p

hras

es, t

he p

repo

sitio

n an

d

226

F. S

T. J

OH

N A

ND

j,L,

McC

LELL

AN

D

the

noun

are

eac

h re

pres

ente

d by

a u

nit i

n th

e in

put.

For

the

verb

con

stitu

ent

the presence

of

the

auxi

liary

ver

b "w

as"

is li

kew

ise

enco

ded

by a

sep

arat

e un

it.

Art

icle

s ar

e no

t rep

rese

nted

, and

nou

ns a

re a

ssum

ed to

be

sing

ular

and

defi

nite

thro

ugho

ut.

The

sur

face

rol

e, o

r lo

catio

n, o

f ea

ch c

onst

ituen

t is

code

d by

fou

r un

its th

atre

pres

ent l

ocat

ion

resp

ectiv

.eto

the

verb

: pre

-ver

bal,

verb

al, first-po

st-v

erba

l,

and

n-po

st-v

erba

l. T

he fi

rst-post-verbal unit is active for the co

nstit

uent

imm

edia

tely

follo

win

g th

e ve

rb, a

nd th

e n-

post-verbal unit is active for any

cons

titue

nt o

ccur

ring

afte

r th

e fir

st-p

ost-

verb

al c

onst

ituen

t. A

num

ber

of

cons

titue

nts,

ther

efor

e, m

ay' s

hare

the

n-po

st-v

erba

l pos

ition

. The

sen

tenc

eT

he b

all w

as h

it by

som

eone

with

the

bat i

n th

e pa

rk"

wou

ld b

e en

code

d as

the

follo

win

g or

dere

d se

t in

whi

ch th

e w

ords

in p

aren

thes

es r

epre

sent

uni

ts in

the

inpu

t tha

t are

on

for

each

con

stitu

ent (

(pre

-ver

bal,

bal

l), (

verb

al, w

as, h

it),

(fir

st-p

ost-

verb

al, b

y, s

omeo

ne),

(n-

post

-ver

bal,

with

, bat), (n-po

st-v

erba

l, in

,

park

)). W

ithou

t the

sur

face

rol

es, t

he n

etw

ork

mus

t lea

rn to

use

the

tem

pora

lor

der

of

the

cons

titue

nts

to p

rodu

ce s

ynta

ctic

con

stra

ints

. Add

ition

al s

imul

a-tio

n ha

s sh

own

that

the

netw

ork

can

lear

n th

e co

rpus

with

out t

he s

urfa

ce r

oles

.In

tere

stin

gly,

rem

ovin

g th

e su

rfac

e ro

les

did

not s

low

dow

n le

arni

ng,

2.

Out

put r

epre

sent

atio

n

The

out

put h

as o

ne u

nit f

or each of 9 po

ssib

le th

emat

ic r

oles

(e.g., agent

actio

n, p

atie

nt, i

nstr

umen

t) a

nd o

ne u

nit f

or e

ach

of 4

5 co

ncep

ts, i

nclu

ding

28

noun

con

cept

s, 1

4 ac

tions

, and

3 a

dver

bs. A

dditi

onal

ly, t

here

is a

uni

t for

the

pass

ive

voic

e. F

inal

ly, t

here

are

13

"fea

ture

" units

, suc

h as

mal

e, fe

mal

e, a

nd

adul

t. T

hese

uni

ts a

re in

clud

ed in

the

outp

ut to

allo

w th

e de

mon

stra

tion

mor

e su

btle

eff

ects

of

cons

trai

nts

on in

terp

reta

tion

(see

App

endi

x A

for

the

complete set

of

role

s an

d co

ncep

ts),

Thi

s re

pres

enta

tion

is n

ot m

eant

to b

eco

mpr

ehen

sive

. Ins

tead

, it i

s m

eant

to p

rovi

de a

con

veni

ent w

ay to

trai

n an

ddemonstrate the processing abilities

of

the

netw

ork.

Any

one

role

/fill

er

patte

rn, t

hen,

con

sist

s of

two

part

s. F

or th

e ro

le, o

ne o

f th

e 9

role

uni

ts s

houl

dbe

act

ive

, and

for

the

fille

r, a

uni

t rep

rese

ntin

g th

e co

ncep

t, a

ctio

n, o

r ad

verb

should be active. If relevant, some

of the feature units or the passive voice unit

shou

ld b

e ac

tive.

3.

Tra

inin

g en

viro

nmen

t

Whi

le th

e se

nten

ces

ofte

n in

clud

e ambiguous or vague words

, the

eve

nts

are

always specific and complete: each event consists

of

a sp

ecif

ic a

ctio

n an

d ea

ch

2 A

sec

ond output layer was included in the si

mul

atio

ns, T

his

laye

r re

prod

uced

the

sent

ence

cons

titue

nt th

at li

ts w

ith th

e ro

le/li

ller

pair

bei

ng p

robe

d. C

onse

quen

tly, t

he m

odel

was

req

uire

d to

reta

in th

e sp

ecili

c w

ords

in th

e se

nten

ce a

s w

ell a

s th

eir

mea

ning

, Sin

ce this aspect of the

processing does not lit into the context of the current discussion, these units are not discussed

furt

her,

Add

ition

al s

imul

atio

n ha

s sh

own

that

this

ext

ra d

eman

d on

the

netw

ork

does

not

qual

itativ

ely

affe

ct it

s pe

rfor

man

ce.

common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 6: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

227

them

atic

rol

e re

late

d to

this

act

ion

is fi

lled

by s

ome

spec

ific

con

cept

. Acc

ord-

ingl

y, e

ach

even

t occ

urs

in a

par

ticul

ar location, and actions requiring an

inst

rum

ent a

lway

s ha

ve a

spe

cific

inst

rum

ent.

Sen

tenc

e/ev

ent p

airs

are

cre

ated

on-

line during training from scaffoldings

calle

d se

nten

ce-frames, The sentence-fr

ames

spe

cify

whi

ch th

emat

ic r

oles

and

fille

rs c

an b

e us

ed w

ith th

at a

ctio

n. E

ach

of th

e 14 actions has a separate

sent

ence

-fra

me,

Fou

r ad

ditio

nal f

ram

es w

ere

mad

e to

cov

er p

assi

ve v

ersi

ons

ofsentences involving the actions

kiss

ed, s

hot,

hit

and

gave

,T

o cr

eate

a s

ente

nce/

even

t pai

r, a

sen

tenc

e-fr

ame

is p

icke

d at

ran

dom

and

then

eac

h th

emat

ic r

ole

is p

roce

ssed

in tu

rn. (

App

endi

x B

con

tain

s a

sam

ple

sent

ence

-fra

me.) For example, let's assume that the

Ate

se

nten

ce-f

ram

e is

chos

en. A

gent

is th

e fi

rst r

ole

proc

esse

d, F

irst ,

a c

once

pt to

fill

the

role

isselected from the set of concepts that can play the agent role in the

Ate

sent

ence

-fra

me,

Thi

s ro

le/f

iller

pai

r is

add

ed to

the

even

t description, Since

som

e ro

les

, suc

h as

inst

rum

ent a

nd lo

catio

n, m

ay n

ot b

e m

entio

ned

in th

ese

nten

ce, i

t is

rand

omly

det

erm

ined

, acc

ordi

ng to

a p

rese

t pro

babi

lity,

whe

ther

a ro

le w

ill b

e in

clud

ed in

the

sent

ence

. If t

he r

ole

is to

be

incl

uded

, a w

ord

isch

osen

to r

epre

sent

the

fille

r in

the

sent

ence

, Oth

erw

ise

, the

rol

e is

left

out

of

the

sent

ence

, but

it is

stil

l included in the event description, Since the agent

role

mus

t be

incl

uded

in s

ente

nces

abo

ut e

atin

g, it

is p

lace

d in

the

sent

ence

and

a w

ord

is ch

osen, Assuming

busd

rive

r is

cho

sen

as th

e fil

ler

conc

ept,

a

wor

d to

desc

ribe

bu

sdri

ver

is s

elec

ted,

For

exa

mpl

e, t

he w

ord

"som

eone

might be chosen.

Nex

t , the action role is processed, Since

the

Ate

se

nten

ce-f

ram

e is

bei

ngus

ed, the action must be

ate,

A

wor

d to

desc

ribe

at

e is

then

cho

sen:

"co

n-su

med

" fo

r ex

ampl

e. T

hen

the

patie

nt is

cho

sen,

The

pro

babi

litie

s of

cho

osin

gpa

rtic

ular

pat

ient

s de

pend

upo

n w

hat h

as b

een

sele

cted

for

the

agen

t and

actio

n. G

iven

the

selection of

busd

rive

r as the agent

stea

k is

a m

uch

mor

elikely patient than

soup

. L

et's

assume that

stea

k is

sel

ecte

d, a

nd th

at th

e w

ord

stea

k" is

cho

sen

to r

epre

sent

it. I

n ge

nera

l, by

cha

ngin

g th

e pr

obab

ilitie

s of

sele

ctin

g sp

ecif

ic f

iller

s ac

cord

ing

to w

hich

oth

er f

iller

s ha

ve b

een

sele

cted

so

far,

sta

tistic

al r

egul

ariti

es a

mon

g th

e fi

llers

will

dev

elop

acr

oss

the

corp

us,

In th

e sa

me

way

, the

rem

aini

ng r

ole/

fille

r pa

irs' f

or th

e se

nten

ce-f

ram

e ar

ege

nera

ted.

Ass

umin

g on

ly th

e fir

st th

ree

role

s ar

e ch

osen

to b

e in

clud

ed in

this

sent

ence

, the input sentence w

ill b

e

, "

Som

eone

con

sum

ed th

e st

eak," The

even

t will

be

the

entir

e se

t of

role

/fill

er pairs (agent/busdriver, action/ate,

patie

nt /

stea

k, i

nstr

umen

t / k

nife

, loc

atio

n /li

ving

-roo

m, etc,

In th

is w

ay, 1

20 d

iffer

ent e

vent

s ca

n be

gen

erat

ed fr

om th

e se

t of frames

with

som

e be

ing

mor

e lik

ely

to a

ppea

r th

an o

ther

s. T

he m

ost f

requ

ent e

vent

occu

rs, o

n av

erag

e, 5

, 5 ti

mes

per

100

tria

ls, b

ut th

e le

ast f

requ

ent e

vent

occ

urs

only 9 times per 10, 0

00. T

he n

umbe

r of

wor

ds th

at c

an b

e ch

osen

to d

escr

ibe

an e

vent

and

the

optio

n to

incl

ude

or e

limin

ate

optio

nal c

onst

ituen

ts f

rom

the

sent

ence

bri

ngs

the

num

ber

of s

ente

nce/

even

t pai

rs to

22

645,

228

F. S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

In training the model on this corpus, sentence-fr

ames

wer

e pi

cked

at

rand

om, a

nd s

ente

nce/

even

t pai

rs w

ere

gene

rate

d ac

cord

ing

to th

e ra

ndom

proc

edur

e de

scrib

ed a

bove

, No

spec

ific

sent

ence

s or

eve

nts

wer

e se

t asi

de to

not b

e tr

aine

d.T

he s

ente

nces

are

lim

ited

in c

ompl

exity

bec

ause

of

the

limita

tions

of

the

even

t rep

rese

ntat

ion,

Onl

y on

e fi

ller

can

be assigned to a role in a particular

sent

ence

, Als

o, a

ll th

e ro

les

are

assu

med

to b

elon

g to

the

sent

ence

as

a w

hole

,T

here

fore

, no

embe

dded

cla

uses

or

phra

ses

atta

ched

to s

ingl

e co

nstit

uent

s ar

e

poss

ible

,

5.4,

T

rain

ing

proc

edur

e de

tails

Eac

h tr

aini

ng tr

ial c

onsi

sts

of f

irst

gen

erat

ing

a se

nten

ce/e

vent

pai

r, a

nd th

enpresenting the sentence to the m

odel

for

trai

ning

. The

constituents are

pres

ente

d to

the

mod

el s

eque

ntia

lly, o

ne a

t a ti

me,

Aft

er th

e m

odel

pro

cess

es a

cons

titue

nt, t

he m

odel

is p

robe

d w

ith e

ach

half

of

each

rol

e/fi

ller

pair

for

the

entir

e ev

ent.

The

err

or p

rodu

ced

on e

ach

prob

e is

col

lect

ed a

nd p

ropa

gate

dba

ckw

ard

thro

ugh

the

netw

ork

(ct.

(28)

), T

he weight changes from each

sent

ence

tria

l are

add

ed to

geth

er a

nd u

sed

to u

pdat

e th

e w

eigh

ts a

fter

ever

y 60

trials. The learning rate,

E,

was

set

to O

,()(

)()5

, and

mom

entu

m w

as s

et to

0,

No

atte

mpt

was

mad

e to

opt

imiz

e th

ese

valu

es, s

o it

is li

kely

that

lear

ning

tim

eco

uld

be im

prov

ed b

y tu

ning

thes

e pa

ram

eter

s.

3. Results

1. O

vera

ll pe

rfor

man

ce

Firs

t, we will assess the model's

abili

ty to

com

preh

end

sent

ence

s ge

nera

lly,

The

n w

e w

ill e

xam

ine

the

mod

el's

abi

lity

to f

ulfi

ll ou

r sp

ecifi

c pr

oces

sing

goa

ls,

and

we

will

examine the development of the m

odel

's p

erfo

rman

ce a

cros

str

aini

ng tr

ials

, Fin

ally

, we

will

dis

cuss

the

mod

el's

abi

lity

to g

ener

aliz

e,O

nce

the

mod

el w

as a

ble

to p

roce

ss c

orre

ctly

bot

h active and passive

sent

ence

s, th

e si

mul

atio

n w

as s

topp

ed a

nd e

valu

ated

. Cor

rect

pro

cess

ing

was

defin

ed a

s ac

tivat

ing

the

corr

ect u

nits

mor

e st

rong

ly th

an th

e in

corr

ect u

nits

.After 330

000

rand

om s

ente

nce

tria

ls, t

he m

odel

beg

an c

orre

ctly

pro

cess

ing

the

passive sentences in the corpus.

A s

et o

f 10

0 te

st s

ente

nce/

even

t pai

rs w

ere

gene

rate

d ra

ndom

ly f

rom

the

corpus, These sentence/event pairs were generated in the sa

me

way

the

trai

ning

sen

tenc

es w

ere

gene

rate

d ex

cept

that

they

wer

e ge

nera

ted

with

out

rega

rd to

thei

r fr

eque

ncy

durin

g tr

aini

ng, s

o se

ldom

pra

ctic

ed p

airs

wer

e as

likel

y to

app

ear

in th

e te

st s

et a

s fr

eque

ntly

pra

ctic

ed p

airs

, Of

thes

e pa

irs;

w

ere

set a

side

for

sep

arat

e an

alys

is b

ecau

se th

ey w

ere

ambi

guou

s: a

t lea

st tw

odi

ffer

ent i

nter

pret

atio

ns c

ould

be

deri

ved

from

eac

h (e

.g, S

omeo

ne a

te s

ome-

thin

g), O

f the

rem

aini

ng s

ente

nce/

even

t pai

rs, e

very

sen

tenc

e co

ntai

ned

at

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 7: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

229

leas

t one

vag

ue o

r am

bigu

ous

wor

d, y

et e

ach

had

only

one

inte

rpre

tatio

n.T

hese

una

mbi

guou

s se

nten

ce/e

vent

pai

rs w

ere

test

ed b

y fi

rst a

llow

ing

the

mod

el to

pro

cess

all

of th

e co

nstit

uent

s of

the

sent

ence

. The

n th

e m

odel

was

prob

ed w

ith e

ach

half

of

each

con

stitu

ent t

hat w

as m

entio

ned

in th

e se

nten

ce.

The

out

put p

rodu

ced

in r

espo

nse

to e

ach

prob

e w

as c

ompa

red

to th

e ta

rget

outp

ut. F

igur

e 3

pres

ents

a h

isto

gram

of

the

resu

lts.

Una

mbi

guou

s S

ente

nces

1ft

~ ~

~ ~ 1f

t

cros

a-en

lrop

y

Fig,

3, H

isto

gram

of

the

cros

s-en

trop

y er

ror

for

rand

om s

ente

nces

aft

er 3

3000

0 se

nten

ce tr

ials

,

The

sen

tenc

es w

ere

draw

n ra

ndom

ly f

rom

the corpus without regard to their fr

eque

ncy,

Acr

oss-

entr

opy

mea

sure

of

betw

een

0 an

d 10

res

ults

from sentences that are processed almost

perf

ectly

, Onl

y sm

all e

rror

s oc

cur

whe

n an

out

put u

nit s

houl

d be

com

plet

ely

activ

ate

(with

a v

alue

of I), but only obtains an activation of 0,7 or 0,8,

or

whe

n a

unit

shou

ld h

ave

an a

ctiv

atio

n of

0,

but h

as a

n ac

tivat

ion

of 0

,1 or 0,2,

Cro

ss-e

ntro

py e

rror

s of

bet

wee

n 15

and

20

occu

r w

hen

one

ofthe role/filler pairs is incorrect, For example, if

teac

her

wer

e su

ppos

ed to

be

the

agen

t, bu

t the

network activates

busd

r;ve

ran

err

or o

f abo

ut 1

5 w

ould

res

ult,

230

F, S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

For these unambiguous se

nten

ces,

the

cros

s-en

trop

y, s

umm

ed o

ver

con-

stitu

ents

, ave

rage

d 3.

9 pe

r se

nten

ce, A

noth

er m

easu

re o

f pe

rfor

man

ce is

the

num

ber

of ti

mes

an

outp

ut u

nit t

hat s

houl

d be

on

is le

ss a

ctiv

e th

an a

n ou

tput

unit

that

sho

uld

be o

ff. T

he id

ea b

ehin

d th

is m

easu

re is

that

as

long

as

the

corr

ect u

nit w

ithin

any

set

, suc

h as

peo

ple

or g

ende

r, is

the

mos

t act

ive

, it c

anw

in a

com

petit

ion

with

the

othe

r un

its in

that

set

. Che

ckin

g th

at a

ll of

the

corr

ect u

nits

are

mor

e ac

tive

than

any

of t

he in

corr

ect u

nits

is a

qui

ck, a

ndco

nser

vativ

e, w

ay o

f ca

lcul

atin

g th

is m

easu

re. A

n in

corr

ect u

nit w

as m

ore

active in 14 out of the

17W

. po

ssib

le c

ases

, or

on 0

.8%

of t

he o

ppor

tuni

ties.

The

14

erro

rs w

ere

i:lis

trib

uted

ove

r 8

of th

e 55

sen

tenc

es..

In

5 of

the

8se

nten

ces,

the

erro

r in

volv

~d

the

inco

rrec

t ins

tant

iatio

n of

the

spec

ific

conc

ept

or a

fea

ture

of

that

con

cept

, ref

erre

d to

by

a va

gue

wor

d. T

wo

othe

r er

rors

invo

lved

the

inco

rrec

t act

ivat

ion

of th

e co

ncep

t rep

rese

ntin

g a

non

vagu

e w

ord.

In each case, the in

corr

ect c

once

pt was similar to the correct concept.

The

refo

re, e

rror

s w

ere

not r

ando

m; t

hey

invo

lved

the

mis

activ

atio

n of

a s

imila

rco

ncep

t or

the

mis

activ

atio

n of

a f

eatu

re o

f a

sim

ilar

conc

ept.

The

err

ors

in th

ere

mai

ning

sen

tenc

e in

volv

ed th

e in

corr

ect a

ssig

nmen

t of

them

atic

rol

es in

pa

ssiv

e, r

ever

sibl

e se

nten

ce: "

Som

eone

hit

the

pitc

her" (see the section on

lear

ning

for

a d

iscu

ssio

n of

this

pro

blem

).A

dditi

onal

pra

ctic

e, o

f co

urse

, im

prov

ed th

e m

odel

's p

erfo

rman

ce. I

mpr

ove-

men

t is

slow

, how

ever

, bec

ause

the

sent

ence

s pr

oces

sed

inco

rrec

tly a

re r

ela-

tively rare. After a total of 630

000

tria

ls, t

he n

umbe

r of

sentences having a

cros

s-en

trop

y hi

gher

than

15

drop

ped

from

3 to

1. T

he n

umbe

r of

err

ors

drop

ped

from

14

to 1

1.

2. P

erfo

rman

ce o

n sp

ecifi

c ta

sks

Our

spe

cific

inte

rest

was

to d

evel

op a

pro

cess

or th

at c

ould

cor

rect

ly p

erfo

rmse

vera

l im

port

ant l

angu

age

com

preh

ensi

on ta

sks.

Fiv

e ty

pica

l sen

tenc

es w

ere

draw

n fr

om th

e co

rpus

to te

st e

ach

proc

essi

ng ta

sk. T

he c

ateg

orie

s an

d on

eex

ampl

e se

nten

ce f

or e

ach

are

pres

ente

d in

Tab

le 1

. The

par

enth

eses

den

ote

the

impl

icit

, to

be in

ferr

ed, role.

Table 1

Tas

k ca

tego

ries

Cat

egor

yE

xam

ple

Role assignment

Active semantic

Act

ive

synt

actic

Pass

ive

sem

antic

Pass

ive

synt

actic

Wor

d am

bigu

ityC

once

pt in

stan

tiatio

nR

ole

elab

orat

ion

The

sch

oolg

irl s

tirre

d th

e ko

ol-a

id w

ith a

spo

on,

The

bus

driv

er g

ave

the

rose

to th

e te

ache

r,T

he b

all w

as h

it by

the

pitc

her,

The

bus

driv

er w

as g

iven

the

rose

by

the

teac

her,

The

pitc

her

hit t

he b

at w

ith th

e ba

t,T

he te

ache

r ki

ssed

som

eone

,T

he te

ache

r at

e th

e so

up (

with

a s

poon

),

common
Pencil
common
Pencil
common
Pencil
Page 8: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

The

fir

st c

ateg

ory

invo

lves

rol

e as

sign

men

t. T

he c

ateg

ory

was

div

ided

into

four sub-categories based on the type of information available to help assign

the

corr

ect t

hem

atic

rol

es to

con

stitu

ents

, Sen

tenc

es in

the

activ

e se

man

ticgr

oup

cont

ain

sem

antic

info

rmat

ion

that

can

hel

p as

sign

rol

es, I

n th

e ex

ampl

efr

om T

able

I, o

f th

e co

ncep

ts r

efer

red

to in

the

sent

ence

, only the schoolgirl

can

play

the

role

of a

n ag

ent o

f stir

ring,

The

net

wor

k ca

n th

eref

ore

use

that

sem

antic

info

rmat

ion

to a

ssig

n sc

hool

girl

to th

e ag

ent r

ole,

Sim

ilarly

, koo

l-aid

is s

omet

hing

that

can

be

stirr

ed, b

ut c

anno

t stir

or

be u

sed

to s

tir s

omet

hing

else

, Aft

er e

ach

sent

ence

was

pro

cess

ed, t

he s

ente

nce

gest

alt w

as p

robe

d w

ithth

e fi

ller

half

of

each

rol

e/fi

ller

pair

, The

net

wor

k th

en h

ad to

com

plet

e th

epa

ir b

y fi

lling

in th

e co

rrec

t the

mat

ic r

ole,

For

eac

h pa

ir, i

n ea

ch s

ente

nce

, the

unit

repr

esen

ting

the

corr

ect r

ole

was

the

mos

t act

ive,

Sen

tenc

es in

the

pass

ive

sem

antic

cat

egor

y ar

e pr

oces

sed

equa

lly w

ell,

Of

cour

se th

e se

man

tic k

now

l-ed

ge n

eces

sary

to p

erfo

rm th

is ta

sk is

nev

er provided in the input or pro-

gram

med

into

the

netw

ork,

Ins

tead

, it m

ust b

e de

velo

ped

inte

rnal

ly in

the

sent

ence

ges

talt

as th

e ne

twor

k le

arns

to p

roce

ss s

ente

nces

,Sy

ntac

tic in

form

atio

n do

es n

ot h

ave

to b

e us

ed in

thes

e ca

ses;

the

sem

antic

cons

trai

nts

suff

ice.

In

fact

, if

the

surf

ace

loca

tion

of th

e co

nstit

uent

s is

rem

oved

from

the

inpu

t , th

e ro

les

are

still

ass

igne

d co

rrec

tly, F

urth

er, i

f th

e co

nstit

uent

sar

e pr

esen

ted

in d

iffer

ent o

rder

s, t

he a

ctiv

atio

n va

lues

in th

e ou

tput

are

affe

cted

onl

y sl

ight

ly. S

ente

nce

proc

essi

ng th

at c

an r

ely

on s

eman

tic in

form

a-tio

n, t

here

fore

, ess

entia

lly d

oes ,

thou

gh c

onfu

sing

the

synt

ax a

ppea

rs to

hav

e a

slig

ht c

orru

ptin

g ef

fect

,T

he r

elat

ive

stre

ngth

s of

syn

tact

ic a

nd s

eman

tic c

onst

rain

ts a

re d

eter

min

edby

thei

r re

liabi

lity

in th

e tr

aini

ng c

orpu

s, T

he m

ore

relia

ble

a co

nstr

aint

, the

mor

e po

tent

its

infl

uenc

e in

pro

cess

ing,

Thi

s ef

fect

of

the

corp

us o

n la

ter

proc

essi

ng is

als

o fo

und

in n

atur

al la

ngua

ges,

Wor

d or

der

in E

nglis

h is

ver

yre

liabl

e an

d is

a v

ery

stro

ng c

onst

rain

t on

the

mea

ning

of

a se

nten

ce, I

n It

alia

n,ho

wev

er, w

ord

orde

r is

less

rel

iabl

e an

d is

a m

uch

wea

ker

cons

trai

nt w

hich

can

be over-ridden by semantic co

nstr

aint

s, C

onse

quen

tly, '

the

sent

ence

"T

hepe

ncil

kick

ed th

e co

w" in English is

take

n to

mea

n th

at th

e pe

ncil

did

the

kick

ing

beca

use

of th

e w

ord

orde

r , w

hile

in I

talia

n it

is ta

ken

to m

ean

that

the

cow

did

the

kick

ing

beca

use

of th

e se

man

tics

of th

e si

tuat

ion

(17)

,T

o pr

oces

s se

nten

ces

in th

e ac

tive

and

pass

ive

synt

actic

cat

egor

ies

, how

ever

the

netw

ork

cann

ot r

ely

entir

ely

on s

eman

tic c

onst

rain

ts :t

o as

~ign

them

atic

role

s, S

ente

nces

in th

ese

cate

gori

es w

ere

crea

ted

by in

clud

ing in the corpus

pair

s of

rev

ersi

ble

even

ts, s

uch

as th

e bu

sdri

ver

givi

ng a

ros

e to

the

teac

her

and

the

teac

her

givi

ng a

rose to the busdriver, Both of these events w

ere

trai

ned

with

equ

al f

requ

ency

. With

out a

dif

fere

nce

in f

requ

ency

, the

re is

no

sem

antic

reg

ular

ity to

hel

p pr

edic

t whi

ch o

f th

e tw

o ev

ents

a s

ente

nce

refe

rs to

,T

he m

odel

mus

t rel

y on

syn

tact

ic in

form

atio

n, s

uch

as w

ord

orde

r , to assign

the

them

atic

rol

es, P

assi

ve s

ente

nces

fur

ther

com

plic

ate

proc

essi

ng b

y m

akin

gw

ord

orde

r , b

y its

elf ,

unp

redi

ctiv

e, T

he p

ast p

artic

iple

and

the

"" preposi-

231

232

F, S

T, J

OH

N A

ND

J.L

. McC

LELL

AN

D

tion

prov

ide

cues

des

igna

ting

the

pass

ive,

but

in th

emse

lves

do

not c

ue w

hich

pers

on p

lays

whi

ch r

ole

eith

er. T

he w

ord

orde

r in

form

atio

n m

ust b

e us

ed in

conj

unct

ion

with

the

pass

ive

cues

to d

eter

min

e th

e co

rrec

t rol

e as

sign

men

ts,

Whe

n se

nten

ces

in th

e sy

ntac

tic c

ateg

orie

s w

ere

test

ed, f

or e

ach

role

/fill

erpa

ir in

eac

h te

st s

ente

nce,

the

corr

ect r

ole

was

the

mos

t act

ive,

Fig

ure

4provides an example of role assignment in the semantic and sy

ntac

ticca

tego

ries

,T

he r

emai

ning

thre

e ca

tego

ries

invo

lve

the

use

of c

onte

xt to

hel

p sp

ecify

the

conc

epts

ref

erre

d to

in a

sen

tenc

e, S

ente

nces

in th

e w

ord

ambi

guity

cat

egor

ycontain one or m

ore

ambi

guou

s w

ords

, Aft

er processing a sentence, the

netw

ork

was

pro

bed

with

the

role

hal

f of

eac

h ro

le/f

iller

pai

r, T

he o

utpu

tpa

ttern

s fo

r th

e fil

lers

wer

e th

en e

xam

ined

, Fig

ure

5 pr

ovid

es a

n ex

ampl

ese

nten

ce w

ith a

mbi

guou

s w

ords

, For

all

pair

s in

eac

h te

st s

ente

nce,

the

corr

ect

fille

r w

as th

e m

ost a

ctiv

e,Disambiguation requires the competition and cooperation of co

nstr

aint

sfr

om b

oth

the

wor

d an

d its

con

text

. Whi

le th

e w

ord

itsel

f cue

s tw

o di

ffer

ent

inte

rpre

tatio

ns, t

he c

onte

xt f

its o

nly

one,

In

"The

pitc

her

hit t

he b

at w

ith th

eba

t

" "

pitc

her"

cues both

cont

aine

r an

d ba

ll-pl

ayer

, T

he c

onte

xt c

ues

both

ball-

play

er

and

busd

rive

r be

caus

e th

e m

odel

has

see

n se

nten

ces

invo

lvin

g bo

thpe

ople

hitt

ing

bats

, All

the

constraints supporting

ball-

play

er

com

bine

, and

agen

t ac

tion

CJ

patie

nt C

Jin

stru

men

t CJ

loca

tion

CJ

co-a

gent

CJ

co-p

atie

nt C

Jre

cipi

ent C

Jsc

hool

girl

The

sch

oolg

irl s

tirre

d th

e ko

ol-a

ld w

ith s

spo

on.

agen

t CJ

agen

t CJ

actio

n ac

tion

CJ

patie

nt C

J pa

tient

in

stru

men

t CJ

inst

rum

ent C

Jlo

catio

n C

J lo

catio

n C

Jco

-age

nt C

J co

-age

nt C

Jco

-pat

ient

CJ

co-p

atie

nt C

Jre

cipi

ent C

J re

cipi

ent C

Jst

irre

d ko

ol-s

ld

agen

t CJ

actio

n C

Jpa

tient

CJ

inst

rum

ent

loca

tion

CJ

co-a

gent

CJ

co-p

atie

nt C

Jre

cipi

ent C

Jsp

oon

The

bus

drlv

er w

ss g

iven

the

rose

by

the

tesc

her.

agen

t CJ

agen

t CJ

agen

t ac

tion

CJ

actio

n ac

tion

patie

nt

patie

nt

patie

nt

inst

rum

ent

inst

rum

ent

inst

rum

ent

loca

tion

loca

tion

loca

tion

co-a

gent

co

-age

nt

co-a

gent

co

-pat

ient

0 co-

patie

nt 0 co-

patie

nt

reci

pien

t re

cipi

ent

reci

pien

t CJ

bus driver wss given

rose

(nou

n)

agen

t ac

tion

CJ

patie

nt C

Jin

stru

men

t CJ

loca

tion

CJ

co-a

gent

CJ

co-p

atie

nt C

Jre

cipi

ent C

J

tesc

her

Fig.

4, R

ole

assi

gnm

ent.

Aft

er a

sen

tenc

e is

pro

cess

ed, t

he n

etw

ork

is p

robe

d w

ith th

e fi

ller

half

of

each

rol

e/fi

ller

pair

. The

act

ivat

ion

over

a s

ubse

t of

the

them

atic

rol

e un

its is

dis

play

ed. T

he fi

rst

sentence contains semantic information useful for role assignment. while the second sentence

contains only syntactic information useful for role assignment.

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 9: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

233

The

pitc

her

hit t

he b

at w

ith th

e ba

t.

pers

on

adul

l CJ

child

m

ale

fem

ale

CJ

busd

river

CJ

teac

her

CJ

scho

olgi

rl C

Jpi

lche

r(pe

rson

) ag

ent a

ctio

n pa

tient

Inst

rum

ent

Fig.

5, W

ord

disa

mbi

guat

ion.

The

sen

tenc

e "T

he p

itche

r hi

t the

bat

with

the

bat"

is p

roce

ssed

by

the

netw

ork,

The

net

wor

k is

then

pro

bed

with

eac

h th

emat

ic r

ole

in th

e ev

ent,

The

act

ivat

ion

over

a su

bset

of t

he fi

llers

is d

ispl

ayed

. The

net

wor

k co

rrec

tly di

sam

bigu

ates

eac

h w

ord,

ale

CJ

kiss

ed C

Jhi

t .:J

thre

w(t

osse

d) C

J

pers

on C

Jth

ing

food

CJ

bat(animal) .:J

bat(

base

ball)

pers

on C

Jth

ing

food

CJ

bat(

anim

al)

CJ

bat(baseball) .:J

toge

ther

they

win

the

com

petit

ion

for

the

inte

rpre

tatio

n of

the

sent

ence

, As

can

be s

een

from

the

pres

ent e

xam

ple,

eve

n w

hen

seve

ral w

ords

of

a se

nten

ceare ambiguous, the event which they support in co

mm

on d

omin

ates

the

disp

arat

e ev

ents

that

they

sup

port

indi

vidu

ally

, The

pro

cess

ing

of b

oth

in-

stances of "ba

t" w

ork

sim

ilarl

y: th

e w

ord

and

the

cont

ext m

utua

lly s

uppo

rt th

eco

rrec

t int

erpr

etat

ion,

Con

sequ

ently

, the

fina

l int

erpr

etat

ion

of e

ach

wor

d fit

stogether into a globally consistent event.

Con

cept

inst

antia

tion

wor

ks s

imila

rly, T

houg

h th

e w

ord

cues

a n

umbe

r of

mor

e sp

ecif

ic c

once

pts,

onl

y on

e fit

s th

e co

ntex

t. A

gain

, the

con

stra

ints

from

the

wor

d an

d fr

om th

e co

ntex

t com

bine

to p

rodu

ce a

uni

que,

spec

ific

inte

rpre

-

tatio

n of

the

term

, As

with

the

disa

mbi

guat

ion

task

, eac

h te

st s

ente

nce

was

processed, and then the ne

twor

k w

as p

robe

d w

ith th

e ro

le h

alf

of e

ach

role

/fill

er p

air,

The

out

put f

iller

pat

tern

s w

ere

exam

ined

to s

ee if

the

corr

ect

conc

ept a

nd s

eman

tic f

eatu

res

wer

e in

stan

tiate

d (s

ee F

ig, 6

). I

n ea

ch c

ase,

the

corr

ect c

once

pt a

nd fe

atur

es w

ere

the

mos

t act

ive,

Dep

endi

ng u

pon

the

sent

ence

, how

ever

, the context m

ay o

nly

part

ially

cons

trai

n th

e in

terp

reta

tion,

Suc

h is

the

case

in "

The

teac

her

kiss

ed s

omeo

ne,

Som

eone

" co

uld

refe

r to

any

of

the

four

peo

ple

foun

d in

the

corp

us, S

ince

, in

the network's

expe

rien

ce, f

emal

es o

nly

kiss

mal

es, t

he c

onte

xt c

onst

rain

s th

einterpretation of "so

meo

ne" to be either the busdriver or the pitcher, b

ut n

o

further, Consequently, the model can activate the

mal

e an

d pe

rson

fe

atur

es o

f

the patient while leaving the units representing

busd

rive

r an

d pi

tche

r on

ly

partially active, The features

adul

t an

d ch

ild

are

also

par

tially

and

equ

ally

activ

e be

caus

e th

e bu

sdri

ver

is a

n ad

ult w

hile

the

pitc

her

is a

chi

ld (

see

Fig,

6).

Whi

le

pitc

her

is s

light

ly m

ore

activ

e in

this

exa

mpl

e, n

eith

er is

act

ivat

ed a

bove

5 (s

ee th

e se

ctio

n on

am

bigu

ous

sent

ence

s fo

r an

exp

lana

tion

of th

e di

ffer

-en

ce in

act

ivat

ions

), In

gen

eral

, the

mod

el is

cap

able

of

infe

rrin

g as

muc

h

info

rmat

ion

as th

e ev

iden

ce p

erm

its: t

he m

ore

evid

ence

, the

mor

e sp

ecifi

c th

ein

fere

nce,

Wor

d di

sam

bigu

atio

n ca

n be

see

n as

one

type

of

this

general inference

234

F. S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

The

sch

oolg

irl s

prea

d so

met

hing

with

a k

nlle

.pe

rson

CJ

icec

ream

CJ

plac

e C

J cr

acke

rs C

Jth

ing

jelly

fo

od

iced

-tea

CJ

stea

k C

J ko

ol-a

id

soup

pa

tient

The

teac

her

kiss

ed s

omeo

ne.

pe,r

son

busd

rive

r adult lEI

teac

her

CJ

child lEI

pilc

her(

pers

on)

lEI

male .:J

scho

olgi

rl

lem

ale

patie

nt

Fig, 6, Concept instantiation, The network has learned that

jelly

is always the patient of

spre

ad,

Whe

n th

e ne

twor

k processes "T

he s

choo

lgirl

spr

ead

som

ethi

ng w

ith a

kni

fe," it instantiates

som

ethi

ng" as

jelly

, For the sentence "T

he te

ache

r ki

ssed

som

eone

," the network pa

rtia

lly

inst

antia

tes

"som

eone

" as a

mal

e an

d pe

rson

, and activates both

pilc

her

and

busd

rive

r pa

rtia

lly,

proc

ess,

The

onl

y di

ffer

ence

is th

at f

or ambiguous words, both the general

concept and the sp

ecif

ic f

eatu

res

diff

er b

etw

een

the

alte

rnat

ives

, whi

le f

orvague words

, the

gen

eral

con

cept

is th

e sa

me

and

only

som

e of

the

spec

ific

feat

ures

diff

er,

Fina

lly, s

ente

nces

in th

e ro

le e

labo

ratio

n ca

tego

ry te

st th

e m

odel

's a

bilit

y to

infe

r th

emat

ic r

oles

not

men

tione

d in

the

inpu

t sen

tenc

e. F

or e

xam

ple,

in "

The

teacher ate the soup," no instrument is mentioned, yet a spoon can

infe

rred

. For

eac

h te

st s

ente

nce,

aft

er th

e se

nten

ce w

as p

roce

ssed

, the

net

wor

kw

as p

robe

d w

ith th

e ro

le h

alf

of th

e to

-be-

infe

rred

rol

e/fi

ller

pair

. The

cor

rect

fille

r w

as th

e m

ost a

ctiv

e in

eac

h ca

se, F

igur

e 7

prov

ides

an

exam

ple.

For

rol

e

The

teac

her

ate

the

soup

., The schoolgirl ate.

pers

on

plac

e C

Jth

ing

lood

st

eak

soup

ic

ecre

am

crac

kers

ic

ed-te

a D

kool

-aid

patie

nt

pers

on

plac

e th

ing

lood

ut

ensi

l kn

ife C

Jsp

oon

Inst

rum

ent

Fig, 7, Role elaboration, After processing the sentence "T

he te

ache

r at

e th

e so

up,"

the

netw

ork

is

prob

ed w

ith th

e in

stru

men

t rol

e, T

he f

iller

act

ivat

ions

are

disp

laye

d, T

he n

etw

ork

corr

ectly

infe

rs

spoo

n,

For "T

he s

choo

lgirl

ate

," th

e m

odel

mus

t inf

er a

pat

ient

, Bec

ause

the

scho

olgi

rl is

like

ly to

eat a

var

iety

of f

oods

, no

part

icul

ar fo

od is

wel

l act

ivat

ed.

common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 10: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

235

elab

orat

ion

, the

con

text

alo

ne p

rovi

des

the

cons

trai

nts

for

mak

ing

the

infe

r-en

ce. E

xtra

rol

es th

at a

re v

ery

likel

y w

ill b

e in

ferr

ed s

tron

gly.

Whe

n th

e ro

les

are

less

like

ly, o

r co

uld

be f

iIle

d by

more than one concept

, the

y ar

e on

lyw

eakl

y in

ferr

ed.

As

it st

ands

, the

re is

not

hing

to k

eep

the

netw

ork

from

gen

eral

izin

g to

infe

rex

tra

role

s fo

r ev

ery

sent

ence

, eve

n fo

r ev

ents

in w

hich

thes

e ro

les

mak

e no

sense. For instance, i

n "T

he b

usdr

iver

dra

nk th

e ic

ed-t

ea, " no instrument

shou

ld b

e in

ferr

ed, y

et th

e ne

twor

k in

fers

kni

fe b

ecau

se o

f its

ass

ocia

tion

with

the busdriver. It appears that since the busdriver uses a knife in m

any

even

tsab

out e

atin

g, th

e ne

twor

k ge

nera

lizes

to in

fer

the

knife

as

an in

stru

men

t for

his

drin

king

. How

ever

, in

even

ts f

urth

er r

emov

ed f

rom

eat

ing,

inst

rum

ents

are

not

infe

rred

. For

exa

mpl

e, in

"T

he b

usdr

iver

ros

e"

no in

stru

men

t is

activ

ated

. It

appe

ars,

then

, tha

t gen

eral

izat

ion

of r

oles

is a

ffec

ted

by th

e de

gree

of

sim

ilari

tybe

twee

n ev

ents

. Whe

n ev

ents

are

sim

ilar ,

ela

bora

tive

role

s m

ay b

e ge

nera

l-iz

ed. W

hen

even

ts a

re d

istin

ct, r

oles

do

not g

ener

aliz

e, a

nd th

e m

odel

has

no

reas

on to

act

ivat

e an

y pa

rtic

ular

fiI

ler

for

a ro

le.

3. Immediate update

As

each

constituent is processed, the information it

conv

eys

mod

ifie

s th

ese

nten

ce g

esta

lt an

d st

reng

then

s th

e in

fere

nces

it s

uppo

rts.

But

the

begi

nnin

gof

a s

ente

nce

may

not

alw

ays

accu

rate

ly p

redi

ct it

s ev

entu

al f

uIl m

eani

ng. F

orex

ampl

e, i

n "T

he a

dult

ate

the

stea

k w

ith d

aint

ines

s" the identity of the adult

is in

itiaI

ly u

nkno

wn.

Aft

er "

The

adu

lt at

e" has been processed

busd

rive

r an

dte

ache

r are equaIly active. After processing "T

he a

dult

ate

the

stea

k" the

model guesses that the agent is the

busd

rive

r si

nce

stea

k is

typi

caIly

eat

en b

ybu

sdri

vers

. At t

his

poin

t , th

e m

odel

has

suf

fici

ent i

nfor

mat

ion

to in

stan

tiate

the adult" to be the

busd

rive

r.

Alo

ng w

ith th

is in

fere

nce

gust

o is

infe

rred

as

the manner of eating, since busdrivers eat with gu

sto.

Her

e th

e m

odel

dem

onst

rate

s its

abi

lity

to in

fer

addi

tiona

l the

mat

ic r

oles

.T

he m

odel

has

, at t

his

poin

t, be

en le

d do

wn

the

gard

en p

ath

tow

ard

anul

timat

ely

inco

rrec

t int

erpr

etat

ion

of th

e se

nten

ce. T

he n

ext c

onst

ituen

t pro

-ce

ssed

, "

with

dai

ntin

ess,

" on

ly f

its w

ith th

e te

ache

r an

d th

e sc

hool

girl

. Sin

ceth

e se

nten

ce s

peci

fies

an a

dult

, the agent must be the

teac

her.

The model must

revise its representation of the ev

ent t

o fit

with

. the

new

information by

de-a

ctiv

atin

g bu

sdri

ver

and activating

teac

her

(see

FIg

. 8).

In general

, as

each

con

stitu

ent i

s pr

oces

sed,

the

info

rmat

ion

it explicitly

conveys is added to the re

pres

enta

tion

of th

e se

nten

ce a

long

with

impl

icit

information implied by the constituent in the current si

tuat

ion.

Whe

n th

eev

iden

ce is

am

bigu

ous

and

supp

orts

may

con

flic

ting

infe

renc

es (

such

as

afte

rT

he a

dult

ate"

has

bee

n pr

oces

sed)

all

the

infe

renc

es a

re w

eakl

y ac

tivat

ed in

the

sent

ence

ges

talt.

Whe

n ne

w e

vide

nce

sugg

ests

a d

iffe

rent

inte

rpre

tatio

n,th

e se

nten

ce g

esta

lt is

rev

ised

.

236

F, S

T. J

OH

N A

ND

j,L.

McC

LELL

AN

D

The

adu

b at

e th

e st

eak

with

dai

ntin

ess,

Sen

tenc

e G

esta

lt A

cllv

allo

nsR

olel

Fl1I

er A

ctiv

atio

nsun

itag

ent

I:J

pers

onaO

Ob

child

mal

eID

..:

Ife

mal

e..:

Ibu

s dr

iver

..:I

teac

her

..:I

actio

nI:

JI:

JI:

Jat

esh

otI:

JI:

Jdr

ove~

tran

s)

drove motiv)

patI

ent

pers

onad

ub..:

Ich

ildbu

s dr

iver

scho

olgi

rl..:

I

..:I

stea

kso

upcr

acke

rsad

verb

gust

oID

..:

I..:

Ipl

easu

reda

intin

ess

..:I

Fig.

8, T

he s

eque

ntia

l pro

cess

ing

of a

gar

den-

path sentence, After "

the

stea

k" h

as b

een

proc

esse

d,the network instantiates "the adult" with the concept

busd

r;ve

r,

Whe

n "w

ith d

aint

ines

s" is

proc

esse

d, th

e ne

twor

k m

ust r

eint

erpr

et "the adult" to mean

teac

her.

4. Ambiguous se

nten

ces

The

am

bigu

ous

sent

ence

s in

the

test

set

wer

e te

sted

sep

arat

ely.

As

note

dab

ove, a

n ambiguous sentence has more than one co

nsis

tent

inte

rpre

tatio

n.Fo

r ex

ampl

e, th

e ad

ult i

n th

e se

nten

ce, "

The

adu

lt dr

ank

the

iced

-tea in the

livin

g-ro

om" can be instantiated with either

busd

rive

r or

te

ache

r as

the

agen

t,but the sentence offers no clues that

teac

her

is th

e co

rrec

t age

nt in

this

particular sentence/event pair in the test set. In

thes

e am

bigu

ous

case

s, th

emodel should compromise and activate

busd

rive

r an

d te

ache

r pa

rtia

Ily

and

equa

Ily,

cau

sing

two

smaI

l err

ors.

Wha

t the

net

wor

k ty

pica

Ily

did,

how

ever

was

to a

ctiv

ate

one

conc

ept s

light

ly m

ore

than

the

othe

r.O

ne r

easo

n fo

r th

ese

diffe

renc

es in

act

ivat

ions

is th

e re

cent

trai

ning

his

tory

of th

e ne

twor

k. T

he s

ente

nce/

even

t pai

rs tr

aine

d m

ore

rece

ntly

hav

e a

grea

ter

impact on the w

eigh

ts a

nd, therefore, on subsequent pr

oces

sing

. Bec

ause

sele

ctio

n of

trai

ning

exa

mpl

es o

ccur

s ra

ndom

ly, s

ever

al s

ente

nces

involving a

part

icul

ar a

gent

may

occ

ur b

efor

e a

sent

ence

/eve

nt in

volv

ing

a di

ffere

nt a

gent

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 11: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

237

is tr

aine

d, S

uch

trai

ning

bia

ses

can

lead

to a

bia

s in

the

activ

atio

n of

alte

rnat

ives

in a

mbi

guou

s se

nten

ces.

We

test

ed th

is e

xpla

natio

n by

trai

ning

the

netw

ork

on s

ente

nce/

even

t pai

rs th

at c

onsi

sted

of

an a

mbi

guou

s se

nten

ce a

ndthe subordinate

, wea

kly

activ

ated

, eve

nt. F

rom

one

to th

ree

trai

ning

tria

lsw

ere

requ

ired

to b

alan

ce th

e ac

tivat

ion

of th

e su

bord

inat

e ev

ent w

ith th

at o

fth

e pr

evio

usly

dom

inan

t eve

nt.

The

sen

sitiv

ity o

f the

net

wor

k to

rec

ent t

rain

ing

on a

mbi

guou

s se

nten

ces

isdu

e to

the

dyna

mic

s of

the

activ

atio

n fu

nctio

n, B

ecau

se th

e ac

tivat

ion

func

tion

is s

igm

oida

l, it

is s

ensi

tive

to c

hang

es in

the

valu

e of

its

inpu

t whe

n th

e va

lue

isin

the

mid

dle

of it

s ra

nge,

Sin

ce e

ach

mea

ning

of

an a

mbi

guou

s se

nten

cesh

ould

be

activ

ated

par

tially

, its

inpu

ts to

the

activ

atio

n fu

nctio

n m

ust l

ie in

this

mid

dle

rang

e, C

onse

quen

tly, m

inor

cha

nges

to th

e w

eigh

ts th

at d

eter

min

eth

e in

put v

alue

will

hav

e a

maj

or im

pact

on

the activation value,

Con

vers

ely,

in th

e ex

trem

es o

f the

ran

ge o

f the

act

ivat

ion

func

tion

, cha

nges

in th

e in

put v

alue

will

hav

e lit

tle discernible effect on the activation value,

Sin

ce th

e on

e m

eani

ng o

f an

unam

bigu

ous

sent

ence

sho

uld

be a

ctiv

ated

ful

ly,

its in

puts

to th

e ac

tivat

ion

func

tion

lie in

the

extr

emes

of

the

func

tion

s ra

nge,

Min

or c

hang

es in

the

wei

ghts

, the

refo

re, w

ill h

ave

only

min

or c

hang

es o

n th

eactivation value, m

akin

g th

e pr

oces

sing

of

unam

bigu

ous

sent

ence

s ro

bust

tore

cent

trai

ning

.

5. Learning

As

the

netw

ork

lear

ns to

com

preh

end sentences correctly, a nu

mbe

r of

deve

lopm

enta

l phe

nom

ena

can

be o

bser

ved,

In fa

ct, t

he o

nly

real

failu

res

inpe

rfor

man

ce s

tem

fro

m a

dev

elop

men

tal e

ffec

t. Pr

oble

ms

in processing only

aris

e in

pro

cess

ing

infr

eque

nt a

nd ir

regu

lar

sent

ence

s, F

or e

xam

ple,

sen

tenc

esab

out t

he b

usdr

iver

eat

ing

soup

are

rar

e, T

he n

etw

ork

is s

even

tim

es m

ore

likel

y to

see

a s

ente

nce

abou

t the

bus

driv

er e

atin

g st

eak

than

eat

ing

soup

, Thi

sfr

eque

ncy

diff

eren

ce c

reat

es a

str

ong

regu

lari

ty b

etw

een

"The

bus

driv

er a

teand the concept

stea

k,

In a

sen

tenc

e ab

out t

he b

usdr

iver

eat

ing

soup

, the

wor

dso

upconstrains the patient to be

soup

, w

hile

"T

he b

usdr

iver

ate

" partially

constrains the patient to be

stea

k,

The

con

stra

ints

com

pete

for

an in

terp

reta

-tio

n of

the

sent

ence

, Whe

n th

e re

gula

ritie

s ar

e pa

rtic

ular

ly s

tron

g, th

e co

n-te

xtua

l con

stra

ints

can

win

the

com

petit

ion

and

caus

e th

e bo

ttom

-up

activ

atio

nfr

om th

e w

ord

itsel

f to

be

over

ridd

en,

Tho

ugh

this

effe

ct s

eem

s lik

e a

serio

us fl

aw, i

t is

a fl

aw th

at th

e m

odel

sha

res

with

peo

ple,

In a

n ill

umin

atin

g ex

perim

ent,

Eri

ckso

n an

d M

atts

on (

8) a

sked

subj

ects

que

stio

ns li

ke

, "

How

man

y an

imal

s of

eac

h ki

nd d

id M

oses

take

on

the

Ark

?" S

ubje

cts

typi

cally

ans

wer

ed, "

Tw

o"

desp

ite th

eir

know

ledg

e, w

hen

late

r as

ked

, tha

t Mos

es h

ad n

othi

ng to

do

with

the

Ark

, Con

stra

ints

fro

m th

eco

ntex

t ove

rwhe

lmed

the

cons

trai

nt f

rom

the

wor

d "M

oses

,E

rick

son

and

Mat

tson

als

o de

scri

be a

sec

ond

orde

r ef

fect

in th

at s

ubje

cts

will

238

F, S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

balk

whe

n as

ked,

"H

ow m

any

anim

als

of e

ach

kind

did

Nix

on ta

ke o

n th

eA

rk?"

App

aren

tly, t

he d

egre

e of

sem

antic

ove

rlap

bet

wee

n th

e co

rrec

t con

cept

and

the

foil

affe

cts

how

eas

ily s

ubje

cts

will

be

mis

led,

The

mod

el d

emon

stra

tes

the

seco

nd o

rder

eff

ect a

s w

ell,

Giv

en "

The

bus

driv

er a

te th

e ba

ll"

the

mod

elfa

ils to

act

ivat

e an

y pa

tient

. The

mod

el c

anno

t exp

licitl

y ba

lk, b

ut it

s fa

ilure

to

repr

esen

t the

sen

tenc

e ac

cura

tely

or

mis

inte

rpre

t it i

s si

mila

r to

bal

king

, Thi

s

exam

ple

sugg

ests

that

the

SG m

odel

hol

ds p

rom

ise

for

dem

onst

ratin

g in

tere

st-

ing

hum

an-l

ike

erro

rs in

com

preh

ensi

on.

In th

e m

odel

, thi

s fr

eque

ncy

or r

egul

arity

eff

ect d

imin

ishe

s w

ith tr

aini

ng: t

here

liabi

lity

of a

con

stra

int,

its p

roba

bilit

y of

cor

rect

ly p

redi

ctin

g th

e ou

tput

rath

er th

an it

s ov

eral

l fre

quen

cy, b

ecom

es in

crea

sing

ly im

port

ant.

The

wor

dso

up" perfectly predicts the concept

soup

: w

hene

ver

"sou

p" appears in a

sentence, the event contains the concept

soup

. O

n th

e other hand, the

busd

rive

r ea

ts a

var

iety

of

food

s: "the busdriver ate"

is o

nly

70%

rel

iabl

e as

apredictor of

stea

k,

With

incr

ease

d tr

aini

ng, e

ven

low

fre

quen

cy c

onst

rain

ts a

repr

actic

ed, I

f th

ey a

re r

elia

ble,

they

gai

n st

reng

th a

nd e

vent

ually

out

wei

gh m

ore

freq

uent

but

less

rel

iabl

e co

nstr

aint

s. S

imila

r de

velo

pmen

tal t

rend

s oc

cur

asch

ildre

n le

arn

lang

uage

(17

). P

rogr

ess

is s

low

, but

aft

er a

tota

l of

630

000

tria

ls

even

thes

e ve

ry in

freq

uent

and

irre

gula

r se

nten

ces

are

proc

esse

d co

rrec

tly,

The

ear

ly e

ffec

t of

freq

uenc

y w

orks

for

syn

tact

ic c

onst

rain

ts a

s w

ell a

sse

man

tic c

onst

rain

ts, A

s sh

own

in F

ig, 9

, the

mod

el m

aste

rs s

ente

nces

in th

eactive voice sooner than it masters sentences in the passive voice, This

diff

eren

ce is

due

to th

e gr

eate

r fr

eque

ncy

of s

ente

nces

in th

e ac

tive

voic

e in

the

corp

us, W

hile

14

sent

ence

fra

mes

use

the

activ

e vo

ice,

onl

y 4

fram

es u

se th

epa

ssiv

e vo

ice,

Aft

er 3

3000

0 tr

ials

, tho

ugh

, bot

h vo

ices

are

han

dled

cor

rect

ly.

The

syn

tact

ic c

onst

rain

ts d

evel

op m

ore

slow

ly th

an th

e re

gula

r se

man

ticco

nstr

aint

s, Y

et w

hile

eve

ry s

ente

nce

cont

ains

wor

d or

der

cons

trai

nts,

onl

y an

occa

sion

al s

ente

nce

will

con

tain

a p

artic

ular

sem

antic

con

stra

int.

Bas

ed o

n th

e

100

Perc

ent

Correct 50

Perf

orm

ance

c::::

::J a

ctiv

e sy

ntac

ticE

im p

assi

ve s

ynta

cllc

c::J regular semantic

Irregular Bemanllc

Fig.

9, D

evel

opm

ent o

f pe

rfor

man

ce, a

ctiv

e sy

ntac

tic-T

he b

usdr

iver

kis

sed

the

teac

her;

pas

sive

synt

actic

-The

teac

her

was

kis

sed

by th

e bu

sdri

ver;

reg

ular

sem

antic

-The

bus

driv

er a

te th

e st

eak;

irre

gula

r se

mam

ic-

The

bus

driv

er a

te th

e so

up, C

orre

ct p

erfo

rman

ce m

eans

the

corr

ect c

once

pts

are

mor

e ac

tive

than

inco

rrec

t con

cept

s,

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 12: words of a sentence may be Artificial Intelligence

SEN

TE

NC

E C

OM

PRE

HE

NSI

ON

239

freq

uenc

y of

pra

ctic

e w

ith p

artic

ular

con

stra

ints

, then, the word order con-

stra

ints

sho

uld

be le

arne

d m

uch

earl

ier

than

the

sem

antic

con

stra

ints

, Tw

ocaveats to the frequency rule help explain this result. First, the syntactic

cons

trai

nts

invo

lve

the

conj

unct

ion

of w

ord

orde

r w

ith th

e pr

esen

ce o

r ab

senc

eof the passive markers, a

nd s

uch

conj

unct

ions

are

dif

ficu

lt to

lear

n, S

econ

dle

arni

ng te

nds

to g

ener

aliz

e ac

ross

sem

antic

ally

sim

ilar

wor

ds, s

o tr

aini

ng o

non

e w

ord

can

faci

litat

e th

e le

arni

ng o

f sim

ilar

wor

ds.

The

larg

e nu

mbe

r of

trai

ning

tria

ls r

equi

red

to a

chie

ve g

ood

perf

orm

ance

led

us to

look

for

way

s to

impr

ove

the

spee

d of

lear

ning

. One

exp

erim

ent c

onsi

sted

of r

emov

ing

the

firs

t hid

den

laye

r fr

om th

e ar

chite

ctur

e. T

he in

put a

nd th

epr

evio

us s

ente

nce

gest

alt t

hen

fed

dire

ctly

into

the

sent

ence

ges

talt

laye

r. I

tw

as h

oped

that

by making the network one layer more shallow, and by

redu

cing

the

num

ber

of w

eigh

ts, e

rror

cor

rect

ion

wou

ld p

roce

ed m

ore

quic

kly,

Tra

inin

g of

the

mod

ifie

d ne

twor

k w

as s

topp

ed w

hen

it be

cam

e ap

pare

nt th

atth

e m

odel

did

not

lear

n th

e se

man

tic c

onst

rain

ts m

ore

quic

kly,

and

it h

ad n

otle

arne

d th

e sy

ntac

tic c

onst

rain

ts a

t all.

It i

s po

ssib

le th

at th

e ne

twor

k co

uld

still

lear

n th

e sy

ntac

tic c

onst

rain

ts g

iven

mor

e tr

aini

ng, T

he p

oint

of

the

mod

ific

a-tio

n, h

owev

er, w

as to

spe

ed le

arni

ng, a

nd th

is g

oal w

as n

ot fu

lfille

d.T

he r

easo

n fo

r th

e ut

ility

of

the

firs

t hid

den

laye

r in

com

putin

g th

e sy

ntac

ticco

nstr

aint

s lie

s in

the

conj

unct

ive

natu

re o

f the

con

stra

ints

. The

hid

den

laye

r is

usef

ul in

com

putin

g th

e co

njun

ctio

n of

wor

d or

der

and

active/passive voice

mar

kers

, With

out t

he h

idde

n la

yer , the sentence ge

stal

t wou

ld h

ave

toco

mpu

te th

e co

njun

ctio

n as

wel

l as

repr

esen

t the

mea

ning

of

the

sent

ence

.Su

ch a

rep

rese

ntat

ion

is a

ppar

ently

dif

ficu

lt fo

r th

e ne

twor

k to

fin

d.

6. Representations

While the input to the network is a local encoding w

here

eac

h w

ord

isre

pres

ente

d by

a d

iffe

rent

uni

t, th

e ne

twor

k ca

n cr

eate

inte

rnal

rep

rese

ntat

ions

that

are

dis

trib

uted

and

that

exp

licitl

y en

code

hel

pful

'sem

antic

info

rmat

ion,

The

wei

ghts

run

ning

fro

m th

e in

put l

ayer

to th

e fi

rst h

idde

n la

yer

can

be s

een

as "constraint vectors"

whi

ch d

eter

min

e ho

w e

ach

wor

d in

fluen

ces

the

evol

u-tio

n of

the

sent

ence

ges

talt.

The

se c

onst

rain

t vec

tors

are

the

mod

el' s

botto

m-

up r

epre

sent

atio

n of

eac

h w

ord,

Wor

ds th

at im

pose

sim

ilar

cons

trai

nts

shou

ldde

velo

p si

mila

r co

nstr

aint

vec

tors

. A c

lust

er analysis ~f the

" w

eigh

t vec

tors

reve

als

thei

r si

mila

rity

. Sep

arat

e cl

uste

r an

alys

es w

ere

perf

orm

ed f

or u

n-am

bigu

ous

verb

s an

d no

uns

(see

Fig

, 10)

,T

he v

erbs

clu

ster

into

a n

umbe

r of

hie

rarc

hica

l gro

ups.

One

clu

ster

con

tain

sthe consumption ve

rbs,

Ano

ther

con

tain

s st

irred

and

spr

ead.

These two

clus

ters

then

com

bine

into

a c

lust

er o

f ve

rbs

invo

lvin

g pe

ople

and

foo

d. K

isse

dhi

t , and shot formed another cl

uste

r. F

or e

ach

of these verbs there were

pass

ive-

voic

e se

nten

ces

in th

e co

rpus

, and

eac

h co

uld

take

an

anim

ate

obje

ct,

Gav

e, t

he o

nly

dativ

e ve

rb, s

tand

s ap

art f

rom

the

othe

r ve

rbs.

Thi

s cl

uste

ring

240

F. S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

refle

cts

the

sim

ilarit

y of

the case frames of the m

embe

rs o

f th

e di

ffer

ent

clus

ters

.T

he c

onst

rain

t vec

tors

of

noun

s fu

rthe

r re

flec

t sim

ilari

ties

in th

e co

nstr

aint

sth

ey im

pose

on

the

evol

ving

sen

tenc

e ge

stal

t. T

his

sim

ilari

ty is

ref

lect

ed in

two

way

s. A

s w

ith th

e ve

rbs,

sem

antic

ally

sim

ilar

wor

ds c

lust

er: a

ll of

the

peop

lecl

uste

r, a

nd d

og a

nd s

pot a

re v

ery

sim

ilar.

Wor

ds th

at o

ccur

together in the

sam

e co

ntex

t als

o ha

ve s

imila

r co

nstr

aint

vec

tors

, For

exa

mpl

e, ice cr

eam

clus

ters

with

par

k, a

nd je

lly c

lust

ers

with

kni

fe, I

n th

e co

rpus

, ice

cre

am is

always eaten in the park

, and

jelly

is a

lway

s sp

read

with

a k

nife

. The

ir s

imila

rco

nstr

aint

vec

tors

fol

low

fro

m th

e si

mila

r co

nstr

aint

s th

ey im

pose

on

the

even

tsde

scri

bed

by th

e se

nten

ces

in w

hich

they

app

ear,

7. Generalization

An

impo

rtan

t rem

aini

ng q

uest

ion

is w

heth

er th

e m

odel

is actually learning

usef

ul c

onst

rain

ts th

at it

can

app

ly to

nov

el s

ente

nces

or

whe

ther

it is

sim

ply

mem

oriz

ing

sent

ence

/eve

nt p

airs

, Sin

ce s

ente

nces

in th

e fi

rst s

imul

atio

n w

ere

high

sim

ilaril

y -

low

sim

ilaril

y 10 1

2 ga

ve

stir

red

spre

ad dran

kc:

::::

(1).., ji)..,;:+ "C:

cons

umed at

e

hit

shot

Fig.

10.

Clu

ster

ana

lyse

s. T

he a

naly

sis

com

pute

s th

e si

mila

rity

bet

wee

n th

e w

eigh

t vec

tors

lead

ing

from

eac

h in

put u

nit t

o th

e fi

rst h

idde

n la

yer,

The

mor

e si

mila

r tw

o ve

ctor

s or

clu

ster

s of

vec

tors

(in

eucl

idea

n di

stan

ce).

the

soon

er th

ey a

re c

ombi

ned

into

a n

ew c

lust

er. P

hysical distance in the

figur

e is

irre

leva

nt; o

nly

the

clus

terin

g is

impo

rtan

t. F

or in

stan

ce, "

stir

red"

is n

ot n

otab

ly m

ore

similar to "dr

ank"

than

it is

to "

ate,

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 13: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

241

high

sim

ilarit

y. lo

w s

imila

rity

pers

onch

ildic

ecre

am park

::J iiI..,;:+

soup

crac

kers

spoo

nki

tche

nje

llykn

iledo

gsp

olsl

eak

iced

-lea

livin

g-ro

om

Fig. 10 (conld,

generated randomly and none were set aside to not be trained, the fi

rst

sim

ulat

ion

cann

ot b

e us

ed to

eva

luat

e ge

nera

lizat

ion.

Tw

o ne

w c

orpo

ra w

ere

deve

lope

d to

test

the

mod

el's

gen

eral

izat

ion

abili

ties.

One

of t

he n

ew c

orpo

raw

as d

esig

ned

to te

st th

e m

odel

's a

bilit

y to

gen

eral

ize

synt

actic

reg

ular

ities

and

the

othe

r w

as d

esig

ned

to te

st it

s ab

ility

to g

ener

aliz

e se

man

tic r

egul

ariti

es. T

hem

odel

was

trai

ned

and

eval

uate

d on

eac

h co

rpus

sep

arat

ely.

1.

Synt

ax

For

syn

tax,

we

test

ed th

e m

odel

's a

bilit

y to

lear

n an

d us

e th

e sy

ntax

of

activ

ean

d pa

ssiv

e se

nten

ces.

Cou

ld th

e m

odel

lear

n to

use

the active/passive voice

mar

kers

and

the

tem

pora

l ord

er o

f th

e co

nstit

uent

s, w

ord

orde

r, p

rodu

ctiv

ely

on n

ovel

sen

tenc

es?

Thi

s is

a te

st o

f co

mpo

sitio

nalit

y. T

he m

odel

mus

t lea

rn to

com

pose

the

fam

iliar

con

stitu

ents

of

sent

ence

s in

new

com

bina

tions

. To

perf

orm

this

generalization task, the m

odel

mus

t lea

rn s

ever

al ty

pes

ofin

form

atio

n. F

irst,

it m

ust l

earn

the

conc

ept r

efer

red

to b

y ea

ch w

ord

in a

sentence: "Jo

hn" refers to

John

an

d "s

aw" refers to

saw

. Se

cond

, the

mod

el

mus

t lea

rn th

at th

e or

der

of th

e co

nstit

uent

s, w

hich

con

stitu

ent c

omes

bef

ore

the

verb

and

whi

ch a

fter,

is im

port

ant t

o as

sign

ing

the

agen

t and

pat

ient

.T

hird

, the

mod

el m

ust l

earn

the relevance of the passive markers: the past

242

F. S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

part

icip

le o

f the

ver

b an

d th

e pr

epos

ition

al p

hras

e be

ginn

ing

with

"by

.

Four

th, t

he m

odel

mus

t lea

rn to

inte

grat

e th

e or

der

info

rmat

ion

and

the

pass

ive

mar

ker

info

rmat

ion

to c

orre

ctly

ass

ign

the

them

atic

rol

es.

The

mod

el w

as tr

aine

d on

a c

orpu

s of

sen

tenc

es c

ompo

sed

of te

n pe

ople

and

ten reversible actions

, suc

h as

"John saw Mary.

" T

hese

sen

tenc

es c

ould

app

ear

in th

e ac

tive

or p

assi

ve v

oice

. The

bas

ic c

orpu

s co

nsis

ted

of a

ll 2000 sentences

(10

peop

le b

y 10

act

ions

by

10 p

eopl

e by

2 v

oice

s). B

efor

e tr

aini

ng b

egan

thou

gh, 2

50 o

f th

ese

sent

ence

s (1

2.5%

) w

ere

set a

side

for

late

r te

stin

g: th

eyw

ere

not t

rain

ed. T

he m

odel

was

then

trai

ned

on th

e re

mai

ning

sen

tenc

es.

Onc

e th

e tr

aini

ng c

orpu

s w

as m

aste

red

(afte

r 10

000

0 tr

ials

), th

e 25

0 te

stse

nten

ces

wer

e pr

esen

ted

to th

e m

odel

. The

mod

el p

roce

ssed

97%

of

thes

ese

nten

ces

corr

ectly

. In

' onl

y 11

sen

tenc

es d

id th

e m

odel

inco

rrec

tly a

ssig

n a

thematic role. In these sentences, w

hen

prob

ed w

ith o

ne o

f the

fille

rs, the

mod

el a

ctiv

ated

an

inco

rrec

t rol

e m

ore

stro

ngly

than

the

corr

ect r

ole.

2.

Sem

antic

s

Sem

antic

reg

ular

ities

may

als

o pr

ovid

e th

e ba

sis

for

gene

raliz

atio

n. W

e te

sted

the

mod

el's

abi

lity

to le

arn

som

e se

man

tic r

egul

ariti

es a

nd a

pply

them

in n

ovel

cont

exts

. The

re a

re tw

o ba

sic

gene

raliz

atio

n ef

fect

s w

e w

ould

like

to s

ee in

the

mod

el's

beh

avio

r. F

irst,

as

with

syn

tax,

the

trai

ned

mod

el s

houl

d ex

hibi

tco

mpo

sitio

nalit

y: th

e m

odel

sho

uld

be a

ble

to r

epre

sent

suc

cess

fully

sen

tenc

esit

has

neve

r se

en b

efor

e. S

econ

dly,

the

mod

el s

houl

d m

ake

pred

ictio

ns: i

tshould be able to use information presented early in

a s

ente

nce

to h

elp

itpr

oces

s su

bseq

uent

info

rmat

ion.

To

test

thes

e ge

nera

lizat

ion

effe

cts

, we

agai

n cr

eate

d a

sim

ple

corp

us f

or th

em

odel

to le

arn.

Thi

s co

rpus

con

sist

ed o

f 8

adul

ts a

nd 8

chi

ldre

n, 5

act

ions

, and

8 ob

ject

s fo

r ea

ch a

ctio

n. T

he c

orpu

s w

as arranged to make age a semantic

regu

lari

ty. F

or e

ach

actio

n, 3 objects were presented with adults, 3 were

pres

ente

d w

ith c

hild

ren,

and

2 w

ere

pres

ente

d w

ith b

oth

adul

ts a

nd c

hild

ren.

Thi

s ar

rang

emen

t cre

ated

a c

orpu

s of

400

sen

tenc

es (

8 ad

ults

by

5 ac

tions

by

5ob

ject

s pl

us 8

chi

ldre

n by

5 a

ctio

ns b

y 5

obje

cts)

. For

exa

mpl

e

Geo

rge

wat

ched

the

new

s.G

eorg

e w

atch

ed J

ohnn

y C

arso

n.G

eorg

e w

atch

ed D

avid

Let

term

an.

Geo

rge

wat

ched

Sta

r T

rek.

Geo

rge

wat

ched

the

Roa

dR

unne

r.

Bob

by w

atch

ed H

e-M

an.

Bob

by w

atch

ed S

mur

fs.

Bob

by w

atch

ed M

ight

yM

ouse

.B

obby

wat

ched

Sta

r T

rek.

Bob

by w

atch

ed th

e R

oad

Run

ner.

The

reg

ular

ity is

that

the

age

of th

e ag

ent,

in conjunction with the verb

pred

icts

a s

et o

f ob

ject

s. A

ge, t

houg

h, i

s no

t exp

licitl

y en

code

d in

the

inpu

t or

the output representations. Instead, i

t is

a "h

idde

n" feature (Unlike people in

common
Pencil
common
Pencil
Page 14: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

243

the

first

sim

ulat

ion

, peo

ple

in th

is c

orpu

s ar

e re

pres

ente

d by

a single unit, i,

loca

lly.)

, The

mod

el m

ust l

earn

whi

ch a

gent

s ar

e th

e ad

ults

and

whi

ch a

re th

ech

ildre

n ba

sed

on it

s ex

perie

nce

seei

ng th

e ob

ject

s w

ith w

hich

eac

h is

pai

red.

The

chi

ldre

n al

l wat

ch o

ne s

et o

f sh

ows,

and

the

adul

ts a

ll w

atch

ano

ther

set

.A

ge, t

here

fore

, org

aniz

es th

e se

nten

ces

into

set

s, S

ince

age

doe

s no

t app

ear

inthe input

, how

ever

, it m

ust b

e in

duce

d by

the

mod

el a

nd accessed as each

sent

ence

is p

roce

ssed

.B

efor

e th

e m

odel

was

trai

ned

on th

e corpus of sentences, 50 of the 400

sent

ence

s (1

2,5%

) w

ere

rand

omly

pic

ked

to b

e se

t asi

de a

s th

e ge

nera

lizat

ion

set.

The

mod

el was then trained on the remaining se

nten

ces,

Whe

n th

ese

sent

ence

s w

ere

proc

esse

d co

rrec

tly (

afte

r 90

000

tria

ls),

the

mod

el w

as te

sted

for

com

posi

tiona

l and

pre

dict

ive

gene

raliz

atio

n,A

gain

, to be compositional, the m

odel

sho

uld

be a

ble

to p

roce

ss c

orre

ctly

the

sent

ence

s on

whi

ch it

had

not

bee

n tr

aine

d, T

he m

odel

pro

cess

ed 8

6% o

fth

ese

sent

ence

s co

rrec

tly, O

n th

e re

mai

ning

14%

, or 7 sentences, t

he m

odel

activ

ated

the

wro

ng o

bjec

t mor

e st

rong

ly th

an it

act

ivat

ed th

e co

rrec

t obj

ect.

At t

he p

oint

in tr

aini

ng w

hen

the

mod

el w

as s

topp

ed, t

hen,

it c

ould

com

pose

nove

l sen

tenc

es r

easo

nabl

y w

ell,

Semantic generalization should allow the model to predict subsequent

info

rmat

ion,

The

re a

re tw

o ty

pes

of r

egul

ariti

es th

at th

e m

odel

sho

uld

disc

over

to h

elp

mak

e pr

edic

tions

. One

is th

e ge

nera

l reg

ular

ity th

at c

hild

ren

doch

ildre

ns

activ

ities

and

that

adu

lts d

o ad

ult a

ctiv

ities

, To

test

the

lear

ning

of

this

reg

ular

ity w

e ob

serv

ed w

hich

obj

ects

the

mod

el p

redi

cted

aft

er th

e m

odel

had processed the agent and action, Given an agent and an action

, the

mod

elsh

ould

pre

dict

the

obje

cts

appr

opri

ate

for

that

act

ion

and

an a

gent

of

that

age

.Fo

r ex

ampl

e, g

iven

the

part

ial s

ente

nce

, "

Bob

by w

atch

ed..

, ," the model

shou

ld p

redi

ct th

e ob

ject

to b

e ei

ther

He-

Man

, Sm

urfs

, Mig

hty

Mou

se, t

heR

oad

Run

ner,

or

Sta

r T

rek,

Spe

cific

ally

, thi

s re

gula

rity

sugg

ests

that

the

mod

elshould activate each of these subjects liS or 0,

Wha

t mak

es th

is a

gen

eral

izat

ion

task

is th

at s

ome

of th

e se

nten

ces

in th

eco

rpus

wer

e se

t asi

de a

nd n

ot tr

aine

d: s

ome

agen

ts w

ere

neve

r pa

ired

with

cert

ain

obje

cts,

For

exa

mpl

e, th

e ne

twor

k w

as n

ever

trai

ned

on th

e se

nten

ceB

obby

wat

ched

He-

Man

," T

o ac

tivat

e ea

ch o

bjec

t in

the

age

appr

opri

ate

set

to 0.

, the

mod

el m

ust g

ener

aliz

e, I

t mus

t generalize both to discover the

com

plet

e se

t of f

ive

child

ren

s sh

ows,

and

to associate, that set to ea

ch o

f th

eei

ght c

hild

ren,

Thi

s ge

nera

lizat

ion

can

be te

sted

by

obse

rvin

g w

heth

er H

e-M

anis

act

ivat

ed a

s a

pred

icte

d ob

ject

for

"B

obby

wat

ched

, , ,

,T

he o

ther

type

of

regu

larit

y is

spe

cific

to e

ach

agen

t. T

he fa

ct th

at th

etr

aini

ng c

orpu

s do

es n

ot c

onta

in

, "

Bob

by w

atch

ed H

e-M

an"

is a

lso

a re

gula

ri-

ty th

at th

e m

odel

can

lear

n. It

lear

ns th

at it

sho

uld

not p

redi

ct H

e-M

an a

s an

obje

ct f

or "

Bob

by w

atch

ed, .

. . " Since

, for

Bob

by, t

here

are

onl

y fo

ur s

how

sthat he actually watches, e

ach

shou

ld b

e gi

ven

an a

ctiv

atio

n le

vel o

f 1/

4 or

25, T

he m

odel

's te

nden

cy to

gen

eral

ize

by u

sing

the

gene

ral r

egul

arity

abo

ut

244

F, S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

child

ren

s vi

ewin

g ha

bits

, the

refo

re, i

s co

unte

ract

ed b

y th

e sp

ecifi

c re

gula

rity

about Bobbys personal viewing ha

bits

, The

activation of the appropriate/

untr

aine

d (e

,g, H

e-M

an f

or B

obby

) ob

ject

s sh

ould

be

a compromise between

the two competing fo

rces

, The

app

ropr

iate

/unt

rain

ed o

bjec

ts, t

hen, should

have

an

activ

atio

n so

mew

hat l

ess

than

0.2

. For

the

appr

opria

te/tr

aine

d (e

.M

ight

y M

ouse

for

Bob

by)

obje

cts,

the

activ

atio

n le

vel s

houl

d lie

bet

wee

n 0.

and

0,25

, and

the

activ

atio

n le

vel f

or th

e in

appr

opri

ate/

untr

aine

d ob

ject

s (e

,th

e ne

ws

for

Bob

by)

shou

ld b

e cl

ose

to 0

,T

he m

odel

's p

redi

ctio

ns o

n ea

ch o

f th

e fi

ve a

ctio

ns f

or f

our

of th

e pe

ople

inth

e co

rpus

wer

e ta

bula

ted.

The

activation values for the model's object

pred

ictio

ns w

ere

cate

gori

zed

into

thre

e gr

oups

: the

app

ropr

iate

/trai

ned, the

appr

opria

te/u

ntra

ined

, and

the

inap

prop

riate

/unt

rain

ed. T

he a

vera

ge a

ctiv

a-

tion

in e

ach

grou

p is

sho

wn

in T

able

2. T

he m

eans

are

sig

nific

antly

diff

eren

t

from

one

ano

ther

.T

hree

con

clus

ions

can

be

draw

n fr

om th

ese

data

. One

, the

app

ropr

iate

/tr

aine

d ob

ject

s ar

e ac

tivat

ed to

the

appr

opri

ate

degr

ee. T

he m

odel

cor

rect

ly

pred

icts

the

corr

ect s

et o

f ob

ject

s fo

r th

e ac

tion

and

the

age

of th

e ag

ent.

Tw

oth

e m

odel

generalizes in that it also ac

tivat

es a

ge-c

orre

ct o

bjec

ts it

has

not

actu

ally

see

n in

that

con

text

bef

ore.

The

sub

stan

tially

wea

ker

activ

atio

n of

thes

e ap

prop

riate

/unt

rain

ed o

bjec

ts d

emon

stra

tes

the

com

petit

ion

betw

een

the

gene

ral a

nd s

peci

fic

regu

lari

ties.

The

gen

eral

reg

ular

ity a

ctiv

ates

the

obje

cts

but t

he a

gent

-spe

cific

reg

ular

ities

red

uce

that

act

ivat

ion,

Thr

ee, t

he s

et o

fin

appr

opria

te/u

ntra

ined

obj

ects

for

each

act

ion

is tu

rned

off.

The

dif

fere

nce

in a

ctiv

atio

n be

twee

n th

e ap

prop

riate

/unt

rain

ed a

nd in

ap-

prop

riat

e/un

trai

ned

obje

cts

unde

rsta

tes

thei

r di

ffer

ence

in th

e ne

twor

k, A

s th

eactivation of a unit approaches 1 or 0, the non-linearity of the activation

func

tion

requ

ires

that

exp

onen

tially

mor

e ac

tivat

ion

be a

dded

to m

ove

clos

erto 1 or 0, A large change in the net input

, the

refo

re, w

ould

be

requ

ired

tore

duce

the

activ

atio

n le

vel o

f an

app

ropr

iate

/unt

rain

ed o

bjec

t to

that

of

anin

appr

opri

ate/

untr

aine

d ob

ject

. Con

sequ

ently

, the

two

type

s of

obj

ects

are

substantially different

, and

the

gene

ral r

egul

arity

is h

avin

g a

sign

ifica

nt im

pact

on p

redi

ctio

n,In

tere

stin

gly,

ther

e is

a tr

ade-

off d

uring training between compositional

gene

raliz

atio

n an

d pr

edic

tive

gene

raliz

atio

n, D

urin

g tr

aini

ng th

e m

odel

im-

prov

es it

s ab

ility

to c

ompo

se n

ovel

sen

tenc

es, b

ut lo

ses

the

abili

ty to

mak

ege

nera

l pre

dict

ions

, The

gen

eral

pre

dict

ions

are

lost

as

the

mod

el le

arns

the

Table 2

Sem

antic

pre

dict

ion

App

ropr

iate

ness

Tra

inin

g

Tra

ined

Unt

rain

ed

App

ropr

iate

Inap

prop

riat

e21

01I

36

(XI6

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 15: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

245

spec

ific

reg

ular

ities

of

the

trai

ning

cor

pus.

Com

posi

tiona

lity

is g

aine

d as

the

inpl

lt co

nstit

uent

s be

com

e le

xem

e-lik

e. A

s th

e ne

twor

k tr

ains

, it s

low

ly le

arns

whi

ch p

arts

of

the

inpu

t are

res

pons

ible

for

whi

ch p

arts

of

the

outp

ut. T

hene

twor

k sl

owly

hon

es th

e m

eani

ngs

of th

e in

put c

onst

ituen

ts u

ntil

they

are

esse

ntia

lly r

epre

sent

ed a

s le

xem

es.

The

mod

el's

trai

ning

pro

cedu

re e

ncou

rage

s pr

edic

tive

gene

raliz

atio

n fo

rbo

th ty

pes

of r

egul

ariti

es b

y ex

plic

itly

requ

iring

the

mod

el to

pre

dict

the

obje

ctaf

ter

proc

essi

ng th

e ag

ent a

nd a

ctio

n. R

ecal

l tha

t the

mod

el must answer

ques

tions

abo

ut th

e en

tire

sent

ence

aft

er p

roce

ssin

g ea

ch c

onst

ituen

t. T

hem

odel

lear

ns th

at a

ll of

the

untr

aine

d ob

ject

s, b

oth

appr

opria

te a

nd in

appr

op-

riat

e, s

houl

d no

t be

pred

icte

d be

caus

e th

ey n

ever

occ

ur.

It m

ay s

eem

that

the

mod

el d

iver

ges

from

hum

an b

ehav

ior

as it

red

uces

the

infl

uenc

e of

the

gene

ral r

egul

arity

in f

avor

of

the

pers

on-s

peci

fic

regu

lari

ties.

It

mus

t be

rem

embe

red

, tho

ugh

, tha

t the

mod

el is

trai

ned

on a

set

of

only

pe

ople

. If

ther

e w

ere

man

y m

ore

peop

le, t

he p

erso

n-sp

ecif

ic r

egul

ariti

es w

ould

rece

ive

less

pra

ctic

e an

d th

e ge

nera

l reg

ular

ity w

ould

rece

ive

mor

e pr

actic

e.T

his

chan

ge w

ould

mak

e th

e ag

e ge

nera

lizat

ion

muc

h st

rong

er.

The

mod

el c

an c

lear

ly le

arn

and

prod

uctiv

ely

appl

y re

gula

ritie

s fr

om it

str

aini

ng c

orpu

s to

nov

el s

ente

nces

. How

ever

, the

deg

ree

to w

hich

the

mod

elco

mpo

ses

nove

l sen

tenc

es, 9

7% in

the

synt

actic

cor

pus,

but

onl

y 86

% in

the

semantic corpus, i

s fa

r fr

om th

e de

gree

we

expe

ct f

rom

peo

ple.

Aga

in, t

his

perf

orm

ance

may

be

due

to th

e si

ze a

nd c

onte

nt o

f the

cor

pus.

The

effe

cts

ofth

ese

fact

ors

on g

ener

aliz

atio

n ar

e no

t wel

l und

erst

ood.

Our

sim

ulat

ion

shou

ldbe

take

n on

ly a

s an

indi

catio

n th

at s

ome

degr

ee o

f com

posi

tiona

lity

can

beac

quir

ed.

As

dem

onst

rate

d by

lear

ning

the

rule

for

the passive voice construction, it

can

lear

n sy

ntac

tic r

egul

ariti

es, a

nd a

s de

mon

stra

ted

by le

arni

ng th

e m

eani

ngof each word and the hidden-fe

atur

e ag

e, it

can

lear

n se

man

tic r

egul

ariti

es.

Gen

eral

izat

ion

base

d on

thes

e re

gula

ritie

s ta

kes

two

form

s. W

hen

the

regu

lari

-ty consists of input/output pairings, g

ener

aliz

atio

n le

ads

to c

ompo

sitio

nalit

y.In

put/o

utpu

t pai

ring

s ar

e lik

e le

xem

es in

that

the

spec

ific

inpu

t con

stra

ins

asp

ecifi

c pa

rt o

f the

out

put.

The

re a

re fe

w, i

f an

y, c

onst

rain

ts o

n ot

her

part

s of

the

outp

ut. L

exem

es w

ill c

ompo

se in

nov

el c

ombi

natio

ns e

asily

bec

ause

eac

hon

e m

akes

a s

epar

ate

and

inde

pend

ent c

ontr

ibut

ion

to th

e in

terp

reta

tion.

Whe

n th

e re

gula

rity

con

sist

s of

inpu

t/inp

ut p

airi

ngs,

gen

eral

izat

ion

can

lead

to prediction. For input/input pairings, one part of the input affects multiple

part

s of

the

outp

ut. S

ome

of th

e pa

rts

may

be

futu

re c

onst

ituen

ts, s

o th

e in

put,

in effect, p

redi

cts

futu

re in

puts

. The

se in

put/i

nput

pai

ring

s m

ay n

ot c

ompo

seea

sily

. The

par

ts m

ay m

ake

mut

ually

con

trad

icto

ry p

redi

ctio

ns. F

or e

xam

ple

in, "

Bobby watched the news

" "

Bob

by" predicts that

the news

will

not

be

wat

ched

, and

"the news" predicts that

Bob

by,

a ch

ild, w

ill n

ot b

e w

atch

ing.

The

se c

ontr

adic

tory

pre

dict

ions

will

cre

ate

conf

lict i

n th

e se

nten

ce g

esta

lt an

dm

ake

the

inte

rpre

tatio

n ha

rd to

rep

rese

nt.

246

F. S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

Add

ition

ally

, a n

ovel

pai

ring

may

not

com

pose

eas

ily b

ecau

se th

e co

nstr

aint

ssp

ecifi

c to

a p

air

may

not

hav

e be

en le

arne

d. In

oth

er w

ords

, the

re m

ay b

eco

nstr

aint

s, in

the

envi

ronm

ent,

that

per

tain

to a

pai

r of

inpu

t con

stitu

ents

. If

the

mod

el h

as n

ever

exp

erie

nced

this

pai

r, it

can

not k

now

thes

e co

nstr

aint

s,an

d th

ey w

ill n

ot a

ffec

t the

inte

rpre

tatio

n.

8. V

aria

ble

synt

actic

fram

es

The

mod

el is

abl

e to

lear

n an

d us

e sy

ntac

tic in

form

atio

n. I

t can

cor

rect

ly a

pply

synt

actic

info

rmat

ion

to. a

ssig

n thematic roles in sentences with re

vers

ible

verb

s. B

ut h

ow c

ompl

ex .

synt

ax c

an th

e m

odel

lear

n? S

ince

the

mod

el c

an o

nly

represent simple sentences in the output layer, it

cann

ot le

arn

or u

se th

e

com

plex

syn

tax

invo

lved

in s

ente

nces

with

em

bedd

ed c

laus

es. T

he s

ynta

x of

sentences involving "

gave

" however, is relatively co

mpl

ex b

ecau

se o

f th

eva

riab

ility

in th

e lo

catio

n of

rol

es in

the

sent

ence

. Bec

ause

it n

eed

only

invo

lve

sim

ple

sent

ence

s, it

is representable in the output la

yer,

and

bec

ause

it is

repr

esen

tabl

e, it

can

be

used

as

the

targ

et f

or e

rror

cor

rect

ion.

The

que

stio

n is

,w

ill th

e m

odel

be

able

to le

arn?

We

crea

ted

a co

rpus

con

sist

ing

of th

e di

ffer

ent l

egal

con

stru

ctio

ns o

f "g

ave.

The

bus

driv

er g

ave

the

rose

to th

e te

ache

r.T

he b

usdr

iver

gav

e th

e te

ache

r th

e ro

se.

The

teac

her

was

giv

en th

e ro

se b

y th

e bu

sdriv

er.

The

ros

e w

as g

iven

to th

e te

ache

r by

the

busd

river

.T

he r

ose

was

giv

en b

y th

e bu

sdri

ver

to th

e te

ache

r.

The

cor

pus

cons

iste

d of

56

even

ts o

f th

is ty

pe. E

ach

even

t was

equ

ally

fre

quen

tso

that

ther

e w

ere

no s

eman

tic r

egul

ariti

es fo

r th

e m

odel

to d

etec

t and

use

tohe

lp it

ass

ign

the

agen

t and

rec

ipie

nt th

emat

ic r

oles

. The

mod

el w

as a

ble

tom

aste

r th

is c

orpu

s. I

t lea

rned

to c

orre

ctly

ass

ign

each

of

the

them

atic

rol

es in

the event described by the sentence.

The

re a

re s

till m

any

mor

e ph

enom

ena

on w

hich

the

mod

el h

as n

ot b

een

trai

ned

or te

sted

. The

gen

eral

poi

nt, t

houg

h, i

s th

at th

e m

odel

app

ears

to h

ave

the

capa

bilit

y to

lear

n an

d us

e sy

ntac

tic c

onst

rain

ts p

rodu

ctiv

ely.

Whe

re it

slim

its a

re is

pre

sent

ly u

nkno

wn.

4, Discussion

The

SG

mod

el h

as b

een

quite

suc

cess

ful i

n m

eetin

g th

e go

als

that

we

set o

utfo

r it,

but

it is

of

cour

se f

ar f

rom

bei

ng th

e fi

nal w

ord

on s

ente

nce

com

preh

en-

sion

. Her

e w

e br

iefly

review the model's accomplishments. Following th

is

revi

ew, w

e co

nsid

er s

ome

of it

s lim

itatio

ns a

nd h

ow th

ey m

ight

be

addr

esse

d by

furt

her

wor

k.

common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 16: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

247

1. A

ccom

plis

hmen

ts o

f th

e m

odel

One

of

the

prin

cipl

e su

cces

ses

of th

e SG

mod

el is that it correctly assigns

cons

titue

nts

to th

emat

ic r

oles

bas

ed o

n sy

ntac

tic a

nd s

eman

tic c

onst

rain

ts, T

hesy

ntac

tic c

onst

rain

ts a

re m

ore

diff

icul

t for the model to master than the

sem

antic

con

stra

ints

eve

n th

ough

we

have

pro

vide

d ex

plic

it cu

es to

the

synt

axin

the

form

of

the

surf

ace

loca

tion

of th

e co

nstit

uent

s , in

the

inpu

t. T

he m

odel

does

, how

ever

, com

e to

mas

ter

thes

e co

nstr

aint

s as

they

are

exe

mpl

ified

in th

eco

rpus

of

trai

ning

sen

tenc

es, T

houg

h sy

ntac

tic c

onst

rain

ts c

an b

e si

gnif

ican

tlym

ore

subt

le th

an th

ose

our

mod

el h

as f

aced

thus

far

, tho

se it

has

fac

ed a

refa

irly

dif

ficu

lt. T

o co

rrec

tly l1

andl

e ac

tive

and

pass

ive

sent

ence

s, the model

mus

t map

sur

face

con

stitu

ents

ont

o di

ffer

ent r

oles

dep

endi

ng o

n th

e pr

esen

ceof

var

ious

sur

face

cue

s el

sew

here

in th

e se

nten

ce,

The

mod

el a

lso

exhi

bits

con

side

rabl

e ca

paci

ty to

use

con

text

to d

isam

bigu

ate

meanings and to in

stan

tiate

vag

ue te

rms

in contextually appropriate ways,

Inde

ed, it is pr

obab

ly m

ost a

ppro

pria

te to

vie

w th

e m

odel

as

trea

ting

each

constituent in a sentence as a clue or set of cl

ues

that

con

stra

in th

e ov

eral

lev

ent d

escr

iptio

n, r

athe

r th

an a

s tr

eatin

g ea

ch c

onst

ituen

t as

a le

xica

l ite

m w

itha

part

icul

ar m

eani

ng, A

lthou

gh e

ach

clue

may

pro

vide

str

onge

r co

nstr

aint

s on

som

e as

pect

s of

the

even

t des

crip

tion

than

on

othe

rs, i

t is simply not the case

that

the

mea

ning

ass

ocia

ted

with

the

part

of the event designated by each

cons

titue

nt is

con

veye

d by

onl

y th

at c

onst

ituen

t its

elf,

The

mod

el li

kew

ise

infe

rs u

nspe

cifie

d ar

gum

ents

rou

ghly

to th

e ex

tent

that

they

can

be

relia

bly

pred

icte

d fr

om th

e co

ntex

t, H

ere

we

see

very

cle

arly

that

cons

titue

nts

of a

n ev

ent d

escr

iptio

n ca

n be

cue

d w

ithou

t bei

ng s

peci

fica

llyde

sign

ated

by

any

cons

titue

nt o

f th

e se

nten

ce, T

hese

infe

renc

es a

re g

rade

d to

refle

ct th

e de

gree

to w

hich

they

are

app

ropr

iate

giv

en th

e se

t of c

lues

pro

vide

d,T

he d

raw

ing

of th

ese

infe

renc

es is

als

o co

mpl

etel

y in

trin

sic to the basic

com

preh

ensi

on p

roce

ss: n

o sp

ecia

l sep

arat

e in

fere

nce

proc

esse

s m

ust b

e sp

aw-

ned

to m

ake

infe

renc

es, t

hey

sim

ply

occu

r im

plic

itly

as' the constituents of the

sent

ence

are

pro

cess

ed,

The

mod

el d

emon

stra

tes

the

capa

city

to u

pdat

e its

.rep

rese

ntat

ion

as e

ach

new

con

stitu

ent i

s en

coun

tere

d, O

ur d

emon

stra

tion

of th

is aspect of the

mod

el's

per

form

ance

is s

omew

hat i

nfor

mal

; nev

erth

eles

s , it

s ca

pabi

litie

s se

emim

pres

sive

, As

each

con

stitu

ent i

s en

coun

tere

d, t

he in

terp

reta

tion

of a

ll as

pect

sof

the

even

t des

crip

tion

is s

ubje

ct to

cha

nge,

If

we

reve

rt to

thin

king

in te

rms

of m

eani

ngs

of p

artic

ular

con

stitu

ents

, bot

h pr

ior

and

subs

eque

nt c

onte

xt c

anin

fluen

ce th

e in

terp

reta

tion

of e

ach

cons

titue

nt. U

nlik

e m

ost c

onve

ntio

nal

sent

ence

pro

cess

ing

mod

els

, the

abi

lity

to e

xplo

it su

bseq

uent

con

text

is a

gain

an in

trin

sic

part

of

the

proc

ess

of in

terp

retin

g ea

ch n

ew c

onst

ituen

t. T

here

isno

bac

ktra

ckin

g; r

athe

r , the representation of the sentence is simply up

date

dto

ref

lect

the

cons

trai

nts

impo

sed

by e

ach

cons

titue

nt a

s it

is e

ncou

nter

ed,

Whi

le a

void

ing

back

trac

king

, the

mod

el also avoids the computational

248

F. S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

expl

osio

n of

com

putin

g ea

ch p

ossi

ble

inte

rpre

tatio

n of

a s

ente

nce

as it

enc

oun-

ters

am

bigu

ous

wor

ds a

nd th

emat

ic r

ole

assi

gnm

ents

, A s

impl

er m

odel

hel

psex

plai

n ho

w th

e S

G m

odel

avo

ids

thes

e du

al p

itfal

ls, K

awam

oto

(16)

des

crib

esan

aut

o-as

soci

ativ

e m

odel

of

lexi

cal a

cces

s, I

n hi

s m

odel

, pat

tern

s of

act

ivat

ion

repr

esen

t a w

ord

and

its m

eani

ng, F

or e

ach

ambi

guou

s w

ord,

ther

e ar

e tw

opa

ttern

s, e

ach

repr

esen

ting

the association of the w

ord

with

one

of i

tsm

eani

ngs,

Pos

itive

wei

ghts

inte

rcon

nect

the

units

rep

rese

ntin

g a

patte

rn, a

ndne

gativ

e w

eigh

ts in

terc

onne

ct u

nits

bet

wee

n pa

ttern

s,W

hen

an a

mbi

guou

s w

ord

is p

roce

ssed

, it initially ac

tivat

es s

eman

tic u

nits

repr

esen

ting

both

mea

ning

s, T

he r

esul

ting

patte

rn o

f ac

tivat

ion

is a

com

bina

-tion of the semantic features of both m

eani

ngs,

so

the

bind

ings

am

ong

the

features of a meaning are lost in the ac

tivat

ion

patte

rn, T

hese

bin

ding

s,ho

wev

er a

re preserved in the weights, a

nd th

e m

odel

will

set

tle in

to o

nein

terp

reta

tion

of th

e w

ord

or th

e ot

her,

One

can think of the al

tern

ativ

ein

terp

reta

tions

as

min

ima

in a

n en

ergy

land

scap

e, T

he in

itial

pat

tern

of

activ

atio

n fa

lls o

n th

e hi

gh e

nerg

y rid

ge b

etw

een

the

min

ima,

The

set

tling

proc

ess

mov

es th

e pa

ttern

of

activ

atio

n do

wn

one

side

of

the

ridg

e to

one

of

the

min

ima,

The

SG

mod

el d

oes

not s

ettle

, but

the

idea

is s

imila

r. W

hen

ther

e is

not

suff

icie

nt in

form

atio

n to

res

olve

an

ambi

guity

, the

sen

tenc

e in

terp

reta

tions

may

beco

me

conf

late

d in

the

patte

rn o

f ac

tivat

ion

over

the

sent

ence

ges

talt

and

inth

e ou

tput

res

pons

es to

pro

bes,

The

bin

ding

s w

ithin

an

inte

rpre

tatio

n, h

ow-

ever, are preserved in the w

eigh

ts, W

hen

suffi

cien

t new

info

rmat

ion

for

disa

mbi

guat

ion

is p

rovi

ded

, the

sen

tenc

e ge

stal

t com

pute

s a

sing

le in

terp

reta

-tio

n w

ith c

orre

ct th

emat

ic r

ole

and

sem

antic

fea

ture

bin

ding

s, F

or e

xam

ple,

give

n "T

he a

dult

ate,

" and with

agen

t as the probe, the model pa

rtia

llyac

tivat

es

busd

rive

r, teacher, male,

and

fem

ale,

W

hich

is m

ale

and

whi

ch is

fem

ale

is lo

st in

the

activ

atio

n pa

ttern

ove

r th

e ou

tput

laye

r, (

It is

not

cle

ar to

wha

t ext

ent t

he b

indi

ngs

are

lost

in th

e ac

tivat

ion

patte

rn o

ver

the

sent

ence

gest

alt.

In g

ener

al, t

he r

epre

sent

atio

n ov

er th

e se

nten

ce g

esta

lt is

an

area

for

further inquiry,),

Aft

er th

e m

odel

is f

urth

er g

iven

, , ,

the

stea

k w

ith g

usto

" it

is c

lear

that

the

agen

t is

the

busd

rive

r. W

hen

prob

ed f

or th

e ag

ent,

the

mod

elunambiguously activates

busd

rive

r in

the

outp

ut la

yer,

For the unambiguous sentences in the corpus, the model predominately

achi

eves

the

corr

ect b

indi

ngs

and

inte

rpre

tatio

ns, F

or th

e am

bigu

ous

sent

-en

ces,

the

mod

el c

onfla

tes,

in th

e ou

tput

laye

r, the patterns for each possible

role

fill

er f

or th

e ro

le b

eing

pro

bed,

The

oret

ical

ly, t

he m

odel

sho

uld

set t

heac

tivat

ions

to m

atch

the

cond

ition

al p

roba

bilit

ies

for

the

aspe

cts

of th

e ev

ent

that

rem

ain

unde

rspe

cifi

ed, I

nste

ad, t

hese

act

ivat

ions

tend

to v

acill

ate

base

d on

rece

nt, r

elat

ed tr

aini

ng tr

ials

, Thi

s va

cilla

tion

tow

ard

alte

rnat

e in

terp

reta

tions

is r

emin

isce

nt o

f th

e fr

eque

nt f

indi

ng th

at h

uman

s ge

nera

lly d

o no

t not

ice

the

ambi

guity

of

sent

ence

s, I

nste

ad, t

hey

gene

rally

set

tle f

or o

ne in

terp

reta

tion

or

common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 17: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

249

the

othe

r, u

nles

s th

eir

atte

ntio

n is

exp

licitl

y dr

awn

to th

e am

bigu

ity, I

n su

mthe long-te

rm, a

vera

ge p

roba

bilit

ies

of p

icki

ng p

artic

ular

inte

rpre

tatio

ns m

ayre

flect

the

stat

istic

al p

rope

rtie

s of

the environment, while the m

omen

t to

mom

ent f

luct

uatio

ns o

f in

terp

reta

tions

ref

lect

rec

ent e

xper

ienc

e,T

he g

radu

al, i

ncre

men

tal l

earn

ing

capa

bilit

ies

of th

e ne

twor

k un

derl

ie it

sab

ility

to s

olve

the

boot

stra

ppin

g pr

oble

m, that is, to learn simultaneously

about both the syntax and semantics of co

nstit

uent

s, T

he p

robl

em o

f lea

rnin

gsy

ntax

and

sem

antic

s is

cen

tral

for

dev

elop

men

tal p

sych

olin

guis

tics,

Nai

gles

Gle

itman

and

Gle

itman

(25

) st

ate

that

lear

ning

syn

tax

and

sem

antic

s us

ing

only

stat

istic

al in

form

atio

n se

ems impossible because, "at a minimum, i

t wou

ldre

quir

e su

ch e

xten

sive

sto

rage

and

man

ipul

atio

n of

con

tinge

ntly

cat

egor

ized

even

t/con

vers

atio

n pa

irs a

s to

be

unre

alis

tic,"

Yet

it is

exa

ctly

by

usin

g su

chin

form

atio

n th

at o

ur m

odel

sol

ves

the

prob

lem

, The

mod

el le

arns

the

synt

axan

d se

man

tics

of th

e tr

aini

ng c

orpu

s si

mul

tane

ousl

y, A

cros

s tr

aini

ng tr

ials

, the

mod

el g

radu

ally

lear

ns w

hich

asp

ects

of

the

even

t des

crip

tion

each

con

stitu

ent

of th

e in

put c

onst

rain

s an

d in

wha

t way

s it

cons

trai

ns th

ese

aspe

cts,

The

pro

blem

of discovering which event in the world a sentence describes

whe

n m

ultip

le e

vent

s ar

e pr

esen

t wou

ld b

e ha

ndle

d in

a s

imila

r w

ay, t

houg

h w

eha

ve n

ot m

odel

ed it

. Aga

in, t

he a

spec

ts o

f th

e w

orld

that

the

sent

ence

act

ually

desc

ribe

s w

ould

be

disc

over

ed g

radu

ally

ove

r re

peat

ed tr

ials, while those

aspe

cts

that

spu

riou

sly

co-o

ccur

with

thes

e de

scri

bed

aspe

cts

wou

ld w

ash

out.

For both the bootstrapping and the am

bigu

ous

refe

renc

e pr

oble

m, t

hen

, our

mod

el ta

kes

a gr

adua

l, s

tatis

tical

app

roac

h, W

e do

not

wan

t to

over

stat

e th

ecase here

, sin

ce th

e ch

ild le

arni

ng a

lang

uage

con

fron

ts a

con

side

rabl

y m

ore

com

plex

ver

sion

of t

hese

pro

blem

s th

an o

ur m

odel

doe

s, O

ur s

ente

nces

are

pre-

segm

ente

d in

to c

onst

ituen

ts, are very simple in structure

, and

are

muc

hfe

wer

in n

umbe

r th

an th

e se

nten

ces

a ch

ild w

ould

hea

r, H

owev

er, t

he r

esul

tsde

mon

stra

te th

at th

e bo

otst

rapp

ing

and

ambi

guou

s re

fere

nce

prob

lem

s m

ight

ultim

atel

y be

ove

rcom

e by

an

exte

nsio

n of

the

pres

ent a

ppro

ach,

Man

y of

the

acco

mpl

ishm

ents

of

the

SG m

odel

are

sha

red

by p

rede

cess

ors,

Cot

trel

l, (

5) C

ottr

ell a

nd S

mal

l (6)

, Wal

tz a

nd P

olla

ck (

35),

and

McC

lella

nd a

ndK

awam

oto

(21)

hav

e al

l demonstrated the use of sy

ntac

tic a

nd s

eman

ticco

nstr

aint

s in

rol

e as

sign

men

t and

mea

ning

dis

ambi

guat

ion,

Of t

hese

, the

firs

ttw

o em

bodi

ed th

e im

med

iate

upd

ate

prin

cipl

e, b

ut d

id n

ot le

arn,

whi

le th

eth

ird le

arne

d in

a li

mite

d w

ay, a

nd h

ad a

fix

ed s

et o

f in

put s

lots

,T

he g

reat

er le

arni

ng c

apab

ility

of

our

mod

el a

llow

s it

to f

ind

conn

ectio

nst

reng

ths

that

sol

ve th

e co

nstr

aint

s em

bodi

ed in

the

corp

us w

ithou

t req

uirin

gth

e m

odel

er to

indu

ce th

ese

cons

trai

nts

and

with

out t

he m

odel

er tr

ying

to b

uild

them

in b

y ha

nd. I

t als

o al

low

s th

e m

odel

to c

onst

ruct

its

own

repr

esen

tatio

nsin the sentence ge

stal

t, an

d th

is a

bilit

y al

low

s th

ese representations to be

cons

ider

ably

mor

e co

mpa

ct th

an in

oth

er c

ases

,S

ome

prev

ious

mod

els

have

use

d conjunctive representations in which

250

F, S

T, J

OH

N A

ND

J,L

. McC

LELL

AN

D

role

/fill

er p

airs

are

exp

licitl

y re

pres

ente

d by

uni

ts pre-assigned to represent

eith

er s

peci

fic

role

/fill

er p

airs

(5,

35) or pa

rtic

ular

com

bina

tions

of

role

feat

ures

and

fill

er f

eatu

res

(21)

, Par

ticul

arly

, whe

n su

ch r

epre

sent

atio

ns a

reex

tend

ed s

o th

at tr

iple

s, r

athe

r th

an s

impl

y pa

irs, c

an b

e re

pres

ente

d (3

033

),

thes

e ne

twor

ks c

an b

ecom

e in

trac

tabl

y la

rge

even

with

sm

all v

ocab

ular

ies,

The

pres

ent m

odel

avo

ids

intr

acta

ble

size

by

lear

ning

to u

se it

s re

pres

enta

tiona

lca

paci

ty s

pari

ngly

to r

epre

sent

just

thos

e ro

le/f

iller

pai

ring

s th

at a

re c

onsi

sten

tw

ith it

s ex

peri

ence

, Thi

s ab

ility

pre

vent

s th

e m

odel

fro

m b

eing

abl

e to

repr

esen

t tot

ally

arb

itrar

y, ~vents: its representational capacities are strongly

cons

trai

ned

by th

e ra

nge

of it

s ex

peri

ence

, In

this

reg

ard

the

mod

el s

eem

ssi

mila

r to

hum

ans:

it is

wid

ely

know

n th

at h

uman

com

preh

ensi

on is

str

ongl

yin

fluen

ced

by e

xper

ienc

e (2

4),

The

maj

or s

imul

atio

n re

port

ed h

ere

cont

aine

d ex

plic

it su

rfac

e ro

le m

arki

ngs

in th

e in

put.

The

se markings were designed to make the learning of the

synt

actic

info

rmat

ion

easi

er. A

dditi

onal

sim

ulat

ions

, tho

ugh

, sho

wed

that

the

netw

ork

coul

d le

arn

the

synt

actic

info

rmat

ion

from

onl

y th

e te

mpo

ral s

e-qu

ence

s of

con

stitu

ents

, In

a di

ffer

ent t

ask

, whe

re th

e ne

twor

k m

ust a

ttem

pt to

antic

ipat

e th

e ne

xt in

put,

ther

e ha

ve b

een

seve

ral d

emon

stra

tions

that

net

-w

orks

can

lear

n to

kee

p tr

ack

of p

arse

pos

ition

, at l

east

for

smal

l fin

ite-s

tate

gram

mar

s (7

,29)

, Ext

ract

ing

info

rmat

ion

from

the

tem

pora

l ord

er o

f inp

utse

quen

ces,

then, is a function generally within the co

mpu

tatio

nal l

imits

of

recu

rren

t net

wor

ks,

Fin

ally

, the

mod

el is

abl

e to

generalize the processing kn

owle

dge

it ha

slearned and apply it to no

vel s

ente

nces

, Gen

eral

izat

ion

can

com

e in

two

varie

ties:

com

posi

tiona

l and

pre

dict

ive

gene

raliz

atio

n. C

ompo

sitio

nal g

ener

ali-

zatio

n oc

curs

whe

n th

e m

odel

has

lear

ned

the

cons

trai

nts

on s

ente

nce

inte

rpre

-ta

tion

cont

ribu

ted

by e

ach

elem

ent o

f th

e se

nten

ce, i

nclu

ding

bot

h sy

ntax

and

sem

antic

s, and has learned ho

w to

com

bine

that

info

rmat

ion

in n

ovel

se-

quen

ces,

The

clu

ster

ana

lysi

s of

the

inpu

t wei

ghts

con

fim

s th

at th

e ne

twor

k is

lear

ning

the

sem

antic

con

stra

ints

impo

sed

by c

onst

ituen

ts, I

t see

ms

likel

y th

at a

cons

ider

able

par

t of t

he s

peci

ficat

ion

of th

ese

cons

trai

nts

mig

ht b

e de

rivab

le b

yth

e ne

twor

k fr

om e

xper

ienc

e on

a s

ubse

t of

the

poss

ible

con

text

s w

here

a w

ord

can

occu

r, T

he in

terp

reta

tion

acqu

ired

in th

ese

expe

rien

ces

wou

ld th

en c

ause

the

new

wor

d to

beh

ave

like

othe

r si

mila

r w

ords

in c

onte

xts

in w

hich

it w

as n

ottrained, lllustrations that back pr

opag

atio

n ne

twor

ks c

an g

ener

aliz

e in

this

way

are

prov

ided

by

Hin

ton

(12)

, Tar

aban

, McD

onal

d an

d M

acW

hinn

ey (

31),

and

Rum

elha

rt (

27),

In

both

the

synt

actic

and

the

sem

antic

gen

eral

izat

ion

corp

ora,

the

mod

el d

emon

stra

tes

its a

bilit

y to

lear

n th

is in

form

atio

n an

d co

mpo

se it

.T

he s

econ

d va

riet

y of

generalization is predictive generalization. It occurs

whe

n th

e m

odel

can

use

a r

egul

arity

to p

redi

ct u

pcom

ing

sent

ence

con

stitu

ents

.In the semantic ge

nera

lizat

ion

expe

rimen

t, the model le

arne

d re

gula

ritie

sab

out t

he a

ge o

f ag

ents

in th

e co

rpus

, The

mod

el th

en u

sed

thes

e re

gula

ritie

sto predict appropriate objects,

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 18: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

251

2. D

efic

ienc

ies

and

limita

tions

of t

he m

odel

The

mod

el h

as s

ever

al li

mita

tions

and

a fe

w o

bvio

us d

efic

ienc

ies.

The

mod

elon

ly a

ddre

sses

a li

mite

d nu

mbe

r of

lang

uage

phe

nom

ena,

It d

oes

not a

ddre

ssqu

antif

icat

ion

, ref

eren

ce a

nd c

o-re

fere

nce

, coo

rdin

ate

cons

truc

tions

, or

man

yot

her

phen

omen

a. P

erha

ps th

e m

ost i

mpo

rtan

t lim

itatio

n is

the

limita

tion

onthe complexity of the sentences, and of the events that they de

scrib

e, In

gene

ral ,

it is ne

cess

ary

to c

hara

cter

ize

the

role

s an

d fi

llers

of

sent

ence

s w

ithre

spec

t to

thei

r su

pero

rdin

ate

cons

titue

nts.

Sim

ilarly

in c

ompl

ex e

vent

s, t

here

may

be

mor

e th

an o

ne a

ctor

, eac

h pe

rfor

min

g an

action in a di

ffer

ent

sub-

even

t of

the

over

all e

vent

or

actio

n. R

epre

sent

ing

thes

e st

ruct

ures

req

uire

she

ad/r

ole/

fille

r tr

iple

s in

stea

d of

sim

ple

role

/fill

er p

airs

.O

ne s

olut

ion

is to

trai

n th

e m

odel

usi

ng tr

iple

s ra

ther

than

pai

rs a

s th

ese

nten

ce a

nd e

vent

con

stitu

ents

, The

dif

ficu

lty li

es in

specifying the non-

sent

ence

mem

bers

of

the

trip

les.

The

se n

on-s

ente

nce

mem

bers

wou

ld s

tand

for

entir

e st

ruct

ures

, Thu

s th

ey w

ould

be

very

muc

h lik

e th

e pa

ttern

s th

at w

e ar

ecu

rren

tly u

sing

as

sent

ence

ges

talts

, It w

ould

be

desi

rabl

e to

hav

e th

e le

arni

ngpr

oced

ure

indu

ce th

ese

repr

esen

tatio

ns, but this is a bo

otst

rapp

ing

prob

lem

that

we

have

not

yet

atte

mpt

ed to

sol

ve.

Ano

ther

lim

itatio

n of

the

mod

el is

the

use

of lo

cal r

epre

sent

atio

ns b

oth

for

conc

epts

and

for

rol

es, T

he p

rese

nt m

odel

use

d pr

edom

inan

tly lo

cal r

epre

-se

ntat

ions

of

conc

ept m

eani

ngs

only

for

con

veni

ence

; in

real

ity w

e w

ould

supp

ose

that

the

conc

eptu

al r

epre

sent

atio

ns u

nder

lyin

g ev

ents

wou

ld b

e re

pre-

sent

ed b

y di

stri

bute

d pa

ttern

s (1

4), T

his

kind

of

repr

esen

tatio

n w

ould

hav

ese

vera

l adv

anta

ges.

Con

text

has

the

capa

bilit

y no

t onl

y of

sel

ectin

g am

ong

high

ly d

istin

ct m

eani

ngs

such

as

betw

een

flyi

ng b

ats

and

base

ball

bats

, but

als

ow

e be

lieve

, of shading meanings

, em

phas

izin

g ce

rtai

n fe

atur

es a

nd a

lterin

gpr

oper

ties

slig

htly

as

a fu

nctio

n of

con

text

(21

), B

oth

of th

ese

phen

omen

a ar

eea

sily

cap

ture

d if

we

view

the

repr

esen

tatio

n of

a c

once

pt a

s a

dist

ribut

ed~~

rn,

Sim

ilarly

, the

re a

re s

ever

al problems with the concept of role which are

solv

ed if

dis

trib

uted

rep

rese

ntat

ions

are

use

d. It

is. o

ften

dif

ficu

lt to

det

erm

ine

whe

ther

two

role

s ar

e th

e sa

me ,

and

it is

ver

y di

fficu

lt to

dec

ide

exac

tly h

owm

any

diffe

rent

rol

es th

ere

are.

If r

oles

wer

e re

pres

ente

d as

dis

trib

uted

patte

rns ,

thes

e is

sues

wou

ld s

impl

y fa

ll by

the

way

sidl

?' In

ear

lier

wor

k (2

1),

was

nec

essa

ry to

inve

nt d

istr

ibut

ed r

epre

sent

atio

ns f

or c

once

pts ,

but

rec

ently

anu

mbe

r of

res

earc

hers

hav

e sh

own

that

suc

h re

pres

enta

tions

can

be

lear

ned

(12

, 24

, 28)

. The

pro

cedu

re s

houl

d al

so a

pply

to d

istr

ibut

ed r

epre

sent

atio

ns o

fro

les,

A f

inal

lim

itatio

n is

the

smal

l siz

e of

the

corp

us u

sed

in tr

aini

ng th

e m

odel

.G

iven

the length of time required for tr

aini

ng, o

ne might be somewhat

pess

imis

tic a

bout

the

poss

ibili

ty th

at a

net

wor

k of

this

kin

d co

uld

mas

ter

asu

bsta

ntia

l cor

pus,

How

ever

, it should be

not

ed th

at th

e extent to which

252

F, S

T, J

OH

N A

ND

j,L.

McC

LELL

AN

D

lear

ning

tim

e gr

ows

with

cor

pus

size

is e

xtre

mel

y ha

rd to

pre

dict

for

con

nec-

tioni

st m

odel

s, a

nd is

hig

hly

prob

lem

dep

ende

nt. F

or s

ome

prob

lem

s (e

.pa

rity

), le

arni

ng ti

me

per

patte

rn in

crea

ses

mor

e th

an li

near

ly w

ith th

e nu

mbe

rof training patterns (3

2), w

hile

for

othe

r pr

oble

ms

(e.g, negation), learning

time

per

patte

rn a

ctua

lly c

an d

ecre

ase

as th

e nu

mbe

r of

pat

tern

s in

crea

ses

(27)

,

Whe

re th

e cu

rren

t pro

blem

fal

ls o

n th

is c

ontin

uum

is n

ot y

et k

now

n. A

com

pari

son

of th

e le

arni

ng ti

mes

bet

wee

n th

e ge

nera

l cor

pus

and

the

synt

actic

corpus used in the ge

nera

lizat

ion

expe

rim

ents

, how

ever

, is

sugg

estiv

e, T

he

netw

ork

requ

ired

630

,000

tria

ls to

lear

n th

e 12

0 events in the general corpus

(330

000

tria

ls to

lear

n al

l but

the

mos

t irr

egul

ar e

vent

s). O

n th

e ot

her

hand

the network required only 100,00

0 tr

ials

to learn the 2000 events of the

synt

actic

cor

pus.

The

syn

tact

ic c

orpu

s, o

f co

urse

, is

extr

emel

y re

gula

r, a

nd th

ere

gula

ritie

s ar

e co

mpo

sitio

nal.

Giv

en th

e m

odel

's g

ood

gene

raliz

atio

n re

sults

on th

e sy

ntac

tic c

orpu

s, it

is p

ossi

ble

that

the

mod

el w

ill s

cale

wel

l to

very

larg

e

corp

ora

if th

eir

regu

lari

ties

are

com

posa

ble,

One

fin

al d

efic

ienc

y of

the

mod

el is

its

tend

ency

to a

ctiv

ate

fille

rs f

or r

oles

that do not apply to a pa

rtic

ular

fra

me,

Thi

s tendency could perhaps be

over

com

e by

exp

licit

trai

ning

that

ther

e sh

ould

be

no o

utpu

t for

a p

artic

ular

role

, but

this

see

ms

inel

egan

t and

impr

actic

al, e

spec

ially

if w

e ar

e co

rrec

t in

believing that the set of roles is op

en-e

nded

, The

abs

ence

of

role

s se

ems

som

ehow

impl

icit

in e

vent

s, r

athe

r th

an e

xplic

itly

note

d. P

erha

ps e

vent

rep

re-

sent

atio

ns th

at p

rese

rved

mor

e de

tail

of th

e re

al-w

orld

eve

nt w

ould

pro

vide

the

rele

vant

impl

icit

cons

trai

nts,

S. Conclusion

The

SG

mod

el r

epre

sent

s an

othe

r st

ep in

wha

t will

sur

ely

be a

long

ser

ies

of

expl

orat

ions

of

conn

ectio

nist

mod

els

of la

ngua

ge p

roce

ssin

g, T

he m

odel

is a

nad

vanc

e in

our

vie

w, b

ut th

ere

is s

till a

ver

y lo

ng w

ay to

go.

The

nex

t ste

p is

tofi

nd w

ays

to e

xten

d th

e ap

proa

ch to

mor

e co

mpl

ex s

truc

ture

s an

d m

ore

extensive corpora, while increasing the rate of learning,

Appendix A. Input and Output R

epre

sent

atio

ns

Inpu

t

surf

ace

loca

tions

:pr

e-ve

rbal

, ver

bal,

pos

t-ve

rbal

-, p

ost-

verb

al-n

wor

dsco

nsum

ed, ate, d

rank

, stirred, s

prea

d, kissed, g

ave, hit, shot, t

hrew

, dro

ve

shed

, ros

e

common
Pencil
common
Pencil
common
Pencil
Page 19: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

253

som

eone

, adu

lt, c

hild

, dog

, bus

driv

er, t

each

er, s

choo

lgirl

, pitc

her,

spo

t

som

ethi

ng, f

ood

, ste

ak, s

oup,

ice

crea

m, c

rack

ers,

jelly

, ice

d-te

a, k

ool-a

id

uten

sil,

spoo

n, k

nife

, fin

ger,

gun

place, kitchen, living-room, park, bat, ball, bus, f

urgu

sto

, ple

asur

e, d

aint

ines

swith, in, t

o, b

yw

as

Out

put

role

s:agent, action, patient, instrument, co-agent, co-patient, location, adverb

reci

pien

t

actions and concepts:

ate,

dra

nk, s

tirre

d, s

prea

d, k

isse

d, g

ave, hit, s

hot,

thre

w(t

osse

d),

thre

w(h

oste

d), d

rove

(tra

nspo

rted

), d

rove

(mot

ivat

ed),

she

d(ve

rb),

rose

(ve

rb)

busd

rive

r, te

ache

r, s

choo

lgirl

, pitc

her(

pers

on),

spo

tst

eak,

sou

p, ic

e cr

eam

, cra

cker

s, jelly, iced-te

a, k

ool-a

idsp

oon,

kni

fe, f

inge

r, g

unki

tche

n, l

ivin

g-ro

om, s

hed(

noun

), p

ark

rose

(nou

n), b

at(a

nim

al),

bat

(bas

ebal

l), b

all(

sphe

re),

bal

l(pa

rty)

, bus

pitcher(container), fur

gust

o, p

leas

ure

, dai

ntin

ess

actio

n fe

atur

es:

cons

umed

, pas

sive

conc

ept f

eatu

res:

pers

on, a

dult,

chi

ld, d

og, m

ale

, fem

ale

thin

g, fo

od, u

tens

ilpl

ace, in-do

ors,

out

-doo

rs

App

endi

x B

. Sam

ple

Sent

ence

-Fra

me

Hit

In the sentence-fr

ame

belo

w, s

uper

ior

num

bers

1,

4 ha

ve th

e fo

llow

ing

mea

ning

:

lnc1ude a role in the input with this pr

obab

ility

o2

Cho

ose

this

fill

er w

ith th

is p

roba

bilit

y.3

Cho

ose

this

wor

d w

ith th

is p

roba

bilit

y T

he w

ord

appe

ars

in th

is p

repo

sitio

nal p

hras

e.

254

F, S

T. J

OH

N A

ND

j,L.

McC

LELL

AN

D

agent 100

2 bu

sdri

ver

703

adul

t 20

pers

on 1

0

verb 100

100 hit 100

patient 100

25 s

hed-

n 80

som

ethi

ng 2

0instrument SO

100 bus 80 something 20 with

40 b

all-

s 80

som

ethi

ng 2

0lo

catio

n S

O ' .

100

park

100

ininstrument SO

100

bat-b 80 something 20 with

10 b

at-a

80

som

ethi

ng 2

0location SO

100

shed

-n 1

00 in

instrument SO

100

bat-

b 80

som

ethi

ng 2

0 w

ith25

pitc

her-

p 70

chi

ld 2

0 pe

rson

10

location SO

100

park

100

ininstrument SO

100

ball-

s 80

som

ethi

ng 2

0 w

ith25 teacher 70 adult 20 person 10

verb 100

100 hit 100

patient 100

34 p

itche

r-c

80 s

omet

hing

20

location SO

100

kitc

hen

100

ininstrument SO

100 spoon 80 something 20 with

33 p

itche

r-p

70 c

hild

20

pers

on 1

0location SO

100

livin

g-ro

om 1

00 in

instrument SO

100

pitc

her-

c 80

som

ethi

ng 2

0 w

ith33

sch

oolg

irl 7

0 ch

ild 2

0 pe

rson

10

location SO

100

kitc

hen

100

ininstrument SO

100 spoon 80 something 20 with

25 p

itche

r-p

70 c

hild

20

pers

on 1

0

common
Pencil
common
Pencil
Page 20: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

255

verb 100

tOO hit 100

patient 100

40 b

all-

s 80

som

ethi

ng 2

0lo

catio

n 50

100

park

100

ininstrument 50

100

bat-b 80 something 20 with

10 b

at-a

80

som

ethi

ng 2

0lo

catio

n 50

100

shed

-n 1

00 in

instrument 50

100

bat-b 80 something 20 with

25 bus 80 something 20

location 50

100

park

100

ininstrument 50

100

ball-

s 80

som

ethi

ng 2

0 w

ith25

bus

driv

er 7

0 ad

ult 2

0 pe

rson

10

location 50

100

park

100

ininstrument 50

100

ball-

s 80

som

ethi

ng 2

0 w

ith25

sch

oolg

irl 7

0 ch

ild 2

0 pe

rson

10

verb 100

100 hit 100

patient 100

34 p

itche

r-c

80 s

omet

hing

20

location 50

100

kitc

hen

100

ininstrument 50

100 spoon 80 something 20 with

33 s

pot 8

0 do

g 20

loca

tion

5010

0 ki

tche

n 10

0 in

instrument 50

100 spoon 80 something 20 with

33 te

ache

r 70

adu

lt 20

per

son

10lo

catio

n 50

100

kitc

hen

100

ininstrument 50

100

spoo

n 80

som

ethi

ng 2

0 w

ith

256

F, S

T. J

OH

N A

ND

J.L

. McC

LELL

AN

D

RE

FER

EN

CE

S

I. R.c. Anderson and A. Ortony, On pu

tting

app

les

into

bot

tles:

A problem of polysemy,

Cognitive PJycllOl.

7 (1975) 167-

18U

,

2, F,c. Bartlett, Remembering: An Experimental and Social Study

(Cam

brid

ge U

nive

rsity

Pre

ss,

Cam

brid

ge, 1932).

3. P.A. Carpenter and M.

A. J

ust,

Rea

ding

com

preh

ensi

on a

s th

e ey

es s

ee it

, in:

M.A

, Jus

t and

A. C

arpe

nter

, eds

.Cognitive Processes in Comprehension

(Erl

baum

, Hillsdale, NJ, 1977).

4. W

G, C

hase

and

H.A. Simon, Perception in chess,

Cognitive Psychol.

4 (1

973)

55-

81.

5. G.W

. Cot

trel

l, A

con

nect

ioni

st a

ppro

ach

to w

ord

sens

e di

sam

bigu

atio

n, D

isse

rtat

ion,

Com

pu-

ter

Scie

nce

Dep

artm

ent,

Uni

vers

ity o

f R

oche

ster

, NY

(19

85).

6. G,W

. Cot

trel

l and

A,L

. Sm

all,

A c

onne

ctio

nist

sch

eme

for

mod

elin

g w

ord

sens

e di

sam

bigu

a-tio

nCognition and Brain Theory

6 (1

983)

89-

120,

7. J

.L. Elman, Finding structure in time, C

RL

Tec

h, R

ept.

8801, Center for Research in

Lan

guag

e, U

nive

rsity

of

Cal

ifor

nia,

San

Die

go, L

a Jo

lla, C

A (

1988

),8. T.

O. E

ricks

on a

nd M

.E. M

atts

on, F

rom

wor

ds to

mea

ning

: A s

eman

tic illusion, J.

Ver

bal

Learn. Verbal Behav.

20 (1981) 54U-

551.

9, c.J. Fillmore, T

he c

ase

for

case

, in:

E, B

ach

and

R,T, Harms, eds" Universals in Linguistic

The

ory

(Hol

t, N

ew Y

ork,

196

8),

10. L

. R. G

leitm

an a

nd E

. Wan

ner,

Lan

guag

e ac

quis

ition

: The

sta

te o

f th

e st

ate

of th

e ar

t, in

: E.

Wanner and L.R. Gleitman, eds.,

Language Acquisition: The State of the Art

(Cam

brid

geUniversity Press, C

ambr

idge

, MA

, 198

2),

11. G

.E. H

into

n, I

mpl

emen

ting

sem

antic

net

wor

ks in

par

alle

l har

dwar

e, in

: G.E

. Hin

ton

and

J.Anderson, eds"

Parallel Models of AJsociative Memory

(Erlb

aum

, Hill

sdal

e, N

J, 1

981)

.

12. G

.E. Hinton, Learning distributed representations of concepts, in:

Proc

eedi

ngs

Eig

hth

Ann

ual

Conference of the Cognitive Science Society,

Am

hers

t, M

A (

1986

),13

. G.E, Hinton, Connectionist learning procedures, T

ech.

Rep

t. C

MU

-CS-

87-1

15, C

ompu

ter

Sci

ence

Dep

artm

ent,

Car

negi

e-M

ello

n U

nive

rsity

, Pitt

sbur

gh, P

A (

1987

).14

. G.E

. Hin

ton,

J.L. McClelland and D.

E. R

umel

hart

, Dis

trib

uted

representations, in: D.

Rum

elha

rt, J

.L. M

cCle

lland

and

the

PDP

Res

earc

h G

roup

, eds

.Pa

ralle

l DiJ

trib

wed

Pro

cess

-ing: Explorations in the Microstructure of Cognition

I:

Foun

datio

ns

(MIT

Pre

ss, C

ambr

idge

,, 1986).

15. M

.I. J

orda

n, A

ttrac

tor

dyna

mic

s an

d pa

ralle

lism

in a

con

nect

ioni

st s

eque

ntia

l mac

hine

, in:

Proc

eedi

ngs

Eig

hth

Ann

ual C

onfe

renc

e of

the

Cog

nitiv

e Sc

ienc

e So

ciet

y,

Am

hers

t, M

A (

1986

).16

, A.H

, Kaw

amot

o, D

istr

ibut

ed r

epre

sent

atio

ns o

f am

bigu

ous

wor

ds a

nd th

eir

reso

lutio

n in

aco

nnec

tioni

st n

etw

ork,

in: S

.L. Small, G.W

. Cot

trel

l, a

nd M

.K. Tanenhaus, eds"

Lex

ical

Am

bigu

ity R

esol

utio

n: P

ersp

ectiv

es f

rom

Psy

chol

ingu

istic

s, N

euro

psyc

holo

gy, a

nd A

rtif

icia

lIn

telli

genc

e (M

orga

n K

aufm

ann,

San

Mat

eo, C

A, 1988),

17. B

. Mac

Whi

nney

, E. B

atcs

and

R. K

liegl

, Cue

val

idity

and

sen

tenc

e in

terp

reta

tion

in E

nglis

hG

erm

an, and Italian, J. Verbal Learn, and Verbal Behav.

23 (1984) 127-

150.

18. M

.P. Marcus,

A T

heor

y of

Syntactic Recogni/ion for Natural Language

(MIT

Pre

ss, C

am-

brid

ge, M

A, 1980).

19, W. Marslen-Wilson and L.K. Tyler

, The

tem

pora

l str

uctu

re o

f sp

oken

lang

uage

und

erst

andi

ng,

Cog

r/i/i

on

8 (1

980)

1-7

1.20

. J.L

. McC

lella

nd a

nd J

.L. E

lman

, Int

erac

tive

proc

esse

s in

spe

ech

perc

eptio

n: T

he T

RA

CE

mod

el, in: J.L. McClelland, D.

E. R

umel

hart

and

the

PDP

Res

earc

h G

roup

, eds

.Pa

ralle

lDiJ/ribu/ed Processing: Explora/ions in the Micros/ructure of Cognition

2:

App

lica/

ions

(M

ITPr

ess,

Cam

brid

ge, M

A, 1

986)

.

21, J

,L. M

cCle

lland

and

A.H

, Kaw

amot

o, M

echa

nism

s of

sen

tenc

e pr

oces

sing

: Ass

igni

ng r

oles

toco

nstit

uent

s, in

: J.L. McClelland, D.

E. R

umel

hart

and

the

PDP

Res

earc

h G

roup

, eds

.Parallel Dis/ribu/ed ProcesJing: Explora/ionJ in the Micros/ruc/ure of Cognition

2:

App

lica/

ions

(MIT

Pre

ss, C

ambr

idge

, MA

, 198

6).

common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
common
Pencil
Page 21: words of a sentence may be Artificial Intelligence

SE

NT

EN

CE

CO

MP

RE

HE

NS

ION

257

22, j

,McClelland and D,E. Rumelhart

, An

inte

ract

ive

activ

atio

n m

odel

of

cont

ext e

ffec

ts in

1elle

r pe

rcep

tion:

Par

i 1. A

n ac

coun

t of b

asic

find

ings

Psychol, Rev,

88 (

1981

) 37

5-40

7,23

, G, M

cKoo

n an

d R

, Rat

cliff

, The

com

preh

ensi

on p

roce

sses

and

mem

ory

stru

ctur

es in

volv

ed in

instrumental inference

J, Verbal Learn. and Verbal Behav.

20 (

19IH

) 67

1-68

2,24

, R, M

iikku

lain

en a

nd M

. G, D

yer,

Bui

ldin

g di

stri

bute

d re

pres

enta

tions

with

out m

icro

feat

ures

,T

ech,

Rep

l., A

rtifi

cial

Int

ellig

ence

Lab

orat

ory,

Com

pute

r Sc

ienc

e D

epar

tmen

t, U

nive

rsity

of

Cal

ifor

nia

, Los

Ang

eles

, CA

(19

88),

25. LG, Naigles, H, Gleitman and LR. Gleitman

, Syn

tact

ic b

oots

trap

ping

in v

erb

acqu

isiti

on:

Evi

denc

e fr

om c

ompr

ehen

sion

, Tec

h, R

epl.,

Dep

artm

ent o

f Psy

chol

ogy,

Uni

vers

ity o

f Pe

nn-

sylv

ania

, Phi

lade

lphi

a, P

A (

1987

),26, W.V

. Qui

neWord and Object

(Har

vard

Pre

ss, C

ambr

idge

, MA

, 1960).

27, D,

E, R

umel

hart

, col

loqu

ium

pre

sent

ed to

the

Dep

artm

ent o

f C

ompu

ter

Scie

nce

, Car

negi

e-M

ello

n U

nive

rsity

, Pill

sbur

gh, P

A (

1987

),28. D.E, Rumelhart

, G,E

, Hin

ton

and

R.J

. Will

iam

s, L

earn

ing

inte

rnal

rep

rese

ntat

ions

by

erro

rpr

opag

atio

n, in: D,E

. Rum

elha

rt, j

.M

cCle

lland

and

the

PDP

Res

earc

h G

roup

, eds

.Parallel Distributed Processing: Explorations

in the Microstructure of Cognition

I:

Foun

datio

ns(M

IT P

ress

, Cam

brid

ge, M

A, 1986).

29, D, Servan-Sc

hrei

ber,

A, C

leer

eman

s an

d j,L McClelland

, Enc

odin

g se

quen

tial s

truc

ture

insi

mpl

e re

curr

ent n

etw

orks

, Tec

h, R

epl.

CM

U-C

S-88

-183

, Dep

artm

ent o

f C

ompu

ter

Scie

nce

Car

negi

e M

ello

n U

nive

rsity

, Pill

sbur

gh, P

A (

1988

),30

. M, F

, SI.

john

and

j,M

cCle

lland

, Rec

onst

ruct

ive

mem

ory

for

sent

ence

s: A

PD

P ap

proa

chProceedings Inference: OU/C

86, University

of

Ohi

o, A

then

s, O

H (

1987

),31

. R, T

arab

an j,

McD

onal

d an

d B

. Mac

Whi

nney

, Cat

egor

y le

arni

ng in

a c

onne

ctio

nist

mod

el:

Lear

ning

to d

eclin

e th

e G

erm

an d

efin

ite a

rtic

le, in: R. Corrigan, ed"

Milw

auke

e C

onfe

renc

eon

C

ateg

oriz

atio

n (B

enja

min

s , P

hila

delp

hia

, PA

, to

appe

ar).

32. G. Tesauro

, Sca

ling

rela

tions

hips

in b

ack-

prop

agat

ion

lear

ning

: Dep

ende

nce

on tr

aini

ng s

etsi

ze, T

ech,

Rep

l., C

ente

r fo

r C

ompl

ex S

yste

ms

Res

earc

h, U

nive

rsity

of

Illin

ois

at U

rban

a-C

ham

paig

n, C

ham

paig

nIL

(1

987)

,33

. S.

T

oure

tzky

and

S, G

eva, A distributed connectionist representation for concept struc-

tures, in: Pr

ocee

ding

s N

inth

Ann

ual C

onfe

renc

e of

the

Cog

nitiv

e Sc

ienc

e So

ciet

y,

Seal

lle, W

A(

1987

),34, T,A

, van

Dijk

and

W, K

ints

ch,

Stra

tegi

es o

f D

isco

urse

Com

preh

ensi

on

(Aca

dem

ic P

ress

Orla

ndo,

FL, 1983),

35. D,

Wal

tz a

nd j,

B, Pollack, Massively pa

ralle

l par

sing

: A s

tron

gly

inte

ract

ive

mod

el o

fnatural language interpretation,

Cognitive Sci,

9 (1

985)

51-

74,

common
Pencil
common
Pencil