Spelling out the lexeme The Dell model. Outline zConnectionist models zDell (1986) yPhenomena...

Spelling out the lexeme

The Dell model

Outline

Connectionist models Dell (1986)

Phenomena Structure of model Accounts of phenomena Evaluation

Dell (1988) Changes and motivation

Connectionist Models

= (artificial) neural networks, ANNs, PDP models Models of cognition, more or less inspired by information

processing in the brain: Massive parallism, using large networks of relatively

simple units. Widely applied in cognitive psychology, robotics, medical

information systems, pattern recognition,...

Example

Node (unit, neurode) with Activation level)

Input connections

Output connections

0.7

A(t) = F [ A(t-1), IN(t-1) ]

Example

Node (unit, neurode) with Activation level)

Input connections (with weights)

Output connections

0.7

A(t) = F [ A(t-1), IN(t-1) ]

IN(t-1) = ∑jwij.Aj(t-1)

0.1 0.10.6

Important Dimensions

Representational assumptions (localized, distributed) Network Structure (Layers, Connectivity) Activational dynamics Learning rule (if any) Biological plausibility

Representational assumptions

Localist or Distributed Localist networks: one-to-one correspondence between nodes

and psychological units (ie. ‘grandmother cells’). (Dell, 1986; McClelland & Rumelhart, 1981; Roelofs, 1997)

Distributed networks (true Parallel Distributed Networks): representations as patterns of activations over many nodes (Seidenberg & McClelland, )

How to deal with serial order? M & C, 1981: 4-letter words only, each letter represented for each position.

Architectural assumptions

How many layers? (linear separability problem) Connectivity between layers: full connectivity or

bounded connectivity?). Types of connections: excitatory, inhibitory, both (Initial) weights: random (PDP-models), fixed

(McClelland & Rumelhart, 1981),uniform (Dell, 1986)

Activational Dynamics

Ai,t = F [ {(Ai(t-1)}, In,i(t-1) ] Ini(t-1) = Gj [ (Aj (t-1), Wij ]

How is input determined? How does input change activation level? How is competition resolved?

‘Lateral’ inhibition => Winner-takes-all ‘Virtual’ inhibition (Roelofs): Luce ratio

Other Dimensions

Learning rules: Supervised (Backpropagation, Delta rule) or Unsupervised (Hebb rule, Self-organizing NN)

Biological plausibility: Constraint or not? (Full connectivity, Dale’s law, ..)

The Dell (1986) model

Phenomenon of interest: phonological speech errors Characteristics & distribution Lexical bias effect Repeated phoneme effect

Data: Corpus Analyses

Substitution errors

57%: Anticipation: Bake my bike [take]

13%: Perseveration She pulled a pantrum [tantrum]

5%: Exchange waple malnut [maple walnut]

Wordshape errors

3% Deletion: _till cheaper [still]

21% Addition: hit frate [rate]

1% Shift back bloxes

Characteristics

Anticipations > Perseverations >> exchanges Additions > Deletions Effects of Speech Rate Phonotactic Constraints respected Effects of Similarity Interacting Elements

Onsets -> onsets, codas-> codas Same stress value Similar phonological features (bilabial, nasal, etc.)

Initialness effect Frequency effects

Characteristics (2)

Lexical Bias Effect Repeated phoneme effects: repeated sounds tend to

trigger misordering of neighbours Speech rate interacts with each of these effects.

Note: See Stemberger (1992): The reliability and and replicability of naturalistic speech error data. In B. Baars (Ed.),: Experimental slips and human error.

Processing

Input to ‘current’ lexeme Activation Spreading through network Spread determined by p (spreading rate) and q (decay

rate) Selection after fixed number of time steps R. Selection at the level of the phoneme, according to the

rule:

SYL -> Onset + Nucleus + Coda

Model Architecture

cat spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spa Lexeme

Syll

Rimecluster

phoneme

feature

Saying Cat: t = 1

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: t = 2

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: t = 3

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: t = 4

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: t = 5

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: t = 6

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Saying Cat: selection

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

After Selection...

spat

kaet spaet spa

aet aspOn

s p k aea

tnull

fric stop

spacat

Why does the model’s tongue slip?

An error occurs, if another unit than the correct one, is the highest activated onset, nucleus, or coda.

Reasons: There is remaining activation of a previous word There is anticipatory activation of an upcoming word Noise in the model.

What does the model capture (1)?

Effects of speech rate: the less time to selection, the more errors the model makes.

The general distribution of speech errors: Substitution > Non-Substitution Anticipation > Perseveration > Exchange

Note: the ratio of anticipation to perseveration can be pushed around (time-to-selection, decay rate, amount of anticipatory priming). Exchange rate can not be pushed around independently.

What does the model capture (2)? Lexical bias effect

kaet raet

k rae t

What does the model capture (3)? Repeated phoneme effect

kaet raen

k rae t

run

nu

Evaluation

Distribution of speech errors: good global fit. But sensitive to particular parameter settings.

Effects of speech rate: follows naturally from processing assumptions.

Lexical bias: direct result of feedback assumption Repeated phoneme: direct result of feedback

assumption Interaction speech rate X Lex. Bias; speech rate x Rep.

Phoneme: to little time for feedback..

Evaluation (2)

Similarity Effect Position effect (built into the model) Stress effect (not accounted for) Phon. Similarity effect (feedback from feature layer)

Well-formedness effect: built into the model (/ml/ is not a cluster)

Initialness effect: not accounted for Addition bias: Opposite prediction.

Evaluation (3)

Fast running speech: resyllabification. Thus ‘gave it him’, becomes ‘ga-vi-tim’. Model produces only citation forms.

Fixed number of time steps. The model cannot predict naming latencies, only error patterns. Thus, it cannot account for what is going on normally in phonological encoding, but only for derailments of the process.

Dell (1988)

Introduction of explicit representations for phonological structure ‘WordShape Headers’.

Introducing a new selection mechanism: 1. Selection of Wordshape Headers 2. Units to be selected depend on Headers 3. Selection of units can be serial

Revised model (my version)

lopen gaper

WORDS

SYLLABLES

lo:

ga:

lg

p

ro: a:@

CVCVC

p@ p@r

1 21 2

C

C

V

V C

Why this revision?

Solves problem of deletion bias Evidence for role of CV structure in language production

(Costa & Sebastian, 1998; Meijer, 1994; 1996; Sevald, Dell, & Cole, 1995).

Evidence for seriality in phonological encoding: Meyer (1990)

Spelling out the lexeme The Dell model. Outline zConnectionist models zDell (1986) yPhenomena...

Documents

Transcript of Spelling out the lexeme The Dell model. Outline zConnectionist models zDell (1986) yPhenomena...