Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language

Post on 16-Mar-2016

63 views 0 download

description

Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language. Alec Marantz MIT KIT/MIT MEG Joint Research Lab Department of Linguistics and Philosophy. From Cog Sci to Cog Neurosci. - PowerPoint PPT Presentation

Transcript of Using Evoked Magnetoencephalographic Responses for the Cognitive Neuroscience of Language

Using Evoked Magnetoencephalographic Responses for the

Cognitive Neuroscience of Language

Alec MarantzMIT

KIT/MIT MEG Joint Research LabDepartment of Linguistics and Philosophy

From Cog Sci to Cog Neurosci

• Cognitive Science, including Linguistics, has used behavioral data to develop computational theories of language representation and use

• These theories play out along the dimensions of time (sequential processing stages), space (separation of processing functions) and complexity (difficulty of processing)

Cognitive Neuroscience of Language

• Cognitive Science moves to Cognitive Neuroscience when the temporal, spatial, and complexity dimensions of cognitive theories are mapped onto the time course, localization, and intensity of brain activity

• However, because of the lack of temporal information, the development of Neurolinguistics with fMRI and PET techniques has tended to flatten theories of the Cognitive Neuroscience of Language

Cognitive Science: Taft & Forster 1977 (traditional

articulated Cog Sci)

Affix stripping, followed by recombination of stem and affix

sample prediction from model:

• -semble is a stem, since assemble, resemble, dissemble are words

• -sassin (assasin) is not a stem, since only assassin is a word

• It should take longer to reject “semble” as a non-word than “sassin,” since “semble” is a lexical item (“semble” requires looping from box 4 through box 5 in the model before reaching box 7, while “sassin” pushes directly from box 4 to box 7, “No”)

Taft 2004: further behavioral support for articulated model of processing

stages

More contemporary instantiation of model -- makes predictions about RTs based, e.g., on a theory of the experimental task

Flattened computational model: Gonnerman & Plaut (2000)

• Masked priming experiment compares responses to Semantic sofa-COUCH Morphological hunter-HUNT Orthographic passive-PASS Unrelated award-MUNCH

• Claim: failure to find special location for the morphological condition (using fMRI) supports flat model in which morphology is an emergent property of semantic and phonological/orthographic relatedness

fMRI experiment consistent with flattened computational model. Temporal/sequential processing not at issue.

But the masked priming experimental design is confounded with respect to predictions from a Taft-style model with affix-stripping since the “orthographic” items consist of possible stems and stripable affixes (e.g., tenable/ten passive/pass)

Articulated vs. Flattened Model

• Taft’s articulated affix-stripping model predicts that “tenable” and “bendable” should be processed in the same “places” (in the model/brain) and in the same temporal sequence (affix stripping followed by stem activation followed by recombination), with differences in “complexity” (measured, e.g., by level of brain activity or latency of brain events)

• Thus the cognitive science model predicts the fMRI results and makes further predictions testable with techniques that allow exploration of the latency of brain responses

MEG allows cognitive neuroscience to fully embrace

cognitive science

• MEG records the magnetic fields generated by electrical activity in the brain, millisecond by millisecond

• MEG has the spatial resolution, the temporal resolution and the sensitivity necessary to test predictions from cognitive science along the space, time and complexity dimensions

Plot

• Examples of MEG experiments exploiting the temporal, spatial, and intensity resolution of the technique

• A return to Taft’s stages• The future: even closer ties between experimental designs in cognitive science and cognitive neuroscience

KIT/MIT MEG Lab

Magnetoencephalography (MEG) =study of the brain’s magnetic fields

http://www.ctf.com/Pages/page33.html

Magnetoencephalography (MEG)

Distribution of magnetic field at 93 ms (auditory M100)

Averaged epoch of activity in all sensors, overlapping wave forms, one line/sensor

Outgoing

Ingoing

Liina Pylkkänen, Aug 03, Tateshina

MEG exemplified

Parametric variation in letter string length and in added

visual noise

Categorical symbol vs letter manipulation

M100 response varies in intensity with visual noise; M170 response varies in

intensity with string length

M100 response M170

response

Note separation in space and temporal sequence (M100 vs. M170) consistent with sequential processing model

Intensity of M170 response to letters as compared to symbols confirms function of

processing at M170 time & location (“visual word form” or “letter string” area)

Reaction time to read words predicted by combination of M170 amplitude and latency

Latency coding? Response latency correlates with stimulus

properties.

Auditory M100 (from auditory cortex)

Frequency of tone predicts latency of M100 peak

Temporal Coding?:Shape of response over time at M100 latency and

source location correlates with phonetic category of stimulus

Voiced (b,d) vs. voiceless (p,t) consonant auditory evoked

response

• Different ways of measuring the shape of the M100 response to voiced vs. voiceless consonants yield good computational “experts” that can classify data from a single response as either a pa/ta or a ba/da with significantly greater than chance accuracy

Sequential processing of words

What happens in the brain when we read words?

-100 0 100 200 300 400 500 600 700 [msec]

0

200

200

[fT]

150-200ms (M170) 200-300ms (M250) 300-400ms (M350) 400-500ms

Pylkkänen and Marantz, Trends in Cognitive Sciences

Letter string processing

(Tarkiainen et al. 1999)

Lexical activation

(Pylkkänen et al. 2002)

Note left lateralization of responses in standard perisylvian language areas

Latency of M350 sensitive to lexical factors such as lexical

frequency and repetition

M350

(Pylkkänen, Stringfellow, Flagg, Marantz, Biomag2000 Proceedings, 2000)

Repetition Frequency

1 2 3 4 5 6

Frequency Category (Frequent -- Infrequent)

Behavioral Data: Reaction Time

Categories (n/Million):

1: 7002: 1403: 30 4: 6 5: 1 6: .2

1 2 3 4 5 6Frequency Category (Frequent -- Infrequent)

Latency of m350 Component

Categories (n/Million):

1: 7002: 1403: 30 4: 6 5: 1 6: .2

(Embick, Hackl, Shaeffer, Kelepir, Marantz, Cognitive Brain Research, 2001)

M350 is (in time and place) the locus of lexical activation; lexical decision modulated by competition

among activated items occurs later and elsewhere

Vitevich and Luce (1998), stages of word processing• Phonotactic probability (sub-lexical frequency of bits of words) affects lexical activation, with frequency being facilitory

• Phonological neighborhood density affects lexical decision (“after” activation), with density being inhibitory

• Phonotactic probability and neighborhood density are usually highly correlated, so the same items that facilitate activation inhibit decision

• So, words with high phonotactic probabilities from dense neighborhoods should show quicker M350 latencies but slower RTs in lexical decision

Words and non-words with high probability sound sequences, from dense neighbors, show quicker

M350s and slower RTs

Pylkkänen et al. (2002)

M350: M350: notnot sensitive to competition from sensitive to competition from phonological neighbors, RT phonological neighbors, RT isis

NEIGHBORHOOD COMPETITION

EFFECT

SUBLEXICAL PHON FREQUENCY

EFFECT

300

350

400

450

500

550

600

650

700

M350 RT

[msec]

High phon. prob. word (LINE) Low phon. prob. word (PAGE)

**

**

Irregular Past Tense Priming:Stockall & Marantz (to appear in Mental Lexicon)

• In cross-modal priming (hear one word, make a lexical decision on a letter string presented immediately after), irregulars don’t generally prime their stems behaviorally:gave-GIVE taught-TEACH

• Allen & Badecker show that orthographic overlap in this experimental design leads to RT inhibition and that past-tense/stem pairs with higher orthographic overlap yield less priming than those with less overlap

Prediction of linguistic theories (e.g., Distributed

Morphology)

• Irregular past tense/”stem” priming paradigms (gave/give, taught/teach) should yield identity priming at the stage of root/stem activation (the M350) and form competition effects among allomorphs subsequently, slowing reaction time relative to pure stem/stem identity priming.

MEG irregular past-tense priming experiment

Design:Visual-visual immediate priming, lexical decision on the target

(see Pastizzo and Feldman 2002 )

+prime

target

450 50 200 0 …2500ms

Duration of trial (ms)

MEG Results: M350 Priming for Past Tense/Stem equivalent to identity

priming

Significant priming for

Identity condition (*p=0.01)

TAUGHT-TEACH vs.SMACK-TEACH (*p=0.04)

GAVE-GIVE vs. PLUM-GIVE (*p=0.05)

No reliable effect for: STIFF-STAFF vsGRAB-STAFF (p=0.13)

Amount of PriingAmount of Priming n=8

RT Results: Competition effects; no significant priming for TAUGHT-TEACH

62.4

-27.6 -13.018.7

-40

-20

0

20

40

60

80

1Amount of priming (ms)

Identity stiff-staff taught-teach gave-give

**

**

n.s.

Significant priming for Identity condition (**p=0.0009)

GAVE-GIVE (*p=0.03)

Significant inhibition for STIFF-STAFF (*p=0.01)

No reliable effect for TAUGHT-TEACH (p=0.21) (but trend towards inhibition)

MEG & RT Results:MEG taps stem activation; RT reflects decision in the face of competition

**

**

n.s.

-50

-20

10

40

70

gave-give ident stiff-staff taught-teach

M350 Latency RT

Follow-up: Add regulars and ritzy/glitzy condition

• Regulars walk-walked

• Orthographic & Semantic Overlap:

boil-broil• Reverse order, stem before past tense

ritzy-glitzy items

drop~drip clash~clangflip~flop blossom~bloom pet~pat ghost~ghoulgloom~glum shrivel~shrinksquish~squash crumple~rumpleboil~broil screech~screamstrain~sprain converge~mergemangle~tangle scald~scorchslim~trim crinkle~wrinkle bump~lump attain~gainburst~bust scrape~scratch

-20

-10

0

10

20

30

40

50

give-gave teach-taught date-dated boil-broilAmount of Priming (ms/fT)

MEG RT

-50

-20

10

40

70

gave-give ident stiff-staff taught-teach

Amount of Priming (ms/fT)

M350 Latency RT

Order effect on RT; i.e., on form competition

Linguistic Computational Models of Morphology fully supported

• Relation between irregular past tense form and stem is like that between regular past tense form and stem (or between identical stems), not like that between words phonologically/orthographically and semantically related (boil - broil)

• Root priming separates from form competition (between allomorphs of stem) in time course of lexical access

Taft (2004), “Morphological Decomposition and the Reverse Base

Frequency Effect.”

• Claim: Base frequency effects (RT to complex word correlates with freq of stem) reflect access of the stem of morphological complex forms whereas surface frequency effects (RT to complex word correlates with freq of complex word) reflect stage of checking recombination of stem and affix for existence and/or well-formedness.

• “The suggestion being made, then, is that the advantage at the early stages of processing of having a relatively high base frequency could be potentially obscured by counterbalancing factors happening at later stages of processing.” [750-1]

Lexical Decision Task

• non-word foils consisting of existing words with ungrammatical affixes (mirths, kettled, joying, redly, iratest) (just like the Devlin “orthographic” cases)

• three classes of words “mending” class: low surface frequency

low base frequency “seeming” class: low surface frequency

high base frequency “growing” class: mid surface frequency

high base frequency

• Claim: advantage of high base frequency for “seem” at stem access stage (indexed by the M350) is offset in RT by a disadvantage for the low-frequency of the use of the –ing with the “seem” stem, i.e., at the post-affix recombination stage, indexed by RT

• (For Taft, manipulating the foils in lexical decision attenuated the surface frequency effect, arguing for two stages of processing in the indirect fashion typical of good cognitive science )

Reilly and Holt 2004, with the KIT/MIT MEG Team

• Replicate Taft’s experiment in the MEG Lab

• Predict: base frequency affects root access and thus M350 latency

surface frequency affects post-M350 recombination stage and thus RT

Results: M350 Latency tracks Base Frequency, RT tracks Surface

FrequencyMending classlow surfacelow base

Seeming classlow surfacehigh base

Growing classmid-surfacehigh base

Surface frequency 7.8 7.7 75.9Base frequency 36.5 460.3 456.9RT Taft 687 701 653RT MIT 780 805 746M350 MIT 375 362 356

Surface Frequency effect at RT (significant at .05 level), Mending and Seeming slower than GrowingBase Frequency effect at M350 Latency (significant at .05 level), Mending slower than Seeming and Growing

>>

>

Conclusion

• MEG serves as a tool to upgrade cognitive science (& linguistics) to cognitive neuroscience without losing the empirically motivated richness of cognitive computational theories

• Cog Sci notions of space, time, and complexity map onto brain space, latency and magnitude of neural activity

What’s the next step?

• Traditional approaches to MEG analysis involve averaging together many responses (repeated from an experimental “bin”) prior to computing differences in responses by condition within each subject

• This contrasts with standard cognitive science practice (e.g., with RT) of including a dependent measure from each trial in the ANOVA.

• To fully incorporate cognitive theories into cognitive neuroscience, including the correlation of continuous variables with continuous response measures and the use of item analyses in complex designs, we need to include single trial MEG data in our analyses

Why not single trial MEG?

• For the type of experiment discussed in this talk, we would need to extract response amplitude and latency information from each trial, given a “response” defined in terms of source localization

• So, we would look at each single response for dipole source activation (latency of peak response, amplitude of response) for a source identified from grand averaged data for a subject

M100 Latency, Single Trials(Marantz, in preparation)

• Left hemisphere M100 source computed via single dipole model from grand averaged response to 60 tones, 30 at 200Hz, 30 at 1KHz

• Weight matrix from dipole source used as spatial filter over raw data to derive dipole activation latency for each tone individually

Single trial M100 latencies

Latency of left hemisphere M100 latency as a function of stimulus tone frequency

80

90

100

110

120

130

140

150

160

6 8 10 12 14 16 18Tone Frequency 200Hz vs. 1KHz

M100 activation latency (ms)

Series1

200Hz 1 KHz

Single trial analysis as in behavioral studies is possible using only normal

MEG techniques and tools

• No fancy pre-processing• No fancy localization or statistical tools

• For responses less automatic than the M100, expect overlap in scatter plots to be greater (approaching that for RTs in e.g. lexical decision experiments)

Taft & Forster re-visited

• Is RT slow-down for -semble (bound stem) over -sassin (pseudo-stem) attributable to lexical access for “semble” but not for “sassin,” as Taft claims, or to response competition from words (resemble, dissemble, assemble vs. assassin)?

• Prediction: slow-down at lexical access should show up at M350 while slow-down for response competition should occur after (as shown by neighborhood density and past tense studies)

Brown & Marantz (in preparation)

• 3 subjects• 20 real stems, 20 pseudo stems (matched by Taft & Forster along various dimensions) per condition

• Single trial analysis of MEG data: M350 dipole activation peak analysis, with M350 dipole fitted over left-hemisphere sensors on the grand average to all stimuli in the experiment

Slow-down is observed at M350: for 3 subjects and 108 observations, difference is significant over the single trial MEG data but not yet for

RT

Real Stems(-semble)

Pseudo Stems(-sassin)

Reaction time

784ms 719ms p=0.16

M350 Latency (over single trials)

356ms 339ms p=0.005

• Taft theory of decomposition in which bound stems have lexical entries is fully supported by the MEG data

• Single trial MEG data is at least as consistent as reaction time data

• MEG can be used on par with RT to add additional dependent variables to experiments testing computational theories within cognitive neuroscience

Thank you.

marantz@mit.edu