Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken...

16
Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI ROELOFS 1 * Radboud University Nijmegen Abstract: It is widely assumed that spoken word production in Germanic languages like Dutch and English involves a parallel activation of phonemic segments and metrical frames in memory, followed by a serial association of segments to the frame, as implemented in the WEAVER++ model (Levelt, Roelofs, & Meyer, 1999). However, for Oriental languages like Mandarin Chinese and Japanese, researchers have sug- gested that the serial association concerns atonal phonological syllables (Mandarin Chinese) or moras (Japanese) to tonal frames. Here, the utility of these theoretical suggestions is demonstrated by computer simulations of key empirical findings using versions of WEAVER++ for English, Mandarin Chinese (纺织者++), and Japanese ( ウィーバー++). The simulation outcomes suggest that, although languages may differ in the phonological structure of their words, the principles underlying phonological encoding are similar across languages. Key words: computational modeling, implicit priming, Japanese, Mandarin Chinese, phonological encoding. An immediately noticeable difference among the 6000 or so languages of the world concerns the sound form of their words. Languages clearly differ in the segmental repertoire of vowels and consonants as well as in supra- segmental properties such as stress and tone (Ladefoged, 2001; Ladefoged & Maddieson, 1996). In European languages like Dutch and English, one of the syllables of a polysyllabic word carries primary stress. The stress pattern across syllables may sometimes distinguish one word from another, as in the case of English abstract (noun) and abstract (verb). In contrast, Oriental languages like Mandarin Chinese and Japanese make use of tones, whereby the tone pattern across syllables distinguishes words that would otherwise be homophones (Gussenhoven, 2004; Yip, 2002). The use of tones to differentiate words, analogously to consonants and vowels, is not a rare phenomenon. It has been estimated that more than half of the languages in the world are tone languages (Yip, 2002). A difference among tone languages concerns the phonologi- cal units that carry the tones, which is the syllable in Mandarin Chinese and the mora (i.e., a subsyllabic unit that determines syllable weight) in Japanese (Gussenhoven, 2004; Yip, 2002). Moreover, tone languages differ in the number of tones that they use. Mandarin Chinese has four lexically contrastive tones (high, mid-rising, low-dipping, and high-falling) *Correspondence concerning this article should be sent to: Ardi Roelofs, Radboud University Nijmegen, Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Spinoza Building B.02.34, Montessorilaan 3, 6525 HR Nijmegen, The Netherlands. (E-mail: [email protected]) 1 I am indebted to Train-Min Chen for developing a preliminary version of WEAVER++ for Mandarin Chinese and to Carlos Gussenhoven, Zeshu Shao, and Atsuko Takashima for helpful comments. Japanese Psychological Research 2015, Volume 57, No. 1, 22–37 Special issue: Speech sounds across languages Invited Review © 2014 Japanese Psychological Association. Published by Wiley Publishing Asia Pty Ltd. doi: 10.1111/jpr.12050

Transcript of Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken...

Page 1: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

Modeling of phonological encoding in spoken wordproduction: From Germanic languages to Mandarin

Chinese and Japanese

ARDI ROELOFS1* Radboud University Nijmegen

Abstract: It is widely assumed that spoken word production in Germanic languageslike Dutch and English involves a parallel activation of phonemic segments andmetrical frames in memory, followed by a serial association of segments to the frame,as implemented in the WEAVER++ model (Levelt, Roelofs, & Meyer, 1999). However,for Oriental languages like Mandarin Chinese and Japanese, researchers have sug-gested that the serial association concerns atonal phonological syllables (MandarinChinese) or moras (Japanese) to tonal frames. Here, the utility of these theoreticalsuggestions is demonstrated by computer simulations of key empirical findings usingversions of WEAVER++ for English, Mandarin Chinese (纺 织 者++), and Japanese(ウィーバー++). The simulation outcomes suggest that, although languages may differin the phonological structure of their words, the principles underlying phonologicalencoding are similar across languages.

Key words: computational modeling, implicit priming, Japanese, Mandarin Chinese,phonological encoding.

An immediately noticeable difference amongthe 6000 or so languages of the world concernsthe sound form of their words. Languagesclearly differ in the segmental repertoire ofvowels and consonants as well as in supra-segmental properties such as stress and tone(Ladefoged, 2001; Ladefoged & Maddieson,1996). In European languages like Dutch andEnglish, one of the syllables of a polysyllabicword carries primary stress. The stress patternacross syllables may sometimes distinguish oneword from another, as in the case of Englishabstract (noun) and abstract (verb). In contrast,Oriental languages like Mandarin Chineseand Japanese make use of tones, wherebythe tone pattern across syllables distinguishes

words that would otherwise be homophones(Gussenhoven, 2004; Yip, 2002).

The use of tones to differentiate words,analogously to consonants and vowels, is not arare phenomenon. It has been estimated thatmore than half of the languages in the worldare tone languages (Yip, 2002). A differenceamong tone languages concerns the phonologi-cal units that carry the tones, which is thesyllable in Mandarin Chinese and the mora(i.e., a subsyllabic unit that determines syllableweight) in Japanese (Gussenhoven, 2004; Yip,2002). Moreover, tone languages differ inthe number of tones that they use. MandarinChinese has four lexically contrastive tones(high, mid-rising, low-dipping, and high-falling)

*Correspondence concerning this article should be sent to: Ardi Roelofs, Radboud University Nijmegen,Donders Institute for Brain, Cognition and Behaviour, Centre for Cognition, Spinoza Building B.02.34,Montessorilaan 3, 6525 HR Nijmegen, The Netherlands. (E-mail: [email protected])

1I am indebted to Train-Min Chen for developing a preliminary version of WEAVER++ for Mandarin Chineseand to Carlos Gussenhoven, Zeshu Shao, and Atsuko Takashima for helpful comments.

bs_bs_banner

Japanese Psychological Research2015, Volume 57, No. 1, 22–37Special issue: Speech sounds across languages Invited Review

© 2014 Japanese Psychological Association. Published by Wiley Publishing Asia Pty Ltd.

doi: 10.1111/jpr.12050

Page 2: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

and one neutral tone, whereby the syllables ofpolysyllabic words may each carry their owntone (Duanmu, 2007). Japanese has only twotones, with one mora of a word carrying highpitch and the other moras bearing low pitch. InDutch and English, the stressed syllable of aword is pronounced louder, longer, and with ahigher pitch than unstressed syllables (Booij,1995; Kenstowicz, 1994; Levelt, 1989; Roach,2009). In Japanese, the accented mora of aword is pronounced with a relatively high tonefollowed by a drop in pitch, while all morasare spoken with equal length and loudness(Akamatsu, 2000; Labrune, 2012).

Over the past few decades, computationallyimplemented models have been developed forhow speakers encode the phonological forms ofwords in English (Dell, 1986, 1988) and Dutch(Levelt, Roelofs, & Meyer, 1999; Roelofs, 1997).However, it has remained unclear to whatextent the encoding principles underlying theseGermanic languages, in which words havestress patterns, apply to languages in which tonemay distinguish one word from another(Gussenhoven, 2004; Yip, 2002). During thepast few years, experimental paradigms thatwere used to examine phonological encoding inGermanic languages, such as implicit priming(Meyer, 1990, 1991), have also been employedin investigating Mandarin Chinese (Bi, Wei,Janssen, & Han, 2009; Chen, Chen, & Dell,2002; O’Seaghdha, Chen, & Chen, 2010) andJapanese (Kureta, Fushimi, & Tatsumi, 2006).In implicit priming experiments, participantshave to produce spoken words in homogeneousblocks of trials in which a word-form propertyis held constant among the response words(e.g., the first phonemic segment /b/ of theresponse words book, bed, and bus) and inblocks in which the property varies (e.g., aswith the response words book, cat, and leg).Typically, response time (RT) is smaller in thehomogeneous than the heterogeneous blocks,often called the implicit priming effect, suggest-ing the benefit of anticipatory preparationof the shared word-form property (here, thesegment /b/) in homogeneous blocks.

Implicit priming experiments on Dutch,English, Mandarin Chinese, and Japanese have

used very similar designs, which allows fora direct comparison of the results. This hasrevealed salient differences in implicit primingeffects among the Germanic languages, Manda-rin Chinese, and Japanese, suggesting importantdifferences in phonological encoding amongthese languages. In particular, whereas phone-mic segments and stress patterns seem to play adominant role in the phonological encoding ofDutch and English words, such a role is playedby syllables and tones in Mandarin Chineseand by moras and tones in Japanese. This raisesthe question as to what extent the principlesunderlying the models of phonological encod-ing developed for English and Dutch apply toMandarin Chinese and Japanese. This issue isaddressed in the present article.

As a theoretical framework for the discus-sion, the article uses the WEAVER++ modelof spoken word planning (Levelt et al., 1999),which has been more extensively applied tofindings from implicit priming than the modelof Dell (1986, 1988). WEAVER++ has beendeveloped primarily for Dutch but has alsobeen applied to English. The model assumesthat phonological encoding in languages likeDutch and English involves a parallel activa-tion of phonemic segments and metricalframes in memory, followed by prosodification,which concerns serial content-to-frame associa-tion (for metrically irregular words, discussedbelow) and syllabification, as illustrated inFigure 1. These assumptions are supported byevidence from implicit priming experiments,which show that sharing of word-initial frag-ments, including single phonemic segments,yields implicit priming effects, but that sharingof noninitial fragments does not (Meyer,1990, 1991). However, studies suggest that forobtaining implicit priming effects in MandarinChinese, the word-initial fragment has tobe minimally a syllable (Chen et al., 2002;O’Seaghdha et al., 2010), whereas it has to beminimally a mora in Japanese (Kureta et al.,2006). Researchers have suggested that forthese languages, the serial association concernsstored atonal phonological syllables (MandarinChinese) or stored moras (Japanese) to tonalframes. In the research reported in the present

Modeling of phonological encoding 23

© Japanese Psychological Association 2014.

Page 3: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

article, the utility of these theoretical sugges-tions was examined through WEAVER++simulations of the key implicit priming findingson English, Mandarin Chinese, and Japanese.

The article is organized as follows. I startby describing phonological encoding inWEAVER++ in somewhat more detail. Next,I explain the implicit priming paradigm, dis-cuss the key findings obtained for Dutchand English, and indicate how WEAVER++explains these findings. Then, I describe the keyfindings on Mandarin Chinese and Japanese,and outline how WEAVER++ may be modifiedto accommodate these findings, followingtheoretical suggestions in the literature. In par-ticular, I adopt the proximate units principleproposed by O’Seaghdha et al. (2010), whichholds that languages differ in the phonologicalunits that are directly connected to lexeme(i.e., whole word form) or morpheme nodes inthe form network (i.e., segments and metricalframes for Dutch and English, atonal phono-logical syllables and tonal frames for MandarinChinese, and moras and tonal frames for Japa-

nese). Finally, I report the results of computersimulations using versions of WEAVER++ forEnglish, Mandarin Chinese (纺 织 者++), andJapanese (ウィーバー++), which demonstratethe utility of the theoretical suggestions. Thesimulation outcomes suggest that, although lan-guages may differ in the phonological structureof their words, the principles underlying phono-logical encoding are similar across languages,with the proximate units principle accommo-dating language-specific phonological proper-ties, such as whether or not tone is used todistinguish one word from another.

Phonological encoding

in WEAVER++

The WEAVER++ model is described in detailin several other articles and I refer to thesepublications for a discussion of the modeland its empirical support (Levelt et al., 1999;Roelofs, 1997). According to the model, theconceptually driven planning of a spokenword involves conceptual preparation, lemmaretrieval, and word-form encoding, with thelatter involving morphological, phonological,and phonetic encoding. Information aboutwords is stored in a large associative networkthat is accessed by spreading activation, whilecondition-action rules select nodes and performencoding operations. The network containsnodes for lexical concepts, lemmas, morphemes,phonemic segments, metrical frames (in casethe word’s stress pattern cannot be derivedby rule), and syllable motor programs. Forexample, for the English word guitar, there arenodes for the concept GUITAR(X), the lemmaof guitar specifying that the word is a noun(for languages such as Dutch, lemmas alsospecify grammatical gender) and allowing forthe abstract morphosyntactic specification ofnumber (singular, plural), the morpheme<guitar>, a metrical frame specifying that stressis on the second syllable, the phonemic seg-ments /g/, /I/, /t/, /A:/, and /r/, and the syllablemotor programs [gI] and [tA:r]. Activationspreads from level to level, whereby each nodesends a proportion of its activation to con-nected nodes.

Phonologicalword

Frameac�va�on

Lexeme/morpheme

Prosodifica�on

Contentac�va�on

Figure 1 Phonological encoding in WEAVER++involves parallel activation of phonological contentsand frames, followed by prosodification, whichincludes serial content-to-frame association. Contentand frame refer to phonemic segments and irregularmetrical structures for Germanic languages likeDutch and English, atonal phonological syllables andtonal structures for Mandarin Chinese, and morasand tonal structures for Japanese.

A. Roelofs24

© Japanese Psychological Association 2014.

Page 4: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

Figure 2 illustrates the representation of theform of guitar in the network, including con-tent nodes for the stem morpheme, phonemicsegments, and syllable motor programs, aswell as frame nodes for the metrical structure.Although previous research has examined thenature of metrical structures in spoken wordproduction (Roelofs & Meyer, 1998), it is stillunclear exactly what the metrical frames inmemory specify. It has been proposed thatthese frames make explicit the stress patternacross syllables (Levelt et al., 1999), which maybe coded as an abstract grouping of syllablesinto feet and phonological words (Roelofs,1996a). However, whether feet are really speci-fied is still unclear. In phonological theory inlinguistics, metrical structures are typically rep-resented as metrical trees or grids (Kenstowicz,1994). The form network illustrated in Figure 2simplifies earlier proposals by assuming thatmemory only indicates for each syllablewhether it is metrically weak (w) or strong (s),in line with what has been suggested in theliterature for tone and tonal frames (Kuretaet al., 2006). The issue of what is made explicitby prosodic (i.e., metrical or tonal) frames inmemory is taken up in the General Discussion.

In morphological encoding, one or moremorphemes are selected for a lemma and itsabstract morphosyntactic specification. Forexample, for the lemma of guitar and thespecification “singular”, the stem morpheme<guitar> is selected. In phonological encoding,condition-action rules select the associated seg-

ments (i.e., /g/, /I/, /t/, /A:/, and /r/) and metricalframe (specifying that stress is on the secondsyllable), and associate the segments to themetrical frame, thereby assigning syllable posi-tions to the segments (e.g., /g/ will become onsetof the first syllable). The association proceedsserially from the first to last segment of theword. The online assignment of syllable posi-tions allows for syllabification across mor-pheme and word boundaries, as in hea-ring orhea-rit (see Levelt et al., 1999, for extensive dis-cussion). The outcome of phonological encod-ing is a phonological word representation,which specifies the segments grouped into syl-lables and the stress pattern across syllables.In phonetic encoding, the phonological wordrepresentation is used to retrieve correspond-ing syllable motor programs (i.e., [gI] and[tA:r]) from a mental syllabary, which is astore of ready-made articulatory programs (cf.Cholin, Dell, & Levelt, 2011; Cholin, Levelt, &Schiller, 2006; Levelt & Wheeldon, 1994).

The model assumes short-term bufferingfacilities for intermediate results (referred toas the “suspend/resume mechanism” in earlierarticles, Roelofs, 1997), which is importantfor the account of implicit priming effects. Innormal word planning, these buffers not onlysupport the serial and incremental encodingof morphological, phonological, and phoneticrepresentations, but also will absorb temporalasynchronies that may arise from differentspeeds of processing at different encodingstages. For example, if a word consists of two

<guitar>

/g//I1 2 1 2 3 4

o n o n

[gI

Morpheme

Segments

Syllablemotorprograms

Metrical framesw /r/

5

c

a

a

Figure 2 Illustration of the representation of the English word guitar in the form network of WEAVER++.c = coda; n = nucleus; o = onset; s = strong; w = weak.

Modeling of phonological encoding 25

© Japanese Psychological Association 2014.

Page 5: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

morphemes, the second morpheme may havebeen selected while phonological encoding forthe first morpheme is still in progress. By tem-porarily buffering the second morpheme, thetemporal asynchrony between morphologicaland phonological encoding may be resolved. Inphonological encoding, syllabification of a wordcan start as soon as the first few segments areavailable, as may be the case in the homoge-neous condition of implicit priming experi-ments. The resulting partial representation canbe buffered until the missing segments are avail-able and syllabification can continue. Thus,encoding operations are always completed as faras possible, after which they are put on hold.When further information becomes available,the encoding processes continue from wherethey stopped. Buffered forms in WEAVER++are only expandable toward the end of the word.

Implicit priming in Dutch

and English

The implicit priming paradigm was designedby Meyer (1990, 1991). To overcome the diffi-culty of finding appropriate pictures to evokeword responses, Meyer used paired-associates.Participants first learned prompt-responsepairs, such as CRADLE-baby, TOAST-bagel,WATER-basin. Next, they had to produce theresponse word of a pair (e.g., bagel) upon visualpresentation on a computer screen of theprompt word (TOAST). The order of promptsacross trials was random. The RT, the intervalbetween prompt onset and speech onset, wasthe main dependent variable. Response wordswere produced in two types of trial blocks,homogeneous and heterogeneous. In a homo-geneous block, all the response words shared aword-form property, for example, the first syl-lable (as in baby, bagel, and basin). In the het-erogeneous blocks, the response words did notshare such form property (e.g., baby, melon,hammer). The response words were tested inboth the homogeneous and the heterogeneousblocks of trials.

Using this technique, mean RT is typicallyshorter in homogeneous than in heterogeneousblocks, the implicit priming effect (Meyer,

1990, 1991). Moreover, a strong effect of serialorder is obtained. Sharing of word-initial frag-ments (e.g., the first phonemic segment, as inbook, bed, and bus, or the first syllable, asin baby, bagel, and basin) yields facilitation,but sharing of noninitial fragments doesnot (e.g., the second syllable, as in hammer,summer, beamer).The magnitude of the implicitpriming effect on RTs increases with thenumber of shared word-initial segments. More-over, the magnitude of the effect depends onmorpheme structure and other abstract linguis-tic variables (Cholin, Schiller, & Levelt, 2004;Janssen, Roelofs, & Levelt, 2002, 2004; Roelofs,1997). For example, the implicit priming effectis larger when a shared syllable makes up amorpheme in the response words (e.g., the syl-lable in of input) than when the same syllabledoes not make up a morpheme (e.g., the syl-lable in of insect, Roelofs, 1996a; Roelofs &Baayen, 2002), and the effect is larger for low-frequency morphemes compared with high-frequency morphemes (Roelofs, 1996b, 1998).Also, the effect is larger for low-frequency thanhigh-frequency syllables (Cholin & Levelt,2009). Moreover, there is an implicit primingeffect when all words in a block of trials sharean initial phonemic segment, but no effect at allwhen the initial segments share all but one oftheir phonological features. For example, thereis no implicit priming effect for the responsesbaby, bagel, and patient, where the segments /b/and /p/ share all phonological features exceptvoicing (i.e., /b/ is voiced and /p/ is voiceless).This suggests that the implicit priming effect ofshared phonemic segments arises at the level ofphonological rather than phonetic encoding(see Roelofs, 1999, for an extensive discussion).Sharing of only the metrical structure does notyield an implicit priming effect (Roelofs &Meyer, 1998). In line with this, Schiller, Fikkert,and Levelt (2004) also obtained no effectof auditory priming of metrical structure inpicture naming.

The implicit priming effect is not onlyobtained using the prompt-response proce-dure, but also with picture naming and wordreading (Bi et al., 2009; Chen & Chen, 2013;O’Seaghdha et al., 2010; Roelofs, 1999, 2004,

A. Roelofs26

© Japanese Psychological Association 2014.

Page 6: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

2006; Santiago, 2000). The latter suggests thatthe priming effect does not arise from remem-bering the prompt-response pairs per se, butfrom the anticipatory preparation and tempo-rary maintenance of partial morphologi-cal, phonological, and phonetic codes. Implicitpriming effects have been obtained in severalEuropean languages, including the Germaniclanguages Dutch (Meyer, 1990, 1991; Roelofs,1996a,b, 1998; Roelofs & Baayen, 2002; Roelofs& Meyer, 1998) and English (Damian &Bowers, 2003), and the Romance languagesFrench (Alario, Perre, Castel, & Ziegler, 2007)and Spanish (Santiago, 2000).

Several of the key findings on implicitpriming have been simulated by WEAVER++.According to the model, shorter naming RTs inhomogeneous than heterogeneous blocks maybe obtained when the word production systemencodes a partial word-form representation forthe response words before the identity of theactual response on a trial is known. This partialrepresentation is temporarily buffered untilinformation about the remainder of the wordform becomes available after lexical selection.Consequently, encoding of the buffered repre-sentation may be skipped and quicker formencoding may occur in planning the form ofthe actual response on a trial. For example,if the response words in a block of trials arebook, bed, and bus, the first segment /b/ maybe selected in advance. Although the actualresponse on a trial is not predictable (i.e., thismay be book, bed, or bus), the /b/ may be pre-pared regardless of the eventual response.When the overlap is larger (e.g., as in baby,bagel, and basin), more can be prepared (i.e., /b/and /eI/). Participants are informed about theprompt-response pairs or picture names beforethe start of a block of trials. Thus, a participantknows in advance what the overlap is amongthe response words before the first trial ofan experimental block and may prepare theresponse accordingly. In a heterogeneous blockof trials (e.g., book, cat, and leg), nothing canbe phonologically encoded in advance. Partialpreparation is not possible because theresponse words differ in onset segments (i.e.,/b/, /k/, and /l/).Therefore, RTs will be shorter in

homogeneous than in heterogeneous blocks, asempirically observed (Damian & Bowers, 2003;Meyer, 1990, 1991). The effects of morphologi-cal status (Roelofs, 1996a; Roelofs & Baayen,2002) and morpheme frequency (Roelofs,1996b, 1998) suggest that preparation may alsohappen at the level of morphological encod-ing. Moreover, the effect of syllable frequency(Cholin & Levelt, 2009) suggests that advancepreparation may also involve syllabary access.

Implicit priming in Mandarin

Chinese and Japanese

In implicit priming experiments on MandarinChinese, Chen et al. (2002) examined the roleof syllables and tones. In particular, they testedwhether sharing of single onset segments,tones, atonal syllables, and tonal syllablesyields implicit priming effects. For example, inthe homogeneous condition, the responsesshared the first syllable and tone (e.g., qing1in qing1.liang2, qing1.sheng1, qing1.ting2,qing1.jiau1), the first syllable but not thetone (e.g., fei in fei1.ji1, fei2.pang4, fei3.cuei4,fei4.yan2), the first segment (e.g., /m/ inmo1.cai3, ma2.que4, mu3.dan1, mi4.yue4), orthe first tone (e.g., tone 1 in fei1.ji1, ke1.ji4,xi1.gua1, qing1.jiau1), whereas the responsesdid not systematically share part of theirform in the heterogeneous condition (e.g.,qing1.liang2, xi2.su2, fei3.bang4, ke4.ben3).Chen et al. (2002) observed that sharing ofthe first syllable plus tone yielded an implicitpriming effect, which was larger than the effectof sharing the first syllable without the tone.However, sharing the first segment or tone onlyyielded no effect. The lack of a tone-only effectcorresponds to the absence of an implicitpriming effect of only sharing metrical structurein Dutch (Roelofs & Meyer, 1998). The resultsfor Mandarin Chinese are illustrated inFigure 3 (black bars).The absence of an implicitpriming effect for single initial segments inMandarin Chinese has been replicated recentlyby O’Seaghdha et al. (2010) and Chen andChen (2013). Moreover, O’Seaghdha et al.(2010) observed that sharing of the firstsegment in English does yield an implicit

Modeling of phonological encoding 27

© Japanese Psychological Association 2014.

Page 7: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

priming effect, in line with earlier findingsfor Dutch (Meyer, 1990, 1991). These resultssuggest that phonemic segments and phono-logical syllables play a different role in Manda-rin Chinese than in Germanic languageslike English and Dutch (see You, Zhang, &Verdonschot, 2012, for converging evidencefrom masked priming in Mandarin Chinese).Whereas for getting implicit priming effectsin English and Dutch, advance knowledge ofthe initial segment of a word suffices, the wordinitial fragment has to be minimally a syllable inMandarin Chinese to obtain such effects.

There were only 12 participants in each ofthe experiments of Chen et al. (2002) andO’Seaghdha et al. (2010), which is not a largenumber. However, the onset priming effect wasobtained for English but not for MandarinChinese. This suggests that the absence of aneffect for Mandarin is not just a matter of lack ofexperimental power,otherwise the effect shouldalso have been absent for English. Moreover,Chen and Chen (2013) replicated the absence ofan onset priming effect for Mandarin Chineseusing much larger groups of participants.

In implicit priming experiments on Japanese,Kureta et al. (2006) examined the role of seg-ments and moras. In particular, they testedwhether sharing of a single onset segmentor first mora yields implicit priming effects.

For example, in the homogeneous conditionthe responses shared the first mora (e.g., te intezjoo, tegami, tecuja) or first segment (e.g., /t/ intenshi, tonbi, tanki), whereas in the heteroge-neous condition the responses did not sharepart of their form (e.g., tenshi, panda, konto).Kureta et al. (2006) observed that sharing ofthe first mora yields an implicit priming effect,but that sharing of the first segment does not.These results are illustrated in Figure 4. Thus,whereas for obtaining implicit priming effectsin English and Dutch, advance knowledgeof the initial segment of a word suffices, theword initial fragment has to be minimally amora in Japanese to obtain such effects (seeVerdonschot et al., 2011, for converging evi-dence from masked priming in Japanese).

To account for these differences in implicitpriming effects among Germanic languages,Mandarin Chinese, and Japanese, O’Seaghdhaet al. (2010) proposed the proximate units prin-ciple, according to which languages differ in thephonological units that are directly connectedto lexeme or morpheme nodes in the formnetwork. The proximal units are phonemicsegments and metrical frames for Dutch andEnglish (cf. Levelt et al., 1999), atonal phono-logical syllables and tonal frames for MandarinChinese (cf. Chen et al., 2002), and moras andtonal frames for Japanese (cf. Kureta et al.,

Empirical WEAVER++

Syllable + tone Syllable Segment Tone

Overlap

Prim

ingeff

ect(m

s)

-60-50-40-30-20-1001020

Ø Ø

Figure 3 Implicit priming effects as a function of type of overlap in Mandarin Chinese: Real data from Chenet al. (2002) and WEAVER++ simulation results. Ø = No effect in the model.

A. Roelofs28

© Japanese Psychological Association 2014.

Page 8: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

2006). Nodes for atonal phonological syllables(Mandarin Chinese) and moras (Japanese) areconnected to lexeme or morpheme nodes,on the one hand, and to phonemic segmentnodes, on the other hand (cf. Qu, Damian, &Kazanina, 2012). Moreover, content-to-frameassociation creates phonological word repre-sentations that are used to retrieve correspond-ing syllable motor programs from a mentalsyllabary (cf. Tamaoka & Makioka, 2009).In the next section, I report the results ofcomputer simulations using versions ofWEAVER++ for English, Mandarin Chinese(纺 织 者++), and Japanese (ウィーバー++),which tested the utility of these theoretical sug-gestions concerning word-form encoding.

Computer simulations for English,

Mandarin Chinese, and Japanese

The computational protocol was the same as inprevious WEAVER++ simulations of word-form encoding in Dutch (Roelofs, 1996a, 1997,2002, 2008; Roelofs & Meyer, 1998).The param-eter values were fixed and identical to those inearlier simulations. The latency of applicationof condition-action rules was set to 50 ms. Theform network included phonological syllable

nodes and four-tone tonal frames for MandarinChinese and mora nodes and two-tone tonalframes for Japanese. In all simulations, encod-ing operations were completed as far as pos-sible given the advance information about formoverlap among the response words. Furtherdetails are given below.

Simulating implicit priming in EnglishWhereas sharing an onset segment generatesan implicit priming effect in Dutch (Meyer,1991), it yields no effect in Mandarin Chinese(Chen et al., 2002). O’Seaghdha et al. (2010)replicated the absence of an implicit primingeffect for shared single onset segments in Man-darin Chinese, and showed that an implicitpriming effect for shared onsets is obtained inEnglish, in line with the seminal findings ofMeyer (1991) for Dutch. The effect obtainedfor English was 16 ms using picture naming(O’Seaghdha et al., 2010, Experiment 6).

Computer simulations confirmed that sharingof a single onset segment yields an implicitpriming effect for English in WEAVER++. Thesimulation used a form network as illustratedin Figure 2 for the English word guitar. In thehomogeneous condition, the responses sharedthe first segment, like the /g/ in guitar, gazelle,and gouache, whereas there was no such seg-mental overlap in the heterogeneous condi-tion. Syllabification was assumed to take 25 msper segment (cf. Indefrey, 2011). Onset overlapyielded an implicit priming effect of 17 ms in thesimulations, which corresponds to the empiricalresults obtained for English (O’Seaghdha et al.,2010).

Simulating implicit priming inMandarin ChineseMandarin Chinese has four lexically contrastivetones and some 400 different atonal syllables,which realize some 1200 different tonal syllables(i.e., 400 possible tonal syllables do not occur).Unlike Dutch and English, there is no resyllabi-fication across morpheme and word boundaries.Moreover, implicit priming experiments suggestthat there is no morphological decompositionin production (Chen & Chen, 2007), differentfrom Dutch and English (cf. Dell, 1986; Roelofs,

Empirical WEAVER++

Mora Segment

Overlap

Prim

ingeff

ect(ms)

-60-50-40-30-20-1001020

Ø

Figure 4 Implicit priming effects as a function oftype of overlap in Japanese: Real data from Kuretaet al. (2006) and WEAVER++ simulation results.Ø = No effect in the model.

Modeling of phonological encoding 29

© Japanese Psychological Association 2014.

Page 9: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

1996a,b; Roelofs & Baayen, 2002). Based ontheir implicit priming results, Chen et al. (2002)and O’Seaghdha et al. (2010) argued that theform network of Mandarin Chinese includesnodes for atonal phonological syllables (linkedto phonemic segments) and tones. In the simu-lations, it was assumed that the form network forMandarin Chinese contains nodes for lexemes(i.e., whole word forms), tonal frames, atonalphonological syllables (i.e., tonally unspecified),syllabified phonemic segments,and syllable pro-grams specified for tone, as illustrated inFigure 5 for the word xizang (Tibet). The high,mid-rising, low-dipping, and high-falling tonesare often denoted by the numerals 1, 2, 3, and 4(and the neutral tone by 0), as is done inFigure 5. In form encoding, atonal phonologicalsyllable nodes and tonal frame nodes areselected for a selected lexeme node, followed byserial syllable-to-tone association. Because seg-ments are already syllabified, segments withinan atonal phonological syllable may be selectedin parallel. That is, there is no need for serialsyllabification as in Germanic languages. Theconstructed phonological word presentationis used to select the corresponding syllableprogram nodes.

In the simulations, it was tested whethersharing of a single onset segment, first tone, first

atonal syllable, and first tonal syllable yieldsimplicit priming effects, following Chen et al.(2002). Figure 3 illustrates the simulation out-comes, showing that sharing of the first syllableplus tone yielded an implicit priming effectwhich was larger than the effect of sharingthe first syllable without the tone. Moreover,sharing the first segment only or the first toneonly yielded no priming effects. These simula-tion results correspond to the empirical obser-vations of Chen et al. (2002) for a shared tonalsyllable (their Experiment 1), atonal syllable(Experiment 2), first segment (Experiment 5),and first tone (Experiment 4).

If the response words in a homogeneousblock of trials share the first syllable plus tone,the first atonal syllable and first tone may beselected and associated, the correspondingsegment nodes may be selected, and thecorresponding syllable program node may beselected in advance. In a heterogeneous blockof trials, nothing can be phonologically andphonetically encoded in advance. Therefore,RTs will be shorter in homogeneous than inheterogeneous blocks, as empirically observed.If the response words in a homogeneous blockof trials share the first syllable but not the tone,the first atonal syllable and the correspondingsegment nodes may be selected, but association

<xizang>

//i://z/ /æ/

41

1 2

o n n c

o n o n

i:1] [zæŋ4]

/ŋ/

i:/ /zæŋ/

o

c

1 2

Lexeme

Atonal syllables

Syllablemotorprograms

Segments

Tonal frame∫

Figure 5 Illustration of the representation of the Mandarin Chinese word xizang (西藏) in the form networkof WEAVER++. The numbers below the arrows in the tonal frame denote tone numbers. The links betweenatonal syllables and segments specify syllable positions. Syllable motor programs are specified for tone.1 = high; 4 = high falling; c = coda; n = nucleus; o = onset.

A. Roelofs30

© Japanese Psychological Association 2014.

Page 10: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

to the tone cannot take place. Moreover, a syl-lable program node cannot be selected, becausethis depends on the completion of syllable-to-tone association (i.e., syllable programs arespecified for tone). Consequently, there will bean implicit priming effect for sharing the firstatonal syllable, but this effect will be smallerthan when both the first atonal syllable andfirst tone are shared, as empirically observed.Moreover, when the response words in a homo-geneous block of trials share only the firstsegment, the corresponding segment may beselected. No other node in the network maybe selected in advance. Given that the othersegments of the first phonological syllablecannot be selected in advance (because theyare unknown), the selection of these nodes willset the pace during the planning of the actualresponse on a trial. As a consequence, noimplicit priming effect will occur, as empiricallyobserved. Similarly, when only the first toneis shared, no implicit priming effect will beobtained, as observed.

O’Seaghdha et al. (2010) suggest that partici-pants in implicit priming experiments “inten-tionally orient” to the proximate units of theirlanguage and prepare these if possible. Theystated, “Mandarin speakers plan syllables anddo not engage onset information even thoughthis would potentially provide a benefit inhomogeneous contexts” (p. 290). “Not only doMandarin speakers use syllable units to planand regulate speech, but this language propertyappears to block them from engaging theshared onsets of those syllables” (p. 290).However, the simulation results reveal that thisis not a necessary assumption. In the simula-tions, a shared onset was prepared but this didnot yield an implicit priming effect in the RTs.The effect of onset preparation was hidden bythe parallel selection of segments of a phono-logical syllable, made possible by the storedsyllabifications (i.e., serial syllabification of seg-ments is assumed not to occur in MandarinChinese). Nevertheless, one may argue that theassumption of parallel selection of segments inMandarin Chinese and serial selection in Ger-manic languages goes against the generalityof the model. If segment selection proceeds

serially in Mandarin Chinese, the assumptionof no intentional orientation to segments ofO’Seaghdha et al. (2010) may account for theabsence of onset priming in Chinese. Futurestudies may further examine these two possi-bilities (i.e., parallel selection of segmentsversus no intentional orientation to segments inMandarin Chinese).

To conclude, the simulations showed thatsharing of atonal and tonal syllables yieldsimplicit priming effects in Mandarin Chinese,whereas sharing of single onset segments andfirst tones yields no effect. In contrast, sharingof single onset segments does yield an implicitpriming effect in simulations for English, inagreement with the empirical observations.

Simulating implicit priming in JapaneseJapanese uses two tones, with one mora of aword carrying high pitch and the other morasbearing low pitch. There are some 100 coremoras, which together with a number of conso-nants make up some 300 core syllables in thelanguage (Tamaoka & Makioka, 2009). As inMandarin Chinese, there is no resyllabificationacross morpheme and word boundaries. Inthe simulations, it was assumed that the formnetwork for Japanese contains nodes forlexemes, tonal frames, moras, phonemic seg-ments, and syllable programs, as illustrated inFigure 6 for the word tenshi (angel). In formencoding, mora nodes and tonal frame nodesare selected for a selected lexeme node, fol-lowed by serial mora-to-tone association andsyllabification. The constructed phonologicalword representation is used to select the corre-sponding syllable program nodes.

In the simulations, it was tested whethersharing of a single onset segment and first morayields implicit priming effects, following Kuretaet al. (2006). In both cases, the first tone wasalso shared. Figure 4 illustrates the simulationoutcomes, showing that sharing of the first morayielded an implicit priming effect, but thatsharing of the first segment did not. Theseresults correspond to the empirical observa-tions of Kureta et al. (2006) for a shared initialsegment (their Experiment 1) and first mora(Experiment 3).

Modeling of phonological encoding 31

© Japanese Psychological Association 2014.

Page 11: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

If the response words in a homogeneous blockof trials share the first mora and tone pattern,the first mora and tonal frame nodes may beselected and associated, and the correspondingsegment nodes may be selected and syllabified.In a heterogeneous block of trials, nothing canbe phonologically encoded in advance. There-fore,RTs will be shorter in homogeneous than inheterogeneous blocks, as empirically observed.If the response words in a homogeneous blockof trials share only the first segment and tonepattern, the corresponding segment and tonalframe nodes may be selected. However, giventhat the first mora cannot be selected in advance(because it is unknown),mora-to-frame associa-tion cannot take place in advance but has to bedone during the planning of the actual responseon a trial. As a consequence, no implicit prim-ing effect will occur, as observed empirically.According to the model, the reason for theabsence of an onset effect in Japanese is essen-tially the same as for Mandarin Chinese. In bothcases, onset segments are prepared in the homo-geneous condition,but this is not reflected in theRTs because of the dynamics of phonologicalencoding in these languages.

General discussion

As outlined previously, languages differ in thephonological structure of their words. In line

with these differences, studies using the implicitpriming paradigm have revealed salient differ-ences in implicit priming effects among Dutch/English, Mandarin Chinese, and Japanese. Inparticular, whereas sharing of word-initial seg-ments yields implicit priming effects in Dutchand English, the word-initial fragment has tobe minimally a syllable in Mandarin Chineseand minimally a mora in Japanese to yield sucheffects. These results confirm other evidencesuggesting that phonemic segments and stresspatterns feature prominently in the phonologi-cal encoding of Dutch and English words,whereas a dominant role is played by syllablesand tones in Mandarin Chinese and by morasand tones in Japanese. The present articleaddressed the issue of whether the principlesunderlying the models of phonological encod-ing developed for English and Dutch apply toMandarin Chinese and Japanese. Accordingto the WEAVER++ model (Levelt et al., 1999),phonological encoding in Dutch and Englishinvolves parallel activation of phonemic seg-ments and metrical frames in memory, whichis followed by a serial association of segmentsto the frame. Researchers have suggested thatfor Mandarin Chinese and Japanese, the serialassociation concerns stored atonal phonologi-cal syllables (Mandarin Chinese) or storedmoras (Japanese) to tonal frames. That is, theproximate units may differ among languages,

<tenshi>

LL2 3

1 2 1 2

o n c o

[teHnL L]

/te/ /n/

n

1 2

Lexeme

Moras

Syllablemotorprograms

Segments

Tonal frame

1

H3

/i/

ɕ

ɕ

ɕ

Figure 6 Illustration of the representation of the Japanese word tenshi (てんし) in the form network ofWEAVER++. Syllable motor programs are specified for tone. c = coda; H = high tone; L = low tone; n =nucleus; o = onset.

A. Roelofs32

© Japanese Psychological Association 2014.

Page 12: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

but activation and association processes aresimilar (cf. O’Seaghdha et al., 2010). The utilityof these theoretical suggestions was demon-strated by computer simulations of the keyimplicit priming findings using versions ofWEAVER++ for English, Mandarin Chinese,and Japanese.

The simulation outcomes support the exis-tence of both unity and differentiation of pho-nological encoding processes among languages.In the remainder of this article, a number ofissues are discussed. These issues concern thenature of the metrical and tonal frames, thedistinction between stress and tone languages,the status of phonemic segments, subsyllabiceffects in Chinese, and the nature of implicitpriming effects (i.e., what exactly the implicitpriming paradigm measures). I briefly discussthese issues in turn and make suggestions forfuture research.

Nature of metrical and tonal framesIn the simulations, it was assumed that tonalframes make explicit which of four tones isassociated with each syllable of a word in Man-darin Chinese and which mora is associatedwith the high tone of a word in Japanese, fol-lowing suggestions in the literature (Kuretaet al., 2006; O’Seaghdha et al., 2010). Similarly,it has been assumed that metrical frames makeexplicit the number of syllables and whichof the syllables of a word receives primarystress in Dutch and English (Levelt et al., 1999).The form network needs to specify the stresspattern across syllables only for a minority ofthe words in these languages (some 10% of allwords), because most words have stress on thefirst stressable syllable. Consequently, for mostwords in Dutch and English, stress can beassigned by rule. Similarly, a low-high tonepattern is prevalent in Japanese (Kureta et al.,2006), which therefore may be assigned by rule.The Japanese form lexicon needs to specify thetone pattern only for a minority of all words(i.e., some 14%, see Kureta et al., 2006). In con-trast, which of the four tones is associated witheach syllable of a word in Mandarin Chinese isnot predictable and therefore needs to be speci-fied for each word in the form network.

Evidence for the assumption that metricalframes make explicit the number of syllablesand which of the syllables of a word receivesprimary stress in Dutch was obtained in implicitpriming experiments by Roelofs and Meyer(1998) using words that did not have stress onthe first stressable syllable. The results of theexperiments suggested that sharing of the firstsyllable of these words only yields an implicitpriming effect if all response words in a block oftrials have the same metrical structure. Thissuggests that segment-to-frame association canonly take place if the full metrical frame isshared. However, subsequent (yet unpublished)experiments by Meyer and the author failed toreplicate this finding. Instead, implicit primingeffects were obtained regardless of whether thefull metrical frame was shared or not. Theseresults suggest that for words with an unpre-dictable stress pattern, the form networkonly indicates which of the syllables receivesprimary stress rather than specifying both thenumber of syllables and the syllable receivingprimary stress. Similarly, for Japanese, giventhat one mora of a word carries the high toneand the other moras bear the low tone, it maybe that the form network only specifies whichof the moras carries the high tone rather thanindicating for each mora whether it bears thelow or high tone, as assumed by Kureta et al.(2006). Future experiments may examine thislatter possibility by testing for implicit primingeffects using response words that share the firstmora plus low tone, but vary in whether thehigh tone is on the second or third mora.

Distinction between stress and tonelanguagesIn Dutch and English, the stress pattern acrosssyllables may distinguish one word fromanother, whereas in words in Mandarin Chineseand Japanese, the tones associated with syl-lables may distinguish words that would other-wise be identical in form (Gussenhoven, 2004;Yip, 2002). Tone is used in standard Dutchand English as part of intonation patterns(Roach, 2009), but it is not lexically distinctive.Moreover, intonational pitch accent is assignedby rule to the syllable carrying primary stress

Modeling of phonological encoding 33

© Japanese Psychological Association 2014.

Page 13: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

within a word. However, some dialects ofDutch, such as Roermond Dutch, do use tone ina lexically distinctive manner (Gussenhoven,2004), and therefore tone should be specified inthe form lexicon of the speakers of those dia-lects. Moreover, it has been argued that wordsin Mandarin Chinese have stress patterns,although much less obviously than in Dutchand English (Duanmu, 2007). Words of twoor more syllables can show different levels ofstress, with unstressed syllables showing vowelreduction or tone neutralization. These casesblur the distinction between stress and tone lan-guages, and suggest that the suprasegmentalframes for some dialects of Dutch and for Man-darin Chinese should specify more than, respec-tively, stress and tone. Future implicit primingexperiments may examine this issue by manipu-lating tone in Roermond Dutch and stress inMandarin Chinese.

Status of phonemic segmentsIn the simulations, it was assumed that theproximate units differed among languages (i.e.,atonal syllables and four-tone tonal frames forMandarin Chinese and moras and two-tonetonal frames for Japanese), but that the formnetwork contains phonemic segment nodeslinked to a syllabary for all languages. Evidencethat the form network of Mandarin Chinesecontains segments nodes has been providedby Qu et al. (2012), who obtained segmentaleffects in naming colored pictures in an electro-physiological (EEG) study. Event-related brainpotentials (ERPs) differed between a conditionin which picture and color name shared theonset segment and a condition in which theydid not. However, there was no RT differencebetween the conditions. This suggests that seg-ments may play a role in planning the produc-tion of words in Mandarin Chinese, withoutyielding a behavioral effect. Similarly, initialsegments were actually prepared in thesegment-only condition of the WEAVER++simulations for Mandarin Chinese, reported inthe present article. However, advance prepara-tion yielded no implicit priming effect on laten-cies in the model because the preparation washidden by the selection of the other segments of

the first syllable during the planning of theactual response on a trial. Future EEG experi-ments may examine whether segments play arole in planning spoken words in MandarinChinese and Japanese by measuring ERPs insegment-only and tone-only conditions.

Subsyllabic effects in ChineseUsing implicit priming in a different dialect ofChinese, namely Cantonese, Wong, Huang,and Chen (2012) observed that sharing the firstsyllable of disyllabic words yields an implicitpriming effect, but that sharing of the firstsegment does not, thereby replicating pre-vious findings obtained for Mandarin Chinese.However, sharing the body (i.e., onset andnucleus) of the first syllable also yielded animplicit priming effect, although the magnitudeof the body effect was smaller than the syllableeffect. This priming effect of body overlapmay suggest that subsyllabic units play a role inCantonese phonological encoding (cf. Wong &Chen, 2008, 2009). Alternatively, Verdonschot,Nakayama, Zhang, Tamaoka, and Schiller(2013) suggested that the subsyllabic effect mayrelate to the fact that participants in the experi-ments of Wong and colleagues were Cantonese-English bilinguals, who learned English asa second language when they were young.Perhaps there was an influence from Englishon phonological encoding in Cantonese forthese participants. Using masked priming,Verdonschot et al. (2013) observed that undercertain experimental conditions a subsyllabiconset priming effect may be obtained in highlyproficient Mandarin-English bilingual speak-ers, suggesting that characteristics of phono-logical encoding in a second (Germanic)language may exert an influence on phonologi-cal encoding in a first (Chinese) language.Future studies may further empirically examinethis possibility. Elsewhere, WEAVER++ hasbeen applied to bilingual language produc-tion. For an account of bilingual phonologicalencoding, including computer simulations,I refer to Roelofs (2003) and Roelofs andVerhoef (2006), and for an account of bilinguallexical selection, see Roelofs, Dijkstra, andGerakaki (2013).

A. Roelofs34

© Japanese Psychological Association 2014.

Page 14: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

Nature of implicit priming effectsThe implicit priming paradigm has been one ofthe workhorses in examining the unity and dif-ferentiation of phonological encoding amonglanguages. The conclusions from experimentsusing this paradigm are based on the assump-tion that implicit priming effects reflect antici-patory preparation of word forms. This wasalso assumed in the WEAVER++ simulations.However, although this assumption is widelyaccepted, alternative interpretations are pos-sible. In particular, the homogeneous blocks notonly provide prior information about responseproperties to the participants, but also causethat the relevant attribute is repeated from oneresponse to the next. It therefore remains pos-sible that implicit priming effects are caused bythis repetition of response properties ratherthan by advance preparation, at least for Ger-manic languages (the absence of onset primingin Mandarin Chinese and Japanese alreadyargues against a mere repetition account forthese languages).

These two accounts (preparation vs. repeti-tion) were tested in two experiments in Dutch(Roelofs, 2013). Rather than manipulatingadvance knowledge between blocks of trials,it was manipulated trial-by-trial by using cues,controlling for repetition of response proper-ties between trials. On each trial, a visual cueindicated two possible responses, one of whichhad to be produced. The cued responses shareda form property (homogeneous) or they did not(heterogeneous). In one experiment, the formproperty was the first syllable, and in the otherexperiment, the properties were the first syl-lable and first morpheme. The cues variedfrom trial to trial so that no property repetitionoccurred across trials. For example, a homoge-neous cue indicated that basin and bagel werethe possible picture naming responses on a par-ticular trial. Next, following a short delay, apicture of a basin or a bagel was presented,which had to be named. A heterogeneous cueindicated, for example, that basin and melonwere the possible responses, which was fol-lowed by the presentation of a picture of abasin or a melon. The RT was shorter followinghomogeneous than rather than heterogeneous

cues, replicating the standard implicit primingeffect that is obtained with homogeneousand heterogeneous blocks of trials (Meyer,1990, 1991). Cueing syllables plus morphemesyielded a larger priming effect than cueing syl-lables only, replicating the results of Roelofs(1996a,b) and Roelofs and Baayen (2002).These results suggest that implicit primingeffects in Dutch reflect anticipatory prepara-tion of form properties rather than repetition ofthese properties from trial to trial.

Conclusions

To recapitulate, RT experiments have shownthat whereas sharing word-initial segmentsyields implicit priming effects in Dutch andEnglish, the initial fragment has to be mini-mally a syllable in Mandarin Chinese andminimally a mora in Japanese to yield sucheffects. The outcomes of simulations usingWEAVER++ versions for English, MandarinChinese, and Japanese support the suggestionthat by adopting the proximate units principle,the differences in implicit priming effectsamong languages can be explained.The presentresults suggest that although languages differin the phonological structure of their words, theprinciples underlying phonological encodingare similar across languages. Word forms ofdifferent languages seem to be “woven on thesame loom”.

References

Akamatsu, T. (2000). Japanese phonology: A func-tional approach. Munich, Germany: LincomEuropa.

Alario, F.-X., Perre, L., Castel, C., & Ziegler, J. C.(2007). The role of orthography in speech pro-duction revisited. Cognition, 102, 464–475.

Bi, Y., Wei, T., Janssen, N., & Han, Z. (2009). Thecontribution of orthography to spoken wordproduction: Evidence from Mandarin Chinese.Psychonomic Bulletin and Review, 16, 555–560.

Booij, G. E. (1995). The phonology of Dutch. Oxford,UK: Oxford University Press.

Chen, J.-Y., & Chen, T.-M. (2007). Form encoding inChinese word production does not involve mor-phemes. Language and Cognitive Processes, 22,1001–1020.

Modeling of phonological encoding 35

© Japanese Psychological Association 2014.

Page 15: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

Chen, T.-M., & Chen, J.-Y. (2013). The syllable as theproximate unit in Mandarin Chinese word pro-duction: An intrinsic or accidental property ofthe production system? Psychonomic Bulletinand Review, 20, 154–162.

Chen, J.-Y., Chen, T.-M., & Dell, G. S. (2002). Word-form encoding in Mandarin Chinese as assessedby the implicit priming task. Journal of Memoryand Language, 46, 751–781.

Cholin, J., Dell, G. S., & Levelt, W. J. M. (2011). Plan-ning and articulation in incremental word pro-duction: Syllable-frequency effects in English.Journal of Experimental Psychology: LearningMemory and Cognition, 37, 109–122.

Cholin, J., & Levelt, W. J. M. (2009). Effects of syllablepreparation and syllable frequency in speechproduction: Further evidence for syllabic unitsat a post-lexical level. Language and CognitiveProcesses, 24, 662–684.

Cholin, J., Levelt, W. J. M., & Schiller, N. O. (2006).Effects of syllable frequency in speech produc-tion. Cognition, 99, 205–235.

Cholin, J., Schiller, N. O., & Levelt, W. J. M. (2004).The preparation of syllables in speech produc-tion. Journal of Memory and Language, 50,47–61.

Damian, M. F., & Bowers, J. S. (2003). Effects oforthography on speech production in a form-preparation paradigm. Journal of Memory andLanguage, 49, 119–132.

Dell, G. S. (1986). A spreading-activation theory ofretrieval in sentence production. PsychologicalReview, 93, 283–321.

Dell, G. S. (1988). The retrieval of phonologicalforms in production: Tests of predictions froma connectionist model. Journal of Memory andLanguage, 27, 124–142.

Duanmu, S. (2007). The phonology of standardChinese. Oxford, UK: Oxford University Press.

Gussenhoven, C. (2004). The phonology of toneand intonation. Cambridge, UK: CambridgeUniversity Press.

Indefrey, P. (2011). The spatial and temporal signa-tures of word production components: A criticalupdate. Frontiers in Psychology, 2, Article 255.

Janssen, D. P., Roelofs, A., & Levelt, W. J. M. (2002).Inflectional frames in language production. Lan-guage and Cognitive Processes, 17, 209–236.

Janssen, D. P., Roelofs, A., & Levelt, W. J. M. (2004).Stem complexity and inflectional encoding inlanguage production. Journal of PsycholinguisticResearch, 33, 365–381.

Kenstowicz, M. (1994). Phonology in generativegrammar. Oxford, UK: Basil Blackwell.

Kureta, Y., Fushimi, T., & Tatsumi, I. F. (2006).The functional unit in phonological encoding:

Evidence for moraic representation in nativeJapanese speakers. Journal of Experimental Psy-chology: Learning, Memory, and Cognition, 32,1102–1119.

Labrune, L. (2012). The phonology of Japanese.Oxford, UK: Oxford University Press.

Ladefoged, P. (2001). Vowels and consonants: Anintroduction to the sound of languages. Oxford:Blackwell.

Ladefoged, P., & Maddieson, I. (1996). The soundsof the world’s languages. Oxford: BlackwellPublishers.

Levelt, W. J. M. (1989). Speaking: From intention toarticulation. Cambridge, MA: MIT Press.

Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). Atheory of lexical access in speech production.Behavioral and Brain Sciences, 22, 1–38.

Levelt, W. J. M., & Wheeldon, L. (1994). Do speakershave access to a mental syllabary? Cognition, 50,239–269.

Meyer, A. S. (1990). The time course of phonologicalencoding in language production: The encodingof successive syllables of a word. Journal ofMemory and Language, 29, 524–545.

Meyer, A. S. (1991). The time course of phonologicalencoding in language production: Phonologicalencoding inside a syllable. Journal of Memoryand Language, 30, 69–89.

O’Seaghdha, P. G., Chen, J.-Y., & Chen, T.-M. (2010).Proximate units in word production: Phonologi-cal encoding begins with syllables in MandarinChinese but with segments in English. Cognition,115, 282–302.

Qu, Q., Damian, M. F., & Kazanina, N. (2012). Sound-sized segments are significant for Mandarinspeakers. Proceedings of the National Academyof Sciences of the United States of America, 109,14265–14270.

Roach, P. (2009). English phonetics and phonology: Apractical course. Cambridge, UK: CambridgeUniversity Press.

Roelofs, A. (1996a). Serial order in planning theproduction of successive morphemes of aword. Journal of Memory and Language, 35, 854–876.

Roelofs, A. (1996b). Morpheme frequency in speechproduction: Testing WEAVER. In G. E. Booij &J. van Marle (Eds.), Yearbook of morphology1996 (pp. 135–154). Dordrecht, The Netherlands:Kluwer Academic Publishers.

Roelofs, A. (1997). The WEAVER model of word-form encoding in speech production. Cognition,64, 249–284.

Roelofs, A. (1998). Rightward incrementality inencoding simple phrasal forms in speech pro-duction: Verb-particle combinations. Journal of

A. Roelofs36

© Japanese Psychological Association 2014.

Page 16: Modeling of phonological encoding in spoken word ... · Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese ARDI

Experimental Psychology: Learning, Memory,and Cognition, 24, 904–921.

Roelofs, A. (1999). Phonological segments and fea-tures as planning units in speech production.Language and Cognitive Processes, 14, 173–200.

Roelofs, A. (2002). Spoken language planning andthe initiation of articulation. Quarterly Journal ofExperimental Psychology, Section A: HumanExperimental Psychology, 55, 465–483.

Roelofs, A. (2003). Shared phonological encodingprocesses and representations of languages inbilingual speakers. Language and Cognitive Pro-cesses, 18, 175–204.

Roelofs, A. (2004). Seriality of phonological encodingin naming objects and reading their names.Memory and Cognition, 32, 212–222.

Roelofs, A. (2006). The influence of spelling onphonological encoding in word reading, objectnaming, and word generation. Psychonomic Bul-letin and Review, 13, 33–37.

Roelofs, A. (2008). Attention, gaze shifting, and dual-task interference from phonological encoding inspoken word planning. Journal of ExperimentalPsychology: Human Perception and Perfor-mance, 34, 1580–1598.

Roelofs, A. (2013). Preparation of word-form attri-butes in speeded naming: Role of miscuing.Manuscript submitted for publication.

Roelofs, A., & Baayen, H. (2002). Morphology byitself in planning the production of spokenwords. Psychonomic Bulletin and Review, 9, 132–138.

Roelofs, A., Dijkstra, T., & Gerakaki, S. (2013). Mod-eling of word translation: Activation flow fromconcepts to lexical items. Bilingualism: Languageand Cognition, 16, 343–353.

Roelofs, A., & Meyer, A. S. (1998). Metrical structurein planning the production of spoken words.Journal of Experimental Psychology: Learning,Memory, and Cognition, 24, 922–939.

Roelofs, A., & Verhoef, K. (2006). Modeling thecontrol of phonological encoding in bilingual

speakers. Bilingualism: Language and Cognition,9, 167–176.

Santiago,J. (2000). Implicit priming of picture naming:A theoretical and methodological note on theimplicit priming task. Psicológica, 21, 39–59.

Schiller, N. O., Fikkert, P., & Levelt, C. C. (2004).Stress priming in picture naming:An SOA study.Brain and Language, 90, 231–240.

Tamaoka, K., & Makioka, S. (2009). Japanese mentalsyllabary and effects of mora, syllable, bi-moraand word frequencies on Japanese speech pro-duction. Language and Speech, 52, 79–112.

Verdonschot, R. G., Kiyama, S., Tamaoka, K.,Kinoshita, S., La Heij, W., & Schiller, N. O.(2011). The functional unit of Japanese wordnaming: Evidence from masked priming. Journalof Experimental Psychology: Learning, Memoryand Cognition, 37, 1458–1473.

Verdonschot, R. G., Nakayama, M., Zhang, Q.,Tamaoka, K., & Schiller, N. O. (2013). The proxi-mate phonological unit of Chinese-English bilin-guals: Proficiency matters. PLoS ONE, 8, e61454.

Wong, A. W.-K., & Chen, H.-C. (2008). Processingsegmental and prosodic information in Canton-ese word production. Journal of ExperimentalPsychology: Learning, Memory, and Cognition,34, 1172–1190.

Wong, A. W.-K., & Chen, H.-C. (2009). What areeffective phonological units in Cantonesespoken word planning? Psychonomic Bulletinand Review, 16, 888–892.

Wong, A. W.-K., Huang, J., & Chen, H.-C. (2012).Phonological units in spoken word production:Insights from Cantonese. PLoS ONE, 7, e48776.

Yip, M. (2002). Tone. Cambridge, UK: CambridgeUniversity Press.

You, W., Zhang, Q.-F., & Verdonschot, R. G. (2012).Masked syllable priming effects in word andpicture naming in Chinese. PLoS ONE, 7,e46595.

(Received July 8, 2013; accepted November 9, 2013)

Modeling of phonological encoding 37

© Japanese Psychological Association 2014.