PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the...

38
PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO CATALAN Daniel Recasens+ 1. INTRODUCTION In this paper I study. the role that different place cues play in the recogni tion of nasal stops. I claim that their perceptual relevance is strongly dependent on how they are related cIt the articulatory and acoustic levels and, essentially, on the nature of the process of speech perception itself. I show that this is the case by investigating experimentally interactive perceptual effects between transitions and m.u.rmurs in the tion of final unreleased alveolar [n], palatal and velar [?] after [aJ in Catalan, using synthetic speech stimuli. 1 Special emphasis is given to the cues for the palatal nasal. I proceed first to investigate what acoustic properties of the signal can be shown to convey place information by looking at a large amount of production and perceptual data on nasal murmurs and formant transitions. The role of releases in the process of place identification for nasals is also taken into account. A consideration of other cues besides formant transitions seems highly advisable. In an early perceptual experiment with synthetic speech (Liberman, Delattre, Cooper, & Gerstman, 1954) it was found that, in contrast with initial non-nasal stops, final nasal consonants ([m], [n], [,]) after different vowels were properly identified only 55% of the time for stimuli with appropriate transition endpoints and a cross-category fixed nasal murmur. Results from more recent experiments, both with synthetic (Garcia, 1966, 1967a, 1967b; Hecker, 1962; House, 1957; Nakata, 1959) and with natural (Henderson, Note 1; Malecot, 1956; Nord, 1976) speech stimuli have shown that not only formant transitions but also murmurs and releases are cues to place of articulation for nasal consonants. As will be shown, experimental data from the literature on speech perception suggest that all these cues for nasals ought to be considered as interdependent and, therefore, need to be taken into account in models of the perceptual evaluation of plaee cues. +Also University of Connecticut, Storrs. Acknowledgment. I acknowledge very gratefully the assistance of Ignatius G. Mattingly at all stages of this investigation and, especially, in the preparation of the manuscript. I would also like to thank Arthur S. Abramson, Louis Goldstein, Alvin Liberman, Bruno Repp, and the of the Department of Linguistics of the University of Connecticut fc,r their relevant comments. The research was supported by NICHD Grant HD01994 and BRS Grant RR05596 to Haskins Laboratories. An abstract of this p!tper was presented at the 101st Meeting of the Acoustical Society of America (Journal of the Acoustical Society of America, 1981, 69, S83). I am also of Margo Carter's help wi ththe figures. - [HASKINS LABORATORIES: Status Report on Speech Research SR-69 (1982 J 139

Transcript of PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the...

Page 1: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO CATALAN

Daniel Recasens+

1. INTRODUCTION

In this paper I study. the role that different place cues play in therecogni tion of nasal stops. I claim that their perceptual relevance isstrongly dependent on how they are related cIt the articulatory and acousticlevels and, essentially, on the nature of the process of speech perceptionitself. I show that this is the case by investigating experimentallyinteractive perceptual effects between transitions and m.u.rmurs in the reco~ni­

tion of final unreleased alveolar [n], palatal [~] and velar [?] after [aJ inCatalan, using synthetic speech stimuli. 1 Special emphasis is given to thecues for the palatal nasal.

I proceed first to investigate what acoustic properties of the signal canbe shown to convey place information by looking at a large amount ofproduction and perceptual data on nasal murmurs and formant transitions. Therole of releases in the process of place identification for nasals is alsotaken into account. A consideration of other cues besides formant transitionsseems highly advisable. In an early perceptual experiment with syntheticspeech (Liberman, Delattre, Cooper, & Gerstman, 1954) it was found that, incontrast with initial non-nasal stops, final nasal consonants ([m], [n], [,])after different vowels were properly identified only 55% of the time forstimuli with appropriate transition endpoints and a cross-category fixed nasalmurmur. Results from more recent experiments, both with synthetic (Garcia,1966, 1967a, 1967b; Hecker, 1962; House, 1957; Nakata, 1959) and with natural(Henderson, Note 1; Malecot, 1956; Nord, 1976) speech stimuli have shown thatnot only formant transitions but also murmurs and releases are cues to placeof articulation for nasal consonants. As will be shown, experimental datafrom the literature on speech perception suggest that all these cues fornasals ought to be considered as interdependent and, therefore, need to betaken into account in models of the perceptual evaluation of plaee cues.

+Also University of Connecticut, Storrs.Acknowledgment. I acknowledge very gratefully the assistance of Ignatius G.Mattingly at all stages of this investigation and, especially, in thepreparation of the manuscript. I would also like to thank ArthurS. Abramson, Louis Goldstein, Alvin Liberman, Bruno Repp, and the Fa~ulty ofthe Department of Linguistics of the University of Connecticut fc,r theirrelevant comments. The research was supported by NICHD Grant HD01994 and BRSGrant RR05596 to Haskins Laboratories. An abstract of this p!tper waspresented at the 101st Meeting of the Acoustical Society of America (Journalof the Acoustical Society of America, 1981, 69, S83). I am also appJ'e~;iativeof Margo Carter's help wi ththe figures. -

[HASKINS LABORATORIES: Status Report on Speech Research SR-69 (1982 J

139

Page 2: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

One must also oonsider the plaoe cues for [y-J wi thin the overall nasalset. It will be pointed out in what way their being taken into considerationat the analysis and synthesis levels affects theoretical considerations aboutperceptual relevance of transition patterns advanced by other scholars inearlier experiments that did not account for [~J.

The allophonic system of Catalan nasals in absolute final position isadequate to test these hypotheses with synthetic speech experiments. Itconsists of unreleased [mJ, [nJ, [fJ, [,J and allows analysis of theperceptual effects of transitions vs. murmurs with special reference to placecues for [tJ. In my experimental paradigm, which differs from approachestaken previously by other investigators, complete patterns of synthetictransitions and murmurs are directly based on real speech utterances andcombined reciprocally in perceptual continua for all the different placecategories. Perceptual results are related to production data on nasalscollected from Catalan speakers and speakers of other languages, and discussedin the light of the literature. Evidence for a complementary perceptualinfluence of transitions and murmurs consistent with parallel effects observ­able for both cues at the acoustic level is reported. This and other findingsargue for some form of motor theory (Liberman, Cooper, Shankweiler, &Studdert-Kennedy, 1967) that refers to the unitary articulatory gesture toaccount for the perceptual processing of dynamic acoustic cues in syllablesending with nasal stops. An integration model similar to that of Dorman,Studdert-Kennedy, and Raphael (1977) for non-nasal stops is proposed toaccount for the perception of nasals after [aJ as well as other vocalicnuclei.

Data for nasal consonants in syllable-final position are taken intoconsideration because of the fact that the perceptual effectiveness of murmursin place recognition in this position is known to be considerably higher thanin syllable-initial position (Malecot, 1956; Nord, 1976). Open vowels [aJ,['£J have been chosen for analysis and discussion since the perception ofconsonantal nasalization improves with [aJ vs. [iJ, [uJ (Ali, Gallagher,Goldstein, & Daniloff, 1971; Martony, 1964; Zee, 19(1). Also [9J, whichhappens to be harder to identify, in general, than other nasal consonants(Garcia, 1966, 1967a,b; MaleCot, 1956; Ohala, 1975), can be recognized with[aJ quite easily when synthesized (Hecker, 1962) or presented in naturalspeech (Wang & Fillmore, 1961), but rather poorly when it happens to becontiguous to [iJ, luJ. In section 4, I will also refer to the interactiveeffects of nasal cues and other vowel nuclei.

2. CUES FOR NASAL CONSONANTS: PERCEPTUAL RELEVANCE,ARTICULATORY AND ACOUSTIC CHARACTERISTICS

2.1. Manner Cues

Certain well-defined spectral characteristics of nasal murmurs mark nasalconsonants as a class, independent of place of articulation and the adjacentnasalized vowels (Delattre, 1958, 1968; Fant, 1960; Fujimura, 1962; Fujimura &Lindqvist, 1970; Hattori, Yamamoto, & Fujimura, 1958; Mattingly, 1968LFormant transitions, on the other hand, are essentially place markers; infact, as shown below, only the first formant transition contributes effective-

190

Page 3: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

ly to manner identification. In the following paragraphs I refer to thosespectral characteristics, their general articulatory correlates and perceptualrelevance, in preparation for discussing not only differences in murmurpatterns for nasals of different place of articulation but also thoseexperimental paradigms concerned with place identification that make use offixed or slightly variable murmurs:

a. First formant (N1), at around 250-300 Hz, with higher intensity thanthe upper spectral regions, dependent on a large internal cavity size (pharynxand nasal subsys tems) behind the tongue constric tion because of nasal cou­pling. According to Delattre (1958), the intensity level of N1 and the otherspectral regions of the murmur is around 6 dB (N1) and 15 dB (N2, N3, N4 ••• )lower than for a normal non-nasalized vowel. It seems to be the mostimportant class cue for nasal consonants, in contrast with the negligibleperceptual role of the frequencies of higher nasal formants (Delattre, 1968).

b. Presence of an antiformant (NZ), varying in frequency with place,according to the size of the mouth cavity behind the tongue constriction,which acts as a shunt. It seems to convey mainly place information.

c. Concentration of formants (N2, N3, N4 ••. ) between 300-4000 Hz, withlarge bandwidth (BW) values, mainly due to the large surface area of the nasalcavities and the dissipative energy losses originated within them. The smallperceptual significance of those formants seems to resul t not only from theirlow intensity level with respect to N1 (especially N2, often absent, asreported by Fant, 1962, and Weinstein, McCandless, Mondshein, & Zue, 1975) butalso from their spectral variability. In synthetic speech experiments, thefollowing ranges of fixed frequency values have been found effective inrealizing the nasal murmur when place is held constant: N2: around 1000-1150Hz; N3: around 2000-2500 Hz (Delattre, 1954; Liberman et al., 1954; Massone,1980, for Argentinian Spanish; Miller & Eimas, 1977).

The value for N2 has been proved to be dependent on the size of thenarrow velar passage to the nasal cavity (Bjuggren & Fant, 1964). Thesignificance of an N3 around a "typical" 2200 Hz area has also been pointedout by Fant (1962); this resonance seems to be chiefly dependent on thecharacteristics of the pharynx cavity (Fant, 1960; Fujimura, 1962). In linewith these observations, De Mori, Gubrynowicz, and Laface (1979) have recentlyproposed the automatic interpretation of any frequency concentration between2000-2800 Hz as N3 and of the first nasal formant below it as N2 as a speechrecognition rule for identification of nasal consonants.

There are also available data on optimal formant bandwid th values forconsonantal nasality. Thus Martony (1964) has stressed the perceptual rele­vance of an N2 bandwid th value around 250 Hz. given an optimal N1 value ataround 100-150 Hz. Such an N2 bandwidth is close to frequencies chosen as themost favorable for nasal perception by Nakata (1959) (N2: 200 Hz; N1: 300Hz) and Pickett (1965) (N1, N2, N3: 180 Hz; N4: 300 Hz).

d. Overall lower intensity level than vowels. House (1957) assigned tothe murmur an intensity 8 dB lower than that appropriate for [i].

191

Page 4: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Other manner cues besides nasal murmurs need to be accounted for:

a. VOi'lGl nasalization, taken into consideration in experiments withsynthetic speech for a 100 msec (Haskins Laboratories QPR 13, 1954, Appendix2) and 20 msec (Miller & Eimas, 1977) overlapping period between vowel andconsonant. It can be simulated by replacing F1 by two formants (N1, F1) andan anUformant (NZ) and by increasing gradually and monotonically N1 and NZvalues from 0 (absence of oronasal coupling) to 600 Hz for N1 and 650 Hz forNZ (Fujilllura & Lindqvist, 1970, Figure 13) or 660 Hz for N1 and 700 Hz for NZ(Fant, 1960) (presence of a small degree of oronasal coupling). It would bedesirable to reproduce the effect of higher extra pole-zero concentrations ofthe nasalized vowel and to reach a better understanding of the configurationof frequency continuities and discontinuities between murmur and vowel for­mants at the closure onset in order to find out to what extent they help toidentify consonantal nasality. A continuity between vowel formants and nasalpoles (F1-N1, F2-N3, F3-N4... ) (Fant, 1960) is confirmed by my considerationson analysis and synthesis of nasals (section 3.2.). Moreover, Takeuchi,Kasuya, and Kido (1975) have shown that an uninterrupted pole excursionbetween vowel and nasal consonant, with the addition of some nasalityparameter that represents the amount of spectral difference between nasal andvowel spectra, can be regarded as a valid cue for detection of nasals as aclass.

b. Nasalized releases following the nasal murmur, different from non­nasal stop releases in presenting low-frequency masking (Blumstein & Stevens,1979) .

c. F1 transitions, generally negative but less so than for non-nasalstops. This differential acoustic cue has perceptual relevance for nasals asa class (Fant, 1967; Mattingly, 1968; Miller & Eimas, 1977).

2.2. Place Cues

A. Nasal murmur

It has been suggested that nasal murmurs also play a relevant role inidentification of different places of articulation. Thus, experimenters atHaskins Laboratories emphasized very early the polyvalent nature of formanttransitions and nasal murmur characteristics in the process of place discrimi­nation among nasal consonants (Cooper, Delattre, Liberman, Borst, & Gerstman,1952). As Fant (1980) has proposed recently: "The base rule stating thatstationary segments signal the manner and transitions the place of articula­tion has more exceptions than one might expect. Thus the nasal murmur of [mJ,[nJ, and [lJ may contain strong place cues ... " (p. 14). In the light of suchobservations, I will argue for consistent place cues in the nasal murmurportion by comparing relevant production and perceptual data.

Table 1 presents mean and extreme frequency values for N1, N2, N3, N4 andNZ in production data from male speakers of different languages (Czech,German, English, Hungarian, Polish, Russian, and Swedish) reported by severalauthors. A summary of the results, included at the bottom part of Table 1,shows that formant patterns for [".J and bJ are very similar except in thecase of N4, which is higher for I,l than for [.rJ ; [mJ and [nJ present lower

192

Page 5: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

~-----------------------~--=-------=--------=------------~----------------=---

Table 1

and

Pole-zero analysis frequency values (in Hz) corresponding to thenasal murmurs of em], en], [1"], [,] in Czech (9), German (9),

English (4), Hungarian (8), Polish (1, 5, 6, 7), Russian (2, 9)Swedish c:~) (male speakers). A s1.llllmary of values is also included.

N1 N2 N3 N4 NZ References

[m] 300-340 850-900 1150-15001700-2000 (1 )Dukienc z (1967)250 800 1000 2000 550 (2 )Fant (1960)100-350 875-1300 2000-2500 (3 )Fant (1973)280-290 950-980 1360 1950 750-1250 (4 )fuj mura (1962)100-400 800-1200 2150-2500 2800-3000 (5 )Jassalll (1964)300 750 1300 2000 800 (6)Ja13se!ll (1973)300 850 1300 1850 1100 (7)Kacprowski (1963)

850 2000-2600 (8)Magdics (1969)600-800 1300-1 900 1800-2900 (9)Romportl (1973)

[n] 310-430250100-300300100-400330320

900-1000 1500-1800 1900-2350 (1 )800 1200 2000 1800 (2 )1000-1750 2400-2900 (3 )1000 1450 ~ooo 1450-2700 (4)1700-2000 2150-2500 2800-3000 (5)1200 1750 2050 1450 (6)900 1500 2050 1750 (7 )1000-1600 2050 (8)830 1900 2200 (9)

[ J"] 31 0-430250100-400345340

1000-1 200 1500-1 600 2000-2300 (1 )800 1900 2600 2200 (2 )1100 2200-2500 2800-3000 (5)1225 2100 2750 2750 (6)1000 1650 2180 2400 (7)1500-2000 (8)900-1100 2200-2500 2850 (9)

------------------------------------------------------------------------------[?] 320-440 900-1300 1500-2200 2000-2550 (1 )

300 1000 2200 2900 (2 )100-500 1000-1250 1625-2400 3200-4000 (3 )250-400 950-1150 1700-2200 2300-3000 above 3000 (4)100-400 900-1100 2200-2500 2800-3000 (5)350 1100 1900 2750 above 3800 (6)350 1050 1800 2450 (7)

750-1500 2300-2600 (8)800 2000 2600 (9)

Summary

[m] 100-400 600-1300 1000-2600 1700-3000 550-1250 (Extremes)285 795 1610 2090 860 (Mean)

----------------------------------------------------------------------------_.[n] 100-430 600-2000 1200-2900 1900-3000 1450-2700 (Extremes)

300 1115 1790 2190 1770 (Mean)----------------------------------------------------------------------------_.[~] 100-430 800-2000 1500-2500 2000-3000 2200-2400 (Extremes)

320 1140 1920 2450 2450 (Mean)------------------------------------------------------------------------------[!J] 100-500

330

750-1500 1500-2600 2000-4000 above 3000-3800 (Extre!lles)

1060 2015 2650 above 3400 (Mean)

-----------------------------------------------------------------------------~

193

Page 6: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

formant values, those for em] being even lower than those for en]. Data on N1frequency values present a succession [,J)[y.J)[nJ)[mJ presumably related tocross~category differences in size of the coupling section at the velarpharyngeal passage and pharyngonasal tract size; although the differences inthose frequency values are small, it should be pointed out that they can befound in data reported by some researchers in Table 1.2 Data on NZ frequencyvalues give an arrangement [?J>[~J>[n]>[m] that is consistent with NZ depen~

dence on the oral cavity size behind the tongue constriction. (Also forJapanese em] vs. [nJ, see Hattori et al., 1958), The summary also showsfrequency proximity of NZ and some specific formant, depending on placecat:gory: [mJ (N2), (nJ (N3), []'"J (N4), [,] (higher than N4) (Fujimura,1963). A way to look at the distinctive role of NZ is by considering itsposition with respect to two adjacent formants (a pole-zero cluster) atdifferent frequency regions according to nasal categories of different placeof articulation (Fujimura, 1963), According to data in Table 1, when valuesreported by different investigators are accounted for, such a classificationprocedure can be said to hold strictly for [n] (N3 between N4) and [.?] (abovethe general low formant structure, even above N6, according to Kacprowski &Mikiel, 1965), but not for em], whose antiresonance can lie between N1 and N2or N2 and Nj, nor [~], whose anti resonance can lie between N3 and N4 or N4 andN5. This variability seems to be partly related to differences in vowelcontext.

Information about perceptual relevance of released vs. unreleased murmursas place cues is provided by experiments in which they were presented foridentification alone or directly attached to the vocalic portion with noformant transitions or release. I refer to results obtained for open vowels[a], [~J with labial, alveolar, palatal, and velar nasal murmurs in finalposition.3

In experiments with natural English speech, released murmur segments forem] were categorized quite accurately (80%-100%) whether presented in isola­tion or following ["at] without formant transitions (Malecot, 1956). Henderson(Note 1) found for the overall vocalic set a higher accuracy in placeidentification for [m] without transition or release (about 75%) than for [nJ,[,J (65%-75%); following [aJ, in the absence of formant transitions, the [mJmurmur identification was always higher than 90%, independently of thepresence or absence of release. Identification scores for synthetic [mJmurmurs in isolation with American English subjects are reasonably consistentwith results with natural speech (65%-85%) (House, 1957; Nakata, 1959).Moreover, Manrique de, Gurlekian, and Massone (1980) report in experimentswith natural Argentinian Spanish speech that not only isolated [mJ murmurs butalso isolated [nJ murmurs give a higher percentage of [mJ than [nJ identifica­tions.

The syllable [anJ, with no transitions or release, is identified 50%-60%of the time (Henderson, Note 1); with release, the average rises to 97% inHenderson I s experiment but not in Malecot IS (50%). Both naturally-spoken(Malecot, 1956) and synthetic (House, 1957; Nakata, 1959) [nJ murmurs present­ed in isolation give the same 50%-60% effect. [nJ murmurs not heard as [nJtend to be heard as [mJ, as in Manrique de et al. (1980).

Dulciewicz (1967) has shown that []'-J murmurs presented in isolation toPolish speakers elicit no [tJ judgments. Correct recognition of ~J improves30% when murmur is presented with its corresponding onset.

194

Page 7: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Isolated [ry] murmurs are less well identified than those for [n] and [m](Malecot, 1956: 12%; House, 1957: 62%' Nakata, 1959: 41%); they tend to beinterpreted mainly as [m] but also as [nj. According to Malecot, released [~jm~rmur without transitions is ver1 ~oorly identified (34%) and confused mainlywlth ['<le.n]. Henderson also finds L?J murmur to be a poor indicator with vowel[e.] (45%); however, for [a?] without transitions, responses rise to 74%(unreleased murmur) and 92% (released murmur).

Frequency values chosen in synthetic speech experiments (English, Itali­an, Polish) with fixed and variable murmur patterns are another usefulindirect source of reference in the investigation of murmur struc tures fornasals of different place of articulation (Table 2). While experiments withfixed patterns account exclusively for class properties of consonantal nasali~

ty (section 2.1.c.), these variable patterns capture N1 and NZ dynamicscorrectly to simulate different place properties but reproduce only to somedegree the complex formant structure observed for different nasal consonants(see Table 1). A good approximation to NZ can be obtained by replacing it,together with the two surrounding poles, by a single formant having anappropriate bandwid tho This method may explain good perceptual resul tsobtained by Naka ta (1959) and Kacprowski and Mikiel (1965) with initialsynthesis values listed in Table 2.

A comparison of the production and perception data given above allows usto infer presumable place cues for nasal consonants. Thus, the remarkableimportance of released or unreleased [m] murmur in place identification can berelated to the particular low N1 and NZ frequency values within an overall lowmurmur spectrum. Such· spectral configuration has been reported to differconsistently and strikingly from that of the other members of the nasal set(Table 1; Malecot, 1956; Romportl, 1973). The perceptual importance of NZ for[m] vs. [n] has been reported by De Mori et al. (1979) and that of a stronglow spectrum component about 1000-1,500 Hz in the case of [m] vs. a highconcentration about 2300 Hz for en], L?] by Delattre (1958).

Intraspeaker and interspeaker inconsistency for murmurs other than [mJhas been noted in different languages (Malecot, 1956; Delattre, 1958; Rom­portl, 1973). As already stated, for [aJ and [aeJ, while [')'-J murmur seems tocarry very li tUe place information, [,] murmur and [n] murmur provideimportant place information in the unreleased case and are identified quiteconsistently when released. The perceptual distinction between [?] vs. [nJcould be well cued by high vs. low N1 and absence vs. presence of NZ at thecentral region of the spectrum. Accordingly, a higher N1 value for [?J (240Hz) than for [mJ, [n] (180 Hz) was reported to help perceptual placeidentification (Haskins Laboratories QPR 11, 1954). NZ's being above 3000 Hzor absent in the case of [? J is consistent with Ohala's (1975) observationthat its perceptual effectiveness is presumably severely attenuated because ofthat high frequency location. High perceptual distinctiveness and, at thesame time, similarity of murmur spectra (high N1 and NZ, similar N2-N3-N4configuration) between [?] and Cr-] accord well with the strong role of Ij]transitions in [t] identification (section 2.2.B.).

Wi th respect to acoustic aspects other than spectral frequency, it hasbeen found that N1, N3 bandwidths for [,] are greater than those for em], [mn]and that [nJ presents a very high degree of N2 damping with respect to [ ]

195

Page 8: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Table 2

Pole-zero frequency values (in Hz) corresponding to the nasalmurmurs of em], en], [)'-J. [,], assigned to synthetic speechstimuli in experiments with English (1, 2, 3), Italian (5)

and Polish (4). A summary of values is also included.

Nl N2 N3 N4 NZ References

em] 250 1360 2200 1550 (1 )Hecker (1962)180 1000 2250 750 (2 )House (1 957)200 1100 2500 (3 )Naka ta (1 959)200 1100 2500 (4 )Kacprowski

& Mikiel (1965)300 1100 2000 (5 )Ferrero et

al. (1977)

en] 250 1300 2000 4000 (1)240 1100 2300 3500 (2)200 1700 2500 (3 )200 1700 2800 (4)300 1800 2800 (5 )

--------~---------------------------------------------------------------------

[y-] 200 2300 2800 (4)400 2300 3000 (5 )

[? ] 330200260

13002300900

220025002300

Summary (Extremes)

above5000

em] 180-300 1000-1360 2000-2500

en] 200-300 1100-1800 2000-2800

750-1550

3500-4000

[J"] 200-400 2300 2800-3000------------------------------------------------------------------------------[, ] 200-330 900-2300 2200-2500 above 5000

196

Page 9: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

(Fujimura, 1962). High N1 damping for [?] has also been reported by otherinvestigators (Fant, 1973; Kacpranski, 1965). No evidence about the perceptu~

al significance of these differences is available. The perceptual influenceof coarticulation effects due to the adjacent vowels upon place identificationof a nasal consonant has been shown in experiments on automatic speechrecogni tion (De Mori et a1., 1979); such influence is to be expected fromsignificant variations of the pole-zero structure of the nasal murmur accord­ing to the vocalic environment. Moreover, the perceptual importance ofrelative amplitude levels at different spectral areas should be investigated.In this respect it is highly plausible that, according to data reported by Eek( 1972 ), energy minima at typical NZ frequency zones (Sect ion 2.2. A.) conveyrelevant perceptual information in distinguishing nasal categories of differ­ent place of articulation.

B. Formant Transitions

In Table 3 I present analysis frequency values for F2-F3 transitionendpoints as well as positive (+), steady (=) and negative (-) transitionranges according to data on syllables with [a] and [m], [n], [~], [?J reportedby several authors. 4 The four languages that have [y--] have been chosen areHungarian, Italian, Polish, and Russian.

According to the summary presented at the bottom of Table 3, F2 for ~]

does not overlap in frequency with other categories and shows a constant(positive) transition direction. In fact, a 250 Hz separation minimum betweenF2 values for En] and [J] is a good indication of high F2 distinctiveness for[1""]. Such a difference was also found for Czech and Russian by Romportl(1973) ([n]: 1100-1300 Hz; [J]: 1600-1800 Hz).

On the other hand, F2 transition values for [,] overlap significantlywith those for other nasals, namely Em] and [n], being in the central part ofthe overall range of endpoint frequency values for the consonantal set.Variabil ity of F2 transition values with [a] from slightly rising to steadyand slightly falling has also been noted for the velar non-nasal correlates[g], [k] (Fischer-J0rgensen, 1954; Halle, Hughes, & Radley, 1957; Potter,Kopp, & Green, 1947), See, however, Dalby & Port, 1980, who found quitestrongly positive F2 ranges). Flat transitions were found for English [OJ] byGreen (1959).

Predictions of acoustic theory of speech production support F2-F3 valuesreported in Table 3. Thus, a comparison with one of Fant's nomograms (1960,p. 84) shows that, while given frequencies for labial and alveolar nasalsmatch well with constriction points located near the lip opening area, formantfrequencies for [?] correspond to a constriction place at about 9 cm from lipopening and those for [~] to a constriction at about 4 cm. Negativetransitions for Em] are due to a complete labial constriction. On the otherhand, a more forward constriction point for [n], []"] than for [9] causes F2and F3 values for [a] to increase as a result of a decrease in front cavitysize. Higher frequency for palatals than alveolars is presumably related to agreater increase in the conductivity index of the internal resonator neck:palatographic evidence for [t] in different languages often shows alveolarcontact, as for [nJ, plus several degrees of prepalatal and/or dorsopalatalcontact.

197

Page 10: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

-------------------------------------~----------------------------------------

Table 3

Analysis frequency values (in Hz) corres~nding to F2-F3transitions in syllables with [a.] and [mJ, En]' [}-], [,]

in Hungarian (3), Italian (5, 6), Polish (1, 2) and Russian (4).A summary of values is also included.

Endpoints Ranges Direction References

F2 F3 F2 F3 F2 F3

Em] 1000- 2850- -50/ +50/ + (1 )Dukiewicz (1967)1100 3050 -250 +100

-/= (2 )Jassem ( 1962, 1964 )1050 1695 -380 -500 (3 )Magdics (1969)800 2150 -500 -50 (4 )Fant (1960)1200 2340 -200 -140 (5 )Vagges et

al- (1978)------------------------------------------------------------------------------En] 1250- 2800- +50/ -150/ + -/+ (1 )

1500 2900 +200 +200+ (2 )

1b50 1b50 +220 -540 + (3 )1400- 2250 0 -50 (4 )15001425 2665 +55 +40 + + (5 )1450- 2650 +50 -50 =/+ (6 )Ferrero et1500 al. (1 979)

------------------------------------------------------------------------------[:r- ] 1900- 2500- +500/ -450/ + +/-/= (1 )

2300 3150 +1000 +550+ + (2 )

1970 2032 +495 :f95 + (3 )2050 2850 +550/ +550 + + (4 )

+6502140 2990 +825 +475 + + (5 )2200 3200 +650 +350 + + (6 )

---------------.--------------------------------------------------------------[ ,] 900- 2900- -50/ -100/ -/+ (1 )

1400 3200 -200 +600-/+ (2 )=/- (3 )

------------------------------------------------------------------------------Summary

em] 800-1200

1695­3050

-50/-500

-500/ -(,,)+100

- (+ )

------------------------------------------------------------------------------en] 1250-

16501650­2900

0/+220

-540/ +(=)+200

---------------------------~--------------------------------------------------(:r-] 1900­

23002030­3200

+495/+1000

-450/ ++550

+(-, =)

---------------------~--------------------------------------------------------[, ] 900­

14002900­3200

-50/-200

-100/ -(=,+600 +)

198

------------------------------------------------------------------------------

Page 11: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Besides conveying manner information, F1 transition values seem tocontribute to place identification. Thus, while the [~] transition isextremely negative (Fant, 1960; Vagges, Ferrero, Caldognetto-Magno, &Lavagno~

li, 19'78), that for [? J is usually only sl ightly negative and can even bepositive (Dukiewicz, 1967). For raj followed by [~J, the negativity is due toan important increase of back cavity size and a noticeable increase of vocaltract constriction (Delattre, 1951; Fant, 1960). The slightly negative F1excursion between raj and [?J is related to a smaller increase in pharynxcavity size.

Available data on perceptual experiments with natural speech and synthet­ic speech give information about the relevance of formant transitions in placerecognition. According to Henderson (Note 1), sequences of raj followed by anasal consonant, without nasal murmur and with or without final release, give60% of [mJ responses when presented with [mJ transitions and 80% of [nJresponses with [nJ transitions but o_nly 15%-30% of [?J responses with [,Jtransitions in favor of a majority of [an] judgments. For [a}], with naturalPolish speech, Dukiewicz (196'{) found that transitions compensated for thenegligible place information conveyed by murmurs (section 2.2.A.).

In Table 4 I present transition frequency values reported in perceptualtests \.,rith English-speaking subjects for synthetic [m], [n], [?] and vowels[aJ, [1£.]. Unlike the murmurs, there is much available perceptual data ontransition cues. A comparison of values for [a] in Tables 3 and 4 shows thatF2 frequency values categorized as [,J in synthetic speech experiments (1920­2300 Hz) correspond exactly to analysis values for F2 of [1'"] while analysisvalues for F2 of [?] correspond to values categorized as Em] and En] inexperiments with synthetic speech. It also suggests that, in the absence of[}] as a labeling category, the F2 difference of 250 Hz (1650-1900 Hz) betweentransition frequency values appropriate for [nJ and [~] (Table 3) wasinterpreted exclusively as [nJ by English listeners (Table 4). On thesegrounds, it seems clear that stimuli with [~]-like transitions were interpret­ed as [9] by English listeners but might well have been categorized as [J'] byspeakers of other languages, while stimuli with [?]-like transitions, in theabsence of an appropriate negative F3 and some [J]-like murmur spectrum, wereinterpreted as Em] or En]. This view is supported by observations reported inthe literature. Thus experimenters at Haskins Laboratories stated that while,with no F3, "a large plus F2 transition is heard as [J--L.. rather than [?],with a -3 transition positive F2 transitions are now heard clearly as [,J"(Haskins Laboratories QPR 8, 1953, pp. 21-22). Direct evidence about theeffect of a strongly rising F2 in cueing palatals is also provided by Derkach,Fant, and Serpa-Leitao (1970) for Russian palatalized consonants. Thus, theyfound this F2 transition type to be the most relevant perceptual palataliza­tion cue with vowel [a]. Consistently with the contrast in formant transitionpatterns for ~J and [?], important improvements in identification of the non­nasal stop [g] without F3 transition (Liberman et al., 1954) were found to bedependent on the presence of an optimal - 480 F3 (Harris, Hoffman, Liberman,Delattre, & Cooper, 1958), In studies on speech analysis listed in Table 3,the direction of F3 was found to be predominantly negative as well.

In the light of previous comments, there are reasons to believe thatdegrees of F1 excursion should be included as variable parameters in perceptu­al continua of place identification for nasals. Thus, as shown on other

199

Page 12: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

~--------------~---=~------==-------------=--------------------------------~--

Table 4

Frequency values (in Hz) corres~onding to F2-F3 transitions insyllables with [a], ['Cf,] and em]' Ln], [,] obtained from synthesisexperiments with English. A summary of values is also included.

Endpoints

F2 F3

em] ar.700(Locus)900-1450ar.1300

1000- 2000-1300 2300920-1300

Ranges

F2

-190/+360-200

-500

-310/+70

Direction

F3 F2

-/+

-200/+100

-/+

References

(1 )Liberman etal. (1 954) ([a.])

(2 )Naka ta(1 959)( [Gt] )

(3 )Hecker(1 962 )([~])

(4 )Garcia (1966,1967a,b)([~])

(5 )Miller & Eimas(1977)([a] )

------------------------------------------------------------------------------en] ar.1800 +360 + + (1 )

(Locus)1450- +360/ + (2 )1890 +800ar.180C +300 + (3 )

-500/ +200/ -/+ - (4 )1300- 2400- +1920 3000 +120 +8001300- +70/ + (5 )1700 +470

[ ? ] ar.3000 -360 + (1 )(Locus)1920- +830/ + (2 )2300 +12102000 +200 + (3 )1920- 2200- +120/ 07 + +7= (4 )2300 2400 +500 +200

Summary

em] 900-1450

en] 1300-1890

[?] 1920­2300

200

-500/+360

-500/+800

+120/+1210

-200/ - (+ )+100

+200/ + (-)+800

-360/ ++200

+

-(=, +)

Page 13: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

grounds by Derkach et al. (1970), a strongly negative F1 transition is to beconsidered a relevant palatalization cue. The perceptual relevance of ahigher F1 starting-ending point in identification of a non~nasal velar [g] vs.Cd], [b] has been pointed out by Fujimura (1971).

Individual and overall duration of formant transitions can be shown todiffer among place categories. It is significant that details about thisaspect (longer transitions for [?], shorter transitions for em] and En]) havebeen taken into consideration in works on speech synthesis (Hecker, 1962;Mattingly, 1968; Nakata, 1959), One must point out that such observationsabout relative transition duration for different categories mean little unlessvocalic context is kept constant. In such circumstances, duration of F2transi tion was found to be a rather important cue for the identification ofdifferent Polish nasal-vowel sequences (Kacprowski & Mikiel, 1965).

C. Interaction of Cues

In sections 2.2.A. and 2.2.B. I have taken into consideration how formanttransitions and nasal murmurs contribute separately to the identification ofnasal consonants of different place of articulation and what acoustic traitsmake transitions and murmurs perceptually relevant in such an identificationprocess. Results show that:

(1) either em] transitions or Em] murmur are sufficient place identifi­cation cues for [m]. The strong perceptual role of em] murmur in [-ae.], [a]environments is to be particularly emphasized (Carlson, Granstrom, & Pauli,1972; Henderson, Note 1; Malecot, 1956);

(2) en] transitions are more powerful place cues than en] murmur for en]identification;

(3) [?] cues, transitions, murmur, and release, are needed for asatisfactory [?] identification with vowels [~], [t:.] but not with [aL Inthis case [?] murmur is a stronger cue than [,] transitions;

(4) [~J transitions but not [~] murmurs are sufficient place identifica­tion cues for [y-].

The arrangement of cues in running speech suggests that place cues fornasals ought to be explored interdependently instead of in an isolated way.Any attempt to detect them should focus primarily on the interactive role oftransitions, murmurs, and releases. Results reported by experiments withnatural and synthetic speech, already discussed, can be said to adduce somevaluable (although indirect) information to this issue. Such experimentspresented murmurs in isolation for identification (Dukiewicz. 1967; House,1957; Malecot, 1956; Manrique de et a1., 1980; Nakata, 1959) or murmurs ortransitions combined with the vocalic steady state portion, with or withoutfinal release (Malecot, 1956; Henderson. Note 1). But, none of such attemptssucceeded in combining all cues reciprocally to detect possible cross-categoryeffects. A more realistic approach is reported by Malecot (1956) who, inaddition to the experiments mentioned above, compared the cue value oftransitions and released murmurs extracted from natural speech utterances with['i'f.] and em], en], [J]; stimuli were tested with American English subjects.

201

Page 14: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

According to Malecot's data, [m] cues (murmur and transitions) overridetransitions and murmurs for [n] and [,J, and [n] cues (murmur and transitions)override transitions and murmur for [,J. These overriding effects aresignificant in all cases except for [n] murmur upon [,] transition (52% to48%). They report valuable information about cross-category effects oftransi tions vs. murmurs and, therefore, are of particular interest for myexperiment on Catalan nasals.

D. Summary

It has been shown that place cues for nasals are complex. In order tocharacterize them satisfactorily, experiments with natural and syntheticspeech need to be appropriately designed. Some relevant suggestions aboutthis subject, to be taken into consideration in further research and partlyaccoun ted for in my perceptual experiment on Catalan nasal consonants, arepresented in the following paragraphs.

Synthetic stimuli have to recreate the temporal arrangement of cues foundin natural speech. Resul ts about perceptual relevance of cues summarized inthe preceding section are to be taken with care. Thus, except for thosereported in Malecot's experiment on interaction of cues, they· derive fromexperimental paradigms in which the arrangement of cues in the stimuli can beshown to deviate from the arrangement of cues observed in the syllabicstructure. This is not only the case for isolated murmurs but also forcombinations of segments: abutting the remaining portions of the signal(MalEfCot, 1956; Henderson, Note 1) when murmurs or transitions have beenremoved clearly al ters .the timing relationship between all VC cues and, asHenderson herself has pointed out, might cause masking phenomena when transi­tions and final release are presented in succession; on the other hand,preserving the timing relationship between remaining cues would leave unnatur­al silent portions in the stimuli.

Male'cot's experiment on interaction of cues accounts for their temporalarrangement in the syllable but does not provide any evidence for placeinformation conveyed by different transition characteristics and murmur spec­tral regions. A speech synthesis paradigm is needed for this purpose in whichfrequency values are close to analysis data of real speech. Moreover,acoustic traits of transitions and murmurs need to be varied simultaneously sothat results do not refer independently to optimal transition cues and optimalmurmur cues but to unitary transition-murmur cues. In fact, data reported inthis paper reveal that murmurs and transitions for em], en], [:r-], [,] areperceptually complementary similarly to bursts and transitions for non-nasalstops and, consequently, are integrated analogously in the perceptual processwith reference to well-stated production constraints.

In the following experiment, I try to put into practice these views onresearch strategies. I investigate the perceptual effects of acoustic cues incombination, using Catalan subjects, by contraposing transition and unreleasedmurmur patterns for final nasal consonants of different place of articulation.In contrast with the experimental paradigms reviewed, the arrangement of cueswithin the syllabic structure is preserved while varying transition and murmursimul taneously and systematically. Al though the perceptual relevance of allindividual acoustic parameters is not tested, such a dynamic approach allows a

202

Page 15: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

better understanding of their role wi thin the overall transition and murmurpatterns as well as the VC structure. Analysis data that have been reviewedand theoretical issues that have been raised as well as additional informationobtained from Catalan speakers are taken into consideration.

3. AN EXPERIMENTAL STUDY ON CATALAN NASAL CONSONANTS

1. Phonetic and Phonological Description

In Eastern Ca talan ~hones [m], [n], [)'o], [,] appear in absolute finalposition. em], en], L)""J also appear intervocalically and correspond tounderlying, /m/, /n/, /;r-/. [,] is found word-internally only immediatel;ybefore [kJ and, in final position, occasionally in free variation with [?kJde~ending on the speaker and the lexical item. These phonetic facts argue for[,J being an allophone of underlying /n/ before a velar stop, generated by thefollowing phonological derivation (see also Mascaro, 1978):

"Underlying form"Regressive AssimilationOptional DeletionDevoicing

"Surface form"

/sang/,(0 )( k)

[sCl9( k) ]'blood'

/bank/,(0 )

The presence of underlying / g/ and /k/ accords well with the phoneticrealiza tion of derived formations such as [S3?gi' nari] 'blood thirsty' ,[so9gu'nos] 'bloody', [b~9'k£t~] 'low bench'. Other minimal pairs withcontrasting nasal stops in final position can be found. ThUS

j[fam] 'hun~er'-{

[fan] '(the-1) do', [fa,,] 'mud'; [bam] 'Aux. we go', [ban 'edict', Lb~'bath' [b~J 'bench'.

Final em], en], [:r-], [,J are weakly released or unreleased according toindividual speakers. Given the occurrence of the full set of nasals ofdifferent place of articulation and the unreleased murmur condition, it ispossible, then, in Eastern Catalan, to investigate the role that transitionsand unreleased murmurs play in the process of identification of differentfinal place categories.

2. Production Study

Analysis values corresponding to production data from a single maleCa talan speaker were chosen as reference points for the patterns to be used insynthetic speech experiments. Several samples of monosyllabic minimal pairswere analyzed by means of a Voice Identification sound spectrograph, a digitalspectrographic analyzer and a linear prediction model analysis. To see towhat extent acoustic patterns found in production data on nasals from thisreference speaker were in accordance with those of other Catalan informants,data on the same minimal pairs embedded in a neutral sentence were alsocollected from 12 other male Catalan speakers and analyzed spectrographically.Frequency values for the reference speaker as well as range of frequency

203

Page 16: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

variation and mean frequency values for the other 12 speakers are presented inTables 5 and 6. Both sets of data are compared below and discussed in thelight of theoretical predictions and data from the literature.

Nasal murmur values for the reference Catalan speaker (Table 5) areconsistent with those given in Table 1. The structural configuration of polesand zeros in Table 1 is violated by N3 and N4 for [~], which, as for the otherCatalan speakers, happen to be higher than that for [?]. 5 The continuitybetween F3 transition and N4 for [~] suggests that palatal N4 is mouth cavitydependent. Large N1 bandwidth values for [?] are consistent with dataobtained by other investigators (section 2.2.A.).6 As shown in the same table,the other Catalan speakers differ from the reference speaker in that they failto show a contrast between En] and [?] with respect to N3 and N4. This factsuggests that, if murmur spectra are found to convey relevant information indiscriminating En] from [?], perceptual distinctiveness is to be assigned tocontrasting values for N1 and NZ. It would be interesting to test the indexof perceptual confusability between En] and [?] in Catalan using real speechstimuli with tokens of unreleased [?].

For Catalan speakers (the reference speaker and 12 others) the generaldirection corresponding to F2-F3 transition mean values (Table 6) is consis­tent with that reported in the current literature on synthetic speechexperiments for [m], [n], [7] in English (Table 4). Also cross-category F1values can be predicted on the basis of the acoustic theory of speechproduction. F2, not only for E.,t] but also for [n], is consistently posi tive(see Table 3 for comparison), even when extremes of the observed range ofvalues are taken into consideration. While F2 values for Em] and []o] fallwell apart from those for [n] and [,], respectively, values for [.?] overlapsignificantly with those for [n] and even [m). This is consistent with thefact that an appropriate F3 is needed to synthesize a satisfactory, unambigu­ous velar nasal consonant.

The nasal murmur was at least 1.5 times longer than the preceding vocalicsegment for the reference Catalan speaker. Transition durations for [~] wereconsistently longer (70 msec average) than those for En] (50 msec average) and[?] (35 msec). For this speaker, as well as for many other Catalan speakers,positive F2-F3 excursions were still observable during the nasal murmur periodas an effect of the dynamic motion exhibited by the large mass of tongue bodytowards the dorsopalatal region. The perceptual relevance of the timingrelationship between the gliding movement and the nasal closure onset has notbeen investigated in my perception experiment: nasal formants were keptsteady instead, as found in the productions of some Catalan speakers. Murmurrelease and final voiceless stop after S'] were present or absent as predictedin section 3.1.

3. Perceptual Study

A. Procedure

To explore the perceptual role of transitions and unreleased nasalmurmurs in place recognition as well as to detect identification cues for [~,

continua with [an], [ar] , [a,] were synthesized in two parallel blocks of1:.woslightly different tests each (1 a, 1b; 2a, 2b) according to analysis values

204

Page 17: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Table 5

Analysis frequency values (ranges and means) (in Hz) for murmursin VC syllables [am], [an], [a.r-] , [a,] according to data from a

single Catalan speaker and 12 other Catalan subjects.

N1 N2 N3 N4 NZ Subjects

[m] 200 1120 1360 2100 (1 )Single subj ect170- 910- 1120- 1370- (2)12 subjects320 1105 1510 1800 (range values)255 1015 1300 1565 (3)12 subjects

(mean values)------------------------------------------------------------------------=-----[n] 200- 800- 1460- 1950- 1780 (1)

300 900 1650 2100BW: 180 BW:330 BW:235225- 880- 1440- 1775- (2 )325 1135 1640 2600285 1035 1515 2130 (3 )

------------------------------------------------------------------------------[J" ] 180- 900- 2000- 2900- 2360- (1 )

230 1150 2200 3350 3000BW: 150 BW:220 BW: 140 B\'l: 140200- 800- 1365- 1740- (2 )340 1180 2335 3000295 1055 1760 2265 (3 )

------------------------------------------------------------------------------[ ? ] 300- 1150- 1860- 2430- 2900- (1 )

400 1240 2200 2650 3400BW:275 BW:200 BW:300 BW:250225- 900- 1375- 1960- (2 )360 1240 1640 2730295 1060 1530 2160 (3 )

------------------------------------------------------------------------~-----

205

Page 18: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

206

~-----------------------------------------------------------------------------

Table 6

Analysis frequency values (ranges and means) (in Hz) for transitionsin VC syllables [a..m], [<h.n], [lL)'"], [a.,] according to data from

a single Catalan speaker and 12 other Catalan subjects

V steady-state Endpoints Subjects

F1 F2 F3 F1 F2 F3

Em] 920 1435 2420 800 1150 2420 (1 )Single subject630- 1110- 1940- 525- 980- 1880- (2 )12 subjects825 1580 2250 765 1560 2250 (range values)725 1290 2130 655 1215 2075 (3)12 subjects

(mean values)------------------------------------------------------------------------------En] 920 1535 2520 775 1540- 2600 (1 )

1630570- 1200- 1955- 525- 1350- 1955- (2 )805 1510 2610 760 1660 2610715 1325 2300 685 1560 2365 (3 )

------------------------------------------------------------------------------[1" ] 920 1535 2520 450- 1840- 2750- (1)

550 1980 2850685- 1200- 1975- 405- 1625- 2025- (2 )875 1570 2700 805 2050 2925755 1400 2315 635 1860 2440 (3 )

------------------------------------------------------------------------------[ ,] 920 1535 2520 900 1400- 2125- (1 )

1640 2500685- 1285- 1995- 645- 1310- 1860- (2 )900 1680 2700 900 2025 2475770 1440 2255 740 1550 2140 (3 )

----------------~-------------------------------------------------------------Ranges Direction

F1 F2 F3 F1 F2 F3em] -=1"20 -285 0 = (1)

-155/ -270/ -110/ -(= , -(= ) (2 )-25 +30 0 +)-70 -75 -55 (3 )

en] -145 -5/ +80 + (=) + (1)+95

+(-, =) (2 )-90/ +115/ -190/ - (=) +0 +300 +285-30 +235 +65 + + (3 )

------------------------------------------------------------------------------[)'" ] -470/ +305/ +230/ + + (1 )

-370 +445 +330-425/ +200/ -285/ - (+, =) + +(-, =) (2 )+50 +630 +435

(3 )-120 +460 +125 + +------------------------------------------------------------------------------[J] -20 -135/ -395/ -/+ /= (1 )

+105 -20-100/ -75/ -315/ - (=) + (=) - (+, =) (2)+10 +450 +90-30 +110 -65 + (3 )

-------------------~----------------------------------------------------------

Page 19: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

for the Catalan reference speaker displayed in Tables 5 and 6. The syntheticstimuli were prepared using a software serial formant synthesizer (SYNTH) atHaskins Laboratories wi th variable BW parameters, an extra pole (used as N2)and an extra zero (used as NZ) (Mattingly, Pollock, Levas, Scully, & Levitt,1981 ) .

In tests 1a and 1 b a series of variable F2-F3 transition endpoints wascombined with three fixed murmur patterns believed to be optimal for en], [y-],[j]; in tests 2a and 2b a series of variable murmur values was combined withthree fixed, optimal transition patterns. In contrast wi th previous experi­mental studies, these two conditions allow us to determine identificationfrequency ranges across place categories for transitions and murmurs as wellas to investigate more adequately the interaction of the two acoustic cueswi thin a syllable structure recreated from that of natural speech utterances.Actual values are given in Figures 1 and 2. Poles of the murmur pattern arerepresented by single lines and zeros by double lines. Test 1a differs from1 band 2a from 2b in vowel steady state and transition endpoint values. Twoversions of the same experimental design were given simply to test theperceptual effect of a larger variety of F2-F3 transition values and,therefore, to be able to determine identification cross-over points wi thhigher precision. I give some details about the variable transition endpoints(Figure 1) and variable murmur structures (Figure 2), which will be taken intoconsideration in the discussion section on perceptual data obtained fromCa talan informants:

a. F2 transition endpoints:

Test 1a: From 1430 Hz (-70) to 1920 Hz (+420) in 7 steps of 70 Hz;Test 1b: From 1600 Hz (0) to 2000 Hz (+400) in 8 steps of 50 Hz.

b. F3 transition endpoints:

Test 1a: From 2340 Hz (-160) to 2900 Hz (+400) in 7 steps of 80 Hz;Test 1b: From 2300 Hz (-300) to 3100 Hz (+500) in 8 steps of 100 Hz.

c. Nasal murmur structures:

N1: From 250 Hz (6 steps) to 350 Hz in 5 steps of 20 Hz;N2: From 900 Hz to 1200 Hz in 11 steps of 30 Hz;N3: From 1600 Hz to 2100 Hz (6 steps) in 5 steps of 100 Hz;N4: From 2500 Hz to 3000 Hz and vice versa in 5 steps of 100 Hz;NZ: From 1800 Hz to 3200 Hz in 11 steps of 140 Hz;

Formant bandwid th values were also varied across stimuli according tofrequencies included in Figures 1 and 2. A constant value of 900 Hz for F1was chosen. Preceding [a] was nasalized from vowel onset to transitionendpoint by means of a progressive frequency rise of a single low pole (500 to600 Hz) - zero (500 to 700 Hz) pair. Each stimulus was 560 msec long, havinga vowel steady state of 200 msec, a transition of 60 msec and a murmur portionof 300 msec. There was a progressive 10 dB decrease from vowel to murmur andan FO lowering slope between vowel onset (120 Hz) and murmur offset (80 Hz).

207

Page 20: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

NoCO

FrequencyMurmurs

On Hz) Transitions ('0) <p) (9'

3000t

~--

150

F32600t(lb)

(1a)

~ 250 250

2200

1 & 150 300

1800F2

(lb)(la) 4I!tC: 300

1400f200

1000 200

200

600t300200 130 190

Figure 1.- Synthesis patterns for tests 1a and 1b (fixed murmur conditions)with Inclusion of bandwidth values (In Hz) for murmur formants.

Page 21: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Frequency Transitions Murmurs(In Hz)

3000! (r> -/ <r>

- 170 150 110 ==--

2600 t(2b)F3 ~(n)

- -- 190 190

(2a)(n)

~)- 210 -

9) 250 230 - 210230250

2200+ =<p) - ---------

-~) -- 150

1800+

~11= _ -180 180 210240210300

F2_ 210

(2b) _210(2a) (~) 300 240

1400 i

1000

200 200 200 200 200 200200 200 200 200200

600

No'-D

200-------130 142 1504 lee 118 190 212 234 256 216300

Figure 2.-Synthesls pattern8 for tests 2a and 2b (fixed transition condition)

with Inclusion of bandwidth values On Hz) for murmur fou'mllnts.

Page 22: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Every test was administered using several stimuli per step and theoverall set of stimuli randomized before presentation for identification.Overall, test 1a was composed of 144 tokens (6 per step), test 1b of 162 (6per step), test 2a of 165 (5 per step) and test 2b of 176 (4 per step).Intervals of 4 sec were included between successive stimuli and longer 10 msecintervals after every ten stimuli. Twenty-four paid Catalan subj ects tookeach of the four tests twice and were asked to identify orthographically thefinal nasal stop as [nJ, [1-J or [,J. They were all students who had had noprevious experience wi th synthetic speech and did not know the purpose of theexperiment. Thirteen took the tests binaurally through headphones; the restlistened to stimuli reproduced through a loudspeaker because of problemsconnected with taking the tests in the field. To find out whether suchdifferent listening conditions could have had some strong effect on subjects'responses, I listened to stimuli under both conditions and obtained almostidentical cross-over points and response distributions.

B. Results

Table 7 shows category judgments for each test in all variableconditions; data from all subjects and from the most consistent 14 labelersare reproduced separately. Resul ts for each variable condition have beendisplayed in Figures 3, 4, 5, and 6. Figures 3 and 4 giVe perceptual dataobtained from tests 1a and 1 b, and Figures 5 and 6 give data from tests 2a and2b. Figures 3 and 5 give data from all 24 Catalan subjects and in Figures 4and 6 data from the most consistent labelers (14 selected subj eets) . Eachsubplot in Figures 3 and 4 represents judgments for a particular murmur;stimulus numbers for different F2-F3 transition values lie on the abscissa.Each subplot in Figures 5 and 6 represents judgments for a particular F2-F3transi tion set; stimulus numbers for different murmur values lie on theabscissa. Among all subj ects those who identified the stimuli wi th mostconsistency were chosen as "best" labelers. A comparison of identificationcurves obtained from these 14 selected informants wi th those obtained from all24 shows that, as expected, the best labelers categorized stimuli moredistinctively. Thus, the following summary on perceptual data about [nJ, [~J,[,J identification will refer mainly to responses obtained from the mostconsistent Catalan subjects.

a. Table 6. -A comparison of percentages of category identificationbetween murmurs and transitions shows that the labeling of appropriatetransitions is always more consistent than that of appropriate murmurs. Theeffect of transitions vs. murmurs in both sets of tests (1a-2a, 1b-2b) is muchhigher for [r-J (2.5-2.8 ratio) than for [nJ, [,J, and for [nJ (1.4-1.7 ratio)than for [?J (1.1-1.6 ratio).

Tests 1a and 1 b show that an optimal [:r--J murmur does not favor theidentification of one or another of the 'place categories. An optimal [nJmurmur sli~htly favors identification of [nJ, [~J vs. [,J (see test 1b). Anoptimal [jJ murmur favors significantly [,J vs. [nJ responses (1.5-1.9 ratio).

Tests 2a and 2b show that the I;resence of optimal [ryJ transitionscorrelates significantly with [,J vs. [nJ responses (1.1-2.1 ratio) while thatof optimal [nJ transitions correlates significantly with [nJ vs. [?] responses(1.1-2.1 ratio), independently of [}J identification. Appropriate [r-] transi­tions contribute exclusively to [y-] identification in a range of 90.9%-95.7%.

210

Page 23: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Table 7

Category judgments for each test (percentage). Data from all24 subjects and the best 14 labelers are displayed separately.

1,1 W if1 Subjects

Test 1a---[? ] murmur 40.8 26.6 32.6 (1 )All 24 subjects

42.7 24.9 32.2 (2 )Best 14 labellers[ n] murmur 30.3 32.9 36.8 (1)

28.4 34.5 37 (2 )[r-] murmur 33 31. 2 35.8 (1 )

32.3 34.4 33.3 (2 )

Test1b---[?] murmur 44.1 23.6 32.3 (1 )

45.1 22.6 32.2 (2 )[ n] murmur 26.7 37. 1 36.1 (1 )

23.3 40.4 36.3 (2 )[J'-] murmur 32.5 31. 5 36 (1)

32.3 34.7 33 (2 )

Test 2a---

[,] transtions 63.8 35.3 0.4 (1 )68.4 31.1 0.5 (2 )

[n] transitions 46 51. 2 2.8 (1 )39 60.4 0.4 (2 )

[y-] transitions 1.5 2.8 95.7 (1 )0.6 4.3 95.1 (2 )

Test 2b---

[J] transitions 53 45.4 1.6 (1 )54 45.9 0.2 (2 )

[ n] transi tions 37.9 54.8 7.1 (1 )30.4 65.7 3.9 (2 )

[:r-] transitions 2.8 4.1 93.1 (1 )2.6 6.5 90.9 (2 )

211

Page 24: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

N~

10

100,(9) Murmur (t In) Murmur Cp) r-t <r) Murmur

80~ (p)<.9) (9)

I------60

40

20

t I , ~L, ~.2 4 6 8 2 4 6 8 2 4 6 8

100T

~(I <it<.9) Murmur (n) Murmur <I" Murmur (r'80+

(9) <.9) Cn) -AI(9)

, -60

40r~20

2468 2.4 e a 2468

Figure 3.- Perceptual results for tests 1a (above) and 1b (below) (all 24 Catalan subjects).Ordlnate:percentage of responses with indicated label~ absclssa:stlmul!.

Page 25: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

(r)<r) Murmur

(9)

(n) Murmur

(9'

(9) Murmur (r><.9>

(!J> Murmur <r)(9)

68 2468 24 S"SFigure 4.- Perceptual results for tests 1a (above) and 1b (below) (14 best Catalan

labelers). Ordinate:percentage of responses with Indicated label; abscissa:stimuli.

80

20

40

20

80

60

40

60

100

100

tvI-'W

Page 26: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

<.9'

<p> Transitions

<p> Transitions

4 e 8' 10

~)

<r)

(n)

(9) ....... (n)

2 4 6 - 8 1'0 2 4 - e - 8 - 10 2' .. . e - 8 . 10

Figure 5.-Perceptual results for tests 2a(above)and 2b(below)(all 24 Catalan subjects).Ordinate: percentage of responses with Indicated label; abscissa: stimulI.

NI-'.p-.

100.I(9) Transitions (n) Transitions

80t (9)

(n)I "' A / ~ '-

60

40

20(p'

~I ~2 4 6 8 - 1

100rt (n) Transitions(9) Transitions

80tI

(n)I UJ1 II -60

40

20I

<r)

Page 27: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

<r) Transitions

Cn)

(r'

(9)

(n) Transitions(h)

(9) Transitions(9)

80

20

60

40

100

10e64

Cp) __ ../

Cp> Transitions

en)

210864

Transitions(n)

2

---------

108642

<.9) Transitions

2 4 6 e 10 2 4 6 e 10, 2 4 8 8 10Figure S.-Perceptual I"esu'tli for te.ta 2a (above) and 2b (below) (14 best Catalan labehil"s).

Ordinate: percentage of reapon8ea with Indicated label; abacla.a :sbmuUt

20

60

40

80

100

Nf--'\Jl

Page 28: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

b. Tests la, lb (Figures 4 and 5)-Identification peaks show thatcategory judgments cannot be predicted on the basis of appropriate murmurs butof appropriate transitions, especially for the En] and [J] murmur conditions.I characterize below an optimal set of F2-F3 transition directions and rangevalues (in Hz) for each place category, according to perceptual data reportedby all Catalan speakers with special reference to the most consistentlabelers:

[?J F2: slighty negative to slightly positive (-70 to +80)F3: strongly negative to steady (-300 to 0 )

En] F2: positive (+140 to +250 )F3: steady to positive ( 0 to +200 )

[jW F2: strongly positive (+280 to +420)F3: strongly positive (+200 to +500)

Murmurs appropriate for different categories have no effect upon optimaltransi tion values for [')J and [rJ. While [J-] murmur has no effect uponoptimal En] transitions, a significantly higher average of [.?] than En]judgments is obtained for typical En] formant values (F2: up to +260; F3: upto +230) followed by [jJ murmur.

b. Tests 2a, 2b (Figures 5 and 6)-Identification peaks show thatcategory judgments can be highly predicted on the basis of appropriatetransitions. This is clearly the case for the [Jt-] transition condition:while no optimal [:r-J murmur can be found along different murmur continua,.optimal [;r-] transi tlOns override completely [n] and [? J murmur s. Therefore,an optimal set of murmur values (in Hz) for [,J and [nJ is to be exclusivelyfound in the case of the [p] and En] transition conditions:

N1: 350N1: 250

N2: 1200N2: 960

N3: 2100N3: 1800

N4: 2500N4: 2700

NZ: 3200NZ: 2080

The perceptual effect of these optimal murmur values is obvious: for [j)]

and [nJ transitions the percentage of [J] responses increases as the optimal[?] murmur (stimulus 11) is approached and that of [n] responses also risestowards the optimal En] murmur (stimulus 3). It is the case that optimalmurmurs for [,] and [nJ are shown to be dependent upon [nJ and [?] transitionsrespecti vely: thus, the percentage of [,] responses for the optimal [,]murmur is significantly lower with En] than [J] transitions and that of En]responses for the optimal En] murmur is significantly lower with [,] than En]transitions. Moreover, it must be noticed that, while En] transitions neveroverride optimal [?] murmurs, [?] transitions are shown to override optimalEn] murmurs in test 2a (Figure 6 .

C. Summary and Discussion

Per0eptual data from Catalan subjects indicate that, overall, for vowel[a], tra'lsitions provide more effective cues for nasal consonants of differentplace of articulation than murmurs. This is consistent with results obtainedin prevj.ous experiments with synthetic speech tested with American English

216

Page 29: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

speakers. In agreement wi th Henderson's (Note 1) resul ts, it has been shown,however, that the contribution of murmurs in place identification is muchhigher than that suggested by data reported in most of those experiments, andcategory-dependent as well. In fact, my results confirm that a bettercharacterization of place cues for nasals must be strongly related to aconsiderable improvement in the construe tion of the experimental paradigmselected for perceptual testing. I draw next a summary of the perceptualresul ts on Catalan and evaluate them in the light of material reviewed inSection 2.2.

While, as stated, transitions are more powerful cues than murmurs in theprocess of identification of nasals, important cross-category differences canbe established. This effect is highly relevant for [~J, and more relevant for[nJ than [, J. Reciprocally, [? J murmurs contribute more to place identifica­tion than other murmurs. This contrast in perceptual relevance of cues isconsistent with the tendency of [?J murmur to prevail over [nJ transitions andnot vice versa in what may be called an inter-category trading relation ascharac terized below in this discussion section. On the other hand, while [}Jtransitions override murmurs appropriate for other nasal categories, optimal[)"J murmurs have been shown to convey no place information. This negativeeffect may have been maximized by not having taken into consideration, in thesyn thetic reproduc tion of murmurs, the characteristic [J J glide componentduring murmur. Spectrographic analysis reveals that, while very littlemovement can be detected for nasal formants during the closure period in thecase of [nJ and [,J, those for [:r-J show a continuation of positive excursionwith respect to F2-F3 transitions. Since F-transitions result from articula­tory dynamics, we have, apparently, a continuation of tongue movement duringlingual closure and complete oronasal coupling. All these findings aboutrelevance of transitions and murmurs in the identification of different placecategories are significantly consistent with the summary of interactiveperceptual effects included in section 2.2.C., based upon production andperception data from other languages.

Optimal cues obtained for [2 J indicate that transition direction for F2is not perceptually relevant as long as it remains close to the vowel steadystate frequency; however, F3 transition must be negative for a satisfactory[,J identification. Optimal [~J murmur, on the other hand, is characterizedby a high N1 and the absence of NZ at the central part of the spectrum.Identification of [n] is mainly dependent upon a constantly positive F2, for asteady or positive F3. Optimal alveolar murmurs have a low N1 and an NZbetween N3 and N4. These results for [,J and [nJ conform well withindications about perceptually relevant acoustic cues in sections 2.2. A. and2.2. B. according to data from other languages. [J"J has been found to beexclusively dependent on strongly positive F2-F3 transitions in agreement withreported perceptual data from Polish and cues for Russian palatalized non­nasal consonants. This powerful transition effect also confirms suggestionsmade in section 2.2. B. about the possibility of a [J"J identification for astrongly positive F2 transition in the absence of appropriate [? J cues byspeakers of languages with ['}'-J. Moreover, the fact that no I?-erceptual effectfollows the contrast in N3-N4 values between [:r-J murmur and LnJ, [7. J murmurs(Figure 1) is consistent wi th the irrelevance of high formants at the murmurportion in place identification of nasal murmurs and, therefore, with theperceptual significance of N1 and NZ values. In this summary of cues one must

217

Page 30: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

point out that the inclwlion in the experimental paradigm of different F1transi tions for contrasting nasal categories might have added some relevantinformation about place identification.

Data in Figure 7 show to what extent the perceptual resul ts areconsistent with production measurements gathered from Catalan speakers.Crosses along both diagonal lines point to values for F2-F3 transition rangescorresponding to synthetic stimuli of tests 1a and 1 b. Phonetic symbolsrecorded on these lines ind icate prevailing interstimuli category judgmentsfor transition continua in all different murmur conditions. Dots stand forF2XF3 transition range values corresponding to productions of [mJ, [nJ, [1"'J,[?J by single Catalan speakers summarized in Table 6; they are grouped intoproduction spaces for each nasal category. Transition range values for thereference Catalan speaker chosen to prepare the synthetic speech stimuli arerepresented by encircled dots. Such a graph has been found to accord morewith perceptual processing of nasals than one that relates stimulus points toproduction data on transition endpoint values: A comparison of results fromtests 1a and 1 b has shown that, in categorizing stimuli, listeners were infact attending to F2-F3 transition ranges relative to the vowel steady statevalue and not to absolute transition endpoints. A satisfactory coincidence isfound between perceptual category judgments and category production spaces fordifferent nasals, thus confirming the fact that transitions are good identifi­cation place cues. While this is clearly the case for the perceptual contrastbetween [nJ and [}J,7 it can be seen that murmur structures ([,J murmur for[?J identification; [nJ, [tJ murmurs for [nJ identification) are used byCatalan speakers as identlfication cues for F2-F3 range values that liesomewhere between or on the edges of [,J and [nJ production spaces. Thisfinding argues for the existence of a trading relation between acoustic traitsspread over time in the process of [,J vs. [nJ identification, and shows thatacoustic cues are integrated into a unitary phonetic percept in the process ofdynamic perception. Thus, perceptual complementarity between transitions andmurmurs accords wi th the fact that, given an ambiguous set of F2-F3 formanttransitions between [?] and [nJ production spaces, listeners report I?I or Inljudgments de,Pending on whether the following murmur structure is appropriatefor [,] or LnJ, respectively. Moreover, consistent with reported observa­tions, [,J murmur appears to have greater perceptual weight than [nJ murmur,since the perceptual ranete for [2. J affects more considerably the [nJ produc­tion space than that for LnJ the'L?J production space.

4. CONCLUSIONS

In the previous sections I have investigated the interactive role offormant transitions and unreleased murmurs in the process of identification of[nJ, [)'-J, ['J] with [a] in VC syllables using synthetic speech stimuli.Perceptual resul ts from Catalan speakers strongly suggest that syllabic cuessuch as tl'ansi tions and murmurs are simul taneously processed in a phoneticmode. As :3tuddert-Kennedy (1977) has made clear, dynamic acoustic events" arejointly shaped by the timing mechanisms of motor control and by the demands ofthe auditory system for perceptual contrast and compression" (p.17). Asexemplified below, produc tion and perception data reported in this paper showthat there is evidence for both related strategies in the process ofidentifica'~ion of place for nasals, namely, reference to articulato ryconstraintH and to constraints imposed by the auditory system.

218

Page 31: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

+100

"'600

<1') production apace

+500

-200 (m)productionspace

-350 -250 -150 -50 +50 +150 +250 +350 +450

Figure 7.- Comparison of perception (Figures <4,6) and production(TABLE 6) data for Catalan subjects with special reference

to formant transitions. Ordinate: F2 ranges (In Hz);absclaaa:F3 ranges (In Hz).

219

Page 32: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Heference to psychoacoustic constraints imposed by the auditory system ispresumably needed to account for specific acoustic cues that make transitionsand murmurs perceptually relevant. On the one hand, it would argue for thecorrelation between amount of F2 transition range and perceptual relevance ofF2 transition found to be true for nasals with respect to the arrangement [~]>en]> [,], An explanation for this effect is suggested by Klatt and Shattuck(1975). They found in experiments with non-speech stimuli that the perceptualimportance of an F2-like chirp with respect to an F3-like chirp is positivelycorrelated with its frequency height. That is, the effect would increase withan F2 transition such as that for [t] (strongly positive) with respect to [n]i[?] and such as that for [n] (moderately positive) with respect to [, J(slightly positive but also slightly negative or steady). No auditoryconstraint is known that can handle perceptually distinctive cues of [1')] vs.[n] murmurs. /

An auditory analysis of that sort is compatible with a perceptualprocessing mechanism of relevant acoustic cues that keeps track of theunderlying unitary articulatory gesture. In fact, it is to be thought thatconstraints imposed by the auditory mechanism in the interpretation of cuesare integrated at a more central stage wi th reference to a dynamic andcontinuous set of coarticulation strategies. Evidence for such related eventscan be derived from Figure 7: according to data displayed there, nasals areperceived with reference to well- established category spaces and essentiallyprocessed, at least for [a], upon F2-F3 transition range values and uponmurmur charac teristic s for po tentially "ambiguous" transition configurations.Heference to the continuous production event is also needed to account for theperceptual decoding of syllabic spread cues: thus, as shown, transitions andmurmurs (and, presumably, releases) are evaluated simultaneously andcomplementarily in a way that a cross-category maximum to minimum perceptualeffect for transitions ([:r-]>[n]>C,]) corresponds invariably to a minimum tomaximum effect for murmurs ([j] <[ nj <[ ,]). Such reciprocal correlationconforms to the existence of a trading relation between transitions andmurmurs for [,] and en] with [a] in Catalan (section 3.3.C.) and a definedcompensator~ effect between strongly positive [~] transitions and perceptuallyirrelevant L}] murmur. Further evidence for simul taneous processing of nasalcues at the syllabic level according to vowel quality has been reported to betrue for [,] preceded by [~], [e.] (section 2.2.C.).

A perceptual model that allows this sort of auditory analysis oftransition ranges to occur presupposes reference to a basic set of articulato­ry gestures but is not compatible wi th a feature recognition model based uponthe auditory detection of invariant short-term spectral properties. Blumsteinand Stevens (1979) report properties of this sort at a 6 msec window releaseperiod of em] (diffuse-falling frequency-amplitude template) vs. en] (diffuse­rising frequency-amplitude template) in initial position. Frequency-amplitudespectral characteristics at the release can be hardly thought to be "primaryplace cues" since unreleased nasals are equally common and occurring releasesmay be perceptually weak, almost indistinguishable; moreover, murmur spectracan hardly provide such cues, given their low amplitude component, highvariability and particular pole-zero structure. In fact, transitions havebeen found in my experiment to be the best place cues in combination wi thappropriate murmurs: overall, gliding F2-F3 transitions have given more than95% of [yv] judgments and optimal transition-murmur combinations 80% of en] and

220

Page 33: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

[,J responses. While, in order to handle these aspects, the perceptual modelproposed by Stevens and his coworkers can be shown to be too simple andlimi ted, it becomes too complex and complicated on other grounds. Thus,contrary to what has been suggested by Stevens (1975), examination of longstretches of acoustic data (presumably hundreds of msec long) before phoneticfeature decoding begins is not needed in the case of diphthong-like spectralnuclei such as palatal and palatalized articulations: short pre-closuretransitions are satisfactory cues for [}VJ with [aJ and can be processed in thesame way as those for [nJ and [jJ.

I have argued for a perceptual process of nasals based upon thesimul taneous integration of acoustic cues according to demands imposed by theproduction and auditory systems. But, given [aJ and other possible vocalicenvironments, what is the articulatory basis for this integration process?Little perceptual data on the identification of nasals of different places ofarticulation in different vocalic environments is available. The only system­atic approach is that of Henderson (Note 1). Henderson's data and evidenceprovided in this paper support the view that the perceptual interpretation oftransitions and murmurs for nasals preceded by different vowels is similar tothe integration of burst and transitions for non-nasal stops of the same placeof articulation in CV environments (Dorman et al., 1977). Thus [, J murmursare optimal cues and [,J transitions very weak cues for [iJ, [aJ, ['JJ, [oJ,[uJ; for [eJ, [€oJ the role of transitions becomes more relevant, while for[eJ--but not for [e.J--that of the murmur decreases. For [nJ, transitions arevery effective cues wi th back vowels but ver;y weak wi th [iJ and [eJ whilemurmur is, complementarily, a better cue for LiJ, in accordance with earlierfindings (Hecker, 1962; Nakata, 1959; Ohala, 1975). General effects for [nJare to be expected also for [J'-J and ought to be correlated with longtransition duration with back vowels vs. short transition duration and littleexcursion range with high vowels (see, for Polish, Jassem, 1964).

While the perceptual relevance of vowel- to- consonant transition rangesfor nasals accords well with data from Dorman et al. (1977), that of murmursis only consistent for [?J with all vowels but [£.J and for [nJ with [iJ.Differently from alveolar bursts, for [nJ, strong murmur effect is found inthe case of [oJ, [£J, [uJ and less for [aJ, [eJ, [:JJ. Such findings about thecategory identification role of murmurs' in different vocalic environmentssuggest that the interpretation strategy used by listeners in associatingmurmur and articulatory event differs from that proposed by Dorman et al. forburst and front cavity size. In the case of final nasals, differentarticulatory conditions argue for different integration strategies: no burstis present and release is weak and unnecessary; spectral continuity cannot beexpected between oral transition endpoints and oro-nasal murmur concentrationscharacterized, moreover, by a low perceptual relevance of the mid spectralregions; finally, energy concentration and crucial place information in nasalmurmurs are dependent on the size characteristics of the oro-nasal systembehind the tongue constric tion point ( back cavity). A plausible integrationmodel for Henderson's data would associate the back cavity for the nasalconsonant wi th the overall front- back cavity system for the vowel so that theperceptual effectiveness of the murmur would depend on the degree of similari­ty between back nasal cavity and front or back cavity size appropriate for thevowel. Such a model would predict perceptual relevance for [:? J murmur wi ththe back~avity size of coarticulated back and front vowels, for [mJ murmurs

221

Page 34: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Les attributs acoustiquesde la nasalite~ vocalique et consonan­Studia Linguistica, 1954, ~, 103-109.

Les indices acoustiques de la parole. Phonetica, 1958, ~, 226-

with the considerable front cavity size of back vowels, and for [Ji-] murmurwith the wide pharyngeal pass of palatal vowels. For [n] murmur there wouldbe integration with the overall tract system of a mid vowel and the frontcavity of [iJ. Obviously, more experimental evidence is needed. In any case,data from Henderson (Note 1) and experiments reported in this paper show thattransitions and murmurs, analogously to bursts and transitions, are equivalentand complementary.

REFERENCE NOTE

1. Henderson, J. On the perception of nasal consonants. Unpublished Gener­als Examination paper, University of Connecticut, 1978.

REFERENCES

Ali, A., Gallagher, T., Goldstein, J., & Daniloff, R. Perception of coarticu­lated nasality. Journal of the Acoustical Society of America, 1971, 49,538-540.

Bjuggren, G., & Fant, G. The nasal cavity structures. Royal Institute ofTechnology. STL-QPSR, 1964, ~, 5-7.

Blumstein, S. E., & Stevens, K. N. Acoustic invariance in speech production:Evidence from measurements of the spectral characteristics of stopconsonants. Journal of the Acoustical Society of America, 1979, 66,1001-1017. - --

Carlson, R., Granstrom, B., & Pauli, S. Perceptive evaluation of segmentalcues. Royal Institute of Technology, STL-QPSR, 1972, 1, 18-24.

Cooper, F. S., Delattre, P. C., Liberman, A. M., Borst, J. M., & Gerstman,L. J. Some experiments on the perception of synthetic speech sounds.Journal of the Acoustical Society of America, 1952, 24, 597-606.

Dalby, J., & Port, R. Radial trajectories in F2XF3 plane as place invariants.Research in Phonetics, 1980, 1, 201-216.

Delattre, P. -The physiological interpretation of sound spectrograms.Publications of the Modern Language Association of America, 1951, 66,864-875.

Delattre, P.tique.

Delattre, p.251.

Delattre, p. Divergences entre nasalites vocalique et consonantique enfrancais. Word, 1968, 24, 64-72.

:. -- -De Mori, R., GUbrynowicz, R., & Laface, p. Inference of a knowledge source

for the recognition of nasals in continuous speech. IEEE Transactions onAcoustics, Speech and Signal Processing, 1979, ASSP-27, i, 538-549.

Derkach, M., Fant, G., & Serpa-Leit~, A. de Phoneme coarticulation in Russianhard and soft VCV-utterances with voiceless fricatives. Royal Instituteof Technology, STL-QPSR, 1970, 2-3, 1-7.

Dorman, M. F., Studdert-Kennedy M., & Raphael, L. J. Stop-consonant recogni­tion: Release burst and formant transitions as functionally equivalent,context-dependent cues. Perception! Psychophysics, 1977, 22, 109-122.

Dukiewicz, L. Polskie gloski nosowe (Analiza akustyczna). Warsaw: PolskaAkademia Nauk, 1967.

222

Page 35: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Eek, A. Acoustical description of the Estonian sonorant types. EstonianPapers in Phonetics, 1972, 9-35.

Fant, G. Acoustic theory of speech production. The Hague: Mouton, 1960.Fant, G. Descriptive analysis of acoustic aspects of speech. Logos, 1962, ~.

3-17.Fant, G. Auditory patterns of speech. In W. Wathen-Dunn (Ed.), Models for

the perception of speech and visual form. Cambridge, Mass.: MIT Press,1967, 111-125.

Fant, G. Acoustic description and classification of phonetic units. Speechsounds and features. Cambridge, Mass.: MIT Press, 1973, 32-83.

Fant, G. Perspectives in speech research. Royal Institute of Technology,STL-QPSR, 1980, 2-3, 1-16.

Ferrero, F., Genre, T, Boe, L. J. & Contini, M. Nozioni di FoneticaAcustica. Torino: Edizioni Omega, 1979.

Ferrero, F., Vagges, K., Righini, G., & Pelamatti, G. M. Un sistema disintesi dell'italiano: primi risultatti. Rivista Italiana di Acustica,1977, .1-, 33-48. -

Fischer-Jl6rgensen, E. Acoustic analysis of stop consonants. MiscellaneaPhonetica, 1954, 2, 42-59.

Fujimura, O. Analysis-of nasai consonants. Journal of the Acoustical Societyof America, 1962, 34, 1865-1875.

Fujimura, O. Formant-antiformant structure of nasal murmurs. Proceedings ofthe Speech Communication Seminar (1962). Vol. I. Stockholm: RoyalInstitute of Technology, Speech Transmission Laboratory, 1963, 1-9.

Fujimura, O. Remarks on stop consonants. Synthesis experiments and acousticcues. In L. L. Hammerich, R. Jakobson, & E. Zwirner (Eds.), Form andsubstance. Denmark: Akademisk Forlag, 1971, 221-232.

Fujimura, 0., & Lindqvist, J. Sweep-tone measurements of vocal-tract charac­teristics. Journal of the Acoustical Society of America, 1970, ~, 541­557.

Garcia, E. The identification and discrimination of synthetic nasals.Haskins Laboratories Status Report on Speech Research, 1966, SR-7/8, 3.1­3. 16.

Garcia, E. Labelling of synthetic nasals. Haskins Laboratories Status Reporton Speech Research, 1967, SR-9, 4. 1-4. 17. (a)

Garcia, E. Discrimination of three-formant nasal-vowel syllables. HaskinsLaboratories Status Report on Speech Research, 1967, SR-12, 143-153. (b)

Green, P. S. Consonant-vowel transitions. A spectrographic study. StudiaLinguistica, 1959, 11, 57-105.

Halle, M., Hughes, G. W., & Radley, J.-P. A. Acoustic properties of stopconsonants. Journal of the Acoustical Society of America, 1957, 29, 107­116.

Harris, K. S., Hoffman, H. S., Liberman, A. M., Delattre, P. C., & Cooper,F. S. Effect of third-formant transitions on the perception of thevoiced stop consonants. Journal of the Acoustical Society of America,1958, 50, 122-126.

Haskins Laboratories Quarterly Progress Report. Research Study onReinforcement of Speech. Number: Eight (1953). Haskins Laboratories.

Haskins Laboratories Quarterly Progress Report. Research Study onReinforcement of Speech. Number: Eleven (1954). Haskins Laboratories.

Haskins Laboratories Quarterly Progress Report. Research Study onReinforcement of Speech. Number: Thirteen (1954). Haskins Laborato­ries.

223

Page 36: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

Bloomington:

Polska Akademia Nauk,

Fujimura, O. Nasaltzation of vowels in relationthe Acoustical Society of America, 1958, 30, 267-

Ha tto rt? S., Yamamoto, K.,to nasals. Journal of274.

Hecker, M.H. Studtes of nasal consonants wt th an arttculatory speech synthe­stzer. Journal of the Acoustical Soctety of Amertca, 1962,21.,179-188.

House, A. S. Analog studtes of nasal consonants. Journal of Speech andHeartng Dtsorders, 1957, 22, 190-204.

Jassem, W. The acoushcs of consonants. In A. Sovtjarvt & P. Aalto (Eds.),Proceedings of the Fourth International Congress of Phonetic Sciences.The Hague: Mouton, 1962, pp. 50-72.

Jassem, W. A spectrographic study of Polish speech sounds. InD. Abercrombie, D. B. Fry, P. A. D. MacCarthy, N. L. Scott, &J. L. M. Trim (Eds.), In honour of Daniel Jones. London: Longmans,Green, 1964 1 334~348G -- -----

Jassem, W. Podstawy fonetyki akustycznej. Warsaw:1973.

Kacpranski, R. P. Spectral analysis of German nasal consonants. Phonettca,1965. g, 165-170.

Kacprowski, J., & Mikiel, W. Simplified rules for parametric synthesis ofnasal and stop consonants in C-V syllables by means of the "terminal­analog" speech synthesizer. Acoustica, 1965, 16. 356-364.

Klatt, D. H., & Shattuck, S. R. Perception of brief stimuli that resemblerapid formant transitions. In G. Fant & M. A. A. Tatham (Eds.), Auditoryanalysis and perception of speech. New York: Academic Press, 1975, 293­301.

Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M.Perception of the speech code. Psychological Review, 1967, 74, 431-461.

Liberman, A. M., Delattre, P. C., Cooper, F. S., & Gerstman, L. J. The roleof consonant-vowel transitions in the perception of the stop and nasalconsonants. Psychological Monographs, 1954, ~, No.8.

Magdic s. K. Studies in the acoustic characteristics of Hungarian speechsounds. Indiana University Publications, Uralic andJrltaic Series, 1969,97./-

Malecot, A. Acoustic cues for nasal consonants: An experimental studyinvolving a tape-splicing technique. Language, 1956, 32, 274-284.

Manrique, A. M. B. de, Gurlekian, J. A., & Massone, M. I-:- Funcion de laspropiedades acusticas en el reconocimiento de las consonantes nasales yl{quidas espanolas. Informe Xln del Laboratorio de InvestigacionesSensoriales. Buenos Aires, 198o:-T3,~ -, -

Martony, J. The role of formant amplitudes in synthesis of nasal consonants.Royal Institute of Technology, STL-QPSR, 1964, 2, 28-31.

Mascaro, J. Catalan Ehonology and the phonological cycle.Indiana University Linguistics Club, 1978.

Massone, M. 1. Estudio acustico de las consonantes espanolas nasales yllquidas. Informe XIII del Laboratorio de Investigaciones Sensoriales.Buenos Aires, 1980, 13, 5.

Mattingly, I. Synthesis ~ rule of General American English. Supplement toHaskins Laboratories Status Report on Speech Research, 1968.

Mattingly, 1., Pollock, S., Levas, A., Scully, W., & Levitt, A. Softwaresynthesizer for phonetic research. Journal of the Acoustical Society ofAmerica, 1981. ~,S83. (Abstract)

Miller, J. L., & Eimas, P. D. Studies on the perception of place and mannerof articulation: A comparison of the labial- alveolar and nasal- stop

224

Page 37: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

distinctions. Journal of the Acoustical Society: of America, 1977, .§l,835-845.

Nakata, K. Synthesis and perception of nasal consonants. Journal of theAcoustical Society of America, 1959, ..2.1, 661-666.

Nord, L. Perceptual experiments with nasals. Royal Institute of Technology,STL-QPSR, 1976, 2-3, 5-8.

Ohala, J. J. Phonetic explanations for nasal sound patterns. In Ch. A.Ferguson, L. M. Hyman, & J. J. Ohala (Eds.), Nasalfest, 1975, 289-316.

Pickett, J. M. Some acoustic cues for synthesis of the /n-d/ distinction.Journal of the Acoustical Society of America, 1965, 35, 474-477.

Potter, R. K.-,-Kopp, G. A., & Green, ~ C. Visible speech. New York: D.Van Nostrand Inc., 1947.

Romportl, M. Zur akustischen analyse und Klassifizierung der nasale. tudiesin Phonetics. The Hague: Mouton, 1973, 78-83.

Stevens, K. N. Feature detection and auditory segmentation: Consonant per-ception. In G. Fant & M. A. A. Tatham (Elds.), Auditory analysis andperception of speech. New York: Academic Press, 1975, 191-195.

Studdert-Kennedy, M. Universals in phonetic structure and their role inlinguistic communication. In T. H. Bullock (Ed.), Recognition of complexacoustic signals. Berlin: Dahlem Konferenzen, 1977, 37-48.

Takeuchi, S., Kasuya, H., & Kido, K. A method for extraction of the spectralcues of nasal consonants. Journal of the Acoustical Society of Japan,1975, 31, 739-740.

Vagges, K.-,-Ferrero, F. E., Caldognetto-Magno, E., & Lavagnoli, C. Someacoustic characteristics of Italian consonants. Journal of ItalianLinguistics, 1978, 2, 69-85.

Wang, W. S.-Y., & Fillmore, Ch. J. Intrinsic cues and consonant perception.Journal of Speech and Hearing Research, 1961, 1, 130-136.

Weinstein, C. J., McCandless, S. S., Mondshein, L. F., & Zue, V. W. A systemfor acoustic-phonetic analysis of continuous speech. IEEE Transactionson Acoustics, Speech and Signal Processing, 1975, ASSP-23, l, 54-67.

Zee, E. Effect of vowel quality on perception of post-vocalic nasal conso­nants in noise. Journal of Phonetics, 1981, ~, 35-48.

FOOTNOTES

1I represent wi th [J"] palatal as well as palatalized nasal stops.nasal stops are consistently represented with [7] independently ofphonological status.

Velartheir

2Research is to be done on the relative differences in pharynx cavitysize and size of velum opening among nasals of different place of articula­tion. It deserves to be seen to what extent the acoustic struc ture and theperceptual role of their murmur formants are affected by those differences.

3The reliability of the results remains open to the objections raised in8ection 3.3.C. Furthermore, possible bias effects can be related to the factthat all experiments were forced-choice. Also, the fact that American Englishsubj ects in Malecot's ex.veriment gave 80% of [?] responses for original [?]while 100% of [m] and [nJ responses for original [m] and en], respectively,suggests response bias effects against the velar correlate. That AmericanEnglish speakers identify [J] after [a] very reliably has been shown wi th

225

Page 38: PERCEPTION OF NASAL CONSONANTS WITH SPECIAL REFERENCE TO … · 2017-07-06 · account for the perception of nasals after [aJ as well as other vocalic nuclei. Data for nasal consonants

original natural speech stimuli by Henderson (Note 1) (100%) and Zee (1981)(96%).

4r call transition range value the amount of frequency contrast betweensteady state vowel and starting point or endpoint value of an adjacent formanttransition. It is expressed in Hz and can be positive (+), negative (-), ornull (:::).

5Acoustic measurements of zeros were inferred from frequency areas thatshow a major reduction in the magnitude of the energy envelope as observed inspectrographic spectral sections. A final decision about zero frequencyvalues to be included in the synthetic speech patterns used for perceptualtesting was also reached on the basis of measurements reported in theliterature as well as well-accepted observations on formant-antiformant spec­tral characteristics of nasal murmurs corresponding to different place catego­ries (see Sections 2.1.B. and 2.2.A., and Table 1).

6 Bandwidth values were estimated by measuring the distance between twopoints at the right and left side of the spectrum envelope, equally located3dB below the peak level.

7The mismatch between perceptual stimuli and frequency values correspond­ing to the [~] production space (Figure 7) did not affect the quality of [JL]judgments, thus suggesting that subjects perceive a positive F2 transition asa palatal cue when pointing to a critical high locus.

226