1 - The Perception of Musical Tones, Pages 1-33

download 1 - The Perception of Musical Tones, Pages 1-33

of 33

Transcript of 1 - The Perception of Musical Tones, Pages 1-33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    1/33

    1The Perception of Musical Tones

     Andrew J. Oxenham

    Department of Psychology, University of Minnesota, Minneapolis

    I. Introduction

     A. What Are Musical Tones? 

    The definition of a tone—a periodic sound that elicits a pitch sensation—encompasses

    the vast majority of musical sounds. Tones can be either pure—sinusoidal variations

    in air pressure at a single frequency—or complex. Complex tones can be divided into

    two categories, harmonic and inharmonic. Harmonic complex tones are periodic,

    with a repetition rate known as the fundamental frequency (F0), and are composed of 

    a sum of sinusoids with frequencies that are all integer multiples, or harmonics,

    of the F0. Inharmonic complex tones are composed of multiple sinusoids that arenot simple integer multiples of any common F0. Most musical instrumental or

    vocal tones are more or less harmonic but some, such as bell chimes, can be

    inharmonic.

    B. Measuring Perception

    The physical attributes of a sound, such as its intensity and spectral content, can be

    readily measured with modern technical instrumentation. Measuring the  perceptionof sound is a different matter. Gustav Fechner, a 19th-century German scientist,

    is credited with founding the field of psychophysics—the attempt to establish a

    quantitative relationship between physical variables (e.g., sound intensity and fre-

    quency) and the sensations they produce (e.g., loudness and pitch;  Fechner, 1860).

    The psychophysical techniques that have been developed since Fechner’s time to

    tap into our perceptions and sensations (involving hearing, vision, smell, touch, and

    taste) can be loosely divided into two categories of measures, subjective and objec-

    tive. The subjective measures typically require participants to estimate or produce

    magnitudes or ratios that relate to the dimension under study. For instance, inestablishing a loudness scale, participants may be presented with a series of tones

    at different intensities and then asked to assign a number to each tone, correspond-

    ing to its loudness. This method of  magnitude estimation   thus produces a psycho-

    physical function that directly relates loudness to sound intensity.  Ratio estimation

    follows the same principle, except that participants may be presented with two

    The Psychology of Music. DOI: http://dx.doi.org/10.1016/B978-0-12-381460-9.00001-8

    © 2013 Elsevier Inc. All rights reserved.

    http://dx.doi.org/10.1016/B978-0-12-381460-9.00001-8http://dx.doi.org/10.1016/B978-0-12-381460-9.00001-8

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    2/33

    sounds and then asked to judge how much louder (e.g., twice or three times) one

    sound is than the other. The complementary methods are   magnitude production  and

    ratio production. In these production techniques, the participants are required to

    vary the relevant physical dimension of a sound until it matches a given magnitude(number), or until it matches a specific ratio with respect to a reference sound.

    In the latter case, the instructions may be something like “adjust the level of the

    second sound until it is twice as loud as the first sound.” All four techniques have

    been employed numerous times in attempts to derive appropriate psychophysical

    scales (e.g.,   Buus, Muesch, & Florentine, 1998; Hellman, 1976; Hellman &

    Zwislocki, 1964; Stevens, 1957; Warren, 1970). Other variations on these methods

    include categorical scaling and cross-modality matching. Categorical scaling involves

    asking participants to assign the auditory sensation to one of a number of fixed

    categories; following our loudness example, participants might be asked to select acategory ranging from very quiet to very loud (e.g., Mauermann, Long, & Kollmeier,

    2004). Cross-modality matching avoids the use of numbers by, for instance, asking

    participants to adjust the length of a line, or a piece of string, to match the perceived

    loudness of a tone (e.g.,   Epstein & Florentine, 2005). Although all these methods

    have the advantage of providing a more-or-less direct estimate of the relationship

    between the physical stimulus and the sensation, they have a number of disadvan-

    tages also. First, they are subjective and rely on introspection on the part of the

    subject. Perhaps because of this they can be somewhat unreliable, variable across

    and within participants, and prone to various biases (e.g., Poulton, 1977).The other approach is to use an objective measure, where a right and wrong

    answer can be verified externally. This approach usually involves probing the limits

    of resolution of the sensory system, by measuring  absolute   threshold (the smallest

    detectable stimulus), relative threshold (the smallest detectable change in a stimulus),

    or   masked   threshold (the smallest detectable stimulus in the presence of another

    stimulus). There are various ways of measuring threshold, but most involve a forced-

    choice procedure, where the subject has to pick the interval that contains the target

    sound from a selection of two or more. For instance, in an experiment measuring

    absolute threshold, the subject might be presented with two successive time intervals,marked by lights; the target sound is played during one of the intervals, and the

    subject has to decide which one it was. One would expect performance to change

    with the intensity of the sound: at very low intensities, the sound will be completely

    inaudible, and so performance will be at chance (50% correct in a two-interval task);

    at very high intensities, the sound will always be clearly audible, so performance will

    be near 100%, assuming that the subject continues to pay attention. A  psychometric

     function can then be derived, which plots the performance of a subject as a function

    of the stimulus parameter. An example of a psychometric function is shown in

    Figure 1, which plots percent correct as a function of sound pressure level. This typeof forced-choice paradigm is usually preferable (although often more time-consuming)

    than more subjective measures, such as the method of limits, which is often used today

    to measure audiograms. In the method of limits, the intensity of a sound is decreased

    until the subject reports no longer being able to hear it, and then the intensity

    of the sound is increased until the subject again reports being able to hear it.

    2 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    3/33

    The trouble with such measures is that they rely not just on   sensitivity  but also on

    criterion—how willing the subject is to report having heard a sound if he or she is

    not sure. A forced-choice procedure eliminates that problem by forcing participants

    to guess, even if they are unsure which interval contained the target sound. Clearly,

    testing the perceptual limits by measuring thresholds does not tell us everything

    about human auditory perception; a primary concern is that these measures are typi-

    cally indirect—the finding that people can detect less than a 1% change in frequency

    does not tell us much about the perception of much larger musical intervals, such as

    an octave. Nevertheless it has proved extremely useful in helping us to gain a deeperunderstanding of perception and its relation to the underlying physiology of the

    ear and brain.

    Measures of reaction time, or response time (RT), have also been used to probe

    sensory processing. The two basic forms of response time are simple response time

    (SRT), where participants are instructed to respond as quickly as possible by push-

    ing a single button once a stimulus is presented, and choice response time (CRT),

    where participants have to categorize the stimulus (usually into one of two catego-

    ries) before responding (by pressing button 1 or 2).

    Although RT measures are more common in cognitive tasks, they also dependon some basic sound attributes, such as sound intensity, with higher intensity

    sounds eliciting faster reactions, measured using both SRTs (Kohfeld, 1971;

    Luce & Green, 1972) and CRTs (Keuss & van der Molen, 1982).

    Finally, measures of perception are not limited to the quantitative or numerical

    domain. It is also possible to ask participants to describe their percepts in words.

    This approach has clear applications when dealing with multidimensional attributes,

    such as timbre (see below, and Chapter 2 of this volume), but also has some inherent

    difficulties, as different people may use descriptive words in different ways.

    To sum up, measuring perception is a thorny issue that has many solutions, allwith their own advantages and shortcomings. Perceptual measures remain a crucial

    “systems-level” analysis tool that can be combined in both human and animal stud-

    ies with various physiological and neuroimaging techniques, to help us discover

    more about how the ears and brain process musical sounds in ways that elicit

    music’s powerful cognitive and emotional effects.

    100

    90

    80

    70

    60

    50

     –5 0 5

    Signal level (dB SPL)

       P  e  r  c  e  n   t  c  o  r  r  e  c   t

    10 15

    Figure 1  A schematic example of 

    a psychometric function, plotting

    percent correct in a two-alternative

    forced-choice task against thesound pressure level of a test tone.

    31. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    4/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    5/33

    broadband sounds remains roughly constant when expressed as a ratio or in deci-

    bels is in line with the well-known Weber’s law, which states that the JND between

    two stimuli is proportional to the magnitude of the stimuli.

    In contrast to our ability to judge differences in sound level between two soundspresented one after another, our ability to categorize or label sound levels is rather

    poor. In line with  Miller’s (1956)  famous “7 plus or minus 2” postulate for infor-

    mation processing and categorization, our ability to categorize sound levels accu-

    rately is fairly limited and is subject to a variety of influences, such as the context

    of the preceding sounds. This may explain why the musical notation of loudness

    (in contrast to pitch) has relatively few categories between   pianissimo   and

     fortissimo—typically just six ( pp,  p,  mp,  mf ,  f , and  ff ).

    2. Equal Loudness Contours and the Loudness Weighting Curves

    There is no direct relationship between the physical sound level (in dB SPL) and

    the sensation of loudness. There are many reasons for this, but an important one is

    that loudness depends heavily on the frequency content of the sound.   Figure 2

    shows what are known as equal loudness contours. The basic concept is that two

    pure tones with different frequencies, but with levels that fall on the same loudness

    contour, have the same loudness. For instance, as shown in  Figure 2, a pure tone

    with a frequency of 1 kHz and a level of 40 dB SPL has the same loudness as a

    pure tone with a frequency of 100 Hz and a level of about 64 dB SPL; in other words,

    a 100-Hz tone has to be 24 dB higher in level than a 40-dB SPL 1-kHz tone in order

    130

    100 phons

    90

    80

    70

    60

    50

    40

    30

    20

    10

    Hearing threshold

    120

    110

    100

    90

    80

    70

    60

    50

    40

    30

    20

    10

    0

     –1016 31,5 63 125 250 500 1000

    Frequency in Hz

       S  o  u  n   d  p  r  e  s  s  u  r  e   l  e  v  e   l   i  n   d   B

    2000 4000 8000 16000

    Figure 2   The equal-loudness contours, taken from ISO 226:2003.

    Original figure kindly provided by Brian C. J. Moore.

    51. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    6/33

    to be perceived as being equally loud. The equal loudness contours are incorporated

    into an international standard (ISO 226) that was initially established in 1961 and was

    last revised in 2003.

    These equal loudness contours have been derived several times from painstakingpsychophysical measurements, not always with identical outcomes (Fletcher &

    Munson, 1933; Robinson & Dadson, 1956; Suzuki & Takeshima, 2004). The mea-

    surements typically involve either loudness matching, where a subject adjusts the

    level of one tone until it sounds as loud as a second tone, or loudness comparisons,

    where a subject compares the loudness of many pairs of tones and the results are

    compiled to derive points of subjective equality (PSE). Both methods are highly

    susceptible to nonsensory biases, making the task of deriving a definitive set of 

    equal loudness contours a challenging one (Gabriel, Kollmeier, & Mellert, 1997).

    The equal loudness contours provide the basis for the measure of “loudnesslevel,” which has units of “phons.” The phon value of a sound is the dB SPL value

    of a 1-kHz tone that is judged to have the same loudness as the sound. So, by defi-

    nition, a 40-dB SPL tone at 1 kHz has a loudness level of 40 phons. Continuing the

    preceding example, the 100-Hz tone at a level of about 64 dB SPL also has a loud-

    ness level of 40 phons, because it falls on the same equal loudness contour as the

    40-dB SPL 1-kHz tone. Thus, the equal loudness contours can also be termed the

    equal phon contours.

    Although the actual measurements are difficult, and the results somewhat conten-

    tious, there are many practical uses for the equal loudness contours. For instance, inissues of community noise annoyance from rock concerts or airports, it is more use-

    ful to know about the perceived loudness of the sounds in question, rather than just

    their physical level. For this reason, an approximation of the 40-phon equal loudness

    contour is built into most modern sound level meters and is referred to as the

    “A-weighted” curve. A sound level that is quoted in dB (A) is an overall sound level

    that has been filtered with the inverse of the approximate 40-phon curve. This means

    that very low and very high frequencies, which are perceived as being less loud, are

    given less weight than the middle of the frequency range.

    As with all useful tools, the A-weighted curve can be misused. Because it isbased on the 40-phon curve, it is most suitable for low-level sounds; however, that

    has not prevented it from being used in measurements of much higher-level sounds,

    where a flatter filter would be more appropriate, such as that provided by the

    much-less-used C-weighted curve. The ubiquitous use of the dB (A) scale for all

    levels of sound therefore provides an example of a case where the convenience of a

    single-number measure (and one that minimizes the impact of difficult-to-control

    low frequencies) has outweighed the desire for accuracy.

    3. Loudness Scales

    Equal loudness contours and phons tell us about the relationship between loudness

    and frequency. They do not, however, tell us about the relationship between loud-

    ness and sound level. For instance, the phon, based as it is on the decibel scale at

    1 kHz, says nothing about how much louder a 60-dB SPL tone is than a 30-dB

    6 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    7/33

    SPL tone. The answer, according to numerous studies of loudness, is not twice as

    loud. There have been numerous attempts since Fechner’s day to relate the physical

    sound level to loudness.  Fechner (1860), building on Weber’s law, reasoned that if 

    JNDs were constant on a logarithmic scale, and if equal numbers of JNDs reflectedan equal change in loudness, then loudness must be related logarithmically to sound

    intensity. Harvard psychophysicist S. S. Stevens disagreed, claiming that JNDs

    reflected “noise” in the auditory system, which did not provide direct insight into

    the function relating loudness to sound intensity (Stevens, 1957). Stevens’s

    approach was to use magnitude and ratio estimation and production techniques, as

    described in Section I of this chapter, to derive a relationship between loudness and

    sound intensity. He concluded that loudness ( L ) was related to sound intensity ( I )

    by a power law:

     L 5 kI α (Eq. 1)

    where the exponent,   α, has a value of about 0.3 at medium frequencies and for

    moderate and higher sound levels. This law implies that a 10-dB increase in level

    results in a doubling of loudness. At low levels, and at lower frequencies, the expo-

    nent is typically larger, leading to a steeper growth-of-loudness function. Stevens

    used this relationship to derive loudness units, called “sones.” By definition, 1 sone

    is the loudness of a 1-kHz tone presented at a level of 40 dB SPL; 2 sones is twice

    as loud, corresponding roughly to a 1-kHz tone presented at 50 dB SPL, and 4sones corresponds to the same tone at about 60 dB SPL.

    Numerous studies have supported the basic conclusion that loudness can be

    related to sound intensity by a power law. However, in part because of the variability

    of loudness judgments, and the substantial effects of experimental methodology

    (Poulton, 1979), different researchers have found different values for the best-fitting

    exponent. For instance, Warren (1970) argued that presenting participants with sev-

    eral sounds to judge invariably results in bias. He therefore presented each subject

    with only one trial. Based on these single-trial judgments, Warren also derived a

    power law, but he found an exponent value of 0.5. This exponent value is what onemight expect if the loudness of sound were proportional to its distance from the

    receiver, leading to a 6-dB decrease in level for every doubling of distance. Yet

    another study, which tried to avoid bias effects by using the entire (100-dB) level

    range within each experiment, derived an exponent of only 0.1, implying a doubling

    of loudness for every 30-dB increase in sound level (Viemeister & Bacon, 1988).

    Overall, it is generally well accepted that the relationship between loudness and

    sound intensity can be approximated as a power law, although methodological issues

    and intersubject and intrasubject variability have made it difficult to derive a defini-

    tive and uncontroversial function relating the sensation to the physical variable.

    4. Partial Loudness and Context Effects

    Most sounds that we encounter, particularly in music, are accompanied by other

    sounds. This fact makes it important to understand how the loudness of a sound is

    71. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    8/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    9/33

    (Moore & Glasberg, 1997), and others have been extended to explain the loudness

    of sounds that fluctuate over time (Chalupper & Fastl, 2002; Glasberg & Moore,

    2002). However, none has yet attempted to incorporate context effects, such as

    loudness recalibration or loudness enhancement.

    B. Pitch

    Pitch is arguably the most important dimension for conveying music. Sequences of 

    pitches form a melody, and simultaneous combinations of pitches form harmony—

    two foundations of Western music. There is a vast body of literature devoted to

    pitch research, from both perceptual and neural perspectives (Plack, Oxenham,

    Popper, & Fay, 2005). The clearest physical correlate of pitch is the periodicity, or

    repetition rate, of sound, although other dimensions, such as sound intensity, canhave small effects (e.g.,   Verschuure & van Meeteren, 1975). For young people

    with normal hearing, pure tones with frequencies between about 20 Hz and 20 kHz

    are audible. However, only sounds with repetition rates between about 30 Hz and

    5 kHz elicit a pitch percept that can be called musical and is strong enough to carry

    a melody (e.g.,   Attneave & Olson, 1971; Pressnitzer, Patterson, & Krumbholz,

    2001; Ritsma, 1962). Perhaps not surprisingly, these limits, which were determined

    through psychoacoustical investigation, correspond quite well to the lower and

    upper limits of pitch found on musical instruments: the lowest and highest notes of 

    a modern grand piano, which covers the ranges of all standard orchestral instru-ments, correspond to 27.5 Hz and 4186 Hz, respectively.

    We tend to recognize patterns of pitches that form melodies (see Chapter 7 of 

    this volume). We do this presumably by recognizing the musical intervals between

    successive notes (see Chapters 4 and 7 of this volume), and most of us seem rela-

    tively insensitive to the absolute pitch values of the individual note, so long as the

    pitch relationships between notes are correct. However, exactly how the pitch is

    extracted from each note and how it is represented in the auditory system remain

    unclear, despite many decades of intense research.

    1. Pitch of Pure Tones

    Pure tones produce a clear, unambiguous pitch, and we are very sensitive to

    changes in their frequency. For instance, well-trained listeners can distinguish

    between two tones with frequencies of 1000 and 1002 Hz—a difference of only

    0.2% (Moore, 1973). A semitone, the smallest step in the Western scale system,

    is a difference of about 6%, or about a factor of 30 greater than the JND of 

    frequency for pure tones. Perhaps not surprisingly, musicians are generally better

    than nonmusicians at discriminating small changes in frequency; what is moresurprising is that it does not take much practice for people with no musical train-

    ing to “catch up” with musicians in terms of their performance. In a recent study,

    frequency discrimination abilities of trained classical musicians were compared

    with those of untrained listeners with no musical background, using both pure

    tones and complex tones (Micheyl, Delhommeau, Perrot, & Oxenham, 2006).

    Initially thresholds were about a factor of 6 worse for the untrained listeners.

    91. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    10/33

    However, it took only between 4 and 8 hours of practice for the thresholds of the

    untrained listeners to match those of the trained musicians, whereas the trained

    musicians did not improve with practice. This suggests that most people are able

    to discriminate very fine differences in frequency with very little in the way of specialized training.

    Two representations of a pure tone at 440 Hz (the orchestral A) are shown in

    Figure 3. The upper panel shows the waveform—variations in sound pressure as a

    function of time—that repeats 440 times a second, and so has a period of 1/440 s,

    or about 2.27 ms. The lower panel provides the spectral representation, showing

    that the sound has energy only at 440 Hz. This spectral representation is for an

    “ideal” pure tone—one that has no beginning or end. In practice, spectral energy

    spreads above and below the frequency of the pure tone, reflecting the effects of 

    onset and offset. These two representations (spectral and temporal) provide a goodintroduction to two ways in which pure tones are represented in the peripheral

    auditory system.

    The first potential code, known as the “place” code, reflects the mechanical fil-

    tering that takes place in the cochlea of the inner ear. The basilar membrane, which

    runs the length of the fluid-filled cochlea from the base to the apex, vibrates in

    1

    0.80.6

    0.4

    0.2

    0

     –0.2

     –0.4

     –0.6

     –0.8

     –10 2 4 6

    Time (ms)

       P  r  e  s  s  u  r  e   (  a  r   b   i   t  r  a  r  y  u  n   i   t  s   )

    8 10 12

    1

    0.8

    0.6

    0.4

    0.2

    00 1000 2000 3000

    Frequency (Hz)

       M  a  g  n   i   t  u   d  e   (  a  r   b   i   t  r  a  r  y  u  n   i   t  s   )

    4000 5000

    Figure 3   Schematic diagram

    of the time waveform (upperpanel) and power spectrum

    (lower panel) of a pure tone

    with a frequency of 440 Hz.

    10 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    11/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    12/33

    considerably worse when the low-frequency temporal information was presented to

    the “wrong” place in the cochlea, suggesting that place information is important.

    In light of this mixed evidence, it may be safest to assume that the auditory sys-

    tem uses both place and timing information from the auditory nerve in order toextract the pitch of pure tones. Indeed some theories of pitch explicitly require both

    accurate place and timing information (Loeb, White, & Merzenich, 1983). Gaining

    a better understanding of how the information is extracted remains an important

    research goal. The question is of particular clinical relevance, as deficits in pitch

    perception are a common complaint of people with hearing loss and people with

    cochlear implants. A clearer understanding of how the brain uses information from

    the cochlea will help researchers to improve the way in which auditory prostheses,

    such as hearing aids and cochlear implants, present sound to their users.

    2. Pitch of Complex Tones

    A large majority of musical sounds are complex tones of one form or another, and

    most have a pitch associated with them. Most common are harmonic complex

    tones, which are composed of the F0 (corresponding to the repetition rate of the

    entire waveform) and upper partials, harmonics, or overtones, spaced at integer

    multiples of the F0. The pitch of a harmonic complex tone usually corresponds to

    the F0. In other words, if a subject is asked to match the pitch of a complex tone to

    the pitch of a single pure tone, the best match usually occurs when the frequencyof the pure tone is the same as the F0 of the complex tone. Interestingly, this is

    true even when the complex tone has no energy at the F0 or the F0 is masked

    (de Boer, 1956; Licklider, 1951; Schouten, 1940; Seebeck, 1841). This phenome-

    non has been given various terms, including pitch of the missing fundamental, peri-

    odicity pitch, residue pitch, and virtual pitch. The ability of the auditory system to

    extract the F0 of a sound is important from the perspective of perceptual constancy:

    imagine a violin note being played in a quiet room and then again in a room with a

    noisy air-conditioning system. The low-frequency noise of the air-conditioning sys-

    tem might well mask some of the lower-frequency energy of the violin, includingthe F0, but we would not expect the pitch (or identity) of the violin to change

    because of it.

    Although the ability to extract the periodicity pitch is clearly an important one,

    and one that is shared by many different species (Shofner, 2005), exactly how the

    auditory system extracts the F0 remains for the most part unknown. The initial

    stages in processing a harmonic complex tone are shown in   Figure 4. The upper

    two panels show the time waveform and the spectral representation of a harmonic

    complex tone. The third panel depicts the filtering that occurs in the cochlea—each

    point along the basilar membrane can be represented as a band-pass filter thatresponds to only those frequencies close to its center frequency. The fourth panel

    shows the “excitation pattern” produced by the sound. This is the average response

    of the bank of band-pass filters, plotted as a function of the filters’ center frequency

    (Glasberg & Moore, 1990). The fifth panel shows an excerpt of the time waveform

    at the output of some of the filters along the array. This is an approximation of the

    12 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    13/33

    0

     –10

     –20

       E  x  c   i   t  a   t   i  o  n   (   d   B   )

       B   M   v

       i   b  r  a   t   i  o  n

       T   i  m  e   (  m  s   )

       0

       2

       4

       6

       8

       1   0

       1   2

     –30

     –400 1000 2000 3000 4000

    Center frequency (Hz)

    5000 6000 7000 8000

    0

     –10

     –20

       R  e  s  p  o  n  s  e   (   d   B   )

     –30

     –400 1000 2000 3000 4000

    Frequency (Hz)

    5000 6000 7000 8000

    0

     –10

     –20

       L  e  v  e   l   (   d   B   )

     –30

     –400 1000 2000 3000 4000

    Frequency (Hz)

    5000 6000 7000 8000

    2

    1

    0

     –1

     –20

    Spectrum

     Auditory filterbank

    Excitation pattern

    Time waveform

    2 4 6

    Time (ms)

       P  r  e

      s  s  u  r  e   (  a  r   b   i   t  r  a  r  y  u  n   i   t  s   )

    8 10 12

    Figure 4  Representations of a harmonic complex tone with a fundamental frequency (F0)

    of 440 Hz. The upper panel shows the time waveform. The second panel shows the power

    spectrum of the same waveform. The third panel shows the auditory filter bank, representing

    the filtering that occurs in the cochlea. The fourth panel shows the excitation pattern, or the

    time-averaged output of the filter bank. The fifth panel shows some sample time waveforms

    at the output of the filter bank, including filters centered at the F0 and the fourth harmonic,

    illustrating resolved harmonics, and filters centered at the 8th and 12th harmonic of the

    complex, illustrating harmonics that are less well resolved and show amplitude modulations

    at a rate corresponding to the F0.

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    14/33

    waveform that drives the inner hair cells in the cochlea, which in turn synapse with

    the auditory nerve fibers to produce the spike trains that the brain must interpret.

    Considering the lower two panels of  Figure 4, it is possible to see a transition

    as one moves from the low-numbered harmonics on the left to the high-numbered harmonics on the right: The first few harmonics generate distinct peaks

    in the excitation pattern, because the filters in that frequency region are narrower

    than the spacing between successive harmonics. Note also that the time waveforms

    at the outputs of filters centered at the low-numbered harmonics resemble pure

    tones. At higher harmonic numbers, the bandwidths of the auditory filters become

    wider than the spacing between successive harmonics, and so individual peaks in

    the excitation pattern are lost. Similarly, the time waveform at the output of higher-

    frequency filters no longer resembles a pure tone, but instead reflects the interac-

    tion of multiple harmonics, producing a complex waveform that repeats at a ratecorresponding to the F0.

    Harmonics that produce distinct peaks in the excitation pattern and/or produce

    quasi-sinusoidal vibrations on the basilar membrane are referred to as being

    “resolved.” Phenomenologically, resolved harmonics are those that can be “heard

    out” as separate tones under certain circumstances. Typically, we do not hear the

    individual harmonics when we listen to a musical tone, but our attention can be

    drawn to them in various ways, for instance by amplifying them or by switching

    them on and off while the other harmonics remain continuous (e.g.,   Bernstein &

    Oxenham, 2003; Hartmann & Goupell, 2006). The ability to resolve or hear outindividual low-numbered harmonics as pure tones was already noted by Hermann

    von Helmholtz in his classic work,   On the Sensations of Tone Perception

    (Helmholtz, 1885/1954).

    The higher-numbered harmonics, which do not produce individual peaks of 

    excitation and cannot typically be heard out, are often referred to as being “unre-

    solved.” The transition between resolved and unresolved harmonics is thought to

    lie somewhere between the 5th and 10th harmonic, depending on various factors,

    such as the F0 and the relative amplitudes of the components, as well as on how

    resolvability is defined (e.g.,   Bernstein & Oxenham, 2003; Houtsma &Smurzynski, 1990; Moore & Gockel, 2011; Shackleton & Carlyon, 1994).

    Numerous theories and models have been devised to explain how pitch is extracted

    from the information present in the auditory periphery (de Cheveigné, 2005). As with

    pure tones, the theories can be divided into two basic categories—place and temporal

    theories. The place theories generally propose that the auditory system uses the

    lower-order, resolved harmonics to calculate the pitch (e.g.,   Cohen, Grossberg, &

    Wyse, 1995; Goldstein, 1973; Terhardt, 1974b; Wightman, 1973). This could be

    achieved by way of a template-matching process, with either “hard-wired” harmonic

    templates or templates that develop through repeated exposure to harmonic series,which eventually become associated with the F0. Temporal theories typically involve

    evaluating the time intervals between auditory-nerve spikes, using a form of autocor-

    relation or all-interval spike histogram (Cariani & Delgutte, 1996; Licklider, 1951;

    Meddis & Hewitt, 1991; Meddis & O’Mard, 1997; Schouten, Ritsma, & Cardozo,

    1962). This information can be obtained from both resolved and unresolved harmonics.

    14 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    15/33

    Pooling these spikes from across the nerve array results in a dominant interval

    emerging that corresponds to the period of the waveform (i.e., the reciprocal of the

    F0). A third alternative involves using both place and temporal information. In one

    version, coincident timing between neurons with harmonically related CFs is postu-lated to lead to a spatial network of coincidence detectors—a place-based template

    that emerges through coincident timing information (Shamma & Klein, 2000). In

    another version, the impulse-response time of the auditory filters, which depends on

    the CF, is postulated to determine the range of periodicities that a certain tonotopic

    location can code (de Cheveigné & Pressnitzer, 2006). Recent physiological studies

    have supported at the least the plausibility of place-time mechanisms to code pitch

    (Cedolin & Delgutte, 2010).

    Distinguishing between place and temporal (or place-time) models of pitch has

    proved very difficult. In part, this is because spectral and temporal representationsof a signal are mathematically equivalent: any change in the spectral representation

    will automatically lead to a change in the temporal representation, and vice versa.

    Psychoacoustic attempts to distinguish between place and temporal mechanisms

    have focused on the limits imposed by the peripheral physiology in the cochlea and

    auditory nerve. For instance, the limits of frequency selectivity can be used to test

    the place theory: if all harmonics are clearly unresolved (and therefore providing

    no place information) and a pitch is still heard, then pitch cannot depend solely on

    place information. Similarly, the putative limits of phase-locking can be used: if 

    the periodicity of the waveform and the frequencies of all the resolved harmonicsare all above the limit of phase locking in the auditory nerve and a pitch is still

    heard, then temporal information is unlikely to be necessary for pitch perception.

    A number of studies have shown that pitch perception is possible even when

    harmonic tone complexes are filtered to remove all the low-numbered, resolved

    harmonics (Bernstein & Oxenham, 2003; Houtsma & Smurzynski, 1990;

    Kaernbach & Bering, 2001; Shackleton & Carlyon, 1994). A similar conclusion

    was reached by studies that used amplitude-modulated broadband noise, which has

    no spectral peaks in its long-term spectrum (Burns & Viemeister, 1976, 1981).

    These results suggest that pitch can be extracted from temporal information alone,thereby ruling out theories that consider only place coding. However, the pitch sen-

    sation produced by unresolved harmonics or modulated noise is relatively weak 

    compared with the pitch of musical instruments, which produce full harmonic

    complex tones.

    The more salient pitch that we normally associate with music is provided by

    the lower-numbered resolved harmonics. Studies that have investigated the

    relative contributions of individual harmonics have found that harmonics 3 to 5

    (Moore, Glasberg, & Peters, 1985), or frequencies around 600 Hz (Dai, 2000),

    seem to have the most influence on the pitch of the overall complex. This is wherecurrent temporal models also encounter some difficulty: they are able to extract the

    F0 of a complex tone as well from unresolved harmonics as from resolved harmo-

    nics, and therefore they do not predict the large difference in pitch salience and

    accuracy between low- and high-numbered harmonics that is observed in psycho-

    physical studies (Carlyon, 1998). In other words, place models do not predict good

    151. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    16/33

    enough performance with unresolved harmonics, whereas temporal models predict

    performance that is too good. The apparently qualitative and quantitative difference

    in the pitch produced by low-numbered and high-numbered harmonics has led to the

    suggestion that there may be two pitch mechanisms at work, one to code the tem-poral envelope repetition rate from high-numbered harmonics and one to code the

    F0 from the individual low-numbered harmonics (Carlyon & Shackleton, 1994),

    although subsequent work has questioned some of the evidence proposed for the two

    mechanisms (Gockel, Carlyon, & Plack, 2004; Micheyl & Oxenham, 2003).

    The fact that low-numbered, resolved harmonics are important suggests that

    place coding may play a role in everyday pitch. Further evidence comes from a

    variety of studies. The study mentioned earlier that used tones with low-frequency

    temporal information transposed into a high-frequency range (Oxenham et al.,

    2004) studied complex-tone pitch perception by transposing the information fromharmonics 3, 4, and 5 of a 100-Hz F0 to high-frequency regions of the cochlea—

    roughly 4 kHz, 6 kHz, and 10 kHz. If temporal information was sufficient to elicit

    a periodicity pitch, then listeners should have been able to hear a pitch correspond-

    ing to 100 Hz. In fact, none of the listeners reported hearing a low pitch or was

    able to match the pitch of the transposed tones to that of the missing fundamental.

    This suggests that, if temporal information is used, it may need to be presented to

    the “correct” place along the cochlea.

    Another line of evidence has come from revisiting early conclusions that no

    pitch is heard when all the harmonics are above about 5 kHz (Ritsma, 1962). Theinitial finding led researchers to suggest that timing information was crucial and

    that at frequencies above the limits of phase locking, periodicity pitch was not per-

    ceived. A recent study revisited this conclusion and found that, in fact, listeners

    were well able to hear pitches between 1 and 2 kHz, even when all the harmonics

    were filtered to be above 6 kHz, and were sufficiently resolved to ensure that no

    temporal envelope cues were available (Oxenham et al., 2011). This outcome leads

    to an interesting dissociation: tones above 6 kHz on their own do not produce a

    musically useful pitch; however, those same tones when combined with others in a

    harmonic series can produce a musical pitch sufficient to convey a melody. Theresults suggest that the upper limit of musical pitch may not in fact be explained by

    the upper limit of phase locking: the fact that pitch can be heard even when all

    tones are above 5 kHz suggests either that temporal information is not necessary

    for musical pitch or that usable phase locking in the human auditory nerve extends

    to much higher frequencies than currently believed (Heinz, Colburn, & Carney,

    2001; Moore & Sęk, 2009).

    A further line of evidence for the importance of place information has come from

    studies that have investigated the relationship between pitch accuracy and auditory

    filter bandwidths.   Moore and Peters (1992)   investigated the relationship betweenauditory filter bandwidths, measured using spectral masking techniques (Glasberg &

    Moore, 1990), pure-tone frequency discrimination, and complex-tone F0 discrimi-

    nation in young and elderly people with normal and impaired hearing. People

    with hearing impairments were tested because they often have auditory filter band-

    widths that are broader than normal. A wide range of results were found—some

    16 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    17/33

    participants with normal filter bandwidths showed impaired pure-tone and

    complex-tone pitch discrimination thresholds; others with abnormally wide filters

    still had relatively normal pure-tone pitch discrimination thresholds. However,

    none of the participants with broadened auditory filters had normal F0 discrimina-tion thresholds, suggesting that perhaps broader filters resulted in fewer or no

    resolved harmonics and that resolved harmonics are necessary for accurate F0 dis-

    crimination. This question was pursued later by   Bernstein and Oxenham (2006a,

    2006b), who systematically increased the lowest harmonic present in a harmonic

    complex tone and measured the point at which F0 discrimination thresholds wors-

    ened. In normal-hearing listeners, there is quite an abrupt transition from good

    to poor pitch discrimination as the lowest harmonic present is increased from the

    9th to the 12th (Houtsma & Smurzynski, 1990). Bernstein and Oxenham reasoned

    that if the transition point is related to frequency selectivity and the resolvability of the harmonics, then the transition point should decrease to lower harmonic numbers

    as the auditory filters become wider. They tested this in hearing-impaired listeners

    and found a significant correlation between the transition point and the estimated

    bandwidth of the auditory filters (Bernstein & Oxenham, 2006b), suggesting that

    harmonics may need to be resolved in order to elicit a strong musical pitch.

    Interestingly, even though resolved harmonics may be  necessary  for accurate pitch

    perception, they may not be   sufficient .   Bernstein and Oxenham (2003)  increased

    the number of resolved harmonics available to listeners by presenting alternating

    harmonics to opposite ears. In this way, the spacing between successive compo-nents in each ear was doubled, thereby doubling the number of peripherally

    resolved harmonics. Listeners were able to hear out about twice as many harmonics

    in this new condition, but that did not improve their pitch discrimination thresholds

    for the complex tone. In other words, providing access to harmonics that are

    not normally resolved does not improve pitch perception abilities. These results are

    consistent with theories that rely on pitch templates. If harmonics are not normally

    available to the auditory system, they would be unlikely to be incorporated

    into templates and so would not be expected to contribute to the pitch percept

    when presented by artificial means, such as presenting them to alternate ears.Most sounds in our world, including those produced by musical instruments,

    tend to have more energy at low frequencies than at high; on average, spectral

    amplitude decreases at a rate of about 1/  f , or -6 dB/octave. It therefore makes sense

    that the auditory system would rely on the lower numbered harmonics to determine

    pitch, as these are the ones that are most likely to be audible. Also, resolved harmo-

    nics—ones that produce a peak in the excitation pattern and elicit a sinusoidal tem-

    poral response—are much less susceptible to the effects of room reverberation than

    are unresolved harmonics. Pitch discrimination thresholds for unresolved harmonics

    are relatively good (B

    2%) when all the components have the same starting phase(as in a stream of pulses). However, thresholds are much worse when the phase

    relationships are scrambled, as they would be in a reverberant hall or church, and

    listeners’ discrimination thresholds can be as poor as 10%—more than a musical

    semitone. In contrast, the response to resolved harmonics is not materially affected

    by reverberation: changing the starting phase of a single sinusoid does not affect its

    171. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    18/33

    waveshape—it still remains a sinusoid, with frequency discriminations thresholds

    of considerably less than 1%.

    A number of physiological and neuroimaging studies have searched for represen-

    tations of pitch beyond the cochlea (Winter, 2005). Potential correlates of periodicityhave been found in single- and multi-unit studies of the cochlear nucleus (Winter,

    Wiegrebe, & Patterson, 2001), in the inferior colliculus (Langner & Schreiner,

    1988), and auditory cortex (Bendor & Wang, 2005). Human neuroimaging studies

    have also found correlates of periodicity in the brainstem (Griffiths, Uppenkamp,

    Johnsrude, Josephs, & Patterson, 2001) as well as in auditory cortical structures

    (Griffiths, Buchel, Frackowiak, & Patterson, 1998). More recently,   Penagos,

    Melcher, and Oxenham (2004)   identified a region in human auditory cortex that

    seemed sensitive to the degree of pitch salience, as opposed to physical parameters,

    such as F0 or spectral region. However, these studies are also not without some con-troversy. For instance, Hall and Plack (2009) failed to find any single region in the

    human auditory cortex that responded to pitch, independent of other stimulus para-

    meters. Similarly, in a physiological study of the ferret’s auditory cortex,   Bizley,

    Walker, Silverman, King, and Schnupp (2009) found interdependent coding of pitch,

    timbre, and spatial location and did not find any pitch-specific region.

    In summary, the pitch of single harmonic complex tones is determined primarily

    by the first 5 to 8 harmonics, which are also those thought to be resolved in the

    peripheral auditory system. To extract the pitch, the auditory system must somehow

    combine and synthesize information from these harmonics. Exactly how this occursin the auditory system remains a matter of ongoing research.

    C. Timbre

    The official ANSI definition of timbre is: “That attribute of auditory sensation

    which enables a listener to judge that two nonidentical sounds, similarly presented

    and having the same loudness and pitch, are dissimilar” (ANSI, 1994). The stan-

    dard goes on to note that timbre depends primarily on the frequency spectrum of the sound, but can also depend on the sound pressure and temporal characteristics.

    In other words, anything that is not pitch or loudness is timbre. As timbre has its

    own chapter in this volume (Chapter 2), it will not be discussed further here.

    However, timbre makes an appearance in the next section, where its influence on

    pitch and loudness judgments is addressed.

    D. Sensory Interactions and Cross-Modal Influences

    The auditory sensations of loudness, pitch, and timbre are for the most part studied

    independently. Nevertheless, a sizeable body of evidence suggests that these sen-

    sory dimensions are not strictly independent. Furthermore, other sensory modali-

    ties, in particular vision, can have sizeable effects on auditory judgments of 

    musical sounds.

    18 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    19/33

    1. Pitch and Timbre Interactions

    Pitch and timbre are the two dimensions most likely to be confused, particularly bypeople without any musical training. Increasing the F0 of the complex tone results in

    an increase in pitch, whereas changing the spectral center of gravity of tone increases

    its brightness—one aspect of timbre (Figure 5). In both cases, when asked to describe

    the change, many listeners would simply say that the sound was “higher.”

    In general, listeners find it hard to ignore changes in timbre when making pitch

     judgments. Numerous studies have shown that the JND for F0 increases when

    the two sounds to be compared also vary in spectral content (e.g.,   Borchert,

    Micheyl, & Oxenham, 2011; Faulkner, 1985; Moore & Glasberg, 1990). In principle,

    this could be because the change in spectral shape actually affects pitch or becauselisteners have difficulty ignoring timbre changes and concentrating solely on pitch.

    Studies using pitch matching have generally found that harmonic complex tones are

    best matched with a pure-tone frequency corresponding to the F0, regardless of 

    the spectral content of the complex tone (e.g., Patterson, 1973), which means that the

    detrimental effects of differing timbre may be related more to a “distraction” effect

    than to a genuine change in pitch (Moore & Glasberg, 1990).

    2. Effects of Pitch or Timbre Changes on the Accuracy of Loudness

     Judgments

    Just as listeners have more difficulty judging pitch in the face of varying timbre,

    loudness comparisons between two sounds become much more challenging when

    either the pitch or timbre of the two sounds differs. Examples include the difficulty

    of making loudness comparisons between two pure tones of different frequency

    High F0, High spectral peakHigh F0, Low spectral peak

    Low F0, High spectral peak

    Frequency

    Low F0, Low spectral peak   I  n  c  r  e  a  s   i  n  g     p       i       t     c       h

    Increasing brightness

       L  e  v  e   l   (   d   B   )

    Figure 5  Representations of F0 and spectral peak, which primarily affect the sensations of 

    pitch and timbre, respectively.

    191. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    20/33

    (Gabriel et al., 1997; Oxenham & Buus, 2000), and the difficulty of making loud-

    ness comparisons between tones of differing duration, even when they have the

    same frequency (Florentine, Buus, & Robinson, 1998).

    3. Visual Influences on Auditory Attributes

    As anyone who has watched a virtuoso musician will know, visual input affects the

    aesthetic experience of the audience. More direct influences of vision on auditory

    sensations, and vice versa, have also been reported in recent years. For instance,

    noise that is presented simultaneously with a light tends to be rated as louder than

    noise presented without light (Odgaard, Arieh, & Marks, 2004). Interestingly, this

    effect appears to be sensory in nature, rather than a “late-stage” decisional effect,

    or shift in criterion; in contrast, similar effects of noise on the apparent brightness

    of light (Stein, London, Wilkinson, & Price, 1996) seem to stem from higher-level

    decisional and criterion-setting mechanisms (Odgaard, Arieh, & Marks, 2003).

    On the other hand, recent combinations of behavioral and neuroimaging techniques

    have suggested that the combination of sound with light can result in increased sen-

    sitivity to low-level light, which is reflected in changes in activation of sensory cor-

    tices (Noesselt et al., 2010).

    Visual cues can also affect other attributes of sound. For instance, Schutz and

    colleagues (Schutz & Kubovy, 2009; Schutz & Lipscomb, 2007) have shown that

    the gestures made in musical performance can affect the perceived duration of a

    musical sound: a short or “staccato” gesture by a marimba player led to shorter

     judged durations of the tone than a long gesture by the player, even though the

    tone itself was identical. Interestingly, this did not hold for sustained sounds, such

    as a clarinet, where visual information had much less impact on duration judg-

    ments. The difference may relate to the exponential decay of percussive sounds,

    which have no clearly defined end, allowing the listeners to shift their criterion for

    the end point to better match the visual information.

    III. Perception of Sound Combinations

     A. Object Perception and Grouping 

    When a musical tone, such as a violin note or a sung vowel, is presented, we normally

    hear a single sound with a single pitch, even though the note actually consists of 

    many different pure tones, each with its own frequency and pitch. This “perceptual

    fusion” is partly because all the pure tones begin and end at roughly the same time,

    and partly because they form a single harmonic series (Darwin, 2005). The impor-

    tance of onset and offset synchrony can be demonstrated by delaying one of thecomponents relative to all the others. A delay of only a few tens of milliseconds is

    sufficient for the delayed component to “pop out” and be heard as a separate

    object. Similarly, if one component is mistuned compared to the rest of the com-

    plex, it will be heard out as a separate object, provided the mistuning is sufficiently

    large. For low-numbered harmonics, mistuning a harmonic by between 1 and 3% is

    20 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    21/33

    sufficient for it to “pop out” (Moore, Glasberg, & Peters, 1986). Interestingly, a

    mistuned harmonic can be heard separately, but can still contribute to the overall

    pitch of the complex; in fact a single mistuned harmonic continues to contribute to

    the overall pitch of the complex, even when it is mistuned by as much as 8%—well above the threshold for hearing it out as a separate object (Darwin & Ciocca,

    1992; Darwin, Hukin, & al-Khatib, 1995; Moore et al., 1985). This is an example

    of a failure of “disjoint allocation”—a single component is not disjointly allocated

    to just a single auditory object (Liberman, Isenberg, & Rakerd, 1981; Shinn-

    Cunningham, Lee, & Oxenham, 2007).

    B. Perceiving Multiple Pitches

    How many tones can we hear at once? Considering all the different instruments in

    an orchestra, one might expect the number to be quite high, and a well-trained con-

    ductor will in many cases be able to hear a wrong note played by a single instru-

    ment within that orchestra. But are we aware of all the pitches being presented at

    once, and can we count them? Huron (1989) suggested that the number of indepen-

    dent “voices” we can perceive and count is actually rather low.  Huron (1989) used

    sounds of homogenous timbre (organ notes) and played participants sections from a

    piece of polyphonic organ music by J. S. Bach with between one and five voices

    playing simultaneously. Despite the fact that most of the participants were musi-cally trained, their ability to judge accurately the number of voices present

    decreased dramatically when the number of voices actually present exceeded three.

    Using much simpler stimuli, consisting of several simultaneous pure tones,

    Demany and Ramos (2005)   made the interesting discovery that participants could

    not tell whether a certain tone was present or absent from the chord, but they

    noticed if its frequency was changed in the next presentation. In other words, lis-

    teners detected a change in the frequency of a tone that was itself undetected.

    Taken together with the results of   Huron (1989), the data suggest that the pitches

    of many tones can be processed simultaneously, but that listeners may only be con-sciously aware of a subset of between three and four at any one time.

    C. The Role of Frequency Selectivity in the Perception of Multiple Tones

    1. Roughness

    When two pure tones of differing frequency are added, the resulting waveform

    fluctuates in amplitude at a rate corresponding to the difference of the two frequen-

    cies. These amplitude fluctuations, or “beats,” are illustrated in   Figure 6, whichshows how the two tones are sometimes in phase, and add constructively (A), and

    sometimes out of phase, and so cancel (B). At beat rates of less than about 10 Hz,

    we hear the individual fluctuations, but once the rate increases above about 12 Hz,

    we are no longer able to follow the individual fluctuations and instead perceive a

    “rough” sound (Daniel & Weber, 1997; Terhardt, 1974a).

    211. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    22/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    23/33

    2. Pitch Perception of Multiple Sounds

    Despite the important role of tone combinations or chords in music, relatively few

    psychoacoustic studies have examined their perception.   Beerends and Houtsma

    (1989)   used complex tones consisting of just two consecutive harmonics each.

    Although the pitch of these two-component complexes is relatively weak, with prac-

    tice, listeners can learn to accurately identify the F0 of such complexes. Beerends

    and Houtsma found that listeners were able to identify the pitches of the two com-

    plex tones, even if the harmonics from one sound were presented to different ears.

    The only exception was when all the components were presented to one ear and

    none of the four components was deemed to be “resolved.” In that case, listeners

    were not able to identify either pitch accurately.

    Carlyon (1996) used harmonic tone complexes with more harmonics and filtered

    them so that they had completely overlapping spectral envelopes. He found that

    when both complexes were composed of resolved harmonics, listeners were able to

    hear out the pitch of one complex in the presence of the other. However, the sur-

    prising finding was that when both complexes comprised only unresolved harmo-

    nics, then listeners did not hear a pitch at all, but described the percept as an

    unmusical “crackle.” To avoid ambiguity,   Carlyon (1996)   used harmonics that

    were either highly resolved or highly unresolved. Because of this, it remained

    unclear whether it is the resolvability of the harmonics before or after the two

    sounds are mixed that determines whether each tone elicits a clear pitch. Micheyl

    and colleagues addressed this issue, using a variety of combinations of spectral

    region and F0 to vary the relative resolvability of the components (Micheyl,

    Bernstein, & Oxenham, 2006; Micheyl, Keebler, & Oxenham, 2010). By compar-

    ing the results to simulations of auditory filtering, they found that good pitch dis-

    crimination was only possible when at least two of the harmonics from the target

    sound were deemed resolved   after   being mixed with the other sound (Micheyl

    et al., 2010). The results are consistent with place theories of pitch that rely on

    resolved harmonics; however, it may be possible to adapt timing-based models of 

    pitch to similarly explain the phenomena (e.g.,  Bernstein & Oxenham, 2005).

    D. Consonance and Dissonance

    The question of how certain combinations of tones sound when played together

    is central to many aspects of music theory. Combinations of two tones that form

    certain musical intervals, such as the octave and the fifth, are typically deemed as

    sounding pleasant or consonant, whereas others, such as the augmented fourth (tri-

    tone), are often considered unpleasant or dissonant. These types of percepts involv-

    ing tones presented in isolation of a musical context have been termed   sensoryconsonance or dissonance. The term   musical consonance   (Terhardt, 1976, 1984)

    subsumes sensory factors, but also includes many other factors that contribute to

    whether a sound combination is judged as consonant or dissonant, including the

    context (what sounds preceded it), the style of music (e.g., jazz or classical), and

    presumably also the personal taste and musical history of the individual listener.

    231. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    24/33

    There has been a long-standing search for acoustic and physiological correlates

    of consonance and dissonance, going back to the observations of Pythagoras that

    strings whose lengths had a small-number ratio relationship (e.g., 2:1 or 3:2)

    sounded pleasant together.   Helmholtz (1885/1954)  suggested that consonance maybe related to the absence of beats (perceived as roughness) in musical sounds.

    Plomp and Levelt (1965) developed the idea further by showing that the ranking by

    consonance of musical intervals within an octave was well predicted by the number

    of component pairs within the two complex tones that fell within the same auditory

    filters and therefore caused audible beats (see also   Kameoka & Kuriyagawa,

    1969a, 1969b). When two complex tones form a consonant interval, such as an

    octave or a fifth, the harmonics are either exactly coincident, and so do not produce

    beats, or are spaced so far apart as to not produce strong beats. In contrast, when

    the tones form a dissonant interval, such as a minor second, none of the compo-nents are coincident, but many are close enough to produce beats.

    Another alternative theory of consonance is based on the “harmonicity” of the

    sound combination, or how closely it resembles a single harmonic series. Consider,

    for instance, two complex tones that form the interval of a perfect fifth, with F0s of 

    440 and 660 Hz. All the components from both tones are multiples of a single

    F0—220 Hz—and so, according to the harmonicity account of consonance, should

    sound consonant. In contrast, the harmonics of two tones that form an augmented

    fourth, with F0s of 440 Hz and 622 Hz, do not approximate any single harmonic

    series within the range of audible pitches and so should sound dissonant, as foundempirically. The harmonicity theory of consonance can be implemented by using a

    spectral template model (Terhardt, 1974b) or by using temporal information,

    derived for instance from spikes in the auditory nerve (Tramo, Cariani, Delgutte, &

    Braida, 2001).

    Because the beating and harmonicity theories of consonance and dissonance pro-

    duce very similar predictions, it has been difficult to distinguish between them

    experimentally. A recent study took a step toward this goal by examining individ-

    ual differences in a large group (.200) of participants (McDermott, Lehr, &

    Oxenham, 2010). First, listeners were asked to provide preference ratings for “diag-nostic” stimuli that varied in beating but not harmonicity, or vice versa. Next,

    listeners were asked to provide preference ratings for various musical sound

    combinations, including dyads (two-note chords) and triads (three-note chords),

    using natural and artificial musical instruments and voices. When the ratings in the

    two types of tasks were compared, the correlations between the ratings for the har-

    monicity diagnostic tests and the musical sounds were significant, but the correla-

    tions between the ratings for the beating diagnostic tests and the musical sounds

    were not. Interestingly, the number of years of formal musical training also corre-

    lated with both the harmonicity and musical preference ratings, but not with thebeating ratings. Overall, the results suggested that harmonicity, rather than lack of 

    beating, underlies listeners’ consonance preferences and that musical training may

    amplify the preference for harmonic relationships.

    Developmental studies have shown that infants as young as 3 or 4 months show

    a preference for consonant over dissonant musical intervals (Trainor & Heinmiller,

    24 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    25/33

    1998; Zentner & Kagan, 1996, 1998). However, it is not yet known whether infants

    are responding more to beats or inharmonicity, or both. It would be interesting to

    discover whether the adult preferences for harmonicity revealed by   McDermott

    et al. (2010) are shared by infants, or whether infants initially base their preferenceson acoustic beats.

    IV. Conclusions and Outlook

    Although the perception of musical tones should be considered primarily in musical

    contexts, much about the interactions between acoustics, auditory physiology, and

    perception can be learned through psychoacoustic experiments using relativelysimple stimuli and procedures. Recent findings using psychoacoustics, alone or in

    combination with neurophysiology and neuroimaging, have extended our knowl-

    edge of how pitch, timbre, and loudness are perceived and represented neurally,

    both for tones in isolation and in combination. However, much still remains to be

    discovered. Important trends include the use of more naturalistic stimuli in experi-

    ments and for testing computational models of perception, as well as the simulta-

    neous combination of perceptual and neural measures when attempting to elucidate

    the underlying neural mechanisms of auditory perception. Using the building

    blocks provided by the psychoacoustics of individual and simultaneous musicaltones, it is possible to proceed to answering much more sophisticated questions

    regarding the perception of music as it unfolds over time. These and other issues

    are tackled in the remaining chapters of this volume.

    Acknowledgments

    Emily Allen, Christophe Micheyl, and John Oxenham provided helpful comments on an

    earlier version of this chapter. The work from the author’s laboratory is supported by fundingfrom the National Institutes of Health (Grants R01 DC 05216 and R01 DC 07657).

    References

    American National Standards Institute. (1994).   Acoustical terminology. ANSI S1.1-1994.

    New York, NY: Author.

    Arieh, Y., & Marks, L. E. (2003a). Recalibrating the auditory system: A speed-accuracy

    analysis of intensity perception.   Journal of Experimental Psychology: Human

    Perception and Performance, 29, 523536.

    Arieh, Y., & Marks, L. E. (2003b). Time course of loudness recalibration: Implications for

    loudness enhancement.  Journal of the Acoustical Society of America,  114, 15501556.

    Attneave, F., & Olson, R. K. (1971). Pitch as a medium: A new approach to psychophysical

    scaling.  American Journal of Psychology, 84, 147166.

    251. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    26/33

    Beerends, J. G., & Houtsma, A. J. M. (1989). Pitch identification of simultaneous diotic and

    dichotic two-tone complexes.   Journal of the Acoustical Society of America,   85,

    813819.

    Bendor, D., & Wang, X. (2005). The neuronal representation of pitch in primate auditorycortex. Nature,  436 , 11611165.

    Bernstein, J. G., & Oxenham, A. J. (2003). Pitch discrimination of diotic and dichotic tone

    complexes: Harmonic resolvability or harmonic number?   Journal of the Acoustical

    Society of America, 113, 33233334.

    Bernstein, J. G., & Oxenham, A. J. (2005). An autocorrelation model with place dependence

    to account for the effect of harmonic number on fundamental frequency discrimination.

     Journal of the Acoustical Society of America, 117 , 38163831.

    Bernstein, J. G., & Oxenham, A. J. (2006a). The relationship between frequency selectivity

    and pitch discrimination: Effects of stimulus level.  Journal of the Acoustical Society of 

     America, 120, 3916

    3928.Bernstein, J. G., & Oxenham, A. J. (2006b). The relationship between frequency selectivity

    and pitch discrimination: Sensorineural hearing loss.   Journal of the Acoustical Society

    of America, 120, 39293945.

    Bizley, J. K., Walker, K. M., Silverman, B. W., King, A. J., & Schnupp, J. W. (2009).

    Interdependent encoding of pitch, timbre, and spatial location in auditory cortex.

     Journal of Neuroscience, 29, 20642075.

    Borchert, E. M., Micheyl, C., & Oxenham, A. J. (2011). Perceptual grouping affects pitch

     judgments across time and frequency.   Journal of Experimental Psychology: Human

    Perception and Performance,  37 , 257269.

    Burns, E. M., & Viemeister, N. F. (1976). Nonspectral pitch.   Journal of the AcousticalSociety of America,  60, 863869.

    Burns, E. M., & Viemeister, N. F. (1981). Played again SAM: Further observations on the

    pitch of amplitude-modulated noise.   Journal of the Acoustical Society of America,   70,

    16551660.

    Buus, S., Muesch, H., & Florentine, M. (1998). On loudness at threshold.   Journal of the

     Acoustical Society of America, 104, 399410.

    Cariani, P. A., & Delgutte, B. (1996). Neural correlates of the pitch of complex tones.

    I. Pitch and pitch salience.  Journal of Neurophysiology, 76 , 16981716.

    Carlyon, R. P. (1996). Encoding the fundamental frequency of a complex tone in the pres-

    ence of a spectrally overlapping masker.   Journal of the Acoustical Society of America,99, 517524.

    Carlyon, R. P. (1998). Comments on “A unitary model of pitch perception” [ Journal of the

     Acoustical Society of America,   102, 18111820 (1997)].   Journal of the Acoustical

    Society of America, 104, 11181121.

    Carlyon, R. P., & Shackleton, T. M. (1994). Comparing the fundamental frequencies of 

    resolved and unresolved harmonics: Evidence for two pitch mechanisms?  Journal of the

     Acoustical Society of America, 95, 35413554.

    Cedolin, L., & Delgutte, B. (2010). Spatiotemporal representation of the pitch of harmonic

    complex tones in the auditory nerve.  Journal of Neuroscience,  30, 1271212724.

    Chalupper, J., & Fastl, H. (2002). Dynamic loudness model (DLM) for normal and hearing-

    impaired listeners.  Acta Acustica united with Acustica, 88, 378386.

    Chen, Z., Hu, G., Glasberg, B. R., & Moore, B. C. (2011). A new method of calculating

    auditory excitation patterns and loudness for steady sounds.   Hearing Research,   282

    (12), 204215.

    26 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    27/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    28/33

    Glasberg, B. R., & Moore, B. C. J. (1990). Derivation of auditory filter shapes from

    notched-noise data.  Hearing Research, 47 , 103138.

    Glasberg, B. R., & Moore, B. C. J. (2002). A model of loudness applicable to time-varying

    sounds. Journal of the Audio Engineering Society,  50, 331

    341.Gockel, H., Carlyon, R. P., & Plack, C. J. (2004). Across-frequency interference effects in

    fundamental frequency discrimination: Questioning evidence for two pitch mechanisms.

     Journal of the Acoustical Society of America, 116 , 10921104.

    Goldstein, J. L. (1973). An optimum processor theory for the central formation of 

    the pitch of complex tones.   Journal of the Acoustical Society of America,   54,

    14961516.

    Griffiths, T. D., Buchel, C., Frackowiak, R. S., & Patterson, R. D. (1998). Analysis of tem-

    poral structure in sound by the human brain.  Nature Neuroscience,  1, 422427.

    Griffiths, T. D., Uppenkamp, S., Johnsrude, I., Josephs, O., & Patterson, R. D. (2001).

    Encoding of the temporal regularity of sound in the human brainstem.   Nature Neuroscience, 4, 633637.

    Hall, D. A., & Plack, C. J. (2009). Pitch processing sites in the human auditory brain.

    Cerebral Cortex, 19, 576585.

    Hartmann, W. M., & Goupell, M. J. (2006). Enhancing and unmasking the harmonics of a

    complex tone.  Journal of the Acoustical Society of America,  120, 21422157.

    Heinz, M. G., Colburn, H. S., & Carney, L. H. (2001). Evaluating auditory performance

    limits: I. One-parameter discrimination using a computational model for the auditory

    nerve. Neural Computation, 13, 22732316.

    Hellman, R. P. (1976). Growth of loudness at 1000 and 3000 Hz.  Journal of the Acoustical

    Society of America,  60, 672

    679.Hellman, R. P., & Zwislocki, J. (1964). Loudness function of a 1000-cps tone in the presence

    of a masking noise. Journal of the Acoustical Society of America, 36 , 16181627.

    Helmholtz, H. L. F. (1885/1954).  On the sensations of tone  (A. J. Ellis, Trans.). New York,

    NY: Dover.

    Henning, G. B. (1966). Frequency discrimination of random amplitude tones.  Journal of the

     Acoustical Society of America, 39, 336339.

    Houtsma, A. J. M., & Smurzynski, J. (1990). Pitch identification and discrimination for complex

    tones with many harmonics. Journal of the Acoustical Society of America, 87 , 304310.

    Huron, D. (1989). Voice denumerability in polyphonic music of homogenous timbres.  Music

    Perception,  6 , 361

    382.Jesteadt, W., Wier, C. C., & Green, D. M. (1977). Intensity discrimination as a function of 

    frequency and sensation level.   Journal of the Acoustical Society of America,   61,

    169177.

    Kaernbach, C., & Bering, C. (2001). Exploring the temporal mechanism involved in the

    pitch of unresolved harmonics.   Journal of the Acoustical Society of America,   110,

    10391048.

    Kameoka, A., & Kuriyagawa, M. (1969a). Consonance theory part I: Consonance of dyads.

     Journal of the Acoustical Society of America, 45, 14511459.

    Kameoka, A., & Kuriyagawa, M. (1969b). Consonance theory part II: Consonance of com-

    plex tones and its calculation method.  Journal of the Acoustical Society of America,  45,

    14601469.

    Keuss, P. J., & van der Molen, M. W. (1982). Positive and negative effects of stimulus

    intensity in auditory reaction tasks: Further studies on immediate arousal.   Acta

    Psychologica, 52, 6172.

    28 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    29/33

    Kohfeld, D. L. (1971). Simple reaction time as a function of stimulus intensity in decibels of 

    light and sound.  Journal of Experimental Psychology,  88, 251257.

    Kohlrausch, A., Fassel, R., & Dau, T. (2000). The influence of carrier level and frequency

    on modulation and beat-detection thresholds for sinusoidal carriers.   Journal of the Acoustical Society of America, 108, 723734.

    Langner, G., & Schreiner, C. E. (1988). Periodicity coding in the inferior colliculus of the

    cat. I. Neuronal mechanisms.  Journal of Neurophysiology, 60, 17991822.

    Liberman, A. M., Isenberg, D., & Rakerd, B. (1981). Duplex perception of cues for stop con-

    sonants: Evidence for a phonetic mode.  Perception & Psychophysics, 30, 133143.

    Licklider, J. C., Webster, J. C., & Hedlun, J. M. (1950). On the frequency limits of binaural

    beats.  Journal of the Acoustical Society of America, 22, 468473.

    Licklider, J. C. R. (1951). A duplex theory of pitch perception.  Experientia, 7 , 128133.

    Loeb, G. E., White, M. W., & Merzenich, M. M. (1983). Spatial cross correlation: A pro-

    posed mechanism for acoustic pitch perception.  Biological Cybernetics,  47 , 149

    163.Luce, R. D., & Green, D. M. (1972). A neural timing theory for response times and the psy-

    chophysics of intensity. Psychological Review,  79, 1457.

    Mapes-Riordan, D., & Yost, W. A. (1999). Loudness recalibration as a function of level.

     Journal of the Acoustical Society of America, 106 , 35063511.

    Marks, L. E. (1994). “Recalibrating” the auditory system: The perception of loudness.

     Journal of Experimental Psychology: Human Perception and Performance,   20,

    382396.

    Mauermann, M., Long, G. R., & Kollmeier, B. (2004). Fine structure of hearing threshold and

    loudness perception. Journal of the Acoustical Society of America, 116 , 10661080.

    McDermott, J. H., Lehr, A. J., & Oxenham, A. J. (2010). Individual differences reveal thebasis of consonance. Current Biology, 20, 10351041.

    Meddis, R., & Hewitt, M. (1991). Virtual pitch and phase sensitivity studied of a computer

    model of the auditory periphery. I: Pitch identification.   Journal of the Acoustical

    Society of America,  89, 28662882.

    Meddis, R., & O’Mard, L. (1997). A unitary model of pitch perception.   Journal of the

     Acoustical Society of America, 102, 18111820.

    Micheyl, C., Bernstein, J. G., & Oxenham, A. J. (2006). Detection and F0 discrimination of 

    harmonic complex tones in the presence of competing tones or noise.   Journal of the

     Acoustical Society of America, 120, 14931505.

    Micheyl, C., Delhommeau, K., Perrot, X., & Oxenham, A. J. (2006). Influence of musicaland psychoacoustical training on pitch discrimination.   Hearing Research,   219,

    3647.

    Micheyl, C., Keebler, M. V., & Oxenham, A. J. (2010). Pitch perception for mixtures of 

    spectrally overlapping harmonic complex tones.   Journal of the Acoustical Society of 

     America,  128, 257269.

    Micheyl, C., & Oxenham, A. J. (2003). Further tests of the “two pitch mechanisms” hypothe-

    sis.  Journal of the Acoustical Society of America,  113, 2225.

    Miller, G. A. (1956). The magic number seven, plus or minus two: Some limits on our

    capacity for processing information. Psychology Review, 63, 8196.

    Moore, B. C. J. (1973). Frequency difference limens for short-duration tones.  Journal of the

     Acoustical Society of America, 54, 610619.

    Moore, B. C. J., & Glasberg, B. R. (1990). Frequency discrimination of complex tones with

    overlapping and non-overlapping harmonics.   Journal of the Acoustical Society of 

     America,  87 , 21632177.

    291. The Perception of Musical Tones

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    30/33

    Moore, B. C. J., & Glasberg, B. R. (1996). A revision of Zwicker’s loudness model.

     Acustica, 82, 335345.

    Moore, B. C. J., & Glasberg, B. R. (1997). A model of loudness perception applied to

    cochlear hearing loss. Auditory Neuroscience, 3, 289

    311.Moore, B. C. J., Glasberg, B. R., & Baer, T. (1997). A model for the prediction of thresholds,

    loudness, and partial loudness.  Journal of the Audio Engineering Society, 45, 224240.

    Moore, B. C. J., Glasberg, B. R., & Peters, R. W. (1985). Relative dominance of individual

    partials in determining the pitch of complex tones.  Journal of the Acoustical Society of 

     America, 77 , 18531860.

    Moore, B. C. J., Glasberg, B. R., & Peters, R. W. (1986). Thresholds for hearing mistuned

    partials as separate tones in harmonic complexes.   Journal of the Acoustical Society of 

     America, 80, 479483.

    Moore, B. C. J., Glasberg, B. R., & Vickers, D. A. (1999). Further evaluation of a model of 

    loudness perception applied to cochlear hearing loss.   Journal of the Acoustical Societyof America, 106 , 898907.

    Moore, B. C. J., & Gockel, H. E. (2011). Resolvability of components in complex tones and

    implications for theories of pitch perception.  Hearing Research, 276 , 8897.

    Moore, B. C. J., & Peters, R. W. (1992). Pitch discrimination and phase sensitivity in young

    and elderly subjects and its relationship to frequency selectivity.   Journal of the

     Acoustical Society of America, 91, 28812893.

    Moore, B. C. J., & Sęk, A. (2009). Sensitivity of the human auditory system to temporal fine

    structure at high frequencies.   Journal of the Acoustical Society of America,   125,

    31863193.

    Noesselt, T., Tyll, S., Boehler, C. N., Budinger, E., Heinze, H. J., & Driver, J. (2010).Sound-induced enhancement of low-intensity vision: Multisensory influences on human

    sensory-specific cortices and thalamic bodies relate to perceptual enhancement of visual

    detection sensitivity. Journal of Neuroscience,  30, 1360913623.

    Oberfeld, D. (2007). Loudness changes induced by a proximal sound: Loudness enhance-

    ment, loudness recalibration, or both?   Journal of the Acoustical Society of America,

    121, 21372148.

    Odgaard, E. C., Arieh, Y., & Marks, L. E. (2003). Cross-modal enhancement of perceived

    brightness: Sensory interaction versus response bias.  Perception & Psychophysics,   65,

    123132.

    Odgaard, E. C., Arieh, Y., & Marks, L. E. (2004). Brighter noise: Sensory enhancement of perceived loudness by concurrent visual stimulation.  Cognitive, Affective, & Behavioral

     Neuroscience, 4, 127132.

    Oxenham, A. J., Bernstein, J. G. W., & Penagos, H. (2004). Correct tonotopic representation

    is necessary for complex pitch perception.   Proceedings of the National Academy of 

    Sciences USA,  101, 14211425.

    Oxenham, A. J., & Buus, S. (2000). Level discrimination of sinusoids as a function of dura-

    tion and level for fixed-level, roving-level, and across-frequency conditions.  Journal of 

    the Acoustical Society of America, 107 , 16051614.

    Oxenham, A. J., Micheyl, C., Keebler, M. V., Loper, A., & Santurette, S. (2011). Pitch per-

    ception beyond the traditional existence region of pitch.   Proceedings of the National

     Academy of Sciences USA, 108, 76297634.

    Palmer, A. R., & Russell, I. J. (1986). Phase-locking in the cochlear nerve of the guinea-pig

    and its relation to the receptor potential of inner hair-cells.   Hearing Research,   24,

    115.

    30 Andrew J. Oxenham

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    31/33

  • 8/16/2019 1 - The Perception of Musical Tones, Pages 1-33

    32/33

    Shofner, W. P. (2005). Comparative aspects of pitch perception. In C. J. Plack, A. J. Oxenham,

    R. Fay, & A. N. Popper (Eds.),   Pitch: Neural coding and perception   (pp. 5698). New

    York, NY: Springer Verlag.

    Stein, B. E., London, N., Wilkinson, L. K., & Price, D. D. (1996). Enhancement of perceivedvisual intensity by auditory stimuli: A psychophysical analysis.   Journal of Cognitive

     Neuroscience, 8, 497506.

    Stevens, S. S. (1957). On the psychophysical law.  Psychology Review,  64, 153181.

    Suzuki, Y., & Takeshima, H. (2004). Equal-loudness-level contours for pure tones.  Journal

    of the Acoustical Society of America,  116 , 918933.

    Terhardt, E. (1974a). On the perception of periodic sound fluctuations (roughness).  Acustica,

    30, 201213.

    Terhardt, E. (1974b). Pitch, consonance, and harmony.   Journal of the Acoustical Society of 

     America, 55, 10611069.

    Terhardt, E. (1976). Psychoakustich begründetes Konzept der musikalischen Konsonanz. Acustica,  36 , 121137.

    Terhardt, E. (1984). The concept of musical consonance, a link between music and psycho-

    acoustics. Music Perception,  1, 276295.

    Trainor, L. J., & Heinmiller, B. M. (1998). The development of evaluative responses to

    music: Infants prefer to listen to consonance over dissonance.   Infant Behavior and 

     Development , 21, 7788.

    Tramo, M. J., Cariani, P. A., Delgutte, B., & Braida, L. D. (2001). Neurobiological founda-

    tions for the theory of harmony in western tonal music.   Annals of the New York 

     Academy of Sciences, 930, 92116.

    van de Par, S., & Kohlrausch, A. (1997). A new approach to comparing binaural maskinglevel differences at low and high frequencies.   Journal of the Acoustical Society of 

     America, 101, 16711680.

    Verschuure, J., & van Meeteren, A. A. (1975). The effect of intensity on pitch.  Acustica,  32,

    3344.

    Viemeister, N. F. (1983). Auditory intensity discrimination at high frequencies in the pres-

    ence of noise.  Science,  221, 12061208.

    Viemeister, N. F., & Bacon, S. P. (1988). Intensity discrimination, increment detection, and

    magnitude estimation for 1-kHz tones.  Journal of the Acoustical Society of America,  84,

    172178.

    Wallace, M. N., Rutkowski, R. G., Shackleton, T. M., & Palmer, A. R. (2000). Phase-lockedresponses to pure tones in guinea pig auditory cortex.  Neuroreport ,  11, 39893993.

    Warren, R. M. (1970). Elimination of biases in loudness judgements for tones.  Journal of the

     Acoustical Society of America, 48, 13971403.

    Wightman, F. L. (1973). The pattern-transformation model of pitch.   Journal of the

     Acoustical Society of America, 54, 407416.

    Winckel, F. W. (1962). Optimum acoustic criteria of concert halls for the performance of 

    classical music.  Journal of the Acoustical Society of America,  34, 8186.

    Winter, I. M. (2005). The neurophysiology of pitch. In C. J. Plack, A. J. Oxenham, R.