The Internal Representation of Pitch Sequences in Tonal...

Psychological Review1981, Vol. 88, No. 6, 503-522

Copyright 1981 by the American Psychological Association, Inc.0033-295X/81 /8806-0503S00.75

The Internal Representation of PitchSequences in Tonal Music

Diana DeutschUniversity of California, San Diego

John FeroeDepartment of Mathematics

Vassar College

A model for the internal representation of pitch sequences in tonal music isadvanced. This model assumes that pitch sequences are retained as hierarchicalnetworks. At each level of the hierarchy, elements are organized as structuralunits in accordance with laws of figural goodness, such as proximity and goodcontinuation. Further, elements that are present at each hierarchical level areelaborated by further elements so as to form structural units at the next-lowerlevel, until the lowest level is reached. Processing advantages of the system arediscussed.

It may generally be stated that we tendto encode and retain information in the formof hierarchies when given the opportunity todo so. For example, programs of behaviortend to be retained as hierarchies (Miller,Galanter, & Pribram, 1960) and goals inproblem solving as hierarchies of subgoals(Ernst & Newell, 1969). Visual scenes ap-pear to be encoded as hierarchies of sub-scenes (Hanson & Riseman, 1978; Navon,1977; Palmer, 1977; Winston, 1973). Thephrase structure of a sentence lends it-self readily to hierarchical interpretations(Chomsky, 1963; Miller & Chomsky, 1963;Yngve, 1960). When presented with artifi-cial serial patterns that may be hierarchi-cally encoded, we readily form encodingsthat reflect pattern structure (Bjork, 1968;Kotovsky & Simon, 1973; Restle, 1970; Res-tie & Brown, 1970; Simon & Kotovsky,1963; Vitz & Todd, 1967, 1969).

In considering how we form hierarchies,however, theories have generally been con-strained by the nature of the stimulus ma-terial under consideration. For example, vi-sually perceived objects are naturally formedout of parts and subparts. The hierarchicalstructure of language must necessarily beconstrained by the logical structure of eventsin the world. The attainment of a goal is

Preparation of this paper was supported by UnitedStates Public Health Service Grant MH-21001.

Requests for reprints should be sent to Diana Deutsch,Department of Psychology, C-009, University of Cali-fornia, San Diego, La Jolla, California 92093.

generally arrived at by an optimal systemof subgoals, and so on.

An analogous situation exists for theoriesbased on experiments utilizing serial pat-terns-that were devised by the experimenter.To take a concrete example, Restle's (1970)theory of hierarchical representation of se-rial patterns evolved from findings based onthe following experimental paradigm. Sub-jects were presented with a row of six lights,which turned on and off in repetitive se-quence, and they were required on each trialto predict which light would come on next.The sequences were structured as hierar-chies of operators. For instance, given thebasic sequence X = (1, 2), the operation R('repeat of X') produces the sequence 1 2 12^the operation M ('mirror-image of X')produces the sequence 1 2 6 5 , and the op-eration T ('transposition +1 of X') producesthe sequence 1 2 2 3 . Through recursive ap-plication of such operations, long sequencescan be produced that have compact struc-tural descriptions. Thus M(T(R(T(1)))) de-scribes the sequence 1 2 1 2 2 3 2 3 6 5 65 5 4 5 4 . Restle and Brown (1970), usingsequences constructed in this fashion, foundcompelling evidence that subjects were en-coding these patterns in accordance withtheir hierarchical structure. However, eachpattern was constructed so as to allow foronly one parsimonious interpretation. Thusit is difficult to estimate the generalizabilityof this model to situations where alternativehierarchical realizations are possible.

503

504 DIANA DEUTSCH AND JOHN FEROE

In contrast, the hierarchical structure oftonal music provides us with a unique op-portunity to examine how we optimally formhierarchies, since such music is solely theproduct of human processing mechanisms,unfettered by external constraints. Further,tonal music can reasonably be considered tohave evolved so as to capitalize on thesemechanisms.

In this article we propose a model of howthe observer represents the pitch sequencesof tonal music in abstract form. This modelfalls into the class of those developed byLeewenberg (1971), Restle (1970; Restle& Brown, 1970), Simon and his colleagues(Simon, 1972; Simon & Kotovsky, 1963;Simon & Summer, 1968; Greeno & Simon,1974), and Vitz and Todd (1967, 1969),among others; in that it proposes a specificlanguage or notation for describing serialpatterns, and this language is considered toreflect specific encodings. Indeed, many ofthe concepts and certain notations are owedto this previous work, as will be describedbelow. However, our model differs from ear-lier ones in its basic architecture. In essenceit may be characterized as a hierarchicalnetwork, at each level of which structuralunits are represented as an organized set ofelements. Elements that are present at anygiven level are elaborated by further ele-ments so as to form structural units at thenext-lower level. It is further proposed thatgestalt principles such as proximity and goodcontinuation contribute to organization ateach hierarchical level.

Before embarking on a formal descriptionof the model, it should be noted that thisconcerns the representation of pitch infor-mation at the highest stage of abstraction,and that such information is assumed to berepresented in parallel at lower stages also.At the lowest stage absolute pitch values areheld to be represented, and interactions instorage that occur at this stage have beendescribed elsewhere (Deutsch, 1975, in press-a). The next-higher stage is concerned withabstracted intervals and chords (Deutsch,1969, 1978b). At the highest stage pitch in-formation is further mapped onto a set ofhighly overlearned alphabets (Cuddy &Cohen, 1976; Cuddy, Cohen, & Miller,1979; Deutsch, 1977, 1980; Dowling, 1978;

Frances, 1958; Krumhansl, 1979; Krum-hansl & Shepard, 1979).

Craik and Lockhart (1972) have arguedthat the higher the stage of abstraction ofinformation, the longer its persistence inmemory. This may well be true of the systemretaining pitch information. Memory formelodic and harmonic intervals clearly per-sists longer than memory for absolute pitchvalues (Attneave & Olson, 1971; Deutsch,1969). It appears plausible that memory forhigher order abstractions persists longer still,but this hypothesis requires experimentalverification.

The Model

Our model can best be introduced by mu-sical example. Let us consider the pitch se-quence shown on Figure l(a). One way torepresent this sequence is in terms of stepstraversing the chromatic scale. We may saythat a basic subsequence consisting of a stepup this scale is presented four times insuccession, the second presentation beingfour steps up from the first, the third beingthree steps up from the second, and thefourth being five steps up from the third.This type of analysis assigns prominence tothe basic subsequence, and does not relatethe successive transpositions to each otherin any meaningful way. If the observer didindeed encode the pitch sequence in such afashion, we may expect the basic subse-quence to be well remembered, but the exactpositions at which it is realized to be onlypoorly remembered.

The above analysis does not accord withmusical intuitions. A musical analysis of thissequence would instead describe it as on thetwo structural levels shown on Figure 1. Wecan see that the basic relationship expressedhere is that of the elaboration of a higher-level subsequence by a lower-level subse-

a. ' l J ' ' '

^E

Figure!. Pitch sequence represented on two hierarchicallevels. (Panel a: Lower level. Panel b: Higher level.)

INTERNAL REPRESENTATION OF PITCH SEQUENCES IN TONAL MUSIC 505

quence. At the higher level, shown on Figure1 (b), there is an arpeggiation that ascendsthrough the C major triad (C-E-G-C). Atthe lower level each note of this triad is pre-ceded by a neighbor embellishment, thusforming a two-note pattern. We may rep-resent this hierarchical structure in tree formas on Figure 2. (In this example, as is oftenthe case, the elements of the higher-levelsubsequence are given metrical stress, to em-phasize their prominence.)

Various points should be observed here.The first is that in this representation, a spe-cific sequence of notes is reali/ed at eachstructural level. This contrasts with repre-sentations in which specific events are real-ized only at the lowest structural level, theelements at higher levels being rule systems.We may also observe that in this represen-tation, notes (or sequences of notes) that arepresent at any given level are also presentat all lower levels. Thus the higher up a note(or sequence of notes) is represented in thishierarchy, the larger the number of its rep-resentations. This analysis therefore assignsprominence to elements at higher ratherthan at lower structural levels. In contrast,representations of serial patterns that arebased on the concept of a subsequence thatis repeatedly presented under transforma-tion assigns greater prominence to elementsat lower structural levels.

Another point illustrated by this exampleis that when a note at a higher level is elab-orated by a sequence of notes at a lower level,the dominant note in the lower-level se-quence (i.e., the note that also occurs at thehigher level) need not be the first note of thissequence. In the present example the dom-inant note is the second of the two lower-

level notes, the first being a submetrical em-bellishment of the second.

Finally, we can see that in this example,distinct pitch alphabets are employed at dif-ferent structural levels: The alphabet of themajor triad is employed at the higher level,and the chromatic alphabet at the lowerlevel. Such use of multiple alphabets occursvery commonly in music and, as we shall see,confers several processing advantages.

Formal Rules for the Representation1

Elementary Operators

1. An alphabet a is a linearly ordered setof symbols a = { . . . , Ci, e2, . ..} which maybe finite or extend infinitely in either direc-tion. Common pitch alphabets in sequencesof tonal music are the chromatic scale, themajor and minor diatonic scales, and arpeg-giated chords. These will be described below.

2. With respect to an element ek in analphabet a the elementary operators s(same), n (next), p (predecessor), n1, p' aredefined as follows:2

s(ek) = ek

n(ek) = ek+1

P(ek) = ek_,

n'(ek) = ek+i

p'(ek) = ek_i

3. A structure A of length n is notatedas

A = (A0, A!, . . . , A,_,, *,.AW,. . ., An_i)

where for each integer j with 0 ^ j <, n —1, j ^ £, the symbol Aj is an elementaryoperator. The symbol * provides a referencepoint for the other operators. It appears ex-actly once in position £ where 0 <, f <, n -1. We note the particular cases (*, AI,. . . , An_,), (A0, . . . , An-2, *) and (*).

4. A sequence A is notated as {A; «}where A is a structure and a is an alphabet.

B-C D|J-E Ffl-G B-C

Figure 2. Tree diagram of pitch sequence shown onFigure 1.

1 Simon (1972) gives a detailed description of relatedformalisms.

2 The symbols s (same), n (next), p (predecessor), n',and p' are due to Simon (1972).


A sequence A together with a reference ele-ment r E a produces a sequence of notes

S = {A; r} = (a0, ah . . . , an_,)

where each ak £ a is as follows:

f r if k = eak = Ak(Ak_,(.. . A1+1(r). . .)) if k > I

[A^A^C.-AnCr). . .)) i f k < ^

If the alphabet for a sequence is understood,the explicit reference to it in the notationmay be omitted.

5. In any structure, the occurrence of astring of length k of an elementary operatorA will be abbreviated kA. For example,

{(*, n2, n2, n2, p, p); «} = {(*, 3n2, 2p); «}.3

Here we give some simple examples to il-lustrate the system so far presented. The rep-resentation {{(*, 4n); C}c}, where C repre-sents the C major scale and c the referenceelement, corresponds to the sequence ofnotes C-D-E-F-G shown on Figure 3 (a).When the structure and reference elementare held constant, but the alphabet of the Cmajor triad is substituted for that of the Cmajor scale, we have the representation{{(*, 4n); Ctr}c}, which corresponds to thesequence of notes C-E-G-C-E shown in Fig-ure 3 (b). When the alphabet of the chro-matic scale is substituted instead, we have{{(*, 4n); Cr}c}, which corresponds to thesequence of notes C-C#-D-D#-E shown onFigure 3 (c).

For a given structure and alphabet, a dif-

ferent sequence of notes is produced whenthe reference element is altered. Thus{{(*, 4n); C}e} corresponds to the sequenceof notes E-F-G-A-B shown on Figure 3 (d).Similarly, {{(*, 4n); Ctl}e} corresponds tothe sequence of notes E-G-C-E-G shown onFigure 3 (e); and {{(*, 4n)Cr}e} corre-sponds to the sequence of notes E-F-F#-G-G# shown on Figure 3 (f).

It should be observed that the identicalsequence of notes may be represented interms of a number of alternative structures,depending on the placement of the referenceelement. Thus the sequence shown on Figure3 (a) may be represented alternatively as{{(p, *, 3n); C}d}; as {{(2p, *, 2n); C}e};or as {{(4p, *); C}g}; and so on. This flex-ibility in placement of the reference elementis important and reflects the fact that thedominant element in a sequence will varydepending on the context in which this se-quence occurs. For example, the lower-levelsequences B-C, D#-D, F#-G, B-C shown onFigure 1 should be represented as {(p, *);Cr}, since in each case the second of the twonotes is dominant, and these second notescombine to form a sequence at a higher level.However, given a different context, any ofthese two-note sequences might be repre-sented as {(*, n); Cr} instead. Thus whereasa large number of alternative representationsmay in principle be constructed for manysequences, the constraints imposed by thehierarchical organization of tonal musicgreatly reduce the number of alternative rep-resentations that the listener will produce.These constraints will be discussed in detailbelow.

Sequence Operators

1. A compound sequence is produced bythe combination of two or more sequencesunder the action of a sequence operator. Thecentral sequence operator is pr (prime), withtwo others, ret (retrograde) and inv (inver-sion) defined as elaborations of pr. As is thecase for sequences, the designation of a ref-

Figure 3. Simple examples to illustrate the system.

3 The coding of a run of identical symbols in such afashion has been proposed by others (e.g., Leewenberg,1971; Restle, 1970; Simon, 1972; Vitz & Todd, 1969).


erence element r for a compound sequenceproduces a sequence of notes.

2. Consider two sequences: A = {(Ao,. . . ,* , . . . , An_,); a} and B = {(B0 *,. . . , B,,,-!); ft} for not necessarily distinct al-phabets a and ft. Observe that for r E a, {A;r} is a sequence of notes (ao, . . . , am), andthat for each i, 0 <, i <, n - 1 such that aiG ft, {B; &i} is a sequence of notes (bi0, . . . ,bi(m-i))- The compound sequence /l[pr]5 to-gether with the reference element r producesthe sequence of notes of length n X m.

{A(vi]B; r}

= {B;a0}, {# a,} , . . . , {fl;an_,}

= (boo, b0i, . . . , bo(m-|), bio, • • • ,

bl(m-l), • • • . b(n-i)o, . . . , b(n_D(m_|)).

Note that this is possible only if a; E ft for0 s i ^ n — 1. (Thus there are constraintson the alphabet of B imposed by the alphabetof A)

This process is reversible; that is, the se-quence of notes

(boo, • • • i bo(m-|), bio, • • • > b(n-i)(m_i))

and the sequence B = {B; ft} produce therepresentation

{fi;a0}{5; a ,} , . . . , {B, aB_,}

and therefore the higher-level sequence ofnotes (a0, . .., an_,).

The example shown on Figure 1 providesa simple illustration of the use of the oper-ator pr (prime). This has the representation:

A = {(*, 3n); Ctr}

B = {(p, *); Cr}

S = {A[pr]B; c}

where Ctr represents the C major triad, Crthe chromatic scale, and c the reference ele-ment.

3. For any sequence B = {(B0 , . . . , B^,*, B^+i,..._, Bm_i); ft} define the retrogradesequence B = {(Bm_,, . . . , B,+1, *, B,_,,. . . , B0); ft}. The compound sequence/l[ret]5 together with the reference elementr produces the sequence of notes

{A[ret]B; r} = {A[pr]B; r}

4. For any sequence B = {(B0, . . . , B/_i,*, B^+i, . v , Bmrl); ft} define the invertedsequence B = {(B0, . . . , B/-,, *, Bm, . . . ,Bm_,); ft} where

n if

n' if

s if

p if

p1 if

= p

= n'

The compound sequence ^4[inv]5 togetherwith the reference element r producesthe sequences of notes {/4[inv]5; r} ={A(pT]B; r}.

5. Recognizing that a structure might beinvoked with different alphabets, define asequence with multiple alphabets as B = {B;fto,fti,..., ftn-i } where each ft{ is an alphabetnot necessarily distinct from the others. Acompound sequence, say ,4[pr]5 whereA = {A; «} for A = (A0, . . . , * , . . . , A^),is realized for a reference element r G a asa sequence of notes

; r} = {{B; /30(modn)}; a0},

{{B; j8,_,(modn)-l(mod n)}; a,-i}

Similar definitions hold for ret and inv. Notethat the single alphabet case given in Rule2, that is, the case n = 1, is simply a par-ticular case of this rule.

6. The power of the sequence operatorsis extended by allowing for a string of op-erators to act on a string of sequences. Forany sequence A = {(Ao, . . . , * , . . . , An_,);a}, sequences B0, BI, . . . , 5M-i> and se-quence operators opo, opi, . . . , opN-i thecompound sequence

A[op0, op,, . . . , opN_i](fi0, • • • , BM-i)

together with the reference element r pro-duces the sequence of notes

, . . . , opN_,](fi0, • • • , flw-i); r)

= {Q>; a0}, {C,; a,}, . . . , {Cn_,; a,


where {A; r} = (a0, . . . , an_i) and

^i(modM) if Opi(modN) ~ pr/?„ _ , . ,< if opi(modN) = ret

if opi(mod N) = inv

Note that for N = M = 1, the definitions inRules 2, 3, and 4 are simply special casesof this rule.

A simple example of the use of pr (prime)together with ret (retrograde) is illustratedon Figure 4 (a). This may be represented as

A = {(*, s); C}

B = {(*, 2n); C}

S1 = {A[pi, ret]5; c]

where C represents the C major scale andc the reference element.

A simple example of the use of pr (prime)together with inv (inversion) is illustrated onFigure 4 (b). This may be represented as

A = {(*, p); Ctr}

B = {(*, n, p); Ctr}

S = {A[pr, inv]B; c}

where Ctr represents the C major triad andc the reference element.

7. In any compound sequence A[op0,. . ., opN_i](50, .. ., #M-i) it is permissiblefor one or more of A, B0, ..., BM-\ to becompound sequences. In this case Rules 2,3, 4, 5, and 6 above apply as stated, subjectto the restriction imposed by the fact that

a.

b. r . r

B and Bt are not defined if Bt is a compoundsequence.

8. In any compound sequence the occur-rence of a string of length k of a sequenceA or sequence operator op will be abbrevi-ated L4 and kop respectively.

Alternation

Consider two sequences of notes,

S - (a0, a1; . . . , a,,-,) of length n, and

T = (b0, bi, . . . , bm_i) of length m.

For integers i and j, subject to the constraintthat n/i = m/j is an integer k, define thesequence of notes of length n + m

Figure 4. Simple examples to illustrate the use of se-quence operators ret (retrograde) and inv (inversion).

= (a0, a,, . . . , aj-iXbo, b,, . . . , bj_i)

(a(k_i)i, . . . , aki_1Xb(k_i) j, . . . , biy-,)

Note that if n = m and i = j = 1 this resultsin the simple alternation of the elements oftwo equally long sequences of notes.

A relatively simple example illustratingthe use of the alternation operation is givenbelow, illustrated on Figure 1 1 .

Choice of Sequence Operators

The sequence operators pr (prime), ret(retrograde), and inv (inversion) were cho-sen from considerations of musical analysis.Prime is the basic operation that producesa compound sequence from a set of se-quences. The term prime is employed in thetheory of twelve tone music to refer to thepresentation of a row of tones without trans-formation (Perle, 1972). The term is bor-rowed here, but no other assumptions fromtwelve-tone theory are implied. The termretrograde is used in music theory to referto the presentation of a sequence of notes inreverse order. Similarly, the term inversionis employed to refer to the presentation ofa sequence of notes in such a way that allascending steps become descending steps,and vice versa. Retrogression and inversionare frequently used as compositional devicesin both traditional and contemporary music.


It should be noted, however, that in tonalmusic, inversion takes place along a givenpitch alphabet (such as a diatonic scale ora triad) with the result that interval sizes aretypically altered. This is captured in our for-malism. (In atonal music based on thetwelve-tone chromatic scale, inversion re-sults in the preservation of interval sizes also;we do not assume that this is necessary). Thefrequent use of retrogression and inversionin music provides strong evidence that weemploy these operations with ease.

A further advantage of these operationsis that they considerably reduce the numberof structures that are required. For example,given the basic structure (*, n), its retro-grade is (n, *) and its inversion is (*, p).These operations therefore allow for a con-siderable reduction in memory load.

Pitch Alphabets

The system as so far described specifiesthe pitch alphabet associated with eachstructure in absolute terms. This device wasemployed in order to simplify the expositionof other parts of the system; however it isunsatisfactory on several grounds. First, inorder to represent even simple melodic linesin absolute terms we would need to call ona huge repertory of alphabets. Second, it isevident that we encode pitch alphabets inrelational terms rather than absolute, sincewe retain segments of music in transposableform. Third, in tonal music there are certainwell-defined rules governing relationshipsbetween pitch alphabets, and these rules con-siderably restrict the number of alphabetsthat can be invoked in combination. Theserules are captured in notational devices usedby musicians, and we propose that they re-flect the ways in which relationships betweenalphabets are encoded.

One may think of the twelve-tone chro-matic scale as the parent alphabet fromwhich families of alphabets are derived. Thealphabets most commonly employed in tonalmusic are the major and minor diatonicscales (e.g., C major, D minor). All thesecan be expressed in terms of steps traversingthe chromatic scale. Thus one octave of theascending C major scale may be notated as

{{(*, 2n2,n, 3n2, n);Cr};c}

where Cr refers to the chromatic alphabetand c is the reference element. (The descend-ing major scale is the ascending scale in ret-rograde form.) Similarly, one octave of theascending D major scale may be notated as

{{(*, 2n2,n, 3n2, n)Cr}; </}

Observe that these two representationsdiffer only in the identities of their referenceelements. Thus we may assume that we haveencoded in long-term memory the sequence

{(*, 2n2, n, 3n2, n); Cr}

which specifies any major diatonic scale.The harmonic form of the minor scale

may be notated as

{(*, n2, n, 2n2, n, n3, n); Cr}

The natural form may be notated as

{(*, n2, n, 2n2, n, 2n2); Cr}

In both cases the ascending form is the ret-rograde of the descending form. The melodicminor scale has two different representationsdepending on whether it is in ascending ordescending form. In ascending form it maybe notated as

{(*, n2, n, 4n2, n); Cr}

and in descending form as

{(2n2, n, 2n2, n, n2, »); Cr}

Again we assume that these sequences areretained in long-term memory.

The term "key" is used to refer to thecollection of notes forming a particular dia-tonic scale. Thus the term "key of C major"refers to the collection of notes (C, D, E, F,G, A, B). The term "key of D major" refersto the collection (D, E, F#, G, A, B, C#),and so on. Any segment of tonal music isheld to be in one of the 12 possible majoror minor keys. This will be reflected in ournotation.

Another common alphabet employed intonal music is the arpeggiation of a triad.Triads can be constructed on each note ordegree of a diatonic scale. As shown on Fig-ure 5, each triad has the structure

(*, 2n2, n3)


which is realized upon specifying as alphabetthe diatonic scale on which it is based, andas reference element the diatonic position ofits fundamental note. Thus in the key of Cmajor the C major triad may be notated as

{{(*,2n2,n3);C};l}

where C denotes the alphabet of the C majorscale, and 1 specifies the diatonic positionof its fundamental note. Similarly, again inthe key of C major, the D minor triad maybe notated as

{{(*,2n2,n3);C};2}

Observe that the intervals comprising thedifferent triads vary, so that they may bemajor, minor or diminished, depending onthe scale degrees on which they are based.However this difference can be ignored inmusical notation which may simply specifya triad by its scale degree. We assume thatthis reflects a simplicity of encoding, that is,that all triads are encoded in terms of thesame overlearned structure.

Another alphabet that is traversed in tonalmusic is the arpeggiation of a seventh chord.Such a chord is formed by the addition toa triad of a note that forms an interval ofa seventh with the fundamental note. Asshown on Figure 6, each seventh chord there-fore has the structure

(*, 3n2, n)

regardless of the intervals formed by its com-ponents. Other chords may also serve as al-phabets, and can be notated in analogousfashion.

We can take advantage of such relation-ships to simplify our notation and at thesame time enable it to reflect more accu-rately the ways in which pitch alphabets andtheir relationships are encoded. We will dis-pense with specifying alphabets in absoluteterms, with the exception of the chromaticscale (Cr). Instead, for each sequence ofnotes we shall specify a key, such as G (Gmajor) or c (C minor). If the alphabet as-sociated with a structure is diatonic it willnot be specified further (as provided for inRule 4). If it is triadic, we will only specifythe scale degree on which it is based (I, II,etc.). Similarly if it is a seventh chord wewill specify it as I7, II7, etc. The referenceelement (r) is also specified as a scale degree(Arabic numerals are used here to differ-entiate the specification of the reference ele-ment from the specification of a chord ar-peggiation).

Thus the example on Figure 1 may benotated as

A = {(*, 3n); 1}

B = {(p, *); Cr}

S = {A[pr]B; 1}C

where I indicates the triad on the first de-

a.I II III IV V VI VII

major minor minor major major minor diminished

b. is111 iv vq vi viitj

minor diminished major minor major major diminished

C. =ffi TFi n in iv v Vi vii

minor diminished major minor minor major major

Figure 5. The diatonic traids. (Panel a: Major. Panel b: Minor in harmonic form. Panel c: Minor innatural form.)


M7 m7 m7 M7 V m7 07

a.I 7 I I 7 I I I 7 IV 7 V 7 V I 7 VII 7

m/M7 07 M7 m7 V7 M7 *7

Figure 6. The diatonic seventh chords. (Panel a: Major. Panel b: Harmonic minor.)

gree, Cr the chromatic alphabet, 1 the ref-erence element and C the key of C major.

Observe that if a sequence of notes istransposed to a different key, only one sym-bol in this notation is changed. (For exampleif the above sequence were transposed to thekey of G major, the C would change to G.)Further if a sequence of notes is modulatedbetween major and minor, again only onesymbol is changed. (If the above sequencewere modulated to C minor, the C wouldchange to c.) Thus these new notational de-vices capture the ready transposability ofmelodic segments and their easy modulation:In each case the representation is barely al-tered.

It can be seen that in specifying a se-quence that has an arpeggiated chord as al-phabet, we are in effect specifying a struc-ture that has as alphabet another structure(such as (*, 2n2, n3), the structure for thetriad), which has in turn as alphabet anotherstructure (such as (*, 2n2, n, 3n2, n), thestructure for the major diatonic scale), whichin turn is based on the fundamental alphabetCr. Thus in place of a substantially largenumber of alphabets we now have a verysmall number of highly overlearned struc-tures that act on each other in hierarchicalfashion. This system allows for the produc-tion of melodic segments of enormous vari-ety through the invocation of a very smallset of basic structures. To give a concreteidea of this encoding parsimony, let us re-strict ourselves to tonal music that is com-posed of the following alphabets: the 12

major scales, the 12 ascending and 12 de-scending melodic minor scales, the 12 har-monic minor scales, the 12 major triads, the12 minor triads, the 12 diminished triads,the 12 major seventh chords, the 12 minorseventh chords, the 12 diminished seventhchords, the 12 half-diminished seventhchords, the 12 dominant seventh chords, andthe chromatic scale. This gives us a total of145 possible alphabets, specified in absoluteterms, that the listener would be required toinvoke. However, in the present system thelistener need only retain seven overlearnedstructures to obtain the same result (thestructure for the major scale, the ascendingand descending melodic minor scales, theharmonic minor scale, the triad, the seventhchord and the chromatic scale). Adding afurther arpeggiated chord to the repertoryof alphabets would analogously lead to theaddition of a large number of alphabets asspecified in absolute terms, but only one ad-ditional structure on the present system. Thisencoding parsimony is achieved through thesuperposition of one unequal-interval scaleon another. In a musical system that wascomposed instead of equal-interval scales,the advantage of such a hierarchy would begreatly reduced (Figure 7).

We should also observe that in the presentsystem there is a restriction on the numberof alphabets, as specified in absolute terms,that are allowed to be combined to form asequence of notes. This is in accordance withmusical intuitions. If we were to pick a com-bination of alphabets at random, the resul-


tant sequence of notes would be likely tosound incorrect to a listener who is familiarprimarily with tonal music. We propose thatthis is because the listener would be unableto fit such a combination into the coding sys-tem proposed here. At the same time, tonalmusic is enormously versatile, and we arenot generally conscious of these restrictions.

We are not assuming that the proposedsystem is hardwired in any way; clearly fromconsideration of other types of music it isnot. However, it is likely that any musicaltradition would have evolved its own systemof rules that restrict the number of allowablecombinations of alphabets, because withoutsuch restrictions the processing load wouldbe too heavy.

Chord Progressions

A different type of generative process alsooccurs in tonal music. This concerns under-lying harmony. The degree to which har-monic structure influences melodic structurehas been the subject of considerable debateamong music theorists; some asserting thatmelody can be understood only in terms of

implied harmony (Schenker, 1956, 1973)and others hypothesizing a relative indepen-dence (Meyer, 1973; Narmour, 1977). Itwould seem that the degree to which oneprocess depends on the other is a functionboth of the type of music and also of thetendencies of the individual listener.

Chord progressions are strongly hierar-chical in nature. In tonal music the tonictriad predominates over the other triads ina key. It serves as a point of departure forharmonic progressions, and also as the ul-timate goal of a harmonic progression. Thusthe tonic triad may generate a progressionthat itself generates other progressions; andso on, in hierarchical fashion. A detailedanalysis of chord progressions is, however,outside the scope of the present paper.

The generation of a sequence of chordsdiffers in an important respect from the gen-eration of a sequence of notes. A note hasonly one realization; however, a chord is anabstraction that can be realized in a numberof different ways. Thus I in the key of Cmajor may be realized as any combinationof Cs, Es or Gs. V in the key of C majormay be realized as any combination of Gs,

Chromatic Scale.-B. D« F« C.. .CI . . .D. . .

C major scale

l((",2n2,n,3n2,n);Cr)c}.

Triad on 1 of C major

(((*,2nV);C)l) -


{{(*,2nV);C}2)


U(*,2nV);C}7) .

Figure 7. Parsimony of encoding achieved by embedding of alphabets. (The intervals composing thedifferent triads vary depending on the scale degrees on which they are based. However, they have theidentical abstract structure when encoded in terms of the diatonic scale, rather than in terms of theequal-interval chromatic scale.)


Fit

Ftt •Ftt

C|t—D CB-D B — C B — C Gtt-A G|t-A E t t — F i t Elt-Ftl

Figure 8. Example to illustrate the system. (From Beethoven, Sonata, op. 22.)

Bs or Ds. A harmonic progression thereforeresults from the generation of one abstrac-tion by another. This type of generation ismore similar to that found in transforma-tional linguistics, where a grammatical cat-egory such as "noun phrase" may, throughthe application of a rewriting rule, produceother grammatical categories such as "de-terminer" and "noun." It may be observed,however, that the choice of a lexical for-mative ultimately depends on the grammat-ical category to which it belongs. In contrast,a given note can in principle serve as therealization of any part of a sequence ofchords.

We assume that the generation of a se-quence of chords may occur in parallel withthe generation of a sequence of notes. Thesequence of notes that is realized at anystructural level is always compatible with thesets of alternative notes determined by thesequence of chords at that level. We assumethat the listener is aware of this compati-bility, which provides redundant informationfor use in retrieval.

A chord progression may be stated in hor-izontal form, and so may serve as a stringof alphabets associated with a structure.This is illustrated in the third musical ex-ample below (see Figure 10 below).

Some Musical Examples

Here we give three examples to illustratethe system. The example on Figure 8 maybe represented as on three structural levels.

A = {(*, 3p); V7}

B = {(«, s)}

C = {(p, *); Cr}

S = {A[pr]B(pr]C;

This example illustrates the use of a chordarpeggiation as alphabet at the highest leveland the chromatic scale at the lowest level.It also illustrates flexibility in placement ofthe reference element; in sequences A andB the reference element occurs first in thestructure; in sequence C it occurs second.

The example on Figure 9 may also be rep-resented as on three structural levels.

inv, 5pr]

5 = { ( * , n , p ) ; l }

S = {A[pi](B, 4{(*

(B, {(*)}); 3}b

This example illustrates the use of the op-erator inv together with pr.

The example on Figure 10 may be rep-resented as on four structural levels.

A = {(*,?)}

5={(* ,n ) ; l ,V 7 }

C={(*,s)}

= {A[pr]B[pT]C[pT](D, {(*)},

2{(*)});3}A


f rHA

•FIT Fit

D — B — D Fit-B—F|t D —F|t-D B — D—B F>-B —Fit D— Fit—D B

Figure 9. Example to illustrate the system. (From Bach, Sinfonia 15, BWV 801.)

This example illustrates the use of a chordprogression as alphabet.

The example shown on Figure 1 1 may berepresented as two interleaved sequences ofnotes, each consisting of two structurallevels.

A = {(*, 2p)}

B = {(*, n, p)}

C={(*,2s)}

51 = {/4[2pr, inv]B; 3}

52 = {^[pr]C; 5}

5 = 5,[alt 1, 1]S2, D

This example illustrates the use of the al-ternation operation.

Generation of a Pitch Sequence From itsStored Representation

It is assumed that, as reflected in the aboveformalism, sequence structures and theirassociated alphabets are retained in parallelat different hierarchical levels. It is furtherassumed that the observer most commonlygenerates a sequence of notes from its storedrepresentation in a "top-down" fashion. Thereference element is first applied to the high-est level, thus realizing a sequence of notesat this level. These notes in turn serve asreference elements for the realization of asequence of notes at the next-lower level(through the action of a compound operatoror operators). This process is continued untilthe sequence of notes at the lowest level isrealized.

c«

C t t '

cit-

•c t t

Cjl D C|t E E B CD B D D

Figure 10. Example to illustrate the system. (From Mozart, Sonata, K. 300'.)


F»

Figure 11. Example to illustrate the system. (Sequences S, and S2 are presented interleaved in time.From Beethoven, Six Variations on the Duet "Nel cor piu non mi sento" from Paisello's La Molinara.)

This system has the consequence that thenotes occurring at the highest level shouldbe recalled best and those occurring only atthe lowest level should be recalled least well.This is for two main reasons. First, whenretrieval occurs in a top-down fashion, inorder to retrieve notes at a lower level, thehigher-level notes must already have beenretrieved. Thus, if a retrieval error occurs ata higher level, this will be reflected in furthererrors at all lower levels. Second, regardlessof the direction of the retrieval process, oncethe full sequence of notes has been retrieved,the higher up a note is represented, the moreoften it is represented. This redundancyagain increases the probability of accuraterecall for the higher-level notes. Such anemphasis on the higher-level notes, which isa consequence of this system, is in accor-dance with musical intuitions and also withassumptions generally made by music the-orists.

The Acquisition of a Representation

This section is concerned with the pro-cesses whereby the listener acquires a rep-resentation from the pattern of sounds thathe or she hears. It is assumed that an initialset of groupings is formed on the basis of

simple perceptual mechanisms, followingwhich more complex mechanisms are in-voked.

One of the most powerful principles in-volved in grouping a sequence of items istemporal proximity. This has been shownusing a variety of stimulus materials (Bower& Springston, 1970; Bower & Winzenz,1969; Dowling, 1973a; Handel, 1973;McLean & Gregg, 1967; Mueller & Schu-mann, 1894; Restle, 1972). A study ad-dressed to this issue specifically with regardto pitch sequences was performed by Deutsch(1980). Subjects were presented with se-quences of 12 notes which they recalled inmusical notation. In the first experiment halfof the sequences were structured in accor-dance with the present model, such that ahigher-level subsequence of four elementsacted on a lower-level subsequence of threeelements. The remaining sequences wereunstructured. Sequences were presented ei-ther with no temporal segmentation, withsegmentation in groups of three (i.e., in ac-cordance with tonal structure), or with seg-mentation in groups of four (i.e., in conflictwith tonal structure). It was found that thelevel of recall for the structured sequenceswas very high in the absence of temporalsegmentation, and even higher when seg-mentation was in accordance with tonal


structure. The recall level was much lowerfor structured sequences that were seg-mented in conflict with tonal structure, asit was for the unstructured sequences. Anal-yses of serial position curves and transitionshift probabilities demonstrated that thesubjects were grouping these pitch sequencesin accordance with temporal proximity ratherthan tonal structure when the two wereplaced in conflict. A second experiment ex-amined the effects of compatible and incom-patible segmentation for sequences that werehierarchically structured such that the lower-level subsequences consisted of either groupsof three or groups of four. In both cases re-call was excellent when temporal segmen-tation was in accordance with tonal struc-ture, and poor when temporal segmentationconflicted with tonal structure.

This study demonstrates that even withsequences whose tonal structure is so clearas to produce a very high level of recall inthe absence of temporal segmentation, seg-mentation in conflict with tonal structureessentially obliterates the listener's ability toexploit this structure to produce a parsi-monious representation. This emphasizes theimportance of low-level perceptual groupingin the induction of a sequence representa-tion.

Another principle involved here is prox-imity along the pitch dimension. There is astrong tendency to group together elementsthat are proximal in pitch, and to separatethose that are farther apart. For this reason,pitch separation between two melodic linesis required to achieve the perception of pseu-dopolyphony (Dowling, 1973b), and so forthe encoding of a representation involvingthe alternation operation.

Processing difficulties have also beenshown to occur between temporally adjacentnotes that are widely separated in pitch(Bregman, 1978; Bregman & Campbell,1971; Deutsch, 1972; Van Noorden, 1975).Thus, a prerequisite for the formation ofcoherent perceptual groupings is that thepitch separation between the notes within agroup should not be too large. If a sequenceof notes is presented such that adjacent sub-sequences are in different pitch ranges, thelistener will tend very strongly to formgroupings in accordance with pitch prox-

imity. If the pitch ranges of adjacent sub-sequences are too far apart, the listener maybe unable to integrate the key elements ofthese subsequences, and so be unable to formhigher-order linkages between them. Opti-mally for our purposes, therefore, adjacentsubsequences should be composed of notesthat differ somewhat in pitch range, but notso much as to prevent the formation ofhigher-order linkages between the key ele-ments of these subsequences.4

Perceptual groupings are also likely to beformed on the basis of loudness, timbre, orspatial location. As discussed above for thecase of pitch, substantial differences alongsuch dimensions will act as powerful group-ing principles; however, if the differences aretoo large, the listener may be unable to in-tegrate the key elements of the different sub-sequences. So again, the optimal perceptualcondition here is some difference along thegiven dimension, but not too large a differ-ence.

In considering low-level perceptual factorsthat lead the listener to choose an elementin a subsequence as the dominant element,a similar argument applies. If this elementdiffers from the others along some dimension(e.g., if it is higher, louder, or has a distinc-tive timbre), it will assume prominence.However, if this difference is too large, itwill instead be dissociated from the otherelements in the group. (This point has beenmade by Cooper & Meyer, 1960.) In gen-eral, patterns of metrical stress providestrong cues for the formation of tonal hier-archies.

Other simple perceptual principles arealso involved. For example, sequences whosecomponents combine to form a unidirec-tional pitch change are likely to be perceivedas a group. This may be regarded as an ex-ample of the principle of good continuation(Divenyi & Hirsh, 1974; McNally & Han-del, 1977; Nickerson & Freeman, 1974; VanNoorden, 1975; Warren & Byrnes, 1975).Further, if the pitch contour is repeatedly

4 This issue is a thorny one, and the reader is referredto Deutsch (in press-a, in press-b) for a discussion ofthe conditions under which perceptual integration ofsequences of notes that are far apart in pitch is madepossible. This also includes a discussion of octave equiv-alence effects.


presented, the listener will tend to formgroupings on the basis of this identity of con-tour. Dowling (1978) has made the pointthat contour, independent of either intervalsize or number of steps along a scale, istreated as a perceptual attribute in music.

In addition to the general, rather primitiveperceptual principles we have described, theencoding of a sequence representation mustalso involve complex processes, in which thelistener draws on his expectations about agiven musical style. Perhaps the most im-portant of these is the process of key attri-bution. An extended discussion of how thisis achieved is beyond the scope of the presentpaper. It is sufficient to note here the ex-perimental evidence that key attribution isreadily and quickly accomplished, and on thebasis of very little information (Cohen, Note1; Cuddy, Cohen, & Miller, 1979).

Once a key has been attributed, it is as-sumed that the listener searches for notesthat are prominent within the key as can-didates for inclusion in higher-level subse-quences. It is generally accepted in the the-ory of tonal music that the first, third, fifth,and eighth scale degrees, forming the tonictriad, have prominence or conceptual prior-ity over the other scale degrees. (Thus, forinstance, in the key of C the notes C, E, andG have prominence over the other notes.)The remaining notes in the diatonic scale inturn have prominence over the rest of thenotes in the chromatic scale (see also Krum-hansl, 1979). Thus, in representations ofsimple tonal music, the tonic triad is mostlikely to be traversed at the highest struc-tural level, and the chromatic scale at thelowest level. It is assumed that the listenermakes use of this knowledge in assigningnotes to different structural levels.

Once such a preliminary mapping hastaken place, it is assumed that the listenerattempts to form representations in whichsequence structures are repeated at any hi-erarchical level. That is, the more often astructure is perceived as repeating at a givenlevel, the greater the probability that it willbe encoded at that level (see also Simon &Sumner, 1968).

So far we have been viewing the listeneras generating a single representation foreach pitch sequence. However, segments of

music are often amenable to more than oneanalysis, and it can be shown that composersexploit such ambiguities. For example, twoadjacent notes may clearly belong to sepa-rate groupings when a theme is first pre-sented, but later the relationship betweenthese notes may assume importance. Indeed,it has been argued that such structural am-biguity contributes importantly to interestin music (Lerdahl & Jackendorff, 1977;Meyer, 1973; Narmour, 1977). Given suchevidence we assume that the listener oftensets up multiple representations in parallel.At any one time, the representation that ismost parsimonious, or that is most in accor-dance with perceptual grouping mecha-nisms, is most likely to be realized. However,given a change in the stimulus configuration(for example in its temporal patterning) analternative representation may be realizedinstead.

Meyer (1973) presents a good example ofsuch structural ambiguity. The theme of thefirst movement of Mozart's Sonata in Amajor (part of which is notated above) isgiven in Figure 12 (a). Meyer observes thatas this theme is presented, the descendingfourths E-B and D-A are not perceived bythe listener, since these melodic intervalscross perceptual group boundaries as deter-mined by the configuration as a whole. How-ever, the potential for a representation thatexploits the repeated descending fourths ispresent and is actualized in a late variation,shown in Figure 12 (b). Here the pattern oftemporal relationships is such as to inducethe listener to perceive an alternative rep-resentation instead.

This discussion of multiple representa-tions may be related to the model proposed

Figure 12. Panel a: Theme of first movement of Mozart,Sonata K. 300'. Panel b: Variation exploiting an alter-native representation. (Adapted from Meyer, 1973.)


by Restle (1979) of the perception of motionconfigurations. Restle points out that a givendisplay may potentially be represented in alarge number of different ways, and can bethought of as ambiguous in principle. How-ever many interpretations, though possible,are not seen. Restle argues that the observerwill actualize the interpretation that has theminimum information load. If two or moreinterpretations have equal and minimal in-formation loads, then both these interpre-tations will be seen, and the display will beambiguous in practice.

One may also view the acquisition of asequence representation as an ongoing pro-cess in which the listener, when presentedwith an initial sequence of pitch events, gen-erates a set of alternative representations,some of which are confirmed by later pitchevents and others of which are discarded.The later events in turn combine with earlierevents to form the basis for a set of moreelaborate representations; again, some ofwhich are confirmed by later pitch eventsand others discarded. This process of gen-erating successively more elaborate repre-sentations and eliminating earlier ones maybe quite prolonged; but ultimately the lis-tener achieves a set of alternative represen-tations and their preference weightings.

This view is in accordance with the"implication-realization" model of Meyer(1973). Meyer argues that an implicativerelationship is one in which a pitch event,called the generative event, is patterned insuch a way that reasonable inferences canbe made as to how the event is to be con-tinued. A pitch event that is implied by agenerative event may itself become a gen-erative event at a higher level. Meyer arguesthat in forming such implications the listenerrelies in large part on principles such as goodcontinuation at each hierarchical level. Forexample, a linear pattern (i.e., based on adiatonic scale) at one level may imply a fur-ther event, which when realized in turn im-plies a continuation of an arpeggiated triadat a higher level.

Processing Advantages of the System

The system proposed here has several pro-cessing advantages. The first involves redun-

dancy of representation. It has been shownby Restle (1970) and Restle and Brown(1970) that when a sequence of elements iscomposed of subsequences that are linkedtogether only by rule systems, recall is bestfor elements at the lowest level, and pro-gressively poorer for elements at progres-sively higher levels. It follows that if nohigher level sequences of notes were realizedin music, we should expect musical segmentsto be recalled in fragmentary fashion: Thelistener would be most likely to make errorsat the highest-level locations. The presentsystem avoids this problem, since the higherup a note or sequence of notes is represented,the more often it is represented. This has theconsequence that higher-level sequences serveto cement lower-level sequences together.

A second processing advantage involvesthe ability to invoke distinct alphabets atdifferent structural levels. This primarilyconcerns the process whereby the listeneracquires a representation from the patternof sounds that he hears. The presence ofdistinct alphabets at different structural lev-els helps to separate out the sequences ofnotes associated with each level. Such anadvantage is implied in statements by musictheorists who advise, for example, that lower-level notes should be chromatically alteredunder certain circumstances to disambiguatethe hierarchical structure of a melody. In theexample in Figure 13 (a), for instance, thechromatic alterations make the hierarchicalstructure easy to perceive. However, if thesenotes were not chromatically altered, asshown in Figure 13 (b), the structure of thepassage would become ambiguous (Forte,1974). A similar line of reasoning applies tothe chromatic alterations in the example onFigure 1 (a).

A third advantage to be gained from thissystem is that it enables sequence structurestogether with their associated alphabets tobe encoded as chunks. Several investigatorshave shown that for serial recall of a stringof items, performance levels are optimalwhen such a string is grouped by the observerinto chunks of three or four items each(Estes, 1972; Wickelgren, 1967). Thus onthe present system if a string of operatorstogether with an alphabet were grouped to-gether in chunks of three or four, superior


Figure 13. Clarification of hierarchical structure bychromatic alterations. (Panel a: From Mozart, Sym-phony in D major, K. 385. Panel b: Same passage withchromatic alterations removed [adapted from Forte,1974].)

recall would be expected, in comparison witha system in which each operator is encodedindependently. When segments of tonal mu-sic are notated on the present system, thereemerges a very high proportion of chunks ofthree or four items each (e.g., {(*, p);l} or{(*, p, n); Cr}). This is exemplified by theexamples in the present paper. As pitch se-quences become more elaborate, they arerepresented as on a larger number of hier-archical levels, but the basic chunk size doesnot appear to vary with changes in sequencecomplexity. This chunking feature thereforeserves to reduce memory load.

A further processing advantage that arisesfrom a system in which strings of operatorsare chunked together, is that it enables rep-resentations to be created whose parts formconfigurations that are in accordance withlaws of figural goodness (Wertheimer, 1923).For example, a structure consisting of op-erators of the same type (e.g., n, n2, n) willproduce a sequence that exhibits good con-tinuation. Evidence has been obtained thatpitch sequences are more efficiently per-ceived when their components combine toproduce unidirectional pitch changes thanwhen they do not (Divenyi & Hirsh, 1974;McNally & Handel, 1977; Nickerson &Freeman, 1974; Van Noorden, 1975; War-ren & Byrnes, 1975).

Similarly, it has been shown in a numberof contexts that sequences are more effi-ciently perceived when their components areproximal in pitch than when they are spacedfarther apart (Bregman, 1978; Bregman &Campbell, 1971; Deutsch, 1975, 1978a;

Dowling, 1973b; Van Noorden, 1975). Thisis a manifestation of the principle of prox-imity. In all the above work, however, onlyproximity along a single pitch scale (corre-sponding to log frequency) was considered.This principle may be extended to the useof scales based on abstract alphabets as well(see also Longuet-Higgins, 1978).

When segments of tonal music are rep-resented in the present system, there emergesa very large proportion of single steps (n'sor p's) in the representations. This is ex-emplified by the examples in the present pa-per. Double steps (n2's or p2's) also some-times occur; but steps larger than these arerare. This is made possible only through theuse of multiple pitch alphabets. For example,if only one alphabet were allowed, the pitchsequence shown on Figure 1 would have tobe represented as

{{(*, n, n3, n, n2, n, n4, n); Cr}; 7}C

However, with the use of the triadic alphabetin conjunction with the chromatic alphabet,this pitch sequence may be represented as

B = {(p, *); Cr}

S = {A[pr]B;

It can be seen that only single steps areemployed in this second representation. Thepresent system, therefore, by providing forthe simultaneous invocation of distinct al-phabets at different structural levels, enablesthe listener to be presented with melodicpatterns of considerable richness and vari-ety, while at the same time enabling an en-coding mainly in terms of proximal rela-tionships.

In addition to conforming with the prin-ciple of proximity, the fact that one step sizeis used much more frequently than othersalso acts to reduce processing load. Thispoint has been made in a related context byDowling (1978).

Discussion

The present model may be related to rep-resentations of hierarchical structure pro-posed by music theorists. The most influ-ential work in this field is that of Heinrich


Schenker (1868-1935) who proposed a hi-erarchical system for tonal music that haspoints of similarity to the system proposedby Chomsky for linguistics (Chomsky, 1963).(In fact Schenker acknowledged that hisideas were inspired by the work of C. P. E.Bach [1714-1788] who in his Essay on theTrue Art of Playing Keyboard Instrumentsdetailed the processes by which a simplemusical event may be replaced by a moreelaborate musical event that expresses thesame basic content [Bach, 1949].) InSchenker's system music is regarded as ahierarchy in which pitch events at any givenlevel are considered "prolonged" by se-quences of pitch events at the next-lowerlevel. Three basic levels are distinguished(though several hierarchical orderings maybe found within each level). First there is theforeground, or surface representation; sec-ond there is the middleground', and thirdthere is the background, or Ursatz. The Ur-satz is considered to be a prolongation of thetriad (Schenker, 1956, 1973).

Schenker's theory is based primarily onthe concept that harmonic structure deter-mines melodic structure; as such it has beencriticized by other theorists who argue thatmelody often acts independently of har-mony. A further criticism is of the rigid and'a priori' nature of the Ursatz, whichSchenker considered immune to change. Inaddition, his critics have argued that by pos-iting only one structural possibility for apiece, Schenker's scheme is too inflexible,since multiple interpretations are often in-dicated. Another criticism is that Schenk-erian analysis does not consider the im-portance of relationships formed withingroupings: Only the hierarchical nature ofthe representation is considered. It shouldbe noted that the present system does notrun into any of the above difficulties. Forliterature related to Schenkerian analysis,see particularly Lerdahl and Jackendorff(1977), Meehan (1979), Meyer (1973),Narmour (1977), Salzer (1962), and Yeston(1977).

The general characteristics of the presenthierarchical system may also be comparedwith those of systems proposed by others forthe representation of visual arrays. Winston(1975) has proposed that visual scenes are

represented as structures consisting of manyembedded levels of organization. Restle(1979) has argued that certain moving con-figurations can best be represented as hier-archies, in which the motion of a point orpoints is described with reference to anotherset of points, which are themselves in motionwith reference to a third set of points, andso on.

Bower and Glass (1976) have proposedthat pictures are represented as structuralhierarchies that are composed of relatedparts, each part corresponding to relation-ships among features at a lower level of anal-ysis. They further assume that relationshipswithin each part follow gestalt rules such asproximity and good continuation. As evi-dence for this, they showed that fragmentsof a picture that formed good patterns servedas strong retrieval cues for redintegratingmemory for the entire picture; whereasequally large fragments that did not formgood patterns served as weak retrieval cues.Further, memory confusions occurred moreoften between patterns containing the samestructural units, than between patterns con-taining different structural units.

Palmer (1977) has also proposed that vi-sual shapes are represented as a hierarchyof structures, whose parts serve as structuresat the next level down in the hierarchy. Healso obtained evidence that we tend to formrepresentations in which elements at eachstructural level are organized in accordancewith laws of figural goodness such as prox-imity. When subjects were asked to dividefigures into parts, they chose organizationsthat were most in accordance with the prin-ciple of proximity. Further, verification thata part was contained in a figure was faster,the greater the degree of goodness of the partwithin the figure. In addition, the time takento synthesize a figure from two parts wasshorter when there was a high degree ofgoodness of the parts within the figure.

The above findings in the case of visionlead us to speculate that the type of modelproposed here may be applied to the internalrepresentation of patterns beyond those oftonal music. For example, an analogousmodel could be proposed for the internal rep-resentation of the environment (Lynch,1960); Chase & Chi, in press). It is unlikely


that tonal music has evolved to accord withan arbitrary set of rules; rather it would beexpected to reflect general principles of cog-nitive organization.

Reference Note

1. Cohen, A. Inferred sets of pitches in melodic per-ception. Cognitive structure of musical pitch. Sym-posium presented at the meeting of the Western Psy-chological Association, San Francisco, April 1978.

References

Attneave, F., & Olson, R. K. Pitch as a medium: A newapproach to psychophysical scaling. American Jour-nal of Psychology, 1911,84, 147-165.

Bach, C. P. E. Essay on the true art of playing keyboardinstruments (W. J. Mitchell, Ed. and trans.). NewYork: Norton, 1949.

Bjork, R. A. All-or-none subprocesses in the learningof complex sequences. Journal of Mathematical Psy-chology, 1968, 5, 182-195.

Bower, G. H., & Glass, A. L. Structural units and theredintegrative power of picture fragments. Journalof Experimental Psychology: Human Learning andMemory, 1976, 2, 456-466.

Bower, G. H., & Springston, F. Pauses as recedingpoints in letter series. Journal of Experimental Psy-chology, 1970, 83, 421-430.

Bower, G. H., & Winzenz, D. Group structure, coding,and memory for digit series. Journal of ExperimentalPsychology Monograph, 1969, 80(2, Pt. 2).

Bregman, A. S. The information of auditory streams.In J. Requin (Ed.), Attention and performance VIII.Hillsdale, N.J.: Erlbaum, 1978.

Bregman, A. S., & Campbell, J. Primary auditorystream segregation and perception of order in rapidsequences of tones. Journal of Experimental Psy-chology, 1971, 89, 244-249.

Chase, W. G., & Chi, M. T. H. Cognitive skill: Impli-cations for spatial skill in large-scale environments.In J. Harvey (Ed.), Cognition, social behavior andthe environment. Potomac, Md.: Erlbaum, in press.

Chomsky, N. Formal properties of grammars. In R. D.Luce, R. R. Bush, & E. Galanter (Eds.), Handbookof mathematical psychology (Vol. 2). New York:Wiley, 1963.

Cooper, G., & Meyer, L. The rhythmic structure ofmusic. Chicago: University of Chicago Press, 1960.

Craik, F. I. M., & Lockhart, R. S. Levels of processing:A framework for memory research. Journal of VerbalLearning and Verbal Behavior, 1972, //, 671-684.

Cuddy, L. L., & Cohen, A. J. Recognition of transposedmelodic sequences. Quarterly Journal of Experimen-tal Psychology, 1976, 28, 255-270.

Cuddy, L. L., Cohen, A. J., & Miller, J. Melody rec-ognition: The experimental application of musicalrules. Canadian Journal of Psychology, 1979, 33,148-157.

Deutsch, D. Music recognition. Psychological Review,1969, 76, 300-307.

Deutsch. D. Octave generalization and tune recognition.Perception & Psychophysics, 1972, // , 411-412.

Deutsch, D. Musical illusions. Scientific American,1975, 233. 92-104.

Deutsch, D. Memory and attention in music. In M.Critchley & R. A. Henson (Eds.), Music and thebrain. London: Heinemann, 1977.

Deutsch, D. Delayed pitch comparisons and the prin-ciple of proximity. Perception & Psychophysics,1978, 23, 227-230. (a)

Deutsch, D. Interactive effects in memory for harmonicintervals. Perception & Psychophysics, 1978, 24, 7-10. (b)

Deutsch, D. The processing of structured and unstruc-tured tonal sequences. Perception & Psychophysics,1980, 28, 381-389.

Deutsch, D. The processing of pitch combinations. InD. Deutsch (Ed.), The psychology of music. NewYork: Academic Press, in press, (a)

Deutsch, D. Grouping mechanisms in music. In D.Deutsch (Ed.), The psychology of music. New York:Academic Press, in press, (b)

Divenyi, P. L., & Hirsh, I. J. Identification of temporalorder in three-tone sequences. Journal of the Acous-tical Society of America, 1974, 56, 144-151.

Dowling, W. J. Rhythmic groups and subjective chunksin memory for melodies. Perception & Psychophysics,1973, 4, 37-40. (a)

Dowling, W. J. The perception of interleaved melodies.Cognitive Psychology, 1973, 5, 322-337. (b)

Dowling, W. J. Scale and contour: Two components ofa theory of memory for melodies. Psychological Re-view, 1978, 85, 342-354.

Ernst, G. W., & Newell, A. GPS: A case study in gen-erality and problem solving. New York: AcademicPress, 1969.

Estes, W. K. An associative basis for coding and or-ganization in memory. In A. W. Melton & E. Martin(Eds.), Coding processes in human memory. Wash-ington, D.C: Winston, 1972.

Forte, A. Tonal harmony in concept and practice (2nded.). New York: Holt, Rinehart & Winston, 1974.

Frances, R. La perception de la musique. Paris: Vrin,1958.

Greeno, J. G., & Simon, H. A. Processes for sequenceproduction. Psychological Review, 1974, 81, 187-196.

Handel, S. Temporal segmentation of repeating audi-tory patterns. Journal of Experimental Psychology,1973, 101, 46-54.

Hanson, A. R., & Riseman, E. M. (Eds.). Computervision systems. New York: Academic Press, 1978.

Kotovsky, K., & Simon, H. A. Empirical tests of a the-ory of human acquisition of concepts of sequentialevents. Cognitive Psychology, 1973, 4, 399-424.

Krumhansl, C. L. The psychological representation ofmusical pitch in a tonal context. Cognitive Psychol-ogy, 1979, //, 346-374.

Krumhansl, C. L., & Shepard, R. N. Quantification ofthe hierarchy of tonal functions within a diatonic con-text. Journal of Experimental Psychology: HumanPerception and Performance, 1979, 5, 579-594.

Leewenberg, E. L. A perceptual coding language for


visual and auditory patterns. American Journal ofPsychology, 1971, 84, 307-349.

Lerdahl, F., & Jackendorff, R. Toward a formal theoryof music. Journal of Music Theory, 1977, 21, 111-172.

Longuet-Higgins, H. C. The perception of music. In-terdisciplinary Science Reviews, 1978, 3, 148-156.

Lynch, K. The image of the city. Cambridge, Mass.:Harvard University Press, 1960.

McLean, R. S., & Gregg, L. W. Effects of inducedchunking on temporal aspects of serial retention.Journal of Experimental Psychology, 1967, 74, 455-459.

McNally, K. A., & Handel, S. Effect of element com-position on streaming and the ordering of repeatingsequences. Journal of Experimental Psychology:Human Perception and Performance, 1977, 3, 451-460.

Meehan, J. R. An artifical intelligence approach to tonalmusic theory. Proceeding of the Annual Conference:Association for Computing Machinery, 1979, 116-120.

Meyer, L. B. Explaining music: Essays and explora-tions. Berkeley: University of California Press, 1973.

Miller, G. A., & Chomsky, N. Finitary models of lan-guage users. In R. D. Luce, R. R. Bush, & E. Galanter(Eds.), Handbook of mathematical psychology (Vol.2). New York: Wiley, 1963.

Miller, G. A., Galanter, E. H., & Pribram, K. H. Plansand the structure of behavior. New York: Holt, Rine-hart & Winston, 1960.

Mueller, G. E., & Schumann, F. Experimented Bei-trage zur Untersuchung des Gedachtnisses. Zeit-schrift fur Psychologie und Physiologie der Sinne-sorgane, 1894, 6, 81-190; 257-339.

Narmour, E. Beyond Schenkerism. Chicago: Universityof Chicago Press, 1977.

Navon, D. Forest before trees: The precedence of globalfeatures in visual perception. Cognitive Psychology,1977, 9, 353-383.

Nickerson, R. S., & Freeman, B. Discrimination of theorder of the components of repeating tone sequences:Effects of frequency separation and extensive prac-tice. Perception & Psychophysics, 1974,16, 471-477.

Palmer, S. E. Hierarchical structure in perceptual rep-resentation. Cognitive Psychology, 1977, 9, 441-474.

Perle, G. Serial composition and atonality (3rd ed.).Berkeley: University of California Press, 1972.

Restle, F. Theory of serial pattern learning: Structuraltrees. Psychological Review, 1970, 77, 481-495.

Restle, F. Serial patterns: The role of phrasing. Journalof Experimental Psychology, 1972, 92, 385-390.

Restle, F. Coding theory of the perception of motionconfigurations. Psychological Review, 1979,86, 1-24.

Restle, F., & Brown, E. Organization of serial pattern

learning. In G. Bower (Ed.), The psychology of learn-ing and motivation: Advances in research and theory(Vol. 4). New York: Academic Press, 1970.

Salzer, F. Structural hearing. New York: Dover, 1962.Schenker, H. Neue musikalische Theorien and Phan-

tasien: Der freie Satz. Universal Edition, Vienna,1956.

Schenker, H. Harmony (O. Jonas, Ed. and annotator;E. M. Borgese, trans.). Cambridge, Mass.: MITPress, 1973.

Simon, H. A. Complexity and the representation of pat-terned sequences of symbols. Psychological Review,1972, 79, 369-382.

Simon, H. A., & Kotovsky, K. Human acquisition ofconcepts for sequential patterns. Psychological Re-view, 1963, 70, 534-546.

Simon, H. A., & Sumner, R. K. Pattern in music. InB. Kleinmuntz (Ed.), Formal representation of hu-man judgment. New York: Wiley, 1968.

Van Noorden, L. P. A. S. Temporal coherence in theperception of tone sequences. Unpublished doctoraldissertation. Technische Hogeschool, Eindhoven, TheNetherlands, 1975.

Vitz, P. C., & Todd, T. C. A model of learning forsimple repeating binary patterns. Journal of Exper-imental Psychology, 1967, 75, 108-117.

Vitz, P. C., & Todd, T. C. A coded element model ofthe perceptual processing of sequential stimuli. Psy-chological Review, 1969, 76, 433-449.

Warren, R. M., & Byrnes, D. L. Temporal discrimi-nation of recycled tonal sequences: Pattern matchingand naming of order by untrained listeners.7oMrna/of the Acoustical Society of America, 1975,18, 273-280.

Wertheimer, M. Untersuchung zur Lehre von der Ge-stalt II. Psychologische Forschung, 1923,4, 301-350.

Wickelgren, W. A. Rehearsal grouping and the hier-archical organization of serial position cues in short-term memory. Quarterly Journal of ExperimentalPsychology, 1967, 19, 97-102.

Winston, P. H. Learning to identify toy block structures.In R. L. Solso (Ed.), Contemporary issues in cog-nitive psychology: The Loyola Symposium. Wash-ington, D.C.: Winston, 1973.

Winston, P. H. Learning structural descriptions fromexamples. In P. H. Winston (Ed.), The psychologyof computer vision. New York: McGraw-Hill, 1975.

Yeston, M. (Ed.). Readings in Schenker analysis andother approaches. New Haven, Conn: Yale Univer-sity Press, 1977.

Yngve, V. H. A model and an hypothesis for languagestructure. Proceedings of the American PhilosophicalSociety, 1960, 104, 444-466.

Received December 15, 1980

The Internal Representation of Pitch Sequences in Tonal...

Documents

Transcript of The Internal Representation of Pitch Sequences in Tonal...