Perceived Sound Quality of Sound-reproducing...

15
Perceived sound quality of sound-reproducing systems AIf Gabrielsson Department of Psychology, Uppsala University, Uppsala, Sweden and Department of Technical Audiology, Karolinska Instituter, Stockholm, Sweden H•kan SjOgren Department of Technical Audiology, Karolinska Instituter, Sto4kholm, Sweden (Received 27 October 1978; revised 19 December 1978) Perceived sound quality of loudspeakers, headphones, and hearing aids was investigated by multivariate techniques from experimental psychology with the purpose (a) to find out and interpretthe meaning of relevant dimensions in perceived sound quality,(b) find out the positions of the investigated systems in thesedimensions, (c) explorethe relations between the perceptual dimensions and the physical characteristics of the systems, and (d) explore the relations between the perceptual dimensions and overall evaluations of the systems. The resulting dimensions were interpreted as "clearness/distinctness," "sharpness/hardness-softness," "brightness-darkness," "fullness-thinness," "feelingof space," "nearness," "disturbing sounds," and "loudness." Their relations to physical variables were explored by studying the positions of the investigated systems in the respective dimensions. Their relations to overall evaluations werestudied, and the implications of the investigations for continued research are discussed. PACS numbers: 43.66.Lj, 43.88.Md INTRODUCTION Equipment for sound reproduction--amplifiers, turn- tables, tape recorders,loudspeakers, headphones, etc.-- is usually described by the manufacturer in terms of physical parameters like power, frequency response, signal-to-noise ratio, various forms of distortion, and so on. For the technically unsophisticated consumer this information is rather useless. Even for the ex- perienced sound engineer such data are not enough to realize how the reproduced sound will actually be per- ceived. The reason for this is that we know too little about the psychophysical relations between the physical parameters and the perceived sound quality. When trying to describe how different loudspeakers, headphones, etc. sound, people oftenuse abig numberof .adjectivesor other expressions: "clear," "soft," "dark,.... hissing, .... shrill," "thin," etc., and it is usual- ly obviouswhich qualities are judged as positive (for instance, "clear") or negative(for instance, "shrill"). In hi-fi magazines and in advertising there also appear an abundance of such expressions in attempts to give an impression of how various equipments actually sound. These are but a few examples of the suggestion that per- ceived sound quality of sound-reproducing systems is a multidimensional phenomenon. It may be assumed that perceived soundquality is constituted by a (limited) number of separate perceptual dimensions, and that it would be possible to give a perceptual description of sound-reproducing systems by stating their positions in such dimensions. Using this assumption as a starting point, the goals of the research project described below are (a) to find out and interpret the meaning of relevant dimensions entering into perceived sound quality, (b) to find out the positions of the investigated systems in these dimen- sions, (e) to explore the relations between the percept- ual dimensions and the physical characteristics of the /systems, and (d) to explore the relationsbetween sep- arale perceptual dimensions and the overall evaluations of the systems given by the listeners. At the present stage most work has been devoted to the first two points, and the investigated systems have included loud- speakers, headphones, andhearing aids. The methods used in the investigations are various forms of multi- variate analysis techniques developed in experimental psychology. Earlier research with similar purposes have been reported by a group of Japanese researchers/-s Eis- ler, 8 McDermott, • KOtter, 8 Jost, ø Gabrielsson, Rosen- berg andSj6gren/ø andStaffeldt. n Some comparisons of results are made in Sec. VII. The present paper represents a summary of a series of experiments which are described in considerable more detail in prepublieation reportsfi 2'•s I. METHODS All experiments below have certain common features as regards stimuli, subjects, judgment methods, and data treatment as follows. A. Stimuli The stimuli ("programs") were tape-recorded sections of music, speech, and various sounds from daily life. Each program lasted for about 30 seconds and was as homogeneous as possible within itself with regard to sound level, presence of musical instruments/voices, musical coherence, etc. The programs used within any experiment were chosen to represent widely varying characteristics as regards type of music, instruments, voices, and other sounds. The programs were played back over loudspeakers, headphones,or hearing aids (generally called "sys- tems" in the following). These were selected for each experiment to represent different characteristics re- garding basic construction principles, size, power, 1010 J. Acoust. Soc. Am. 6•(4), Apr. 1070 0001-4966/70/041019-15500.80 ¸1970 Acoustical Society of America 1019

Transcript of Perceived Sound Quality of Sound-reproducing...

Page 1: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

Perceived sound quality of sound-reproducing systems AIf Gabrielsson

Department of Psychology, Uppsala University, Uppsala, Sweden and Department of Technical Audiology, Karolinska Instituter, Stockholm, Sweden

H•kan SjOgren

Department of Technical Audiology, Karolinska Instituter, Sto4kholm, Sweden (Received 27 October 1978; revised 19 December 1978)

Perceived sound quality of loudspeakers, headphones, and hearing aids was investigated by multivariate techniques from experimental psychology with the purpose (a) to find out and interpret the meaning of relevant dimensions in perceived sound quality, (b) find out the positions of the investigated systems in these dimensions, (c) explore the relations between the perceptual dimensions and the physical characteristics of the systems, and (d) explore the relations between the perceptual dimensions and overall evaluations of the systems. The resulting dimensions were interpreted as "clearness/distinctness," "sharpness/hardness-softness," "brightness-darkness," "fullness-thinness," "feeling of space," "nearness," "disturbing sounds," and "loudness." Their relations to physical variables were explored by studying the positions of the investigated systems in the respective dimensions. Their relations to overall evaluations were studied, and the implications of the investigations for continued research are discussed.

PACS numbers: 43.66.Lj, 43.88.Md

INTRODUCTION

Equipment for sound reproduction--amplifiers, turn- tables, tape recorders, loudspeakers, headphones, etc.-- is usually described by the manufacturer in terms of physical parameters like power, frequency response, signal-to-noise ratio, various forms of distortion, and so on. For the technically unsophisticated consumer this information is rather useless. Even for the ex-

perienced sound engineer such data are not enough to realize how the reproduced sound will actually be per- ceived. The reason for this is that we know too little

about the psychophysical relations between the physical parameters and the perceived sound quality.

When trying to describe how different loudspeakers, headphones, etc. sound, people often use abig number of .adjectives or other expressions: "clear," "soft," "dark, .... hissing, .... shrill," "thin," etc., and it is usual- ly obvious which qualities are judged as positive (for instance, "clear") or negative (for instance, "shrill"). In hi-fi magazines and in advertising there also appear an abundance of such expressions in attempts to give an impression of how various equipments actually sound. These are but a few examples of the suggestion that per- ceived sound quality of sound-reproducing systems is a multidimensional phenomenon. It may be assumed that perceived sound quality is constituted by a (limited) number of separate perceptual dimensions, and that it would be possible to give a perceptual description of sound-reproducing systems by stating their positions in such dimensions.

Using this assumption as a starting point, the goals of the research project described below are (a) to find out and interpret the meaning of relevant dimensions entering into perceived sound quality, (b) to find out the positions of the investigated systems in these dimen- sions, (e) to explore the relations between the percept- ual dimensions and the physical characteristics of the /systems, and (d) to explore the relations between sep-

arale perceptual dimensions and the overall evaluations of the systems given by the listeners. At the present stage most work has been devoted to the first two points, and the investigated systems have included loud- speakers, headphones, and hearing aids. The methods used in the investigations are various forms of multi- variate analysis techniques developed in experimental psychology.

Earlier research with similar purposes have been reported by a group of Japanese researchers/-s Eis- ler, 8 McDermott, • KOtter, 8 Jost, ø Gabrielsson, Rosen- berg and Sj6gren/ø and Staffeldt. n Some comparisons of results are made in Sec. VII.

The present paper represents a summary of a series of experiments which are described in considerable more detail in prepublieation reportsfi 2'•s

I. METHODS

All experiments below have certain common features as regards stimuli, subjects, judgment methods, and data treatment as follows.

A. Stimuli

The stimuli ("programs") were tape-recorded sections of music, speech, and various sounds from daily life. Each program lasted for about 30 seconds and was as homogeneous as possible within itself with regard to sound level, presence of musical instruments/voices, musical coherence, etc. The programs used within any experiment were chosen to represent widely varying characteristics as regards type of music, instruments, voices, and other sounds.

The programs were played back over loudspeakers,

headphones, or hearing aids (generally called "sys- tems" in the following). These were selected for each experiment to represent different characteristics re- garding basic construction principles, size, power,

1010 J. Acoust. Soc. Am. 6•(4), Apr. 1070 0001-4966/70/041019-15500.80 ¸1970 Acoustical Society of America 1019

Page 2: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

frequency response, distortion, and so on.

The total number of stimuli in an experiment was the number of programs (P) multiplied by the number of sys- tems (S). The presentation was monophonic (however ster- eophonic in the experiment with headphones). The level of each program was set to correspond approximately to the original level at the recording (the recording conditions were known for most programs) as mea- sured by a precision sound level meter, placed in a position corresponding to that of the listener's head (loudspeaker experiments) or connected with a cou- pler (the three-volume coupler Brtiel &Kjaer 4153 for headphones and the 2-cc coupler IEC for hearing aids). The perceived loudness of the different systems re- producing a certain program was equalized as far as possible by matching the output from the systems with a white noise signal input within the octave bands 500 and 1000 Hz as well as in broadband condition (dBA). The equalization was also checked perceptually by the experimenters and by subjects in pilot experiments.

B. Subjects and judgment methods

In each experiment 20-42 normal hearing subjects were used. For the most part an experiment was per- formed with two or three subjects at a time. The stim- uli (PxS combinations) were presented in a randomized order, different for different groups of subjects. The subjects were instructed to judge the perceived sound quality according'to one of the following methods:

1. Adjective ratings

The subjects got lists with a big number of adjectives and were asked to designate how well each adjective characterized the reproduction in question by writing a figure from 0 to 9 for each adjective, where 0 meant that the reproduction had nothing of the quality denoted by the adjective, and 9 that the reproduction had a "maximum" of that quality. There was one list for each P xs combination with the order of the adjectives dif- ferently randomized for each list and each subject. Each subject thus made a total of PXSXA (A for ad- jectives) number of judgments. In general Lhis required two to four experimental sessions of 1.5-2 h each.

The adjectives were chosen on the basis of results from questionnaires to 40 sound engineers (for loud- speaker and headphone experiments) and to 30 audio- logisis and 105 people suffering from hearing loss (for experiments with hearing aids). They were given about 200 adjectives and rated them on a certain scale for their suitability as descriptions of how various sound reproductions may be perceived. About 60 adjectives were considered suitable. Most of these were used in the investigations with some variations between suc- cessive experiments based on the results from each preceding experiment.

2. Similarity ratings

In this case the reproductions appeared in pairs (with a silent interval of about one second between the mem- bers of the pair), and the subjects were instructed to judge the perceived similarity between the respective

two reproductions (systems) on a scale from 100 (per- fect similarity) to 0 (minimum similarity). With S sys- tems there are S(S- 1)/2 possible pairs, and these mul- tiplied with the number of programs give the total num- ber of cases to be judged. The order of these pairs was randomized with an interval of about six seconds be-

tween each pair. The whole procedure was repeated one or more times (with different randomized orders) to in- crease the reliability of the judgments. This method was also used in two introductory experiments described earlier (Ref. 10).

3. Free verbal descriptions

As complement to the above methods the subjects were also asked to describe in their own words how they per- ceived the sound reproduction for a sample of the actual stimuli. After each experiment they also answered var- ious questions about their judgment principles.

For all methods mentioned above preliminary trials were made before the experiment started, and several relaxation breaks were given. The subjects knew noth- ing about the type or number of systems appearing in the respective experiment. Some information was given about the programs indicating piece of music, perform- ers, and something about the room where the recording took place.

C. Data treatment

Indices for interrarer and intrarater reliabilities of

the judgments were obtained using procedures described in Winer •6 (p. 283).

The adjective ratings were subjected to various forms of factor analysis (principal components analysis, rea- lized by the BMD08M computer program). First the arithmetic mean of all subjects' ratings were computed for each PXSXA combination. These means were en-

tered into a matrix with Pxs combinations as rows and

adjectives as columns. Factor analysis was applied to the matrix of correlations between the adjectives over the Pxs combinations. (The basic mathematical opera- tion in factor analysis is computing the characteristic roots and vectors (eigenvalues and eigenvectors) of a correlation matrix.) The result is a matrix of factor loadings for the adjectives, obtained after rotating the original solution to a "best" solution according to a "simple structure" criterion. The rotation was made either according to the "varimax" principle (orthogonal rotation) or the "simple-loadings" principle (oblique rota- tion). The pattern of factor loadings for the adjec- tiv• a•-• u•ad fo•' inte•p•'eting the meaning of the respective factors, which represent the perceptual di- mensions we are looking for. Use is also made of the matrix of factor scores for the BxS combinations, which indicate the positions of the respective I•xs com- binations within each factor (perceptual dimension). For orientation about factor analysis see Gorsuch. •

The similarity ratings were analyzed according to the distance model in multidimensional scaling •a as recent- ly developed in a model dealing with individual differ- ences in multidimensional scaling, INDSCAL? As ap-

1020 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. SjSgren: Quality of sound-reproducing systems 1020

Page 3: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

plied here, the systems are thought of as points in a Euclidean space of n dimensions. These dimensions represent the perceptual dimensions underlying the similarity judgments. The similarity ratings are taken as indicators of the distances between the systems in the space: the more similar two systems are rated to be, the nearer they should lie in the perceptual space, and vice versa. Different individuals may give differ- ent weights to different dimensions, however, and this fact is utilized to provide a unique solution, that is, a unique position of the dimension axes in the n-dimen- sional space. The perceptual meaning of the dimen- sions may be interpreted by observing the positions of the systems within the respective dimension and by studying the verbal descriptions of the subjects, es- pecially by those subjects who, according to the analy- sis, give the highest weights to the dimension in ques- tion.

For more details of methods and data treatment see

the prepublication reports (Refs. 12-15) and two sep- arate papers. 2ø' 2•

Six experiments are described in the following. They do not appear in their chronological order but accord- ing to the type of system that was used (loudspeakers, headphones, hearing aids).

II. EXPERIMENT WITH LOUDSPEAKERS

Two experiments with similarity ratings of loud- speakers were reported earlier in this Journal by the authors. xa They are followed here by an experiment using adjective ratings and factor analysis.

A. Stimuli and listening conditions

The following five programs were used:

(1) Blomdahl: Music from the batter "Sisyfos" (1954), performed by The Stockholm Philharmonic Orchestra under Antal Dorati. Recorded in the Concert Hall of

Stockholm. Gramophone record: Expo Norr Riks Lp 16. Average level about 95 dB SPL.

(2) Glazunov: "Prelude and Fugue in D minor," the ending chorale, performed by Erik Lundkvist on the organ in the (empty) church of N•itra. Gramophone re- cord PROPRIUS 7707. Average level about 90dB SPL.

(3) Honegge•': "Sept pi&ces br&ves," the beginning of the seventh piece, performed by Hefts Fischer on the Bolin grand piano in the (empty) concert hall of the school of Ekliden, Nacka. Gramophone record: PRO- i•RIUS 7709. Average level about 90 dB SPL.

(4) Swedish folktune "Tycker du am mig?," performed by The Gothenburg Chamber Choir in the (empty) church of Flat•s, Gothenburg. Gramophone record: PRO- PRIUS 7717. Average level about 85 dB SPL.

(5) Male voice, text read by the experimenter, re- corded in an anechoic chamber. Average level about 85 dB SPL.

There were nine different rej)•'odztcinA, systems:

.4: Electrostatic loudspeaker of high quality

ß abdlst:

B: Omnidirectional loudspeaker of high quality

Bt.: Loudspeaker B, the treble (t) response of the am- plifier increased (+) 6 dB at 10000 Hz

Bt.: Loudspeaker B, the treble response of the am- plifier decreased (-) 6 dB at 10000 Hz

Bb.: Loudspeaker B, the bass go) response of the amplifier increased 10 dB at 100 Hz

B b_: Loudspeaker B, the bass response of the am- plifier decreased 10 dB at 100 Hz

Bdi•,: Loudspeaker B, 300• quadratic distortion added (the distortion was generated in an analog com- puter)

Loudspeaker B, 50• quadratic distortion added in the bass region below 300 Hz

C: Radio receiver of medium size and quality (the built-in amplifier was used).

There were thus only three different loudspeakers, but the reproductions over loudspeaker B were varied fur- ther as described above. The manipulations in the bass

region (Bh. , Bb_, and Bo,,,) were used only in connection with programs I and 2, which extended into a lower fre- quency region than the other programs. The remaining six reproductions were used for all five programs. The added distortion corresponded to 30• harmonic distor- tion at the peak level of the respective program. Since the distortion products vary quadratically with the sig- nal level, the distortion at the average signal level is, considerably lower.

Frequency curves for the loudspeakers appear in Fig. 1. The listening room was the same as in our previous loudspeaker experiments (Ref. 10).

B. Procedure

There were in all 36 different Pxs combinations (9 systems for each of programs 1-2, and 6 systems for each of programs 3-5). 55 adjectives were used, see Table I, and thus each subject made 36)<55 =1980 judg- ments. There were 20 male subjects, age 20-51 years (only three of them more than 35 years old), recruited from an association of high-fidelity fans.

The main parts of the instruction were as follows:

"You will listen to various sections of music and

speech. The sections appear several times in the ex- periment but are played over different equipments for sound reproduction each time. For every such case you shall try to describe how you perceive the sound repro- duction by writing a figure from 0 to 9 for each of the adjectives on the respective list. 0 means that the re- production has nothing of the quality denoted by the ad- jective. 9 means, on the contrary, that the reproduction has a "maximum" of that quality. For levels between these extremes you use values in between: the more of the quality, the higher number (up to 9); the less of the quality, the lower number (down to 0). Try to spread your judgments over the scale and do not only use a nar- row range in the middle. It is very important to observe

1021 J. Acougt. Soc. Am.. VoL 65. No. 4. April 1979 A. Gabrielsson and H. S]•igren: Quality of •ound-reproducing systems 1021

Page 4: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

FIG. 1. Loudspeaker responses measured in a reverberation chamber (loudspeaker A uppermost, B middle, and C lowest). The three curves represent the first, second, and third har- monic of the total radiated power. Test signal: white noise fed through a 30-Hz wide bandpass filter. Zero level: 50 dB re I pW for the first harmonic, 40 dB for the second and third harmonic.

TABLE I. Varimax rotated factor loadings for 55 adjectives.

Adjectives

Factors

I II III

Avl]igsen ("distant") 0.88 0.21 -0.19 Balanserad ('%alanced") -0.74 --0.58 -0.01 Behaglig ("pleasant") -0.70 -0.67 -0.11 BeslSjad ("veiled") 0.92 0.09 0.21 Brusig ("noisy/hissing") 0.32 0.31 -0.22 Bullrig ("noisy/rumbling") 0.27 0.55 0.73 Detaljrik ("rich in details'') -0.91 -0.27 0.01 Diffus ("diffuse") 0.89 0.30 0.20 Dog ("dull") 0.55 -0.14 0.76 D•mpad ("subdued/moderated") 0.90 -0.31 0.20 Fyllig ("full/full-toned") -0.85 -0.27 0.29 Genomskinlig ("transparent") -0.01 0.31 -0.64 Grumlig ("muddy/confused") 0.80 0.48 0.28 GrStig ("mushy/thick") 0.75 0.55 0.25 G/ill ("shrill") 0.25 0.88 -0.29 H•rd ('q•ard") 0.27 0.92 --0.09 Ih•lig ("hollow") 0.82 0.46 --0.16 Inst•ngd ("closed/shut up") 0.95 0.18 0.10 J'•mn ("uniform/smooth") -0.67 -0.63 -0.04 Klar ("clear") -0.87 -0.40 -0.16 Kontrastrik ("rich in contrasts") -0.94 -0.17 0.02 Kraftig ("strong/loud") -0.66 0.47 0.47 Kylig ("chilly") 0.43 0.68 --0.50 Ljus ("bright/light") -0.23 0.38 -0.83 Luftig ("airy") -0.91 -0.20 -0.16 Matt ("faint/feeble") 0.96 --0.02 -O.11 Mjuk ("soft") --0.39 --0.85 0.11 Mullrande ("rumbling") 0.17 0.40 0.88 Mustig ("juicy/succulent") -0.79 -0.11 0.48 M•rk ("dark") 0.08 -0.21 0.93 Naturtrogen ("true to nature") -0.84 -0.47 O.01 N•Jra ("near") -0.82 --0.35 0.17 Punkfformig (•onfined to a point") 0.78 0.30 -0.18 PAtr•ingande ("obtrusive") 0.26 0.90 0.17 Ren ("clean/pure") -0.85 -0.41 -0.08 Rumsk•insla ("feeling of room") -0.90 -0.29 0.01 Skarp ("sharp") -0.16 0.85 -0.35 Skrapande ("scraping") 0.61 0.60 -0.03 Skrikig ("screaming") 0.40 0.87 -0.11 Skrovlig ("rough/raucous") 0.75 0.53 0.19 Skr'ill ig ("clashing") 0.48 0.85 0.00 Skr'•nig ("yelling/vociferating") 0.41 0.89 0.02 Spetsig ("pointed") 0.27 0.86 -0.35 Sprueken ("cracked") 0.67 0.67 -0.01 Stark ("loud") -0.52 0.68 0.42 Str•iv ('%arab'') 0.75 0.49 -0.04 Suddig ("blurred") 0.86 0.41 0.18 Tort ("dry") 0.85 0.24 -0.20 TrEng ("narrow") 0.89, 0.37 0.04 Tr'6ttande ("tiring") 0.61 0.74 0.11 Tunn ("thin") 0.64 0.46 -0.53 Tydlig ("distinct/clear") -0.80 -0.53 -0.19 rY'•E (•d•no¸") --0.99 --0.1o- O,7O Vass ("sharp/keen") 0.24 0.88 -0.26 '6ppen ("open") --0.92 -0.22 -0.12

that the judgments shall refer to the sound reproduction, not to the music (speech) as such."

C. Results

The interrater reliability for each of the adjectives varied between 0.70 and 0.96 with a median value of 0.85

(however, for two adjectives there were lower relia- bilities: "transparent" 0.62, and "dense" 0.47). The results from the factor analysis on the correlations be- tweea the adjectives appear in Table I (factor loadings for adjectives) and Table H (factor scores for Px$ corn-

1022 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1022

Page 5: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

TABLE If. Factor scores for the different reproductions with- in each program (P). The reproductions are ordered after sign and size of their factor scores at each program within each factor.

Factors

I II III

P1 C 2.21 Bt+ 1.75 A 2.38 A 0.58 C 1.17 Bb+ 2.30 B t_ 0.36 Bdist 0.99 Bt_ 1.69 Bbdi• --0.02 Bb-- 0.99 Bbdis t 1.36 Bb+ --0.08 B 0.96 B 0.91 B•_ -0.40 A 0.88 Bdist 0.81 Bdist --0.48 Bb+ 0.42 Bt+ 0.25 B -0.54 Bt- 0.33 C 0.04 Bt+ -1.02 Bbais t 0.01 Bb- -0.07

P2 C 1.99 C 0.71 A 1.49

A -0.51 Bt+ 0.58 Bb+ 1.31 Bdi•t --0.52 B b_ -0.40 Bdi•t 0.24 B t- -0.53 Bdist -0.45 Bbdis t 0.15 B•_ -0.76 A -0.46 B t_ 0.06 Bt+ -1.12 Bbdis t --0.52 B -0.02 Bbdis t --1.15 Bb, --0.57 Bt+ -0.69 Bb+ -1.17 B -0.63 Bb- --0.70 B -1.35 B t_ -1.11 C --0.91

P3 C 1.70 C 1.51 A -0.43

Bdist 0.21 Bt+ 0.93 Bdist --0.73 B t- 0.11 Bdist 0.55 B t_ -0.81

A 0.07 A 0.51 [ C -0.91 B -0.46 B 0.22 B -1.16

B•+ -1.02 B•_ 0.12 B•+ -1.44

P4 C 1.83 C 0.97 Bdis• --0.25 A 0.14 Bdist 0.83 B t_ --0.52 Bdi•t 0.07 Bt, 0.43 C -0.54 Bt_ -0.50 A 0.04 A -0.72 B -0.94 B t_ -0.11 Bt+ -1.13 Bt+ -1.09 B -0.39 B -1.15

I•5 Bdi•t 1.94 Bt+ -1.18 A 0.93 C 1.54 Bdi•t --1.38 Bali • 0.55 A 1.09 C -1.77 B t_ 0.19 B•- 0.51 B -1.84 B -0.42 B -0.09 A -1.84 B•+ -0.78 Bt+ -0.59 B•- -2.27 C -1.30

binations). Four factors accounted for 90.6% of the total variance (the fourth factor is omitted in the tables since there was only one substantial loading in this factor, see below).

Factor I (1; 'I) is interpreted as a general quality factor emphasizing "Clearness/Distinctness," "feeling of space," and "nearness" in the reproduction. As seen in Table I, the highest factor loadings on one side of F I (-0.94 to -0.80) appear for "rich in contrasts, .... open," "rich in details," "airy," "feeling of room," "clear," "clean/pure," "full," "true-to-nature," "near," and "distinct." Moreover "pleasant" belongs to this side (-.70). On the other side (0.96 to 0.80) there are "faint/ feeble," "closed/shut up, .... veiled, .... subdued" "dif- fuse," "narrow," "distant," "blurred," "dry," "hollow," and "muddy/con/used." "Tiring" also belongs to his side.

The psychophysical relations behind this factor are suggested from the positions of the P xS combinations in this factor as defined by the factor scores in Table II. In this table the reproductions (systems) are ordered within each program according to the sign and size of their factor scores. It is necessary to compare the positions of the reproductions within each program, since the properties of the programs "as such" also are reflected in the factor scores. It is noted that loud-

speaker C is fairly outstanding on the "bad" ("diffuse/ closed/distant") side of F I, followed by loudspeaker A or Bdi• (in program 5 Bdist is worst). In the case of C this may be due to its relatively narrow bandwidth and high distortion (see Fig. 1). In the case of A the reason is probably its bass boost and relatively high distortion in the bass range. That Bdi•t appears here is not surprising, of course. The added distortion seems to have the most negative effect for the voice program, while its consequences for a very complex spectrum like that of program 1 are more limited due to masking effects.

On the "good" ("clear/near/open") side loudspeaker B or its variant with increased treble (Bt.) are the leading ones. A certain increase of the treble seems to enhance the mentioned qualities in this case (de- creasing the treble, as in Bt _, leads in •he opposite direction). It is noted that loudspeaker B has relative- ly low distortion and no bass boost as in loudspeaker A.

Factor H (F II) may be interpreted as "sharpness/ hardness-softness." On one side the dominant factor

loadings (0.92 to 0.85) appear for "hard," "obtrusive," "yelling," "shrill, .... sharp," "screaming," "clashing," and "pointed." "Tiring" also appears here (0.74). On the opposite side "soft" is outstanding (-0.85), followed by "pleasant" (-0.67) and, later, by "true-to-nature" (0.47).

In the factor scores for F II there are apparent ex- amples that the properties of the programs "as such" are reflected in these scores. All reproductions of the voice program lie on the "soft" side (negative signs), while all reproductions of program 1 (which is very "aggressive" music) and program 3 ("percussive" piano music in the treble) lie on the "sharp/hard •' side (positive signs). However, comparing the positions of the reproductions within each program it is seen that the "sharpest/hardest" reproductions are generally loudspeaker C, B•., Balsa, and B•... It seems that in- creased treble (B•.), weakened bass reproduction and C), and distortion (C, Bd•st) are conditions influenc- ing the reproductions in a "sharp/hard" direction. On the other hand, reproductions with decreased treble (B,_) and increased bass (Bb.) seem to affect the re- production in the "soft" direction as well as the broad- band reproductions of loudspeakers A and B.

Factor III (F HI) is interpreted as a "Brightness- Darkness" dimension. "Bright" has an outstanding fac- tor loading on one side (-0.83), while "dark, .... rumb- ling, .... dull," and "noisy/rumbling" are utmost on the other side (0.93 to 0.73). Here again the properties of the programs "as such" may be seen in the factor

1023 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1023

Page 6: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

scores, where all reproductions of program 3 (piano music in the treble) and program 4 ("bright" choir music) belong to •he "bright" side (negative signs). Within each program loudspeaker A is extreme on the "dark" side (except at program 4) together with B• at programs 1 and 2. Increased bass response as in those two cases seems to contribute to "darkness." On

the other hand, decreased bass response (B b_ and C) and increased treble response (Bt.) probably contribute to "brightness." In general systems Bt-, Bdist, and Bbdis , all lead to "darker" reproductions than the unmanipu- lated B system.

Factor IV (F IV), finally, is interpreted as a "Disturbance/Noise" factor with only one substantial factor loading: "noisy/hissing" (0.84; all other ad- jectives had loadings between - 0.25 and 0.28). From the corresponding factor scores it was evident that in- creasing the treble response (Bt.) resulted in a more "noisy/hissing" sound, while reducing the treble re- sponse (t•_) also reduced this disturbing sound.

III. EXPERIMENT WITH HEADPHONES

In this experiment similar judgment and analysis methods were used as in the above experiment. The differences mainly concern the stimuli and listening conditions.

A. Stimuli and listening conditions

The stimuli were five music programs presented stereophonically over each of eight headphones. The programs were as follows:

(1) J.S. Bach: "St. John Passion," the very end of the finale chorale, performed by the Bach choir at Adolf Fredrik, Stockholm. Recorded in the (empty) church of Adolf Fredrik, Stockholm. Gramophone record: PROPRIUS 7741. Sound level 85-97 dBA.

(2) Oscar Peterson's jazz trio (piano, bass, drums), sample from the tune "Something's coming." Recorded in a gramophone studio. Gramophone record: VERVE 2304 062. Sound level 80-90 dBA.

(3) Grieg: "Solveigs sang" from the "Peer Gynt" suite, sung by Grynet Mollrig accompanied by piano, bass, and choir. Recorded in the auditorium of Ljungkileskolan, Sweden, synthetic reverberation ad- ded afterwards. Gramophone record: PROPRIUS 7739. Sound level 80-88 dBA.

(4) Vivaldi: Excerpt from the "Summer" concerto in "The four seasons," performed by The Academy of st. Mar[in-in-the-Fields. Gramophone record: ARGO ZRG 654. Sound level 80-90 dBA.

(5) Stravinsky: Excerpt from the end of "The Fire- bird Suite," performed by the Stockholm Philharmonic Orchestra. Recorded in the Concert Hall of Stockholm. Gramophone record: LJUD (issued by the Swedish Hi- Fi Institute). Sound level 90-97 dBA.

The eight headphones (labeled H1, H2,..., H8 in the following) represented different technical solutions to the transducer design problem and included electrody-

FIG. 2. Frequency response of eight headphones measured on FEC coupler, type Br•el & Kjaer 4153.

namic, electrostatic, orthodynamic, and piezo-electric systems. They also differed in the way of applying them to the ear, including supra-aural, circum-aural, and open types. Their frequency responses, as measured by the three-volume coupler (Brfiel& Kjaer 4153), are given in Fig. 2. Problems concerning the measurement of frequency responses in headphones are discussed in the prepublication report (Ref. 15). The headphones were kept hidden for the subjects, and they were not allowed to put on the headphones or adjust them in any way. The experimenter put on and changed headphones, approaching the subject from behind and adjusting the headphones until the subject was satisfied.

B. Procedure

There were 20 subjects, 14 males and 6 females, 19-29 years old, representing music students and ex- perienced music listeners. They were not used to lis[ening by headphones and did not consider themselves as high-fidelity fans.

There were [5 programs (P)]X [8 headphones (H)] =40 PXH combinations. 30 adjectives were used (see Table III), reduced to that number on basis of the re- suits in preceding experiments (including those des- cribed below). Each subject thus made 40x30 =1200 judgments according to an instruction analogous to that in the loudspeaker experiment above. The data treat- ment was also similar, but also included various pro- cedures for analysis of variance to test the presence of

1024 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1024

Page 7: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

significant differences between the headphones in the different adjective scales (for details see the prepubli- cation).

C. Results

The inter-rarer reliability for most adjectives varied between 0.60 and 0.91 (median 0.75). Furthermore there were highly significant differences between the headphones in all adjective scales, which also indicated the reliability of the ratings and that a successful selec- tion of adjective scales had been made.

In the factor analysis five factors accounted for 85.8% of the total variance. The factor loadings for the ad- jectives appear in Table III and the factor scores for the P XH combinations in Table IV.

Factor I (F I) is interpreted as "sharpness/'hardness- softness," possibly confounded with "loudness." There are a few high factor loadings on each side of the con- tinuum: for "loud, .... jarring/grating, .... hard," and "sharp/keen" on one side, for "soft" on the other side.

Studying the factor scores for Px H combinations in FI indicates that this factor partly reflects differences between the different programs. The highest factor scores on the "soft" side occur for most reproductions of program 3, which has the lowest sound level among the programs, probably also less energy in the high

frequency region. The highest factor scores on the "sharp/hard/loud" side appear for various reproduc- tions of program 5, which has the highest sound level among the programs and a lot of brass instruments and percussion playing in fortissimo. Within each program H6 is the most "soft" one, followed by H5 or H4 (and H8 for program 3), while H3 is fairly outstanding on the "sharp/hard/loud" side. A look at the frequency re- sponses (Fig. 2) suggests that this perceptual character of H3 is related to the prominent peak around 3000- 4000 Hz in its frequency curve.

Factor II (F II) is interpreted as "clearness/distinct- ness." The highest factor loadings on one side of the continuum appear for "clear" and "pure/clean"followed by "true-to-nature" and "feeling of presence." On the opposite side an outstanding high loading occurs for "diffuse."

The evidence of the factor scores in F II shows that

H1 lies highest on the "clear/pure" side within each of the programs, while H5 and H3 lie utmost on the other side (except for H5 at program 3). The order of the re- maining headphones varies from program to program. It may be suggested that the relatively more "diffuse" character of H5 is related to the bass boost in its fre-

quency curve. As regards H3 one reason is again probably its pronounced peak around 3000-4000 Hz in the sense that other frequency regions than this are

TABLE III. Factor loadings ("simple loadings" criterion) for 30 adjectives.

Factors

Adjective I II III IV V

Balanserad ("balanced") 0.28 0.59 -0.06 -0.32 0.18 Behaglig ("pleasant") 0.42 0.39 -0.20 -0.21 0.34 Brusig ("noisy/hissing") 0.48 -0.10 0.88 -0.01 0.05 Diffus ("diffuse") 0.12 -0.90 -0.05 -0.17 0.18 Der ("dull") 0.18 -0.31 -0.25 -0.71 -0.12 Framh]ivd bas ("emphasized bass") 0.00 0.14 0.11 -0.93 -0.21 Framh•ivd diskant ("emphasized treble") -0.43 0.22 0.39 0.37 -0.16 Friisande ("hissing") -0.04 0.04 0.88 0.07 -0.10 Fyllig ("full/- toned") 0.22 0.18 -0.13 -0.45 0.47 Hfi.rd ("hard") -0.76 0.31 -0.12 0.16 -0.31 lh{lig ("he[low") 0.01 -0.43 -0.10 0.20 -0.50 Instiingd ("shut up/closed") -0.11 -0.35 -0.09 -0.10 -0.71 Klar ("clear") -0.21 0.85 -0.02 0.02 0.01 Knastrande ("crackling/crunching") 0.07 0.04 0.95 -0.08 -0.07 Ljus ("bright/light") -0.20 0.07 0.18 0.77 0.00 Matt ("faint/feeble") 0.34 -0.55 -0.15 -0.07 -0.45 Mjuk ("soft") 0.84 -0.06 -0.04 -0.17 0.12 Mul[rande ("rumbling") -0.26 -0.20 -0.06 -0.93 0.10 Nasal ("nasal") -0.25 -0.45 0.07 0.52 -0.18 Naturtrogen ("true to nature") 0.09 0.75 -0.06 -0.23 0.20 Niirvaroklinsla ("feeling of presence") 0.23 0.64 0.01 -0.17 0.31 Ren ("pure/clean") -0.01 0.83 -0.13 0.13 0.11 Rymdk}insla ("feeling of space") 0.02 -0.01 -0.13 0.07 0.92 Skorrande ("jarring/grating") -0.83 -0.03 0.09 -0.11 -0.15 Stark ("loud") -0.92 0.01 -0.24 -0.08 0.42 Sir]iv ("harsh") -0.49 -0.04 0.30 0.29 -0.17 Torr ("dry") --0.20 --0.15 0.33 0.30 --0.44 Tunn ("thin") -0.07 -0.25 0.06 0.65 -0.31 Vass ("sharp/keen") -0.66 0.09 0.20 0.37 -0.06 Vinande ("whistling/whizz ing") -0.25 -0.09 0.82 0.05 0.18

1025 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1025

Page 8: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

TABLE IV. Factor scores for eight headphones within each of five programs. The head- phones are ordered with respect to the size of their factor scores at each program within each factor.

Factors

I II III IV V

Pl H6 0.86 H1 1.07 H3 1.19 H3 2.21 Hi 1.82

H5 0.37 H6 0.95 H1 0.38 H8 1.80 H6 1.47

H1 0.35 H4 0.35 H7 0.35 H7 0.72 H4 1.23

H4 0.25 H2 -0.27 H2 -0.20 H6 0.48 H2 0.97

H2 0.23 H8 -0.48 H8 -1.15 H2 0.31 H7 0.86

H7 -0.11 H7 -0.55 H5 -1.29 H4 0.12 H5 0.26

H8 -0.48 H3 -1.54 H4 -1.38 HI -0.36 H8 -0.55

H3 -1.42 H5 -2.28 H6 -1.56 H5 -1.81 H3 -0.87

P2

P3

P4

P5

H6 0.92 Hi 1.90 H3 1.79 H3 1.58 H1 -0.08 H5 0.01 H7 0.83 Hi 1.15 H2 0.81 H4 -0.49

H4 -0.09 H4 0.74 H7 0.69 H8 0.76 H7 -0.67

H7 -0.13 H6 0.60 H8 0.38 H4 -0.28 H5 -1.08

H8-0.18 H8 0.24 H2 0.37 H1-0.32 H6-1.09

H2 -0.24 H2 0.21 H4 -0.66 H6 -0.54 H2 -1.45

HI -0.50 H5 -0.37 H6 -0.76 H7 -0.57 H8 -1.64

H3 -0.96 H3 -0.90 H5 -0.94 H5 -1.47 H3 -2.52

H6 1.79 H1 0.73 H7 1.95 H3 1.31 H4 0.91

H8 1.79 H4 0.27 H1 1.92 H8 0.39 H2 0.81

H5 1.58 H6 0.21 H3 1.65 H7 -0.06 H1 0.63

H4 1.54 H5 -0.39 H2 1.16 H2 -0.08 H6 0.43

H7 1.53 H2 -0.43 H8 0.20 H4 -0.17 H5 0.30

Hi 1.28 H8 -0.55 H4 -0.12 H6 -0.32 H7 -0.08

H2 1.27 H7 -0.88 H5 -0.45 H1 -1.15 H8 -0.96

H3 0.33 H3 -1.24 H6 -0.77 H5 -1.55 H3 -1.24

H6 0.40 H1 0.96 H3 1.21 H3 1.58 H1 1.33

H4 0.27 H8 0.29 H1 0.70 H8 0.66 H7 0.45

H5 0.07 H2 -0.01 H7 0.61 H7 0.45 H6 0.33

H7 -0.10 H4 -0.08 H2 0.38 H2 0.32 H4 0.18

H2 -0.12 H6 -0.12 H4 -0.46 H6 -0.35 H2 0.01

H8 -0.13 H7 -0.21 H8 -0.68 H4 -0.44 H8 -0.70

HI -0.37 H3 -0.69 H5 -1.07 H1 -0.51 H5 -0.74

H3 -1.82 H5 -2.81 H6 -1.23 H5 -1.96 H3 -1.08

H6 -0.13 H1 2.41 H3 0.81 H3 1.43 H1 1.99

H4 -0.18 H7 1.09 H1 0.00 H8 0.58 H2 1.17

H5 -0.55 H2 0.77 H7 -0.13 H2 0.02 H7 0.45

H2 -0.95 H8 0.59 H8 -0.29 H7 -0.10 H4 0.43

HI -1.10 H4 0.48 H2 -0.68 H4 -0.34 H6 0.13

H8 -1.38 H6 0.46 H5 -0.83 H6 -0.66 H8 -0.03

H7 -1.40 H3 -0.43 H4 -0.92 HI -0.72 H5 -0.36

H3 -2.53 H5 -0.89 H6 -1.33 H5 -1.76 H3 -0.52

suppressed in its reproduction.

FactorHi (FHI) seems related to various unwanted "disturbing sounds" in the reproduction. High factor loadings occur for "crackling-crunching, .... noisy/his- sing, .... hissing," and "whistling/whizzing."

In the factor scores for this factor there are indica-

tions that F IH partly reflects characteristics of the programs (or recordings). The reproductions of pro- gram 3 lie more towards the "disturbance" side than what is the case for the other programs, especially for program 5. Program 3 represents the lowest sound level among the programs and thus a somewhat lower signal-to-noise ratio. On the other hand program 5 has

the highest sound level among the programs, and this fortissimo music effectively masks the tape noise and related phenomena. As regards the headphones, H3 lies most towards the "disturbance" side within

programs (except in program 3) followed by HI and H7, while H6 lies most away from the "disturbance" side (except in program 2) together with H5 or H4. It may be suggested that this perceptual dimension is related to the presence of resonance peaks at higher frequency regions as in the frequency responses of H3, HI, and H7.

Factor IV (F IV) may be labeled "brightness-dark- ness," possibly with a touch of "fullness." The adject-

1026 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. SjSgren: Quality of sound-reproducing systems 1026

Page 9: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

ives "bright" and "thin" have the highest factor loadings on one side, while "emphasized bass, .... rumbling," and "dull" dominate the opposite side (note also "full" with a moderately high loading).

In the factor scores for F IV the "bright/thin" side is represented above all by H3, followed by HS, while the opposite "dark/bass" side has H5 as rather outstanding example within each program. It seems fairly evident that these perceptual characteristics reflect differences in the frequency responses: the peak around 3000- 4000 Hz in H3 versus the bass boost in H5.

Factor V (F V) is aptly described by its single high loading on the positive side occurring for "feeling of space." In contrast to this the highest loading on the other side appears for "closed/shut up."

No doubt this factor reflects much of the recording conditions for the different programs. In the factor scores for F V it is striking that reproductions of pro- gram 2, which was recorded in a studio, give less "feeling of space" than the other programs which were recorded in big rooms, for instance, program I in a church and program 5 in a concert hall. There are, however, also recurring differences between the head- phones. H1 gives most "feeling of space" within each of the programs (except for program 3), while H3 is extreme on the "closed/shut up" side, followed by H8 or HS. It may be noted that H1 is the single headphone which is not directly applied against the outer ear.

There were four moderate intercorrelations between

the factors in this oblique solution. The "sharpness/ hardness-softness" factor (F I) correlated negatively (- 0.41) with "brightness-darkness" (F IV). As seen in the factor scores headphones which belong to the "soft" side of FI also belong to the "darker" side of F IV. The "clearness-distinctness" factor (F II) corre- lated positively (0.45) with the "feeling of space" factor (FV). It is seen in the factor scores that the more "clear/distinct" headphones in F II in general lie high in the "feeling of space" factor. The "disturbing sounds" factor (FIII) correlated positively (0.39) with "brightness-darkness" (F IV): there is a tendency for headphones which lie high in the "disturbance" factor to appear on the "bright" side of F IV. Finally "bright- ness-darkness" correlated negatively (-0.33) with "feeling of space": there is a slight tendency for head- phones on the "bright" side to give less "feeling of space." The remaining intercorrelations between the factors were negligibly low. '

IV. EXPERIMENTS WITH HEARING AIDS

A series of experiments with hearing aids is des- cribed in a journal for andiologists. 22 They are briefly reviewed here to put them into their context within our research project and to allow conclusions to be drawn from all experiments together.

Four experiments were made. In three of them the reproductions of the hearing aids were recorded on tape using the I•C 2 cc coupler, and the subjects listen- ed to the tape recordings by means of headphones

TDH39. In the fourth experiment, however, the hearing aids were fitted directly into the subjects' individual earmoulds. Normal hearing subjects were used for reasons discussed in Ref. 22, but a few people suffering from hearing loss took part in the fourth experiment.

A. Experiment 1

Six programs were used: two with music, three with speech, and one with traffic noise. They were repro- duced by five systems: one narrowband head-worn hearing aid, one broadband body-worn hearing aid, two systems representing pronounced lowpass filtering (6 riB/octave 100-10000 Hz) and highpass filtering (12 dB/octave 100-5000 Hz), respectively, and a broad- band reference system realized by a tape copy of the respective master tape. 20 subjects judged the per- ceived sound quality of the 30 Pxs combinations on 62 adjective scales.

Three factors accounted for 91% of the total vari- ance. The first factor was "sharpness/hardness- softness" with the highpass filtered system and the narrowband hearing aid being "sh arp," while the low - pass filtered system and the two broadband systems were "softer." The second factor was "clearness/distinct- ness" with the broadband reference system as out- standing, while the low-pass tittered system and the narrow-band hearing aid were most "nondistinct." The third factor represented a combination of "loudness" and "feeling of space" with the broadband reference system as most "loud/open/airy" and the low-pass filtered system as most "soft/faint/closed."

B. Experiment 2

In experiments 2-4 the same eight systems were used, recorded on tape over coupler in experiments 2-3, directly fitted into individual earmoulds in ex- periment 4. The systems were five different types of body-worn hearing aids, one of which was also used with three different tone control settings ("normal," "low," and "high"). The remaining system was a broadband reference system (tape copy of the respect- ive master tape).

In experiment 2 two music programs and one speech program were used. 25 male subjects listened to all pairwise combinations of the eight systems (28 pairs) within each of the programs and rated the perceived similarity between the two systems in each pair. The similarity ratings were analyzed according to the [NDSCAL model. As an example the three-dimension- al solution (accounting for 72.5% of the variance) for the speech program is shown in Fig. 3.

The interpretation of the dimensions was similar to that for the other two programs. Dimension I was mainly interpreted as "brightness-darkness" com- bined with "fullness-thinness" (possibly also "sharp- heSs/hardness-softness"). Two narrowband syatem• (No. 3 and 8 in Fig. 3) with peaks at 1000-3000 Hz were extreme on the "bright" side, while a broadband system with certain bass boost (No. 5) was the "dark- est" one. Dimension II was labeled "clearness/dis-

1027 J. Acoust. Soc. Am., VoL 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1027

Page 10: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

3 7

FIG. 3. Position. of e•ht he•[n• •[d systems •n a three-d•- mens•on•! INDSCAL solution. (D•nens[on I ieff-r•ht, d•men- s•on II front-rest, d•rnens[on III height. D•mens•on 1T• •s transformed •o rnske the lowest coord[nste vs•ue in th|s di-

mension •Dpesr slightly •bove the plane of d•rnens•ons I-H.)

distinctness." The third factor was interpreted as "nearness," how "near" the sound seems to be to the listener. To some extent this factor reflects whether

the recording is made near to the sound source or more at distance, but there were also consistent dif- ferences between the hearing aids. The fourth factor was "fullness-thinness" with narrow-band systems appearing on the "thin" side, while broadband systems in general had more of "fullness." The fifth factor represented various "disturbing sounds" ("noisy/his- sing, .... crackling, .... sparkling," etc.) with certain of the hearing aids more affected by such disturbances than others. The data from the three hearing impaired subjects suggested factors as "sharpness/hardness- softness," "clearness/distinctness, .... feeling of space," and "disturbing sounds." Due to the low number of sub- jects these results must be considered with big caution.

tincthess" with the broadband reference system (No. 1) as outstanding, while the bass boost of No. 5 seemed to counteract "clearness" in this speech program. In dimension III systems Nos. 3 and 7 were contrasted against No. 6, probably reflecting the relative promi- nence of the various resonance peaks in the respective systems. No definite perceptual interpretation was given for this dimension.

C. Experiment 3

The same systems and programs as in experiment 2 were used, and three other programs were added (fe- male speech, traffic noise, dining room sounds). 42 subjects rated the perceived sound quality of the 6 P x 8 S =48 Pxs combinations on 40 adjective scales. Three factors accounted for 88% of the total variance. The first factor was "sharpness/hardness-softness" with narrowband and treble-boosted systems belonging to the "sharp/hard" side, while systems with broader frequency range and/or bass boost appeared on the "soft" side. The second factor was "clearness-distinct-

ness" combined with "feeling of space" and "near- ness." The broadband reference system was outstand- ing, while systems of narrowband character or with bass boost appeared less "clear/distinct." The third factor was a blending of "brightness-darkness" and "fullness-thinness" with "bright/thin" systems being narrow banded and treble boosted, while "dark/full" systems were represented by broadband systems with some bass boost.

D. Experiment 4

The hearing aids were now listened to as in reality, that is, directly fitted into individual earmoulds (the reference system, however, was listened to by head- phones). The same three programs as in experiment 2 were used. The experiment was very time-consuming and tiring. Ten normal hearing subjects and three hearing impaired subjects made ratings of the 3 Px8 S =24 Pxs combinations on 50 adjective scales. Five factors accounted for 86% of the total variance in the data for normal hearing subjects. The first two factors were "sharpness/hardness-softness" and "clearness/

V. EVALUATIVE JUDGMENTS

Which relations are there between different percept- ual dimensions as those found above and evaluative

judgments concerning the overall quality of the repro- ductions (systems)? Some i0formation about this can be gained by studying the factor loadings for the two adjectives "pleasant" and "natural/true-to-nature." Both of them refer to some kind of overall evaluation, but from different standpoints- what sounds "pleasant" does not necessarily sound "natural" (in the "high- fidelity" sense), and conversely. The following sum- mary uses evidence from Tables I and III but also cor- responding data from the hearing aid experiments not given here.

In the "clearness/distinctness" dimension both "pleasant" and "natural/true-to-nature" have rather high factor loadings on the "clear/distinct" side of the continuum. This is somewhat more pronounced for "natural/true-to-nature" than for "pleasant." "Clear- ness" seems thus important for both aspects, but probably more for "naturalhess" than for "pleasant- ness" (for instance, a reproduction may sometimes be too "clear" to be "pleasant").

In "sharpness/hardness-softness" both "pleasant" and "natural" appear on [he "soft" side, and more so for "pleasant" than for "natural." A "soft" reproduc- tion may thus be "pleasant," but it may sometimes be too "soft" to sound "natural."

In "brightness-darkness" the situation is more vary- ing with "pleasant" and "natural" appearing rather neu- trally in the middle or slightly on the "dark" side. In "fullness-thinness" they appear on the "full" side, in the "feeling of space" dimension they belong to the "open/airy" side, and in the "disturbing sounds" di- mension they occur, of course, on the "nondisturbance" side. In "nearness" the situation is varying, but in most eases "pleasant" and "natural" appear on the "near" side rather than on the "distant" side, and pos- sibly more so for "natural" than for "pleasant" (a too "near" reproduction may not be "pleasant").

These results should be considered as suggestions to be further investigated in future experiments. They

1028 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj6gren: Quality of sound-reproducing systems 1028

Page 11: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

may depend, more or less, on the type of programs/ recordings which are used as well as on other factors, too (see further in Discussion). A more detailed study of the relations between each single adjective used in the respective experiment and "pleasant" or "natural" appears in two prepublication reports?' •

Vl. REVIEW OF DIMENSIONS

The results in any single experiment may depend very much on the actual context: which programs and re- productions are used, which judment methods are used, which adjectives are included, how the factor analysis is applied etc. To counteract this dependency on con- text and see which results would be invariant we there-

fore varied such conditions from experiment to experi- ment. A summary of the conditions and the interpreted perceptual dimensions is given in Table V. Experi- ments 1 and 2 in this table were reported earlier (Ref. 10) but are included here for completeness. Experi- ments 3-8 are those described in this paper.

It is apparent from Table V that there is a limited number of perceptual dimensions appearing more or less constantly in all experiments. The varying num- ber of dimensions (two to five) in different experiments is mainly a function of various context factors (which systems, which adjectives etc.).The circumstance that certain dimensions sometimes appear separately but sometimes in combination with others is also related

to various context factors. It may happen that the selected programs and systems in an experiment "per- mir" two (or more) perceptual dimensions to appear separately, while the selection in another experiment may be such that there will be a covariance.between two (or more) dimensions and so they appear in com- bination.

In the following a review is made of the dimensions found in these experiments together with suggestions about the underlying psychophysical relations.

A. "Clearness/distinctness"

This dimension refers to descriptions of sound repro- ductions by adjectives/expressions like "clear, .... dis- tinct, .... clean/pure, .... rich in details," and the like, in contrast to reproductions characterized as "diffuse," "muddy/confused, .... blurred, .... noisy, .... rough," "harsh," sometimes "rumbling, .... dull," and "faint."

Systems with broad frequency range, fairly flat fre- quency response, and low nonlinear distortion are in general rated favorably in "clearness/distinctness." Narrow-band systems, systems with marked resonance peak(s), and systems with more distortion get poorer ratings. There are examples that bass boost may be unfavorabl• (note also the adjectives "rumbling" and "dull" in the enumeration above), probably due to the strong masking effects by low-frequency components. On the other hand a certain emphasis on the middle and high frequency region may be favorable to "clearness." However, too much emphasis on these regions may re- suit in an exaggerated "clearness/distinctness," which sounds unnatural to the normal ear, and may also give

a too "sharp" reproduction (as sometimes happens when the treble control of your amplifier is set in its ex- treme high position; see further about "sharpness" be- low). This question is of interest for hearing aids in which the frequency response normally has a low fre- quency cut and very often an emphasis on the treble ex- pected to give an improved intelligibility of speech.

It is noted above that "roughness" and "harshness" are possible opposites to "clearness/distincthess." "Roughness" has recently been related to perception of consonance-dissonance and to timbre of voices and or- gans?' 24 "Roughness" is said to increase with the number of partials within the same critical band. In the present context we may guess that the more distor- tion products are generated by a sound-reproducing system, the more sound components (harmonics, com- bination tones, etc.) will appear within the respective critical bands, and the more "roughness" (among other things) will be perceived.

Of course, an increased "clearness/distinctness" may be obtained by raising the sound level, however, only up to a certain limit where overloading of the equipment and/or "psychological overloading" (exceeding the sound level for comfortable listening) is risked.

B. "Sharpness/hardness-softness"

This dimension of perceived sound quality is describ- ed by adjectives as "sharp, .... hard, .... shrill, .... scream- ing, .... pointed, .... clashing" (sometimes "loud," see further under Sec. VIH), and the like, in opposite to "soft, .... mild," "calm/quiet, .... dull," and "subdued."

Systems judged as "sharp/hard" usually have a rather steeply rising frequency response towards the treble or marked resonance peaks at high frequencies (may be especially important in the most sensitive region of the ear, say 2000-4000 Hz). On the other hand the bass re- sponse is more or less suppressed (very marked for certain hearing aids as mentioned above). Distortion products and high sound level also seem to contribute to "sharpness/hardness." Systems rated as "soft" have flatter frequency responses or, to be still "softer," re- sponses emphasizing the bass region and deemphasizing the treble region. A lower sound level also contributes to "softness."

"Sharpness/hardness" is probably related to the "density" dimension described for pure tones?' 26 "Density" is said to increase with increasing frequency and increasing intensity, which is strikingly similar to the fact that "sharpness/hardness" increases with fre- quency responses rising towards the treble and with higher levels. For perception of sound reproductions it seems more natural to speak of "sharpness/hardness" rather than "density." In fact the Swedish word "t//t" ("dense") was not judged as suitable for describing per- ceived sound quality in the questionnaires for selection of adjectives (see Methods). In an investigation about the timbre of steady sounds Bismarck z7' za recently found "sharpness" to be a dominant dimension with un- derlying psychophy.sical relations very similar to those suggested here.

1029 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. SjSõren: Quality of sound-rer•roducing systems 1029

Page 12: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

TABLE V. Summary of conditions and interpreted perceptual dimensions in eight experiments. Abbreviations under Programs: Mus = Music, Sp=Speech, N =Noises, under Judgements: Sire = Similarity ratings, Adj =Adjective ratings, under Subjects: HF = High-fidelity fans, Mus = Musicians, Gen = Listeners in general.

Systems Programs Judgments Subjects Pereeptual dimensions

(1) 5 loudspeakers . 5 (Mus) Sire HF, Mus, Gen I. Clearness/distinctness II. Brightness-darkness, sharpness/hardness-softness

(2) 6 "loudspeakers" 3 (Mus) Sim HF, Gen I. Brightness-darkness,. Sharpness/hardness-softness II. Loudness

(3) 9 "loudspeakers" 5 (Mus, Sp) Adj HF I. Clearness/distinctness, feeling of space, nearness II. Sharpness/hardness-softness

III. Brightness-darkness IV. Disturbing sounds

(4) 8 headphones 5 (Mus) Adj Mus I. Sharpness/hardness-softness II. Cicarue ss/distinctness

III. Disturbing sounds IV. Brightness-darkness, fullness-thinness V. Feeling of space

(5) 5 "hearing aids" 6 (Mus, Sp, N) Adj Gen I. Sharpness/hardness-softness II. Clearness/distinctness

III. Loudness, feeling of space (6) 8 hearing aids 3 (Mus, Sp) Sire Gen I. Brightness-darkness, fullness-thinness

II. Clearness/distinctness IIL Dimension related to prominence of resonance peaks

(7) 8 hearing aids 6 (Mus, Sp, N) Adj Mixed I. Sharpness/hardness-softness II. Clearness/distinctness, feeling of space

III. Brightness--darkness, fullness-thinness (8) 8 hearing aids 3 (Mus, Sp) Adj Mixed I. Sharpness/hardness-softness

II. Clearness/distinctness, feeling of space III. Nearness

IV. Fullness--thinness

V. Disturbing sounds

Page 13: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

C. "Brightness-darkness"

This dimension is defined by the adjective "bright" contrasted against "dark, .... rumbling, .... dull," and "emphasized bass."

The psychophysical relations behind this dimension seem very similar to those for "sharpness/hardness- softness." Systems with frequency responses rising towards the treble or with peaks in the treble are judged as "bright," while reducing the treble and/or increas- ing the bass response results in a "darker" character of the reproduction.

The relation between "sharpness/hardness-softness" and "brightness-darkness"is elusive for an analysis. In our experiments they sometimes appear in combina- tion, sometimes separated (Table V). Their relations to physical variables seem similar. With regard to pure tones it has been suggested that "brightness" and "density" are two different words for the same dimen- sion, 2s since they depend on frequency and intensity in a similar way. That the psychophysical relations are similar does not exclude the possibility that "density" and "brightness" are perceptually different phenomena, however. As regards the present problem it seems plausible that "sharpness/hardness-softness" and "brightness-darkness"' refer to two different perceptual dimensions, although their relations to physical vari- ables seem similar. A supplementary hypothesis is that the steeper the frequency response rises towards the treble, the more natural it is to judge the reproduction as "sharp/hard" rather than "bright" (although it is "bright" too).

D. "Fullness-thinness"

This dimension refers to descriptions of reproduc- tions as "full" versus "thin." It often appears in com- bination with "brightness-darkness" but sometimes separately. Systems with broad frequency range are rated as having "fullness," particularly if there is also a certain emphasis on the bass region. Narrow-band systems with peaks at high frequencies are judged as "thin." It also seems obvious that "fullness" increases

if the sound level is raised, and vice versa. "Fullness" is probably related to the "volume" dimension discus- sed for pure tones (Ref. 25). "Volume" is said to in- crease with increased intensity, but decrease with in- creased frequency, which is reminiscent of what is suggested for "fullness" here.

E. "Feeling of space"

This dimension refers to expressions as "feeling of space," "feeling of room," "airy," "wide," and "open" in opposite to descriptions as "closed/shut up, .... nar- row," and "dry."

The psychophysical relations behind this dimension can only be loosely suggested from the present data. In the loudspeaker experiments the omnidirectional loudspeaker gave mo•t "feeling of •pace," possibly al•o in combination with an increased treble response, while the narrowband radio receiver sounded most "closed/ shut up," followed by a loudspeaker with bass boost

(Fig. 1). In the headphone experiment the single head- phone of open type was considered best in this dimen- sion, while a headphone with marked resonance peak in the treble and another headphone with bass boost were judged as most "closed/shUt up." From the hearing aid experiments it is also suggested that narrowband sys- tems and systems with bass boost sound relatively more "closed/shut up."

It should be noted that monophonic reproductions were used in most experiments. Sterophonic reproduc- tion is, of course, an important factor for giving "feeling of space," not systematically investigated here.

F. "Nearness"

Different sound reproductions may sound more or less "near" to the listener (alternatively more or less "distant"). It is obvious that "nearness" is related to the intensity: the higher intensity, the "nearer" it sounds, and conversely. The relations to characteris- tics of the frequency response are varying. In some cases broadband systems give more "nearness" than narrow-band systems, possibly also in combination with an increased treble response. However, there are also examples in the opposite direction, and presently this question is left open.

G. "Disturbing sounds"

This dimension refers to characteristics of sound

reproductions as "noisy/hissing, .... crackling-crunch- ing, .... whistling/whizzing, .... wheezing," and others. Systems described by such expressions are generally characterized by an increased response at high fre- quencies. Reducing the treble response by a treble control also reduces those disturbing sounds.

H. "Loudness"

Loudness is a self-evident dimension in perceived sound quality. As mentioned under Methods, we have tried to equalize the systems in loudness in order to bring other perceptual dimensions into focus (and fur- ther to remove the influence of possible differences in loudness when evaluating different systems). The ad- jective "loud" has then often appeared in the "sharp- ness/hardness" dimension, and there are reasons to believe, as Bismarck suggests (Ref. 27), that the sub- jects often used "loud" as more or less synonymous to "sharp, .... hard," or "painful."

VII. DISCUSSION

A. Labeling and comparison of dimensions

The list of dimensions above should not be understood

as representing a "final" result or a ready system for use in subjective evaluations of sound-reproducing sys- tems. There may be more dimensions in perceived sound quality than those mentioned here; on the other hand there may bc redundancy in the preeent enumera- tion. Experiments are needed to check the validity of the suggested dimensions and to investigate their rela- tions to physical parameters. Such experiments nec-

1031 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. Sj•gren: Quality of sound-reproducing systems 1031

Page 14: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

essarily include unidimensional scaling of perceptual dimensions and systematic experimental manipulation of various physical characteristics in sound-reproduc- ing systems.

The present labels of the dimensions are provisional and may be changed in the future. There are many problems of language here. It is sometimes said that verbal terms are inadequate for describing sounds-- there seem to be no words for what you perceive. Or it can be suspected that different people use words in different ways so the same word may mean different things to different individuals. For this reason investi- gators of "timbre" often use multidimensional scaling techniques requiring only judgments of similarity or dissimilarity, avoiding (more or less) other descriptive terms like adjectives (Wedin and Goude, 29 Plomp, •ø and GreyS•). The present studies of perceived sound qual- ity are very reminiscent of investigations of timbre-- one could speak of the "timbre" of various loudspeakers, headphones, etc. We have found it usefulto use various methods (similarity ratings, adjective ratings, free verbal descriptions) in combination to get at different aspects of perceived sound quality. The resulting dimensions were given verbal labels believed to have a certain interin- dividual validity (the interindividual reliability for the judgments in various adjective scales were in general high, see Results). We regard these labels as approxi- mations to be checked and refined in the future and

should keep in mind that there may be other perceptual aspects not clarified here.

A related problem is the translation from one lang- uage into another language. The dimension labels and other adjectives/expressions have been translated from Swedish into English here as "exactly" as possible (sometimes by using alternative translations), but it is hardly possible to get at every possible shade of mean- ing.

However, comparison of the present dimensions with those mentioned by other researchers in other countries suggests a good agreement. In a study of subjective assessment of multichannel reproduction Nakayama et al. (Ref. 5) mention "fullness, .... clearness," and "depth of image sources" (probably related to "near- ness" here). In earlier Japanese studies (Refs. 1-4) there are dimensions as "noisiness, .... softness," and "sensation of low-frequency tone/high frequency tone," the last one probably related to "brightness-darkness" here. Eisler e mentions, among others, "loudness," "bass boost," and "full treble reproduction" (the two last ones seemingly related to "brightness-darkness"). McDermott ? in a study of voice-communication circuits mentions "clarity of speech" and "loudness." K6tter s and Jost g both found "volume" (German: "Volumen," probably related to "fullness" here) and "density" (Ger- man: "Diehie") as two dominant dimensions. Staffeldt (Ref. 11) mentions "emphasized treble" and "empha- sized bass."

Many of the listed dimensions are probably no sur- prise for sound engineers, high-fidelity fans, audio- logisis, and others. Terms like "fullness, .... bright-

hess," "clearness," "sharpness," etc. are often used in discussions about sound reproduction. There is an in-

tricate question, however, whether the listed dimen- sions should be considered as independent of each other, or if there is some kind of relation between some of

them (for instance, since some of them often appear in combination). This question, as well as other method- ological issues, is discussed from various standpoints in another paper. 2• The practical conclusion of that dis- cussion is that the suggested dimensions should now be carefully investigated in separate experiments to check their validity and their relations to physical variables, which will provide a safer basis for conclusions con- cerning independence or not.

B. Psychophysical relations, interactions

Identification of dimensions in perceived sound quality was the primary objective of these investigations. Their relations to physical characteristics of the sys- tems were explored by noting the positions of the sys- tems in the respective dimensions as given by factor scores in factor analysis and by positions along the dimension rexes in the space from multidimensional scaling. The conclusions concerning psychophysical relations therefore have much of a post hoc character.

They should be regarded as working hypotheses for fu- ture experiments, in which different physical variables of sound-reproducing systems are systematically varied to see their effects on various perceptual dimensions.

This also presupposes some kind of adequate (unidimen- sional) scaling within the respective perceptual dimen- sions.

A problem here is the presence of interaction bet-ween programs and systems, that is, the judgment of a sys- tem may be rather different depending on which program is used for "testing" the system. This dependence on programs has been mentioned earlier, and many con- crete examples are given in the pre-publication reports. Since the stimulus reaching the listener's ear is a func- tion both of the program's physical structure and of the characteristics of the reproducing system (room acous- tics etc. may also be included), it is not surprising that a certain system may sound rather differently when fed with different programs. The selection of proper prog- rams which actually "reveal" differences between dif- ferent systems is a difficult problem in all listening tests. With increased knowledge about perceptual di- mensions and underlying psychophysical relations we may be in a better position to select adequate programs.

C. Evaluative judgments

The relations between different perceptual dimensions and evaluative judgments like "pleasant" and "natural/ true-to-nature" were sketched earlier. One way of quantifying these relations in more detail is by means of multiple regression equations, in which the evaluative overall judgment of a system is considered as a weight- ed linear function of its position in the different per- ceptual dimensions (see, for instance, Refs. 1-5). The problem is, however, that the respective weights may be very different depending on the specific context of

1032 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson and H. SjSgren: Quality of sound-reproducing systems 1032

Page 15: Perceived Sound Quality of Sound-reproducing Systemsh24-files.s3.amazonaws.com/42685/100195-KhwRJ.pdf · 2011-10-11 · Equipment for sound reproduction--amplifiers, turn- tables,

programs and systems, types of listeners (for instance, "listeners in general" seem to give different weights to dimensions as "softness" and "disturbing sounds/noise" than what high-fidelity fans doS'•ø), types of scaling methods etc. More reliable information in these ques- tions may be expected from future experiments.

A simil•tr line of reasoning can be applied as regards the relations between physical characteristics of the systems and overall evaluations. It would indeed be very informative (not the least for manufacturers) to be able to express overall evauation of various systems as weighted linear functions (or other kinds of functions) of physical characteristics of the systems. This pre- supposes still more detailed knowledge about the rela- tions between physical and perceptual-cognitive vari- ables within sound reproduction, which can be gained only by continued research. In the special case of hearing aids this question is even more complicated, since the effects of various types of hearing loss (often very individual ones) are added to the already long list of influencing factors.

ACKNOWLEDGMENTS

The authors express their gratitude to Ulf Rosenberg and Sten-•ke Frykholm who took part in the planning and realization of the loudspeaker and headphone ex- periments, respectively, to Bodil Johansson and BjSrn Lindstr•Jm, who served as experimenters and took part in many discussions, and to Bertil Johansson for val- uable discussions. The investigations were supported by The Swedish Consumer Council, The Swedish Nation- al Board for Technical Development, and The Swedish Philips Group.

iT. Kosh•kawa, T. Nakayama, and R. Miyagawa, "On the De- signing Method of Reproduced Sound Quality Using the Multi- dimensional Sensory and Emotional scales," 5 e Congr•s In- ternational d'Acoustique, M67, 1-4 (1965).

2T. Nakayama, R. Miyagawa, and T. Miura, '•IVIethods of Eval- uating and Designing Reproduced Sound Quality," Hitachi Rev. 15 (7), 256-262 (1966).

3R. Miyagawa, T. Nakayama, and T. Miura, "Design of Repro- duced Sound Quality by ESP Method," 6th, ICA A-5-14, 129- 132 (1968).

4y. Kawashima, T. Miura, T. Nakayama, and R. Miyagawa, "Design of Reproduced Sound Quality by ESP Method," Hi- tachi Hey. 19 0-), 1-9 (1970).

ST. Nakayama, T. Miura, O. Kosaka, M. Okamoto, and T. Shiga, '•ubjective Assessment of Multichannel Reprodue- t[on," J. Audio Eng. Soe. 19, 744--751 (1971).

6H. Eisler, '•Vleasurement of Perceived Acoustic Quality of Sound-Reproducing Systems by Means of Factor Analysis," 3. Aceus[. Soe. Am. 39, 484-492 (1966).

7B. $. McDermott, '•Multidimensional Analysesof Circuit Qual- ity Judgments," J. Acoust. SOc. Am. 45, 774-781 (1969).

BE. KStter, Der Ei•fluss •bert•'agungstechnische• Faktoren auf alas Musikh•ren (Arno Volk Verlag, K•ln, 1968).

9E. Jost, '•Ueber die Klangeigenschaften von Lautsprechern," in Jah•,buch des Staatlichen lnstituts f• Musikforschung I•reussischer Kulturbesitz, edited by D. Droysen (Verlag Merseburger, Berlin, 1972). pp. 175-202.

16A. Gabrielsson, U. Rosenberg, and H. SjSgren, "Judgments and Dimensien Analyses of Perceived Sound Quality of Sound- Reproducing Systems," J. Acoust. Soc. Am. 55, 854-861 (1974).

llH. Staffeldt, "Correlation between Subjective and Objective Data for Quality Loudspeakers," J. Audio Eng. Son. 22, 402-415 (1974).

12A. Gabrielsson, U. Rosenberg, and H. SjSgren, •'Adjective Ratings and Dimension Analyses of Perceived Sound Quality of Sound-Reproducing Systems," Reports from the The Psy- chological Laboratory, University of Uppsala, No. '141 (1973).

13A. Gabrielsson and H. Sj'•;gren, "Adjective Ratings and Di- mension Analyses of Perceived Sound Quality of Hearing Aids. I., II.. III.," Reports from Technical Audiology, Kar- ollnska Ins[[tote[, Stockholm, Nos. 75, 77, and 85 (1974, 1975, 1977).

14A. Gabrielsson and H. Sj'dgren, "Similarity Ratings and Di- mension Analyses of Perceived Sound Quality of Hearing Aids," Reports from Technical Audiology, Karolinska Insti- tute[, Stockholm, No. 76 (1975).

lSA. Gabrielsson, $. •. Frykholm, and H. Sj'(•gren, "Adjective Ratings and Dimension Analyses of Perceived Sound Quality of Headphones," Reports from Technical Audiology, Karo- linska Institutet, Stockholm, No. 86 (1977).

i•B. J. Winer, Statistical Principles in Experimental Design, 2rid ed. (McGraw-Hill, New York, 1971).

l?R. L. Gotsuch, Factor A•alysis (Saunders, Philadelphia, 1974).

•S•V. S. Torgerson, Theory and Methods of Scaling (Wiley, New York, 1958).

19j. D. Carroll and J. J. Chang, "Analysis of Individual Dif- ferences in Multidimensional Scaling via an N-way General- ization of 'Eckart-Young' Decomposition," Psychometrika 35, 283-319 (1970).

2øA. Gabrielsson, "An Empirica! Comparison Between Some Models for Multidimensional Scaling," Searid. J. Psychol. 15, 73-80 (1974).

21A. Gabrielsson, •Dimension Analyses of Perceived Sound Quality of Sound-Reproducing Systems," Suand. J. Psychol. (•n press).

•2A. Gabrielsson and H. Sj'(•gren, "Perceived Sound Quality of Hearing Aids," Sc•nd. Audiology (in press).

•E. Terhardt, "Psyeoacoustie Evaluation of Musical Sounds," Percept. Psyehophys. 23, 483-492 (1978).

24j. Sundberg, "Singing and Timbre," in Music Room Acoustics (Royal SwedishAcademyoflVIusie 17• StoekhMm 1977), pp. 57- 81.

2sS. S. Stevens and H. Davis, Hearing: Its Psychology and •hysiology (Wiley, New York, 1938).

•M. Guirao and S. S. Stevens, "Measurement of Auditory Den- airy, •' 3. Acoust. Soe. Am. 36, 1176-1182 (1964).

27G. yon Bismarck, '•Timbre of Steady Sounds: A Factorial Investigation of its Verbal Attributes," Acustica 30, 146-159 (1974).

28G. yon Bismarck, "Sharpness as an Attribute of the Timbre of Steady Sounds," Acustica 30, 159-172 (1974).

29L. Wedin and G. Goude, "Dimension Analysis of the Percep- tion of Instrumental Timbre," Seand. 3. Psychol. 13, 228- 240 0-972).

3øR. Plomp, Aspects of 2•one Sensation (Academic, London, New York, San Francisco, 1976), Chap. 6.

•iJ. M. Grey, '•/lultidimensional Pereeptoal Sealing of Musical Timbres," J. Acoust. Soe. Am. 61, 1270-1277 (1977).

1033 J. Acoust. Soc. Am., Vol. 65, No. 4, April 1979 A. Gabrielsson ar•l H. SjSgren: Quality of sound-reproducing systems 1033