AES128th Conven tion Pro gram - Audio Engineering · PDF fileThis paper intro - duces software...

Audio Engineering Society 126th Convention Program, 2010 Spring 1

The AES has launched a new opportunity to recognizestudent members who author technical papers. The Stu-dent Paper Award Competition is based on the preprintmanuscripts accepted for the AES convention.Twenty-seven student-authored papers were nominat-

ed. The excellent quality of the submissions has madethe selection process both challenging and exhilarating.This year we are recognizing two papers.The award-winning student paper will be honored dur-

ing the Convention, and the student-authored manuscriptwill be published in a timely manner in the Journal of theAudio Engineering Society.Nominees for the Student Paper Award were required

to meet the following qualifications:

(a) The paper was accepted for presentation at theAES 128th Convention.(b) The first author was a student when the work was

conducted and the manuscript prepared.(c) The student author’s affiliation listed in the manu-

script is an accredited educational institution.(d) The student will deliver the lecture or poster pre-

sentation at the Convention.

* * * * *

The co-winners of the 128th AES Convention Student Paper Award are:

Perceptual Evaluation of Physical Predictors of theMixing Time in Binaural Room Impulse Re-sponses —Alexander Lindau, Linda Kosanke,Stefan Weinzierl, Technical University of Berlin,

Berlin, Germany Convention Paper 8089

To be presented on Monday, May 24 in Session P17 —Multichannel and Spatial Audio: Part 1

A Subjective Evaluation of the Minimum AudibleChannel Separation in Binaural ReproductionSystems through Loudspeakers—Yesenia La-couture Parodi, Per Rubak, Aalborg University,

Aalborg, DenmarkConvention Paper 8104

To be presented on Monday, May 24 in Session P19 —Posters: Psychoacoustics and Listening Tests

* * * * *Student Event/Career DevelopmentSTUDENT SCIENCE SPOTSaturday, May 22 through Tuesday, May 25C4-Foyer

Ongoing presentations of student’s non-commercial sci-entifically challenging works will be given throughout theConvention in an accessible way. They will offer proto-

types or demonstrations, including a student poster ses-sion. Within the scope of their presentation, the studentswill have the opportunity to present themselves and theirinstitutes in an appropriate way.

Saturday, May 22 09:00 Room Saint JulienTechnical Committee Meeting on Microphones and Applications

Session P1 Saturday, May 2209:30 – 12:30 Room C3

HIGH PERFORMANCE AUDIO PROCESSING

Chair: Neil Harris, New Transducers Ltd. (NXT), Cambridge, UK

09:30

P1-1 Model-Driven Development of Audio Processing Applications for Multi-CoreProcessors—Tiziano Leidi,1 Thierry Heeb,2Marco Colla,1 Jean-Philippe Thiran31ICIMSI-SUPSI, Manno, Switzerland 2Digimath, Sainte-Croix, Switzerland 3EPFL, Lausanne, Switzerland

Chip-level multiprocessors are still very youngand available forecasts anticipate a strong evo-lution for the forthcoming decade. To exploitthem, efficient and robust applications have tobe built with the appropriate algorithms and soft-ware architectures. Model-driven development isable to lower some barriers toward applicationsthat process audio in parallel on multi-cores. Itallows using abstractions to simplify and maskcomplex aspects of the development processand helps avoid inefficiencies and subtle bugs.This paper presents some evolutions of Audio n-Genie, an open-source environment for model-driven development of audio processing applica-tions, which has been recently enhanced withsupport for parallel processing on multi-cores.Convention Paper 7961

10:00

P1-2 Real-Time Additive Synthesis with One Million Sinusoids Using a GPU—LauriSavioja,1,2 Vesa Välimäki,2 Julius O. Smith III31NVIDIA Research, Helsinki, Finland2Aalto University School of Science and Technology, Espoo, Finland

3Stanford University, Palo Alto, CA, USA

AAEESS 112288thth CCoonnvvenenttiion Proon ProggramramMay 22 – 25, 2010

Novotel London West, London, UK

�

2 Audio Engineering Society 128th Convention Program, 2010 Spring

Additive synthesis is one of the fundamentalsound synthesis techniques. It is based on theprinciple that each sound can be represented asa superposition of sine waves of different fre-quencies. That task can be done fully paralleland thus it is suitable for GPU (graphics pro-cessing unit) implementation. In this paper weshow that it is possible to compute over one mil-lion unique sine waves in real-time using a cur-rent GPU. That performance depends on the applied buffer sizes, but close to the maximumresult is reachable already with a buffer of 500samples.Convention Paper 7962

10:30

P1-3 A GPGPU Approach to Improved Acoustic Finite Difference Time Domain Calculations—Jamie A. S. Angus, Andrew Caunce, Universityof Salford, Salford, Greater Manchester, UK

This paper shows how to improve the efficiencyand accuracy of Finite Difference Time Domainacoustic simulation by both calculating the differ-ences using spectral methods and performingthese calculations on a Graphics ProcessingUnit (GPU) rather than a CPU. These changesto the calculation method result in an increase inaccuracy as well as a reduction in computationalexpense. The recent advances in the way thatGPU’s are programmed (for example usingCUDA on Nvidia's GPU) now make them an ide-al platform on which to perform scientific compu-tations at very high speeds and very low powerconsumption. Convention Paper 7963

11:00

P1-4 Digital Equalization Filter: New Solution tothe Frequency Response Near Nyquist andEvaluation by Listening Tests—ThorstenSchmidt,1 Joerg Bitzer21Cube-Tec International, Bremen, Germany 2Jade-University of Applied Sciences, Oldenburg, Germany

Current design methods for digital equalizationfilter face the problem of a frequency responseincreasingly deviating from their analog equiva-lent close to the Nyquist frequency. This paperdeals with a new way to design equalization fil-ters, which improve this behavior over the entirefrequency range between 0 Hz (DC) andNyquist. The theoretical approach is shown andexamples of low pass, peak-, and shelving-filtersare compared to state-of-the-art techniques. Listening tests were made to verify the audibledifferences and rate the quality of the differentdesign methods. Convention Paper 7964

11:30

P1-5 Audio Equalization with Fixed-Pole ParallelFilters: An Efficient Alternative to ComplexSmoothing—Balázs Bank, Budapest Universityof Technology and Economics, Budapest, Hungary

Recently, the fixed-pole design of parallel sec-ond-order filters has been proposed to accom-plish arbitrary frequency resolution similar to

Kautz filters, at 2/3 of their computational cost.This paper relates the parallel filter to the com-plex smoothing of transfer functions. Complexsmoothing is a well-established method for limit-ing the frequency resolution of audio transferfunctions for analysis, modeling, and equaliza-tion purposes. It is shown that the parallel filterresponse is similar to the one obtained by com-plex smoothing the target response using a han-ning window: a 1/ß octave resolution is achievedby using ß/2 pole pairs per octave in the parallelfilter. Accordingly, the parallel filter can be eitherused as an efficient implementation of smoothedresponses, or, it can be designed from the unsmoothed responses directly, eliminating theneed of frequency-domain processing. In addi-tion, the theoretical equivalence of parallel filtersand Kautz filters is developed, and the formulasfor converting between the parameters of thetwo structures are given. Examples of loud-speaker-room equalization are provided. Convention Paper 7965

12:00

P1-6 Rapid and Automated Development of AudioDigital Signal Processing Algorithms for Mobile Devices—David Trainor, APTX, Belfast,N. Ireland, UK

Software applications and programming lan-guages are available to assist audio DSP algo-rithm developers and mobile device designers,including Matlab/Simulink, C/C++, and assemblylanguages. These tools provide some assistancefor algorithmic experimentation and subsequentrefinement to highly-efficient embedded soft-ware. However, a typical design flow is still high-ly iterative, with manual software recoding,translation, and optimization. This paper intro-duces software libraries and design techniquesthat integrate existing commercial audio algo-rithm design tools and permit intuitive algorithmicexperimentation and automated translation ofaudio algorithms to efficient embedded software.These techniques have been incorporated into anew software framework, and the operation ofthis framework is described using the example ofa custom audio coding algorithm targeted to amobile audio device. Convention Paper 7966


SOUND REINFORCEMENT AND ROOM ACOUSTICS

Chair: Glenn Leembruggen, Acoustic Directions Pty Ltd., ICE Design, Sydney, Australia, and University of Sydney, Sydney, Australia

09:30

P2-1 Simultaneous Soundfield Reproduction at Multiple Spatial Regions—Yan Jennifer Wu,Thushara D. Abhayapala, The Australian National University, Canberra, Australia

Reproduction of different sound fields simultane-ously at multiple spatial regions is a complexproblem in acoustic signal processing. In this pa-per we present a framework to recreate 2-D

Technical Program


sound fields simultaneously at multiple spatialregions using a single loudspeaker array. Wepropose a novel method of finding an equivalentglobal sound field that consists of each individualdesired sound field by the spatial harmonic coef-ficients translation between the coordinate systems. This method makes full use of theavailable dimensionality of the sound field. Somefundamental limits are also revealed. Specifical-ly, the number of spatial regions could be repro-duced inside the single loudspeaker array as determined by the total dimensionality from eachsound field. Convention Paper 7967

10:00

P2-2 Verification of Geometric Acoustics-BasedAuralization Using Room Acoustics Measurement Techniques—Aglaia Foteinou,1Damian Murphy,1 Anthony Masinton21University of York, Heslington, UK 2University of York, King’s Manor, York, UK

Geometric acoustics techniques such as ray-tracing, image-source, and variants are com-monly used in the simulation and auralization ofthe acoustics of an enclosed space. The short-comings of such techniques are well document-ed and yet they are still the methods most commonly used and accepted in architecturalacoustic design. This work compares impulse response-based objective acoustic measuresobtained from a 3-D model of a medieval Englishchurch using a combination of image-source andray-tracing geometric acoustic methods, withmeasurements obtained within the actual space.The results are presented in both objective andsubjective terms, and include an exploration ofoptimized boundary materials and source direc-tivity characteristics with problems and limita-tions clarified.Convention Paper 7968

10:30

P2-3 Analysis Tool Development for the Investigation of Low Frequency RoomAcoustics by Means of Finite ElementMethod—Christos Sevastiadis, George Kalliris,George Papanikolaou, Aristotle University ofThessaloniki, Thessaloniki, Greece

The sound wave behavior at low frequencies insmall sound recording, control, and reproductionrooms, results in resonant sound fields observedin both frequency and spatial domains. Finite Element Method (FEM) software applicationsprovide potential analysis procedures foracoustics problems but there is a lack of tools focusing on room acoustics investigation. Thepresent paper is the result of an attempt to develop a room acoustics modeling tool integrat-ed with a FEM. Simple room construction, com-mon acoustical treatments, and multiple sourceexcitations are involved in modal and steadystate analysis procedures in order to simplify thesolution of problematic sound fields. The key parameters and features, as long as validationexperimental results are presented.Convention Paper 7969

11:00

P2-4 Subwoofers in Rooms: Experimental ModalAnalysis—Juha Backman, Nokia Corporation,Espoo, Finland

The behavior of a loudspeaker in a room depends fully on the coupling of the loudspeakerto the room modes, which are in the subwooferfrequency range individually identifiable. Themodes can be determined through computation-al methods if the surface properties and roomgeometry are simple enough, but systems withcomplex properties have to be analyzed experi-mentally. The paper describes the use of modalanalysis techniques, usually applied only to two-dimensional structures, for three-dimensionalspaces to determine experimentally the actualmodes and their properties.Convention Paper 7970

11:30

P2-5 Subwoofer Positioning, Orientation, and Calibration for Large-Scale Sound Reinforcement —Adam Hill,1 Malcolm Hawksford,1 Adam P. Rosenthal,2 Gary Gand21University of Essex, Colchester, UK 2Gand Concert Sound, Glenview, IL, USA

It is often difficult to achieve even coverage at low-frequencies across a large audience area. Tocomplicate matters, it is desirable to have tightcontrol of the low-frequency levels on the stage.This is generally dealt with by using cardioid sub-woofers. While this helps control the stage area,the audience area receives no clear benefit. Thispaper investigates how careful positioning, orienta-tion, and calibration of a multiple subwoofer sys-tem can provide enhanced low-frequency cover-age, both in the audience area and on the stage.The effects of placement underneath, on top of,and in front of the stage are investigated as well asthe performance of systems consisting of bothflown and ground-based subwoofers.Convention Paper 7971

12:00

P2-6 Live Measurements of Ground-Stacked Subwoofers’ Performance—Elena Shabalina,1Mathias Kaiser,1 Janko Ramuscak21RWTH Aachen University, Aachen, Germany 2d&b audiotechnik GmbH, Backnang, Germany

Various software-based measurement systemscan be used for live sound measurements withmusic or speech signals in occupied concert hallswith sufficient accuracy. However for some specialapplications as measuring the performance of aground-stacked bass array component of a multi-ple component sound reinforcement system in anoccupied hall, these systems cannot be used directly, since it is not possible to run a bass arrayalone during a concert. A simple method to obtainthe subwoofers’ impulse response using programsignal measurements of a full range PA system isproposed. The desired impulse response resultsfrom subtraction of full system impulse responseswith and without subwoofers running. Requiredmeasurement conditions and limitations are discussed.Convention Paper 7972 �


Saturday, May 22 10:00 Room Saint JulienTechnical Committee Meeting on Perception and Subjective Evaluation of Audio Signals

Session P3 Saturday, May 2210:30 – 12:00 C4-Foyer

POSTERS: RECORDING, PRODUCTION, AND REPRODUCTION—MULTICHANNEL AND SPATIAL AUDIO

10:30

P3-1 21-Channel Surround System Based onPhysical Reconstruction of a Three Dimensional Target Sound Field—JeongilSeo,1 Jae-hyoun Yoo,1 Kyeongok Kang,1Filippo M. Fazi21ETRI, Daejeon, Korea 2University of Southampton, Southampton, UK

This paper presents the 21-channel sound fieldreconstruction system based on the physical reconstruction of a three dimensional targetsound field over the pre-defined control volume.According to the virtual sound source positionand intensity, each loudspeaker signal is esti-mated through convolving with appropriate FIRfilter to reconstruct a target sound field. In addi-tion, the gain of FIR filter is only applied to themid frequency band of a sound source signal toprevent aliasing effects and to save the compu-tational complexity at the high frequency bands.Also the whole filter processing is carried out atthe frequency domain to adopt a real-time appli-cation. Through the subjective listening tests theproposed system showed better performance onthe localization in the horizontal plane comparingwith conventional panning method.Convention Paper 7973

10:30

P3-2 Real-Time Implementation of Wave Field Synthesis on NU-Tech Framework UsingCUDA Technology —Ariano Lattanzi,1Emanuele Ciavattini,1 Stefania Cecchi,2Laura Romoli,2 Fabrizio Ferrandi31Leaff Engineering, Ancona, Italy 2Università Politecnica delle Marche, Ancona, Italy

3Politecnico di Milano, Milan, Italy

In this paper we present a novel implementationof a Wave Field Synthesis application based onemerging NVIDIA Compute Unified Device Architecture (CUDA) technology using NU-TechFramework. CUDA technology unlocks the pro-cessing power of the Graphics Processing Units(GPUs) that are characterized by a highly paral-lel architecture. A wide range of complex algo-rithms are being re-written in order to benefitfrom this new approach. Wave Field Synthesis isa quite new spatial audio rendering techniquehighly demanding in terms of computationalpower. We present here results and compar-isons between a NU-Tech Plug-In (NUTS) imple-menting real-time WFS using CUDA librariesand the same algorithm implemented using IntelIntegrated Primitives (IPP) Library.Convention Paper 7974

10:30

P3-3 Investigation of 3-D Audio Rendering withParametric Array Loudspeakers—Reuben Johannes, Jia-Wei Beh, Woon-Seng Gan, Ee-Leng Tan, Nanyang Technological University, Singapore

This paper investigates the applicability of para-metric array loudspeakers to render 3-D audio.Unlike conventional loudspeakers, parametricarray loudspeakers are able to produce sound ina highly directional manner, therefore reducinginter-aural crosstalk and room reflections. Theinvestigation is carried out by performing objec-tive evaluation and comparison between parametric array loudspeakers, conventionalloudspeakers, and headphones. The objectiveevaluation includes crosstalk measurement andbinaural cue analysis using a binaural hearingmodel. Additionally, the paper also investigateshow the positioning of the parametric array loud-speakers affects 3-D audio rendering. Convention Paper 7975

10:30

P3-4 Robust Representation of Spatial Sound inStereo-to-Multichannel Upmix—Se-WoonJeon,1 Young-Cheol Park,2 Seok-Pil Lee,3Dae-Hee Youn11Yonsei University, Seoul, Korea 2Yonsei University, Gangwon, Korea 3Korea Electronics Technology Institute (KETI), Seoul, Korea

This paper presents a stereo-to-multichannel upmix algorithm based on a source separationmethod. In the conventional upmix algorithms,panning source and ambient components aredecomposed or separated by adaptive algo-rithm, i.e., least-squares (LS) or least-mean-square (LMS). Separation performance of thosealgorithms is easily influenced by primary to ambient energy ratio (PAR). Since PAR is time-varying, it causes the energy fluctuation of sepa-rated sound sources. To prevent this problem,we propose a robust separation algorithm usinga pseudo inverse matrix. And we propose a nov-el post-scaling algorithm to compensate for theinfluence of interference with considering desired multichannel format. Performance of theproposed upmix algorithm is confirmed by sub-jective listening test in ITU 3/2 format.Convention Paper 7976

10:30

P3-5 Ambisonic Decoders; Is Historical Hardwarethe Future?—Andrew J. Horsburgh, D. FraserClark, University of the West of Scotland, Paisley, UK

Ambisonic recordings aim to create full sphereaudio fields through using a multi-capsule micro-phone and algorithms based on a “metatheory”as proposed by Gerzon. Until recently, Ambison-ic decoding was solely implemented using hard-ware. Recent advances in computing power nowallow for software decoders to supersede hard-ware units. It is therefore of interest to determinewhich of the hardware or software decoders pro-vide the most accurate decoding of Ambisonic

Technical Program


B-format signals. In this paper we present acomparison between hardware and software decoders with respect to their frequency andphase relationships to determine the most accu-rate reproduction. Results show that software isable to decode the files with little coloration com-pared to hardware circuits. It is possible to seewhich implementation of the analog or digital decoders match the behavioral characteristics ofan “ideal” decoder. Convention Paper 7977

10:30

P3-6 Localization Curves in Stereo MicrophoneTechniques—Comparison of Calculationsand Listening Tests Results—Magdalena Plewa, Grzegorz Pyda, AGH University of Science and Technology, Kraków, Poland

Stereo microphone techniques are a simple andusable solution to reproduce music scenes whilemaintaining sound direction maintenance. Tochoose a proper microphone setup one needs toknow the “recording angle” that could be easily determined from localization curves. Localizationcurves in different stereo microphone techniquescan be determined with the use of calculations,which take into consideration interchannel leveland time differences. Subjective listening testswere carried out to verify the mentioned calcula-tions. Here we present and discuss the comparisonbetween the calculations and the listening tests.Convention Paper 7978

10:30

P3-7 Investigation of Robust Panning Functionsfor 3-D Loudspeaker Setups—Johann-MarkusBatke, Florian Keiler, Technicolor, Research,and Innovation, Hannover, Germany

An accurate localization is a key goal for a spa-tial audio reproduction system. This paper dis-cusses different approaches for audio playbackwith full spatial information in three dimensions(3-D). Problems of established methods for 3-Daudio playback, like the Ambisonics modematching approach or Vector Base AmplitudePanning (VBAP), are discussed. A new approach is presented with special attention tothe treatment of irregular loudspeaker setups, asthey are to be expected in real world scenarioslike living rooms. This new approach leads tobetter localization of virtual acoustic sources.Listening tests comparing the new approach withstandard mode matching and VBAP will be described in a companion paper. Convention Paper 7979

Workshop 1 Saturday, May 2210:30 – 12:30 Room C6

AUDIO NETWORK CONTROL PROTOCOLS

Chair: Richard Foss, Rhodes University, Grahamstown, South Africa

Panelists: John Grant, Nine Tiles, Cambridge, UKRobby Gurdan, Universal Media Access Networks (UMAN), Düsseldorf, Germany

Stefan Ledergerber, Harman Pro Audio, Washington, DC, USAPhilip Nye, Engineering Arts, Bournemouth, UKAndy W. Schmeder, University of California at Berkeley, Berkeley, CA, USA

Digital audio networks have solved a number of prob-lems related to the distribution of audio within a numberof contexts, including recording studios, stadiums, con-vention centers, theaters, and live concerts. They pro-vide cabling ease, better immunity to interference, andenhanced control over audio routing and signal process-ing when compared to analog solutions. There exist anumber of audio network types and also a number of audio network protocols that define the messaging nec-essary for connection management and control. Theproblem with this range of protocol solutions is that alarge number of professional audio devices are beingmanufactured without regard to global interoperability. Inthis workshop a panel of audio network protocol expertswill share the features of audio network protocols thatthey are familiar with and what features should appearwithin a common protocol.

Saturday, May 22 11:00 Room Saint JulienTechnical Committee Meeting on Studio Practicesand Production

Special EventAWARDS PRESENTATION AND KEYNOTE ADDRESSSaturday, May 22, 12:45 – 13:30 Room C3

Opening Remarks: • Executive Director Roger Furness• President Diemer de Vries• Convention Chair Josh Reiss

Program: • AES Awards Presentation• Introduction of Keynote Speaker by Convention Chair Josh Reiss• Keynote Address by Masataka Goto

Awards PresentationPlease join us as the AES presents Special Awards tothose who have made outstanding contributions to theSociety in such areas of research, scholarship, and pub-lications, as well as other accomplishments that havecontributed to the enhancement of our industry. Theawardees are:

Honorary Member of AES:• George Martin

Citation:• Salvador Castaneda Valdes

Board of Governors Award• Jan Abildgaard Pedersen• Hiroaki Suzuki• Martin Wöhr

Fellowship Award• John Grant• Bozena Kostek• Ville Pulkki• Mark Yonge

Silver Medal Award• Ronald Aarts

Keynote Speaker

This year’s Keynote Speaker is Masataka Goto. Masa-�


taka Goto is the leader of the Media Interaction Group atthe National Institute of Advanced Industrial Science andTechnology (AIST), Japan. In 1992 he was one of thefirst to start work on automatic music understanding, andhas since been at the forefront of research in music tech-nologies and music interfaces based on those technolo-gies. Since 1998 he has also worked on speech recogni-tion interfaces. He has published more than 160 papersin refereed journals and international conferences. Overthe past 18 years, he has received 25 awards, includingthe Young Scientists’ Prize, the Commendation for Sci-ence and Technology by the Minister of Education, Cul-ture, Sports, Science and Technology, the ExcellenceAward in Fundamental Science of the DoCoMo MobileScience Awards, and the Best Paper Award of the Infor-mation Processing Society of Japan (IPSJ). He hasserved as a committee member of over 60 scientific soci-eties and conferences and was the General Chair of the10th International Society for Music Information RetrievalConference (ISMIR 2009) and the Chair of the IPSJ Spe-cial Interest Group on Music and Computer (SIGMUS).The title of his speech is “Music Listening in the Future.”Music understanding technologies based on signal

processing have the potential to enable new ways of lis-tening to music—for everyone. We have developed sev-eral Active Music Listening interfaces to demonstratehow end users can benefit from these new technologies.Active Music Listening aims at allowing the user to understand better the music he or she listens to and toactively influence the listening experience. For example,our Active Music Listening interface “SmartMusicKIOSK”has a chorus-search function that enables the user to access directly his or her favorite part of a song (and toskip others) while viewing a visual representation of itsmusical structure. The software “LyricSynchroniser” canautomatically synchronize song lyrics to a recording andhighlights the phrase currently sung. A user can easilyfollow the current playback position and click on a wordin the lyrics to listen to it. Given polyphonic sound mix-tures taken from available music recordings, our inter-faces thus enrich end-users’ music listening experiences,and open up new ways of music listening in the future.


SPATIAL SIGNAL PROCESSING

Chair: Francis Rumsey

14:00

P4-1 Classification of Time-Frequency Regions in Stereo Audio—Aki Härmä, Philips ResearchEurope, Eindhoven, The Netherlands

The paper is about classification of time-frequen-cy (TF) regions in stereo audio data by the typeof mixture the region represents. The detectionof the type of mixing is necessary, for example,in source separation, upmixing, and audio manipulation applications. We propose a genericsignal model and a method to classify the TF re-gions into six classes that are different combina-tions of central, panned, and uncorrelatedsources. We give an overview of traditional tech-niques for comparing frequency-domain dataand propose a new approach for classificationthat is based on measures specially trained forthe six classes. The performance of the newmeasures is studied and demonstrated using

synthetic and real audio data. Convention Paper 7980

14:30

P4-2 A Comparison of Computational PrecedenceModels for Source Separation in ReverberantEnvironments—Christopher Hummersone,Russell Mason, Tim Brookes, University of Surrey, Guildford, UK

Reverberation continues to be problematic inmany areas of audio and speech processing, including source separation. The precedence effect is an important psychoacoustic tool uti-lized by humans to assist in localization by sup-pressing reflections arising from room bound-aries. Numerous computational precedencemodels have been developed over the years andall suggest quite different strategies for handlingreverberation. However, relatively little work hasbeen done on incorporating precedence intosource separation. This paper details a studycomparing several computational precedencemodels and their impact on the performance of abaseline separation algorithm. The models aretested in a range of reverberant rooms and witha range of other mixture parameters. Large dif-ferences in the performance of the models areobserved. The results show that a model basedon interaural coherence produces the greatestperformance gain over the baseline algorithm. Convention Paper 7981

15:00

P4-3 Converting Stereo Microphone Signals Directly to MPEG-Surround—ChristopheTournery,1 Christof Faller,1 Fabian Kuech,2Jürgen Herre21Illusonic LLC, Lausanne, Switzerland2Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

We have previously proposed a way to use stereomicrophones with spatial audio coding to recordand code surround sound. In this paper we aredescribing further considerations and improve-ments needed to convert stereo microphone sig-nals directly to MPEG Surround, i.e., a downmixsignal plus a bit stream. It is described in detailhow to obtain from the microphone channels theinformation needed for computing MPEG Sur-round spatial parameters and how to process themicrophone signals to transform them to anMPEG Surround compatible downmix.Convention Paper 7982

15:30

P4-4 Modification of Spatial Information in Coincident-Pair Recordings—Jeremy Wells,University of York, York, UK

A novel method is presented for modifying thespatial information contained in the output from astereo coincident pair of microphones. The pur-pose of this method is to provide additionaldecorrelation of the audio at the left and right replay channels for sound arriving at the sides ofa coincident pair but to retain the imaging accu-racy for sounds arriving to the front or rear orwhere the entire sound field is highly correlated.

Technical Program


Details of how this is achieved are given and results for different types of sound field are presented. Convention Paper 7983

16:00

P4-5 Unitary Matrix Design for Diffuse Jot Reverberators—Fritz Menzer, Christof Faller,Ecole Polytechnique Fédérale de Lausanne,Lausanne, Switzerland

This paper presents different methods for designing unitary mixing matrices for Jot rever-berators with a particular emphasis on caseswhere no early reflections are to be modeled.Possible applications include diffuse sound reverberators and decorrelators. The trade-offbetween effective mixing between channels andthe number of multiply operations per channeland output sample is investigated as well as therelationship between the sparseness of powersof the mixing matrix and the sparseness of theimpulse response. Convention Paper 7984

16:30

P4-6 Sound Field Indicators for Hearing Activityand Reverberation Time Estimation in Hearing Instruments—Andreas P. Streich,1Manuela Feilner,2 Alfred Stirnemann,2 JoachimM. Buhmann11ETH Zurich, Zurich, Switzerland 2Phonak AG, Stäfa, Switzerland

Sound field indicators (SFI) are proposed as anew feature set to estimate the hearing activityand reverberation time in hearing instruments.SFIs are based on physical measurements ofthe sound field. A variant thereof, called SFIshort-time statistics SFIst2, is obtained by com-puting mean and standard deviations of SFIs on10 subframes. To show the utility of these fea-ture sets for the mentioned prediction tasks, experiments are carried out on artificially rever-berated recordings of a large variety of soundsencountered in daily life. In a classification sce-nario where the hearing activity is to be predict-ed, both SFI and SFIst2 yield clearly superior accuracy even compared to hand-tailored fea-tures used in state-of-the-art hearing instru-ments. For regression on the reverberation time,the SFI-based features yield a lower residual error than standard feature sets and reach theperformance of specially designed features. Thehearing activity classification is mainly based onthe average of the SFIs, while the standard devi-ation over sub-window is used heavily to predictthe reverberation time.Convention Paper 7985

17:00

P4-7 Stereo-to-Binaural Conversion Using Interaural Coherence Matching—Fritz Menzer,Christof Faller, Ecole Polytechnique Fédérale deLausanne, Lausanne, Switzerland

In this paper a method of converting stereorecordings to simulated binaural recordings ispresented. The stereo signal is separated intocoherent and diffuse sound based on the

assumption that the signal comes from a coinci-dent symmetric microphone setup. The coherentpart is reproduced using HRTFs and the diffusepart is reproduced using filters adapting the in-teraural coherence to the interaural coherence abinaural recording of diffuse sound would have. Convention Paper 7986

17:30

P4-8 Linear Simulation of Spaced Microphone Arrays Using B-Format Recordings—AndreasWalther, Christof Faller, Ecole PolytechniqueFédérale de Lausanne, Lausanne, Switzerland

A novel approach for linear post-processing ofB-Format recordings is presented. The goal is tosimulate spaced microphone arrays by approxi-mating and virtually recording the sound field atthe position of each single microphone. The delays occurring in non-coincident recordingsare simulated by translating an approximativeplane wave representation of the sound field tothe positions of the microphones. The directionalresponses of the spaced microphones are approximated by linear combination of the corre-sponding translated B-format channels. Convention Paper 7987


LOUDSPEAKERS AND HEADPHONES: PART 1

Chair: Mark Dodd, GP Acoustics, UK

14:00

P5-1 Micro Loudspeaker Behavior versus 6½-InchDriver, Micro Loudspeaker Parameter Drift—Bo Rohde Pedersen, Aalborg University Esbjerg,Esbjerg, Denmark

This study tested micro loudspeaker behaviorfrom the perspective of loudspeaker parameterdrift. The main difference between traditionaltransducers and micro loudspeakers, apart fromtheir size, is their suspension construction. Thesuspension generally is a loudspeaker’s mostunstable parameter, and the study investigatedtemperature drift and signal dependency. We investigated three different micro loudspeakersand compared their behavior to that of a typicalbass mid-range loudspeaker unit. There is mea-sured all linear loudspeaker parameters at differ-ent temperatures. Convention Paper 7988

14:30

P5-2 Modeling a Loudspeaker as a Flexible Spherical Cap on a Rigid Sphere—RonaldAarts,1,2 Augustus J. E. M. Janssen21Philips Research Europe, Eindhoven, The Netherlands

2Technical University Eindhoven, Eindhoven, The Netherlands

It has been argued that the sound radiation of aloudspeaker is modeled realistically by assuming �


the loudspeaker cabinet to be a rigid sphere witha moving rigid spherical cap. Series expansions,valid in the whole space on and outside thesphere, for the pressure due to a harmonicallyexcited, flexible cap with an axially symmetricvelocity distribution are presented. The velocityprofile is expanded in functions orthogonal onthe cap rather than on the whole sphere. Thishas the advantage that only a few expansion coefficients are sufficient to accurately describethe velocity profile. An adaptation of the stan-dard solution of the Helmholtz equation to thisparticular parametrization is required. This isachieved by using recent results on argumentscaling in orthogonal Zernike polynomials. Theefficacy of the approach is exemplified by calcu-lating various acoustical quantities with particularattention to certain velocity profiles that vanish atthe rim of the cap to a desired degree. Thesequantit ies are: the sound pressure, polar response, baffle-step response, sound power,directivity, and acoustic center of the radiator.The associated inverse problem, in which thevelocity profile is estimated from pressure mea-surements around the sphere, is feasible as wellsince the number of expansion coefficients to beestimated is limited. This is demonstrated with asimulation.Convention Paper 7989

15:00

P5-3 Closed and Vented Loudspeaker Alignmentsthat Compensate for Nearby Room Boundaries—Jamie A. S. Angus, University of Salford, Salford, Greater Manchester, UK

The purpose of this paper is to present a methodof designing loudspeakers in which the presenceof the nearest three boundaries are taken intoaccount in the design of the low frequencyspeaker. The paper will first review the effects ofthe three boundaries. It will then discuss howthese effects might be compensated for. The paper will then examine the low frequency behavior of loudspeaker drive units and makesuggestions for new alignments, which take account of the boundaries. It will conclude withsome simulation examples. Convention Paper 7990

15:30

P5-4 Point-Source Loudspeaker Reversely-Attached Acoustic Horn: Its Architecture,Acoustic Characteristics and Application toHRTF Measurements—Takahiro Miura, TeruoMuraoka, Tohru Ifukube, The University ofTokyo, Tokyo, Japan

It is ideal to measure acoustic characteristics bypoint-source sound. In the case when simultane-ous recording of single source by multiple micro-phones located at under 1 m from the source, itis difficult to regard the loudspeaker as a point-source. In this paper we propose a point-sourceloudspeaker whose radiation diameter is smallerthan 2 cm. The loudspeaker is designed to attach the mouse of hyperbolic horn to the diaphragm of a loudspeaker unit. Directional patterns of the proposed was measured at a dis-tance of 50 cm from the radiation point in

anechoic chamber. As a result, the difference ofdirectional intensity at the frequency range of20 – 700 Hz were within 3 dB at any combina-tion of azimuth and elevation. At the frequencyrange over 700 Hz, difference of azimuthal direc-tional intensity were within 10 dB while that ofthe elevational ones were within 20 dB. Convention Paper 7991

16:00

P5-5 The Low-Frequency Acoustic Center: Measurement, Theory, and Application—John Vanderkooy, University of Waterloo, Waterloo, Ontario, Canada

At low frequencies the acoustic effect of a loud-speaker becomes simpler as the wavelength ofthe sound becomes large relative to the cabinetdimensions. One point acoustically acts as thecenter of the speaker at the lower frequencies.Measurements and acoustic boundary-elementsimulations verify the concept. Source radiationcan be expressed as a multipole expansion,consisting of a spherical monopolar portion anda significant dipolar part, which becomes zerowhen the acoustic centre is chosen as the origin.Theory shows a strong connection between dif-fraction and the position of the acoustic center.General criteria are presented to give the posi-tion of the acoustic center for different geometri-cal cabinet shapes. Polar plots benefit when thepivot point is chosen to be the acoustic center.For the first of several applications we considera subwoofer, whose radiation into a room isstrongly influenced by the posit ion of theacoustic center. A second application that weconsider is the accurate reciprocity calibration ofmicrophones, for which it is necessary to knowthe position of the acoustic center. A final appli-cation is the effective position of the ears on thehead at lower frequencies. Calculations andmeasurements show that the acoustic centers ofthe ears are well away from the head.Convention Paper 7992

16:30

P5-6 Prediction of Harmonic Distortion Generatedby Electro-Dynamic Loudspeakers UsingCascade of Hammerstein Models—Marc Rebillat,1,2 Romain Hennequin,3Etienne Corteel,4 Brian F. G. Katz11LIMSI-CRNS, Orsay, France 2LMS (CNRS, École Polytechnique), Palaiseau, France

3Institut TELECOM, TELECOM ParisTech, Paris, France

4sonic emotion, Oberglatt (Zurich), Switzerland

Audio rendering systems are always slightlynonlinear. Their nonlinearities must be modeledand measured for quality evaluation and controlpurposes. Cascade of Hammerstein models de-scribes a large class of nonlinearities. To identifythe elements of such a model, a method basedon a phase property of exponential sine sweepsis proposed. A complete model of nonlinearitiesis identified from a single measurement. Cas-cade of Hammerstein models corresponding toan electro-dynamic loudspeaker are identifiedthis way. Harmonic distortion is afterward pre-

Technical Program


dicted using the identified models. Comparisonswith classical measurement techniques showthat harmonic distortion is accurately predictedby the identified models over the entire audiofrequency range for any desired input amplitude.Convention Paper 7993

17:00

P5-7 Characterizing Studio Monitor Loudspeakers for Auralization—Ben Supper, Focusrite Audio Engineering Ltd., High Wycombe, UK

A method is presented for obtaining the frequen-cy and phase response, directivity pattern, andsome of the nonlinear distortion characteristicsof studio monitor loudspeakers. Using a special-ly-designed test signal, the impulse responseand directivity pattern are measured in a smallrecording room. A near-field measurement isalso taken. An algorithm is presented for com-bining the near- and far-field responses in orderto compute out the early reflections of the room.Doppler distortion can be calculated usingrecorded and measured properties of the loud-speaker. The result is a set of loudspeaker impulse and directional responses that are de-tailed enough for convincing auralization.Convention Paper 7994

17:30

P5-8 Automated Design of Loudspeaker DiaphragmProfile by Optimizing the Simulated RadiatedSound Field with Experimental Validation—Patrick Macey,1 Kelvin Griffiths21PACSYS Limited, Nottingham, UK 2Harman International Automotive Division, Bridgend, UK

Loudspeaker designers often aim to reduceacoustic effects of diaphragm resonance fromthe frequency and power response. However,even in the domain of simulation models, this often requires considerable trial and error to pro-duce a satisfactory outcome. A modeling plat-form to simulate loudspeakers is presented,which can automatically cycle through vast per-mutations guided by a carefully considered objective function until convergence is achieved,and importantly, the entire iterative effort is accommodated by the computer. Experienceand knowledge must make an initial evaluationof which factors are important in achieving a desired result a priori. Once prepared, the basisof a rapid loudspeaker development tool isformed that can return a tangible resource profitwhen compared to successive manual efforts tooptimize loudspeaker components.Convention Paper 7995


POSTERS: AUDIO EQUIPMENT AND EMERGING TECHNOLOGIES

14:00

P6-1 Study and Evaluation of MOSFET Rds(ON)–Impedance Efficiency Losses in High Power

Multilevel DCI-NPC Amplifiers—Vicent Sala,German Ruiz, Luis Romeral, UPC-UniversitatPolitecnca de Catalunya, Terrassa, Spain

This paper justifies the usefulness of multilevelpower amplifiers with DCI-NPC (Diode ClampedInverter– Neutral Point Converter) topology inapplications where size and weight needs wereoptimized. These amplifiers can work at high fre-quencies thereby reducing the size and weightof the filter elements. However, it is necessary tostudy, analyze, and evaluate the efficiency loss-es because this amplifier has double the numberof switching elements. This paper models thebehavior of the MOSFET Rds(ON) in a DCI-NPCtopology for different conditions.Convention Paper 7996Paper presented by German Ruiz.

14:00

P6-2 Modeling Distortion Effects in Class-D Amplifier Filter Inductors—Arnold Knott,1 ToreStegenborg-Andersen,1 Ole C. Thomsen,1Dominik Bortis,2 Johann W. Kolar,2 GerhardPfaffinger,3 Michael A. E. Andersen11Technical University of Denmark, Lyngby, Denmark

2Swiss Federal Institute of Technology in Zurich, Zurich, Switzerland

3Harman/Becker Automotive Systems GmbH, Straubing, Germany

Distortion is generally accepted as a quantifier tojudge the quality of audio power amplifiers. Inswitch-mode power amplifiers various mecha-nisms influence this performance measure. Aftergiving an overview of those, this paper focuseson the particular effect of the nonlinearity of theoutput filter components on the audio perfor-mance. While the physical reasons for both, thecapacitor and the inductor induced distortion aregiven, the practical in-depth demonstration isdone for the inductor only. This includes measur-ing the inductors performance, modeling throughfitting and resulting into simulation models. Thefitted models achieve distortion values between0.03 % and 0.20 % as a basis to enable the design of a 200 W amplifier.Convention Paper 7997

14:00

P6-3 Multilevel DCI-NPC Power Amplifier High-Frequency Distortion Analysis through Parasitic Inductance Dynamic Model—Vicent Sala, German Ruiz, E. López, LuisRomeral, UPC-Universitat Politecnica deCatalunya, Terrassa, Spain

The high frequency distortion sources in DCI-NPC (Diode Clamped Inverter-Neutral PointConverter) amplifiers topology are studied andanalyzed. It has justified the need for designinga model that contains the different parasitic inductive circuits that presents dynamically thiskind of amplifier, as a function of the combina-tion of its active transistors. By means of a pro-posed pattern layout we present a dynamic mod-el of the parasitic inductances of the amplifierFull-Bridge DCI-NPC, and this is used to pro-pose some simple rules for the optimal design-ing of layouts for these types of amplifiers. �


Simulation and experimental results are present-ed to justify the proposed model, and the affir-mations and recommendations are given in thispaper. Convention Paper 7998Paper presented by German Ruiz.

14:00

P6-4 How Much Gain Should a Professional Microphone Preamplifier Have?—DouglasMcKinnie, Middle Tennessee State University,Murfreesboro, TN, USA

Many tradeoffs are required in the design of microphone preamplifier circuits. Characteristicssuch as noise figure, stability, bandwidth, andcomplexity may be dependent upon the gain ofthe design. Three factors determine the gain required from a microphone preamp: sound-pressure level of the sound source, distance ofthe microphone from that sound source (withinthe critical distance), and sensitivity of the micro-phone. This paper is an effort to find a probabili-ty distribution of the gain settings used with professional microphones. This is done by find-ing the distribution of max SPL in real use andby finding the sensitivity of the most commonlyused current and classic microphones. Convention Paper 7999

14:00

P6-5 Equalizing Force Contributions in Transducerswith a Partitioned Electrode—Libor Husník,Czech Technical University in Prague, Prague,Czech Republic

A partitioned electrode in an electrostatic trans-ducer can present among others a possibility formaking the transducer with the direct D/A con-version. Nevertheless, partitioned electrodes,the sizes of which are proportional to powers of2 or terms of other convenient series, do nothave the corresponding force action on themembrane. The reason is the membrane doesnot vibrate in a piston-like mode and electrodeparts close to the membrane periphery do notexcite membrane vibrations in the same way asthe elements near the center. The aim of this paper is to suggest equalization of force contri-butions from different partitioned electrodes byvarying their sizes. Principles presented herecan also be used for other membrane-electrodearrangements.Convention Paper 8000

14:00

P6-6 Low-End Device to Convert EEG Waves toMIDI— Adrian Attard Trevisan,1 Lewis Jones21St. Martins Institute of Information Technology, Hamrun, Malta

2London Metropolitan University, London, UK

This research provides a simple and portable sys-tem that is able to generate MIDI output based onthe inputted data collected through an EEG col-lecting device. The context is beneficial in manyways, where the therapeutic effects of listening tothe music created by the brain waves documentsmany cases of treating health problems. The approach is influenced by the interface described

in the article “Brain-Computer Music Interface forComposition and Performance” by Eduardo ReckMiranda, where different frequency bands triggercorresponding piano notes through, and the com-plexity of, the signal represents the tempo of thesound. The correspondence of the sound and thenotes have been established through experimen-tal work, where data of participants of a test groupwere gathered and analyzed, putting intervals forbrain frequencies for different notes. The study isan active contribution to the field of the neurofeed-back, by providing criteria tools for assessment.Convention Paper 8001Paper will be presented by Lewis Jones

14:00

P6-7 Implementation and Development of Interfaces for Music Performance throughAnalysis of Improvised Dance Movements—Richard Hoadley, Anglia Ruskin University,Cambridge, UK

Electronic music, even when designed to be interactive, can lack performance interest and isfrequently musically unsophisticated. This is un-fortunate because there are many aspects ofelectronic music that can be interesting, elegant,demonstrative, and musically informative. Theuse of dancers to interact with prototypical inter-faces comprising clusters of sensors generatingmusic algorithmically provides a method of investigating human actions in this environment.This is achieved through collaborative work involving software and hardware designers,composers, sculptors, and choreographers whoexamine aesthetically and practically the inter-stices of these disciplines. This paper investi-gates these interstices.Convention Paper 8002

14:00

P6-8 Violence Prediction through EmotionalSpeech— José Higueras-Soler, Roberto Gil-Pita, Enrique Alexandre, Manuel Rosa-Zurera,Universidad de Alcalá, Acalá de Henares,Madrid, Spain

Preventing violence takes an absolute necessityin our society. Whether in homes with a particu-lar risk of domestic violence, as in prisons orschools, there is a need for systems capable ofdetecting risk situations, for preventive purpos-es. One of the most important factors that pre-cede a violent situation is an emotional state ofanger. In this paper we discuss the features thatare required to provide decision makers dedicat-ed to the detection of emotional states of angerfrom speech signals. For this purpose, we pre-sent a set of experiments and results with theaim of studying the combination of features extracted from the literature and their effectsover the detection performance (relationship be-tween probability of detection of anger and prob-ability of false alarm) of a neural network and aleast-square linear detector. Convention Paper 8003

14:00

P6-9 FoleySonic: Placing Sounds on a Timelinethrough Gestures—David Black,1 Kristian

Technical Program


Gohlke,1 Jörn Loviscach21University of Applied Sciences, Bremen, Bremen, Germany

2University of Applied Sciences, Bielefeld, Bielefeld, Germany

The task of sound placement on video timelinesis usually a t ime-consuming process that requires the sound designer or foley artist tocarefully calibrate the position and length ofeach sound sample. For novice and home videoproducers, friendlier and more entertaining inputmethods are needed. We demonstrate a novelapproach that harnesses the motion-sensing ca-pabilities of readily available input devices, suchas the Nintendo Wii Remote or modern smartphones, to provide intuitive and fluid arrange-ment of samples on a timeline. Users can watcha video while simultaneously adding sound effects, providing a near real-time workflow. Thesystem leverages the user’s motor skills for enhanced expressiveness and provides a satis-fying experience while accelerating the process.Convention Paper 8004

14:00

P6-10 A Computer-Aided Audio Effect Setup Procedure for Untrained Users—SebastianHeise,1 Michael Hlatky,1 Jörn Loviscach21University of Applied Sciences, Bremen, Bremen, Germany

2University of Applied Sciences, Bielefeld, Bielefeld, Germany

The number of parameters of modern audio effects easily ranges in the dozens. Expertknowledge is required to understand which para-meter change results in a desired effect. Yet,such sound processors are also making theirway into consumer products, where they over-burden most users. Hence, we propose a proce-dure to achieve a desired effect without technicalexpertise based on a black-box genetic opti-mization strategy: Users are only confronted witha series of comparisons of two processed exam-ples. Learning from the users’ choices, our soft-ware optimizes the parameter settings. We conducted a study on hearing-impaired personswithout expert knowledge, who used the systemto adjust a third-octave equalizer and a multi-band compressor to improve the intelligibility of aTV set.Convention Paper 8005


AES42 AND DIGITAL MICROPHONES

Chair: Helmut Wittek, SCHOEPS Mikrofone GmbH, Karlsruhe, Germany

Panelists: Claudio Becker-Foss, DirectOut, Mittweida, GermanyStephan Flock, DirectOut, Mittweida, GermanyChristian Langen, Schoeps Mikrofone, Karlsruhe, GermanyStephan Peus, Georg Neumann GmbH,

Berlin, GermanyGregor Zielinski, Sennheiser, Wedemark, Germany

The AES42 interface for digital microphones is not yetwidely used. This can be due to the relatively new appearance of digital microphone technology, but also alack of knowledge and practice with digital microphonesand the corresponding interface. The advantages anddisadvantages have to be communicated in an open andneutral way, regardless of commercial interests, on thebasis of the actual need of the engineers. Along with an available “White paper” about AES42

and digital microphones, which is aimed a neutral in-depth information and was compiled from different authors, the workshop intends to bring to light facts andprejudices on this topic.

Tutorial 1 Saturday, May 2214:00 – 15:30 Room C6

DO-IT-YOURSELF SEMANTIC AUDIO

Presenter: Jörn Loviscach, Fachhochschule Bielefeld (University of Applied Sciences), Bielefeld, Germany

Content-based music information retrieval (MIR) andsimilar applications require advanced algorithms that often overburden non-expert developers. However, manybuilding blocks are available—mostly for free—to signifi-cantly ease software development, for instance of simi-larity search methods, or to serve as components for ad-hoc solutions, for instance in forensics or linguistics.This tutorial looks into software libraries/frameworks(e.g., MARSYAS and CLAM), toolboxes (e.g., MIRtool-box), Web-based services (e.g., Echo Nest), and stand-alone software (e.g., The Sonic Visualizer) that help withthe extraction of audio features and/or execute basic machine learning algorithms. Focusing on solutions thatrequire little to no programming in the classical sense, thetutorial’s major part consists in live demos of hand-pickedroutes to roll one’s own semantic audio application.

Saturday, May 22 14:00 Room Saint JulienTechnical Committee Meeting on Coding of AudioSignals

Saturday, May 22 14:00 Room Mouton CadetStandards Committee Meeting SC-02-02 Digital In-put/Output Interfacing

Saturday, May 22 15:00 Room Saint JulienTechnical Committee Meeting on Archiving, Restoration, and Digital Libraries

Tutorial 2 Saturday, May 2215:30 – 17:00 Room C1

MASTERING FOR BROADCAST

Presenter: Darcy Proper, Galaxy Studios, Mol, Belgium

Mastering has often had an aura of mystery around it.Those “in the know” have always regarded it as a vitaland necessary last step in the process of producing arecord. Those who have never experienced it have oftenhad only a vague idea of what good mastering couldachieve. However, the loudness race in recent years hasput the mastering community under pressure; on oneside from the producers or labels who want their product �


louder than the competition and on the other side fromthe artists or mixers who don’t want their work smashedinto a lifeless “brick” and maxed out by excessive use oflimiter plug-ins. Darcy Proper is a multi-Grammy-winning mastering

engineer whose golden ears (and hands) have put thefinishing touches on a vast array of high profile records,among many others those from Steely Dan. She will talkabout her approach to her work and will also demo ex-amples with various degrees of compression with a lega-cy broadcast processor as the final piece of gear in thesignal chain, simulating a radio broadcast. The audienceis then able to experience the effects and artifacts thatdifferent compression levels will cause at the consumer'send.


CALIBRATION OF ANALOG REPRODUCINGEQUIPMENT FOR DIGITIZATION PROJECTS

Chair: George Brock-Nannestad, Patent Tactics, Gentofte, Denmark

Panelists: Sean W. Davies, S.W. Davies Ltd., Aylesbury, UKAndrew Pearson, British Library Sound Archive, London, UKPeter Posthumus, Post Sound Ltd., Newton Abbot, UK

Analog reproduction equipment sees its last use in thetransfer of audio content to digital files. In order to obtainthe optimal signal from the carrier (tape and disc) theequipment has to be adjusted. In order to obtain trace-ability the equipment has to be calibrated. Generations oftechnicians that did this on a regular basis are vanishingquickly, and making available calibration materials, suchas the AES-S001-064 Coarse-groove Calibration disc, isnot enough when the know-how for its use is lacking.This workshop aims to redress this situation, in that itprovides a firm theoretical underpinning of practicaldemonstrations, in which the basic principles of calibra-tion and alignment will be discussed along with loggingthe activity. Keywords are: calibration and alignment ofanalog tape reproducers, including equipment for tapesthat have been encoded, such as Dolby-A and Dolby SR.Furthermore calibration and alignment of gramophonerecord reproducing equipment will be discussed. Activediscussion is encouraged and may spill over into opticalsound as found on films.

Saturday, May 22 16:00 Room Saint JulienTechnical Committee Meeting on Hearing and Hearing Loss Prevention


POSTERS: LOUDSPEAKERS AND HEADPHONES

16:30

P7-1 Theoretical and Experimental Comparison of Amplitude Modulation Techniques for Parametric Loudspeakers—Wei Ji, Woon-Seng Gan, Peifeng Ji, Nanyang TechnologicalUniversity, Singapore

Due to the self-demodulation property of finiteamplitude ultrasonic waves propagating in air, ahighly focused sound beam can be reproduced.This phenomenon is known as the parametric array in air. However, because of the nonlinearityeffect of ultrasonic waves, the reproduced soundwave suffers from high distortion. Several ampli-tude modulation techniques have been previous-ly proposed to mitigate this problem. Currently,no comprehensive study has been carried out toevaluate the performance of these amplitudemodulation techniques. This paper attempts toprovide theoretical and experimental studies ofthe effectiveness in mitigating the distortion fromdifferent amplitude modulation techniques. Objective measurements, such as sound pres-sure level and total harmonic distortion are usedto compare the performance of these techniques.Convention Paper 8006

16:30

P7-2 Improvement of Sound Quality by Means of Ultra-Soft Elastomer for the Gel Type Inertia Driven DML Type Transducer—Minsung Cho,1,2 Elena Prokofieva,1Jordi Munoz,2 Mike Barker11Edinburgh Napier University, Edinburgh, UK2SFX Technologies Ltd., Edinburgh, UK

Unlike standard DML transducers, the gel-typeinertia driven transducer (referred to as the geltransducer in this paper) designed as a miniwoofer DML-type transducer, transfers its piston-ic movement to a panel through the gel sur-round, thereby generating a transverse wave.This mechanism induces maximum movementof a magnet assembly, which boosts the force ofthe moving voice coil so that sound pressure lev-el increases as the acceleration of the panel in-creases. This effect is proportional to soundpressure level at low and medium frequencies inthe range 50 Hz to 1000 Hz. Furthermore it isfound that the gel surround prevents reflectedtransverse waves from the panel into interferingwith the pistonic waves generated by the trans-ducer. Results of the stiffness testing of the gelsurround are presented together with data forthe displacement and acceleration of the panelwith the gel transducer attached. These data arecompared to acoustical outputs.Convention Paper 8007

16:30

P7-3 Electrical Circuit Model for a Loudspeakerwith an Additional Fixed Coil in the Gap—Daniele Ponteggia,1 Marco Carlisi,2Andrea Manzini31Studio Ing. Ponteggia, Terni, Italy 2Ing. Carlisi Marco, Como, Italy 318 Sound - Division of A.E.B. S.r.l., Cavriago, Italy

A previous paper by some of the authors investi-gated a solution to minimize the loudspeaker inductance based on an additional fixed coil positioned in the gap with two additional termi-nals. This device is referred as A.I.C. (Active Impedance Control). In this paper a suitableelectrical circuit model for such loudspeaker withfour terminals and two coils is proposed.Convention Paper 8008

Technical Program


16:30

P7-4 Measurement of the Nonlinear Distortions in Loudspeakers with a Broadband Noise—Rafal Siczek, Andrzej B. Dobrucki, Wroclaw University of Technology, Wroclaw, Poland

The paper presents application of digital filters formeasurement of nonlinear distortion in loudspeak-ers using Wolf’s method and broadband noise asthe exciting signal. The results of simulation of themeasurement process for loudspeaker with vari-ous nonlinearities of Bl factor and suspensionstiffness are presented. The influence of variousexciting signals, e.g., white and pink noise as wellas the parameters of digital filters have been test-ed. The results of measurement of an actual loud-speaker are also presented.Convention Paper 8009

16:30

P7-5 The Fundamentals of Loudspeaker Radiationand Acoustic Quality—Grzegorz P. Matusiak,New Smart System (AC Systems), Poznan,Poland and Smørum, Denmark

The loudspeaker designer faces many chal-lenges when designing a new loudspeaker sys-tem. The proposed solution is to replace themathematically complicated series circuit of theradiation impedance by a new one, that can eas-ily be introduced into a loudspeaker circuit in thewhole frequency range. It facilitates a designingprocess and enables to learn more about theloudspeaker. The fundamental resonance in airis lower than in vacuum and depends on the radiating surface, as ACOUSTIC QUALITY,which also depends on frequency.1/QT=1/QE+1/QM+1/QA. The parameters affectthe SPL and efficiency response. Investigatedstructures: pulsating sphere, baffled circular,square and rectangular piston, infinite strip, unflanged pipe-piston, finally horn (0-infinity Hz). Convention Paper 8010

16:30

P7-6 Analysis of Low Frequency Audio Reproduction via Multiple Low-Bl Loudspeakers—Theodore Altanis, John Mourjopoulos, University of Patras, Patras,Greece

Small loudspeakers with low force factors (Bl)behave as narrow bandwidth filters with highquality factors and can be tuned at low reso-nance frequencies. It is possible to optimize lowfrequency reproduction (in the sub-wooferrange) using a number of such drivers. This paper employs a set of such low Bl loudspeak-ers with different eigenfrequencies to achieve efficient reproduction of low frequency signalsand examines the aspects of such implementa-tion comparing the results to an ideal sub-woofersystem and also via perceptual evaluation ofthese systems. Convention Paper 8011

16:30

P7-7 Circular Loudspeaker Array with ControllableDirectivity—Martin Møller,1 Martin Olsen,1

Finn T. Agerkvist,1 Jakob Dyreby,2 Gert KudahlMunch21Technical University of Denmark, Lyngby, Denmark

2Bang & Olufsen A/S, Struer, Denmark

Specific directivity patterns for circular arrays ofloudspeakers can be achieved by utilizing theconcept of phase-modes, which expands the directivity pattern into a series of circular har-monics. This paper investigates the applicabilityof this concept applied on a loudspeaker arrayon a cylindrical baffle, with a desired directivitypattern, which is specified in the frequency inter-val of 400–5000 Hz. From the specified frequen-cy independent directivity pattern filter transferfunctions for each loudspeaker are determined.The sensitivity of various parameters, related topractical implementation, is also investigated byintroducing filter errors for each array element.Measurements on a small scale model using 2inch drivers are compared with simulations,showing good agreement between experimentaland predicted results.Convention Paper 8012

Student Event/Career DevelopmentOPENING AND STUDENT DELEGATE ASSEMBLY MEETING – PART 1Saturday, May 22, 17:00 – 18:00Room C2

Chair: Miroslav Jakovljevic

Vice Chair: Veit Winkler

The first Student Delegate Assembly (SDA) meeting isthe official opening of the Convention’s student programand a great opportunity to meet with fellow students fromall corners of the world. This opening meeting of the Stu-dent Delegate Assembly will introduce new events andelection proceedings, announce candidates for the com-ing year’s election for the Europe/International Regions,announce the finalists in the recording competition cate-gories, hand out the judges’ sheets to the nonfinalists,and announce any upcoming events of the Convention.Students and student sections will be given the opportu-nity to introduce themselves and their activities, in orderto stimulate international contacts.All students and educators are invited to participate in

this meeting. Election results and Recording Competitionand Poster Awards will be given at the Student DelegateAssembly Meeting–Part 2 on Tuesday, May 25, at 14:00.

Saturday, May 22 14:00 Room Mouton CadetStandards Committee Meeting SC-02-08 Audio-FileTransfer and Exchange

Special EventMIXER PARTY Saturday, May 22, 18:00 – 19:30C4-Foyer

A Mixer Party will be held on Saturday evening to enableConvention attendees to meet in a social atmosphere tocatch up with friends and colleagues from the world ofaudio. There will be a cash bar.

Special EventORGAN RECITAL BY GRAHAM BLYTHSaturday, May 22, 20:00 – 21:30Temple Church, Temple, London

�


Graham Blyth’s traditional organ recital will be given onthe organ of the famous Temple Church, a late 12th cen-tury church in London located between Fleet Street andthe River Thames, built for and by the Knights Templaras their English headquarters. In modern times, two Innsof Court (Inner Temple and Middle Temple) both use thechurch. It is famous for its effigy tombs and for being around church. It was heavily damaged during the SecondWorld War but has been largely restored. The churchcontains two organs: a chamber organ built by RobinJennings in 2001 (a three-stop continuo organ), and afour manual Harrison & Harrison organ. The Harrison or-gan will be completely renovated from July 2011 untilEaster 2013.The featured works are by Hubert Parry, George Thal-

ben-Ball, Charles Villiers Stanford, Frank Bridge, PercyWhitlock, Louis Vierne, and Leon Beëllmann.Graham Blyth was born in 1948, began playing the

piano aged 4 and received his early musical training as aJunior Exhibitioner at Trinity College of Music in London,England. Subsequently, at Bristol University, he took upconducting, performing Bach’s St. Matthew Passion before he was 21. He holds diplomas in Organ Perfor-mance from the Royal College of Organists, The RoyalCollege of Music and Trinity College of Music. In the late1980s he renewed his studies with Sulemita Aronowskyfor piano and Robert Munns for organ. He gives numer-ous concerts each year, principally as organist and pianist, but also as a conductor and harpsichord player.He made his international debut with an organ recital atSt. Thomas Church, New York in 1993 and since thenhas played in San Francisco (Grace Cathedral), Los Angeles (Cathedral of Our Lady of Los Angeles), Ams-terdam, Copenhagen, Munich (Liebfrauendom), Paris(Madeleine and St. Etienne du Mont) and Berlin. He haslived in Wantage, Oxfordshire, since 1984 where he iscurrently Artistic Director of Wantage Chamber Concertsand Director of the Wantage Festival of Arts.He divides his time between being a designer of pro-

fessional audio equipment (he is a co-founder and Tech-nical Director of Soundcraft) and organ related activities.In 2006 he was elected a Fellow of the Royal Society ofArts in recognition of his work in product design relatingto the performing arts.

Student Event/Career DevelopmentSTUDENT PARTYSaturday, May 7, 20:00 –22:00TBD

Information will be given at the Student Delegate Assem-bly —Part 1.

Session P8 Sunday, May 2309:00 – 13:00 Room C3

MUSIC ANALYSIS AND PROCESSING

Chair: David Malham, University of York, York, UK

09:00

P8-1 Automatic Detection of Audio Effects in Guitar and Bass Recordings—Michael Stein,1Jakob Abeßer,1 Christian Dittmar,1 GeraldSchuller21Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany

2Ilmenau University of Technology, Ilmenau, Germany

This paper presents a novel method to detectand distinguish 10 frequently used audio effectsin recordings of electric guitar and bass. It isbased on spectral analysis of audio segments located in the sustain part of previously detectedguitar tones. Overall, 541 spectral, cepstral andharmonic features are extracted from short timespectra of the audio segments. Support VectorMachines are used in combination with featureselection and transform techniques for automaticclassification based on the extracted featurevectors. With correct classification rates up to100 percent for the detection of single effectsand 98 percent for the simultaneous distinctionof 10 different effects, the method has success-fully proven its capabil i ty—performing on isolated sounds as well as on multitimbral, stereophonic musical recordings.Convention Paper 8013

09:30

P8-2 Time Domain Emulation of the Clavinet—Stefan Bilbao,1 Matthias Rath21University of Edinburgh, Edinburgh, UK 2Technische Universität Berlin, Berlin, Germany

The simulation of classic electromechanical musical instruments and audio effects has seena great deal of activity in recent years, due inpart to great recent increases in computing pow-er. It is now possible to perform full emulationsof relatively complex musical instruments in realtime, or near real time. In this paper time domainfinite difference schemes are applied to the emu-lation of the Hohner Clavinet, an electromechan-ical stringed instrument exhibiting special features such as sustained hammer/string con-tact, pinning of the string to a metal stop, and adistributed damping mechanism. Various issues,including numerical stability, implementation details, and computational cost will be dis-cussed. Simulation results and sound exampleswill be presented.Convention Paper 8014

10:00

P8-3 Polyphony Number Estimator for PianoRecordings Using Different Spectral Patterns—Ana M. Barbancho, Isabel Barbancho, JavierFernandez, Lorenzo J. Tardón, Universidad deMálaga, Málaga, Spain

One of the main tasks of a polyphonic transcrip-tion system is the estimation of the number ofvoices, i.e., the polyphony number. The correctestimation of this parameter is very important forpolyphonic transcription systems, this task hasnot been discussed in depth in the known tran-scription systems. The aim of this paper is topropose a novel estimation method of thepolyphony number for piano recordings. Thisnew method is based on the use of two differenttypes of spectral patterns: single-note patternsand composed-note patterns. The usage of com-posed-note patterns in the estimation of thepolyphony number and in the polyphonic detec-tion process has not been previously reported inthe literature.Convention Paper 8015

Technical Program


10:30

P8-4 String Ensemble Vibrato: A SpectroscopicStudy—Stijn Mattheij, AVANS University, Breda,The Netherlands

A systematic observation of the presence of ensemble vibrato on early twentieth centuryrecordings of orchestral works has been carriedout by studying spectral line shapes of individualmusical notes. Broadening of line shapes wasdetected in recordings of Beethoven’s Fifth Sym-phony and Brahms’s Hungarian Dance no. 5;this effect was attributed to ensemble vibrato.From these observations it may be concludedthat string ensemble vibrato was common prac-tice in orchestras from the continent throughoutthe twentieth century. British orchestras did notuse much vibrato before 1940. Convention Paper 8016

11:00

P8-5 Influence of Psychoacoustic Roughness on Musical Intonation Preference—Julián Villegas,1 Michael Cohen,1 Ian Wilson,1William Martens21University of Aizu, Aizu, Japan 2University of Sydney, Sydney, NSW, Australia

An experiment to compare the acceptability ofthree different music fragments rendered withthree different intonations is presented. Thesepreference results were contrasted with those ofisolated chords also rendered with the samethree intonations. The least rough renditionswere found to be those using Twelve-ToneEqual-Temperament (12-TET). Just Intonation (JI)renditions were the roughest. A negative correla-tion between preference and psychoacousticroughness was also found.Convention Paper 8017

11:30

P8-6 Music Emotion and Genre Recognition Toward New Affective Music Taxonomy—Jonghwa Kim, Lars Larsen, University Augsburg,Augsburg, Germany

Exponentially increasing electronic music distrib-ution creates a natural pressure for fine-grainedmusical metadata. On the basis of the fact that aprimary motive for listening to music is its emo-tional effect, diversion, and the memories itawakens, we propose a novel affective musictaxonomy that combines the global music genretaxonomy, e.g., classical, jazz, rock/pop, andrap, with emotion categories such as joy, sad-ness, anger, and pleasure, in a complementaryway. In this paper we deal with all essentialstages of automatic genre/emotion recognitionsystem, i.e., from reasonable music data collec-tion up to performance evaluation of various ma-chine learning algorithms. Particularly, a novelclassification scheme, called consecutive di-chotomous decomposition tree (CDDT) is pre-sented, which is specifically parameterized formulti-class classification problems with extreme-ly high number of class, e.g., sixteen music cate-gories in our case. The average recognition ac-curacy of 75% for the 16 music categoriesshows a realistic possibility of the affective music

taxonomy we proposed. Convention Paper 8018

12:00

P8-7 Perceptually-Motivated Audio Morphing:Warmth—Duncan Williams, Tim Brookes, University of Surrey, Guildford, UK

A system for morphing the warmth of a sound independently from its other timbral attributeswas coded, building on previous work morphingbrightness only, and morphing brightness andsoftness. The new warmth-softness-brightnessmorpher was perceptually validated using a series of listening tests. A multidimensional scal-ing analysis of listener responses to paired-comparisons showed perceptually orthogonalmovement in two dimensions within a warmth-morphed and everything-else-morphed stimulusset. A verbal elicitation experiment showed thatlisteners’ descriptive labeling of these dimen-sions was as intended. A further “quality control”experiment provided evidence that no “hidden”timbral attributes were altered in parallel with theintended ones. A complete timbre morpher cannow be considered for further work and evaluat-ed using the tri-stage procedure documentedhere. Convention Paper 8019

12:30

P8-8 A Novel Envelope-Based Generic DynamicRange Compression Model—Adam Weisser,Oticon A/S, Smørum, Denmark

A mathematical model is presented, which reproduces typical dynamic range compression,when given the nominal input envelope of thesignal and the compression constants. The mod-el is derived geometrically in a qualitative approach and the governing differential equationfor an arbitrary input and an arbitrary compres-sor is found. Step responses compare well tocommercial compressors tested. The compres-sion effect on speech using the general equationin its discrete version is also demonstrated. Thismodel applicability is especially appealing tohearing aids, where the input-output curve andtime constants of the nonlinear instrument arefrequently consulted and the qualitative theoreti-cal effect of compression may be crucial forspeech perception.Convention Paper 8020


ROOM AND ARCHITECTURAL ACOUSTICS

Chair: Ben Kok, Consultant, The Netherlands

09:00

P9-1 On the Air Absorption Effects in a Finite Difference Implementation of the AcousticDiffusion Equation Model—Juan M. Navarro,1José Escolano,2 José J. Lopez31San Antonio's Catholic University of Murcia, �


Guadalupe, Spain 2University of Jaén, Linares, Spain 3Universidad Politécnica de Valencia, Valencia, Spain

In room-acoustics modeling, the sound atmos-pheric attenuation is a critical phenomenon thathas to be taken into account to obtain correctpredictions, especially when high frequenciesand large enclosures are simulated. A finite dif-ference scheme implementation of the diffusionequation model with a mixed boundary conditionis evaluated, including the air absorption withinthe room. This paper focuses on investigatingthe performance of this implementation for room-acoustics simulation. In particular, the stabilitycondition is developed to compare the featuresof the solution both with and without the air absorption term. Moreover, the correct behaviorof the numerical implementation has been stud-ied comparing predicted results using differentsurfaces and air absorption coefficients.Convention Paper 8021

09:30

P9-2 A Comparison of Two Diffuse Boundary Models Based on Finite Differences Schemes—José Escolano,1 Juan M. Navarro,2 Damian T.Murphy,3 Jeremy J. Wells,3 José J. López41University of Jaén, Linares, Spain 2San Antonio's Catholic University of Murcia, Guadalupe, Spain

3University of York, York, UK 4Universidad Politécnica de Valencia, Valencia, Spain

In room acoustics, the reflection and scatteringof sound waves at the boundaries strongly deter-mines the behavior within the enclosed space itself. Therefore, the accurate simulation of theacoustic characteristics of a boundary is an important part of any room acoustics predictionmodel, and in particular diffuse reflections at aboundary is one of the most important propertiesto model correctly for an accurate and perceptu-ally natural result. This paper presents a comparison of two simulation models for thephenomenon of diffuse reflection at a boundary:one based on introducing physical variations atthe boundary itself, and another based on theuse of a diffusion equation model.Convention Paper 8022

10:00

P9-3 Considerations for the Optimal Location and Boundary Effects for Loudspeakers in an Automotive Interior—Roger Shively,1Jeff Bailey,1 Jerôme Halley,2 Lars Kurandt,2François Malbos,3 Gabriel Ruiz,4Alfred Svobodnik51Harman International, Novi, MI, USA 2Harman International, Karlsbad, Germany 3Harman International, Chateau du Loir, France 4Harman International, Bridgend, Wales, UK 5Harman International, Vienna, Austria

Referencing earlier work by the authors on theboundary effects in an automotive vehicle interi-or (AES Convention Paper 4245, May 1996) onthe mid-to-high frequency timbral changes in thesound field due to the proximity to loudspeakers

of reflective, semi-rigid surfaces, modeling ofmidsize loudspeakers in the interior of an auto-mobile is reported on, as well as modeled resultsfor a specific case are given.Convention Paper 8023

10:30

P9-4 Acoustics of the Restored Petruzzelli Theater—Marco Facondini,1 Daniele Ponteggia21TanAcoustics Studio, Pesaro, Italy 2Studio Ing. Ponteggia, Terni, Italy

Petruzzelli theater in Bari has been recently restored after the disastrous fire of 1991 had seriously damaged the building. The restorationhas focused on the aesthetics and functionalityof the room, in particular giving great attention toimproving the acoustics of the theater. This paper reviews the acoustical design process thathas been carried out using a computer model ofthe hall and measurements during the restora-tion process. Objective indexes from measure-ments of the renewed theater are compared withliterature suggested values and with similarhalls.Convention Paper 8024

11:00

P9-5 Identification of a Room Impulse ResponseUsing a Close-Microphone Reference Signal—Elias Kokkinis, John Mourjopoulos, Universityof Patras, Patras, Greece

The identification of room impulse responses isvery important in many audio signal processingapplications. Typical system identification meth-ods require access to the original source signalthat is seldom available in practice, while blindsystem identification methods require priorknowledge of the statistical properties of thesource signal that, for audio applications, are notavailable. In this paper a new system identifica-tion method is proposed using a close-micro-phone reference signal, and it is shown that it accurately identifies real room impulse responses.Convention Paper 8025

11:30

P9-6 Acoustic Impulse Response MeasurementUsing Speech and Music Signals—John Usher, Barcelona Media, Barcelona, Spain

Continuous measurement of room impulse responses (RIRs) in the presence of an audi-ence has many applications for room acoustics:in-situ loudspeaker/room equalization; telecon-ferencing; and for architectural acoustic diagnos-tics. A continuous analysis of the RIR is oftenpreferable to a single measurement, especiallywith non-stationary room characteristics such asfrom changing atmospheric or audience condi-tions. This paper discusses the use of adaptivefilters updated according to the NLMS algorithmfor fast, continuous in-situ RIR acquisition; par-ticularly when the input signal is music orspeech. We show that the dual-channel FFT(DCFFT) method has slower convergence and isless robust to colored signals such as music andspeech. Data is presented comparing the NLMSand the DCFFT methods and we show that the

Technical Program


adaptive filter approach provides RIRs with highaccuracy and high robustness to backgroundnoise using music or speech signals. Convention Paper 8026

12:00

P9-7 Evaluation of Late Reverberant Fields in Loudspeaker Rendered Virtual Rooms—Wieslaw Woszczyk, Brett Leonard, Doyuen Ko,McGill University, Montreal, Quebec, Canada

Late reverberant decay was isolated from high-resolution impulse response measured in achurch that has excellent acoustics suitable formusic recording. The measured multichannel impulse response (IR) was compared to twosynthetic decays (SIR) built from Gaussian noiseand modeled on the original IR. The syntheticdecays were created to be similar to the originalresponse in the rate of decay and spectralweighting in order to make listening comparisonsbetween them. Four different transition points(100 ms, 200 ms, 300 ms, 400 ms) were chosento crossfade between the original early responseand the synthetic decays. Listeners auditionedthe synthetic and measured rooms convolvedwith three anechoic monophonic sources. Theresults of tests conducted within immersive sur-round sound environment with height help toverify whether shaped random noise is a suit-able substitute for late reverberation in high-res-olution simulations of room acoustics. Convention Paper 8027This paper is presented by Brett Leonard.

12:30

P9-8 Finite Difference Room Acoustic Modeling on a General Purpose Graphics ProcessingUnit—Alexander Southern,1,2 Damian Murphy,1Guilherme Campos,3 Paulo Dias31University of York, York, UK 2Aalto University School of Science and Technology, Aalto, Finland

3University of Aveiro, Aveiro, Portugal

Detailed and convincing walkthrough auraliza-tions of virtual rooms requires much processingcapability. One method of reducing this require-ment is to pre-calculate a data-set of room impulse responses (RIR) at locations throughoutthe space. Processing resources may then focuson RIR interpolation and convolution using thedataset as the virtual listening position changesin real-time. Recent work identified the suitabilityof wave-based models over traditional ray-basedapproaches for walkthrough auralization. Despite the computational saving of wave-basedmethods to generate the RIR dataset, process-ing times are still long. This paper presents awave-based implementation for execution on ageneral purpose graphics processing unit. Results validate the approach and show thatparallelization provides a notable acceleration.Convention Paper 8028

Session P10 Sunday, May 2309:00 – 10:30 C4-Foyer

POSTERS: AUDIO PROCESSING—ANALYSIS AND SYNTHESIS OF SOUND

09:00

P10-1 Cellular Automata Sound Synthesis with an Extended Version of the Multitype VoterModel—Jaime Serquera, Eduardo R. Miranda, University of Plymouth, Plymouth, UK

In this paper we report on the synthesis ofsounds with cellular automata (CA), specificallywith an extended version of the multitype votermodel (MVM). Our mapping process is based onDSP analysis of automata evolutions and con-sists in mapping histograms onto sound spectro-grams. This mapping allows a flexible sound design process, but due to the non-deterministicnature of the MVM such process acquires itsmaximum potential after the CA run is finished.Our extended version model presents a high degree of predictability and controllability makingthe system suitable for an in-advance sound design process with all the advantages that thisentails, such as real-time possibilities and perfor-mance applications. This research focuses onthe synthesis of damped sounds.Convention Paper 8029

09:00

P10-2 Stereophonic Rendering of Source DistanceUsing DWM-FDN Artificial Reverberators—Saul Maté-Cid, Hüseyin Hacihabiboglu, ZoranCvetkovic, King’s College London, London, UK

Artificial reverberators are used in audio recordingand production to enhance the perception of spa-ciousness. It is well known that reverberation is akey factor in the perception of the distance of asound source. The ratio of direct and reverberantenergies is one of the most important distancecues. A stereophonic artificial reverberator is pro-posed that allows panning the perceived distanceof a sound source. The proposed reverberator isbased on feedback delay network (FDN) rever-berators and uses a perceptual model of direct-to-reverberant (D/R) energy ratio to pan the sourcedistance. The equivalence of FDNs and digitalwaveguide mesh (DWM) scattering matrices isexploited in order to devise a reverberator rele-vant in the room acoustics context.Convention Paper 8030

09:00

P10-3 Separation of Music+Effects Sound Trackfrom Several International Versions of theSame Movie —Antoine Liutkus,1 Pierre Leveau21Télécom ParisTech, Paris, France2Audionamix, Paris, France

This paper concerns the separation of the music+effects (ME) track from a movie sound-track, given the observation of several interna-tional versions of the same movie. The approachchosen is strongly inspired from existing stereoaudio source separation and especially fromspatial filtering algorithms such as DUET thatcan extract a constant panned source from amixture very efficiently. The problem is indeedsimilar for we aim here at separating the MEtrack, which is the common background of all international versions of the movie soundtrack.The algorithm has been adapted to a number ofchannels greater than 2. Preprocessing tech- �

18 Audio Engineering Society 122nd Convention Program, 2007 Spring

niques have also been proposed to adapt the al-gorithm to realistic cases. The performances ofthe algorithm have been evaluated on realisticand synthetic cases.Convention Paper 8031

09:00

P10-4 A Differential Approach for the Implementation of Superdirective Loudspeaker Array—Jung-Woo Choi, Youngtae Kim, Sangchul Ko, Jungho Kim, SAIT,Samsung Electronics Co. Ltd., Gyeonggi-do, Korea

A loudspeaker arrangement and correspondinganalysis method to obtain a robust superdirec-tive beam are proposed. The superdirectivitytechnique requires precise matching of thesound sources modeled to calculate excitationpatterns and those used for the loudspeaker array. To resolve the robustness issue arisingfrom the modeling mismatch error, we show thatthe overall sensitivity to the model-mismatch error can be reduced by rearranging loudspeak-er posit ions. Specif ically, a beam pattern obtained by a conventional optimization tech-nique is represented as a product of robust de-lay-and-sum patterns and error-sensitive differ-ential patterns. The excitation pattern driving theloudspeaker array is then reformulated such thatthe error-sensitive pattern is only applied to theoutermost loudspeaker elements, and the arraydesign that fits to the new excitation pattern isdiscussed. Convention Paper 8032

09:00

P10-5 Improving the Performance of Pitch Estimators—Stephen J. Welburn, Mark D.Plumbley, Queen Mary University of London,London, UK

We are looking to use pitch estimators to providean accurate high-resolution pitch track for resyn-thesis of musical audio. We found that currentevaluation measures such as gross error rate(GER) are not suitable for algorithm selection. Inthis paper we examine the issues relating toevaluating pitch estimators and use these insights to improve performance of existing algo-rithms such as the well-known YIN pitch estima-tion algorithm. Convention Paper 8033

09:00

P10-6 Reverberation Analysis via Response and Signal Statistics—Eleftheria Georganti,Thomas Zarouchas, John Mourjopoulos, University of Patras, Patras, Greece

This paper examines statistical quantities (i.e.,kurtosis, skewness) of room transfer functionsand audio signals (anechoic, reverberant,speech, music). Measurements are taken undervarious reverberation conditions in different realenclosures ranging from small office to a largeauditorium and for varying source–receiver posi-tions. Here, the statistical properties of the roomresponses and signals are examined in the fre-quency domain. From these properties, the rela-

tionship between the spectral statistics of theroom transfer function and the corresponding reverberant signal are derived. Convention Paper 8034

09:00

P10-7 An Investigation of Low-Level Signal Descriptors Characterizing the Noise-LikeNature of an Audio Signal—Christian Uhle,Fraunhofer Institute for Integrated Circuits IIS,Erlangen, Germany

This paper presents an overview and an evalua-tion of low-level features characterizing thenoise-like or tone-like nature of an audio signal.Such features are widely used for content classi-fication, segmentation, identification, coding ofaudio signals, blind source separation, speechenhancement, and voice activity detection. Besides the very prominent Spectral FlatnessMeasure various alternative descriptors exist.These features are reviewed and the require-ments for these features are discussed. The features in scope are evaluated using syntheticsignals and exemplarily real-world application related to audio content classification, namelyvoiced-unvoiced discrimination for speech sig-nals and speech detection. Convention Paper 8035

09:00

P10-8 Algorithms for Digital Subharmonic Distortion—Zlatko Baracskai,1,2 Ryan Stables11Birmingham City University, Birmingham, UK2University of Birmingham, Birmingham, UK

This paper presents a comparison between exist-ing digital subharmonic generators and a new algorithm developed with the intention of havinga more pronounced subharmonic frequency andreduced harmonic, intermodulation and aliasingdistortions. The paper demonstrates that by intro-ducing inversions of a waveform at the minimaand maxima instead of the zero crossings, thediscontinuities are mitigated and various types ofdistortion are significantly attenuated.Convention Paper 8036

Workshop 4 Sunday, May 2309:00 – 11:00 Room C1

BLU-RAY AS A HIGH RESOLUTION AUDIO FORMAT FOR STEREO AND SURROUND

Chair: Stefan Bock, EBB-msm-studios GmbH, Munich, Germany

Panelists: Simon Heyworth, SAM, Chagford, UKMorten Lindberg, 2L, Oslo, NorwayJohannes Müller, msm-studios, Munich, GermanyCrispin Murray, Metropolis, London, UKRonald Prent, Galaxy Studios, Mol, Belgium

The decision for the Blu-ray disc as the only HD pack-aged media format also offering up to 8 channels of uncompressed high resolution audio has at least elimi-nated one of the obstacles of getting high-res surroundsound music to the market. The concept of utilizing Blu-

Technical Program


ray as a pure audio format will be explained, and Blu-raywill be positioned as successor to both SACD and DVD-A. The operational functionality and a double concept ofmaking it usable both with and without screen will bedemonstrated by showing a few products that are already on the market.

Tutorial 3 Sunday, May 2309:00 – 11:00 Room C2

HEARING AND HEARING LOSS PREVENTION

Presenter: Benj Kanters, Columbia College, Chicago,IL, USA

The Hearing Conservation Seminar and HearTomor-row.Org are dedicated to promoting awareness of hear-ing loss and conservation. This program is specificallytargeted to students and professionals in the audio andmusic industries. Experience has shown that engineersand musicians easily understand the concepts of hearingphysiology, as many of the principles and theories arethe same as those governing audio and acoustics. More-over, these people are quick to understand the impor-tance of developing their own safe listening habits, aswell as being concerned for the hearing health of theirclients and the music-listening public. The tutorial is a 2-hour presentation in three sections: first, an introduc-tion to hearing physiology; the second, noise-inducedloss; and third, practicing effective and sensible hearingconservation.

Sunday, May 23 09:00 Room Saint JulienTechnical Committee Meeting on Spatial Audio

Sunday, May 23 10:00 Room Saint JulienTechnical Committee Meeting on Transmission and Broadcasting


APPLICATIONS OF TIME-FREQUENCY PROCESSING IN SPATIAL AUDIO

Chair: Ville Pulkki, Aalto University School of Science and Technology, Aalto, Finland

Panelists: Christof Faller, Illusonic LLC, Lausanne, SwitzerlandJean-Marc Jot, DTS Inc., Scotts Valley, CA, USAChristian Uhle, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

The time-frequency resolution of human hearing hasbeen taken into account for a long time in perceptual audio codecs. Recently, the spatial resolution of humanshas also been exploited in time-frequency processing.This has already led into some commercial applications.This workshop covers the capabilities and incapabilitiesof human spatial hearing and the audio techniques thatexploit these features. Typically the techniques arebased on estimation of directional information for eachauditory frequency channel, which information is thenused in further processing. The application areas dis-cussed in the workshop include, at least, audio coding,microphone techniques, upmixing, directional micro-phones, and studio effects.

Special EventAES/APRS—LIFE IN THE OLD DOGS YET—PART ONE: KEEPING STUDIOS ALIVESunday, May 23, 11:00 – 13:00Room C2

Moderator: Andrew Leyshon, Nottingham University

Panelists: Mark Anders, Bug Music, UKMalcolm Atkin, formerly AIR and Sphere StudiosIan Brenchley, MD, Metropolis StudiosPaul Brown, Paul Brown ManagementJonathan Smith, GM, Abbey Road Studios

This Special Event examines the trends in the profes-sional recording business in recent years and invites anew generation of recording entrepreneurs to shareideas often driven by the expanding needs of contempo-rary clients. Professor Andrew Leyshon from NottinghamUniversity presents his thought-provoking analysis of thechanging worldwide recording climate. This is followedby a distinguished panel of recording stakeholders dis-cussing current and possible future recording businessmodels.

Student Event/Career DevelopmentCAREER/JOB FAIRSunday, May 23, 11:00 – 13:00C4-Foyer

The Career Fair will feature several companies from theexhibit floor. All attendees of the Convention, studentsand professionals alike, are welcome to come talk withrepresentatives from the companies and find out moreabout job and internship opportunities in the audio industry. Bring your resume!

Sunday, May 23 11:00 Room Saint JulienTechnical Committee Meeting on Human Factors in Audio Systems

Sunday, May 23 11:00 Room Mouton CadetStandards Committee Meeting SC-05-02 Audio Connectors


CANCELED

Sunday, May 23 12:00 Room Saint JulienTechnical Committee Meeting on Signal Processing

Sunday, May 23 12:30 Room Mouton CadetStandards Committee Meeting SC-05-05 Groundingand EMC Practices

Sunday, May 23 13:00 Room Saint JulienTechnical Committee Meeting on Audio Recordingand Mastering Systems

Student Event/Career DevelopmentBENEFIT OF ACCREDITATION—JAMESSunday, May 23, 13:15 – 13:45Room C5

Moderator: Phil HardingDave Ward

�


Will a Qualification Get You a Job in Music Production?Not so long ago, to get a career in music productionmeant you starting at the bottom in a recording studiopushing a broom and making the tea, slowly absorbingthe skills until you eventually graduated from tape op toassistant engineer, finally reaching the rank of producer.Now there are hundreds of degree courses and thou-sands of graduates looking for jobs in far fewer commer-cial studios than existed before —has anythingchanged? Is it worth the investment of putting yourselfthrough college if you then have to start at the bottomagain. And are there any jobs anyway?JAMES is dedicated to making links between Educa-

tion and the Media Industries. JAMES is the educationarm of the Association of Professional Recording Ser-vices (APRS), the Music Producers Guild (MPG) and theUK Screen Association. JAMES is made up of dedicatedindustry professionals who wish to ensure that manyyears of professional experience are not lost to futuregenerations.


NETWORK, INTERNET, AND BROADCAST AUDIO

Chair: Bob Walker, Consultant, UK

13:30

P11-1 A New Technology for the Assisted Mixing of Sport Events: Application to Live FootballBroadcasting —Giulio Cengarle, Toni Mateos,Natanael Olaiz, Pau Arumí, Barcelona Media Innovation Centre, Barcelona, Spain

This paper presents a novel application for cap-turing the sound of the action during a footballmatch by automatically mixing the signals ofseveral microphones placed around the pitchand selecting only those microphones that areclose to, or aiming at, the action. The sound engineer is presented with a user interfacewhere he or she can define and move dynami-cally the point of interest on a screen represent-ing the pitch, while the application controls thefaders of the broadcast console. The technologyhas been applied in the context of a three-dimensional surround sound playback of aSpanish first-division match. Convention Paper 8037

14:00

P11-2 Recovery Time of Redundant Ethernet-BasedNetworked Audio Systems—MaciejJaniszewski, Piotr Z. Kozlowski, Wroclaw University of Technology, Wroclaw, Poland

Ethernet-based networked audio systems hasbecome more popular among audio system designers. One of the most important issues thatis available from networked audio system is theredundancy. The system can recover after differ-ent types of failures—cable or even device fail-ure. Redundancy protocols implemented by audio developers are different than protocolsknown from computer networks, but both may beused in an Ethernet-based audio system. Themost important attribute of redundancy in the audio system is a recovery time. This paper is asummary of research that was done at Wroclaw

University of Technology. It shows the recoverytime after different types of failures, with differentnetwork protocols implemented, for all possiblenetwork topologies in CobraNet and EtherSoundsystems.Convention Paper 8038

14:30

P11-3 Upping the Auntie: A Broadcaster’s Take on Ambisonics—Chris Baume, Anthony Churnside, British Broadcasting Corporation Research & Development, London, UK

This paper considers Ambisonics from a broad-caster’s point of view: to identify barriers pre-venting its adoption within the broadcast industryand explore the potential advantages were it tobe adopted. This paper considers Ambisonics asa potential production and broadcast technologyand attempts to assess the impact that the adop-tion of Ambisonics might have on both produc-tion workflows and the audience experience.This is done using two case studies: a large-scale music production of “The Last Night of theProms” and a smaller scale radio drama produc-tion of “The Wonderful Wizard of Oz.” These examples are then used for two subjective listen-ing tests: the first to assess the benefit of repre-senting height allowed by Ambisonics and thesecond to compare the audience’s enjoyment offirst order Ambisonics to stereo and 5.0 mixes. Convention Paper 8039

15:00

P11-4 Audio-Video Synchronization for Post-Production over Managed Wide-Area Networks—Nathan Brock,1 Michelle Daniels,1Steve Morris,2 Peter Otto11University of California San Diego, La Jolla, CA, USA

2Skywalker Sound, Marin County, CA, USA

A persistent challenge with enabling remote col-laboration for cinema post-production is synchro-nizing audio and video assets. This paper detailsefforts to guarantee that the sound quality andaudio-video synchronization over networked col-laborative systems will be measurably the sameas that experienced in a traditional facility. Thisincludes establishing a common word-clocksource for all digital audio devices on the net-work, extending transport control and time codeto all audio and video assets, adjusting latenciesto ensure sample-accurate mixing between remote audio sources, and locking audio andvideo playback to within quarter-frame accuracy.We will detail our instantiation of these tech-niques at a demonstration given in December2009 involving collaboration between a film edi-tor in San Diego and a sound designer in MarinCounty, California.Convention Paper 8040

15:30

P11-5 A Proxy Approach for Interoperability andCommon Control of Networked Digital AudioDevices—Osedum P. Igumbor, Richard J. Foss,Rhodes University, Grahamstown, South Africa

This paper highlights the challenge that results

Technical Program


from the availability of a large number of controlprotocols within the context of digital audio networks. Devices that conform to different pro-tocols are unable to communicate with one another, even though they might be utilizing thesame networking technology (Ethernet, IEEE1394 serial bus, USB). This paper describes theuse of a proxy that allows for high-level deviceinteraction (by sending protocol messages) between networked devices. Furthermore, theproxy allows for a common controller to controlthe disparate networked devices.Convention Paper 8041

16:00

P11-6 Network Neutral Control over Quality of Service Networks—Philip Foulkes, Richard Foss, Rhodes University, Grahamstown,South Africa

IEEE 1394 (FireWire) and Ethernet Audio/VideoBridging are two networking technologies that allow for the transportation of synchronized, low-latency, real-time audio and video data. Eachnetworking technology has its own methods andtechniques for establishing stream connectionsbetween the devices that reside on the net-works. This paper discusses the interoperabilityof these two networking technologies via an audio gateway and the use of a common controlprotocol, AES-X170, to allow for the control ofthe parameters of these disparate networks.This control is provided by a software patchbayapplication. Convention Paper 8042

16:30

P11-7 Relative Importance of Speech and Non-Speech Components in Program LoudnessAssessment—Ian Dash,1 Mark Bassett,2Densil Cabrera21Australian Broadcasting Corporation, Sydney, NSW, Australia;

2The University of Sydney, Sydney, NSW, Australia

It is commonly assumed in broadcasting and filmproduction that audiences determine soundtrackloudness mainly from the speech component.While intelligibility considerations support thisidea indirectly, the literature is very short on direct supporting evidence. A listening test wastherefore conducted to test this hypothesis. Results suggest that listeners judge loudnessfrom overall levels rather than speech levels. Asecondary trend is that listeners tend to comparelike with like. Thus, listeners will comparespeech loudness with other speech contentrather than with non-speech content and willcompare loudness of non-speech content withother non-speech content more than withspeech content. A recommendation is made onapplying this result for informed program loud-ness control.Convention Paper 8043

17:00

P11-8 Loudness Normalization in the Age ofPortable Media Players—Martin Wolters,1Harald Mundt,1 Jeffrey Riedmiller2

1Dolby Germany GmbH, Nuremberg, Germany2Dolby Laboratories Inc., San Francisco, CA, USA

In recent years, the increasing popularity ofportable media devices among consumers hascreated new and unique audio challenges forcontent creators, distributors as well as devicemanufacturers. Many of the latest devices arecapable of supporting a broad range of contenttypes and media formats including those oftenassociated with high quality (wider dynamic-range) experiences such as HDTV, Blu-ray orDVD. However, portable media devices are gen-erally challenged in terms of maintaining consis-tent loudness and intelligibility across varyingmedia and content types on either their internalspeaker(s) and/or headphone outputs. This paper proposes a nondestructive method to con-trol playback loudness and dynamic range onportable devices based on a worldwide standardfor loudness measurement as defined by theITU. In addition the proposed method is compati-ble to existing playback software and audio con-tent following the Replay Gain (www.replay-gain.org) proposal. In the course of the paperthe current landscape of loudness levels acrossvarying media and content types is describedand new and nondestructive concepts targetedat addressing consistent loudness and intelligi-bility for portable media players are introduced.Convention Paper 8044

17:30

P11-9 Determining an Optimal Gated LoudnessMeasurement for TV Sound Normalization—Eelco Grimm,1 Esben Skovenborg,2 GerhardSpikofski31Grimm-Audio, Eindhoven, The Netherlands2tc-electronic, Risskov, Denmark3Institute of Broadcast Technology, Berlin, Germany

Undesirable loudness jumps are a notoriousproblem in television broadcast. The solutionconsists in switching to loudness-based meter-ing and program normalization. In Europe thisdevelopment has been led by the EBU P/LOUDgroup, working toward a single target level forloudness normalization applying to all genres ofprograms. P/LOUD found that loudness normal-ization as specified by ITU-R BS.1770-1 worksfairly well for the majority of broadcast programs.However, it was realized that wide loudness-range programs were not well-aligned with otherprograms when using ITU-R BS.1770-1 directly,but that adding a measurement-gate provided asimple yet effective solution. P/LOUD thereforeconducted a formal listening experiment to per-form a subjective evaluation of different gate parameters. This paper specifies the method ofthe subjective evaluation and presents the results in term of preferred gating parameters.Convention Paper 8154

18:00

P11-10 Analog or Digital? A Case-Study to ExaminePedagogical Approaches to Recording Studio Practice—Andrew King, University ofHull, Scarborough, North Yorkshire, UK

This paper explores the use of digital and analog �


mixing consoles in the recording studio over asingle drum kit recording session. Previous research has examined contingent learning,problem-solving, and collaborative learning with-in this environment. However, while there havebeen empirical investigations into the use ofcomputer-based software and interaction aroundand within computers, this has not taken intoconsideration the use of complex recording apparatus. A qualitative case study approachwas used in this investigation. Thirty hours ofvideo data was captured and transcribed. A pre-liminary analysis of the data shows that thereare differences between the types of problemsencountered by learners when using either ananalog or digital mixing console. Convention Paper 8045


NOISE REDUCTION AND SPEECH INTELLIGIBILITY

Chair: Rhonda Wilson, Meridian Audio, Huntingdon, UK

14:00

P12-1 Monolateral and Bilateral Fitting with Different Hearing Aids Directional Configurations—Lorenzo Picinali,1Silvano Prosser21De Montfort University, Leicester, UK 2Università di Ferrara, Ferrara, Italy

Hearing aid bilateral fitting in hearing impairedsubjects raises some problems concerning theinteraction of the perceptual binaural propertieswith the directional characteristics of the device.This experiment aims to establish whether andto which extent in a sample of 20 normally hear-ing subjects the binaural changes in the speech-to-noise level ratio (s/n), caused by differentsymmetrical and asymmetrical microphone con-figurations, and different positions of the speechsignal (frontal or lateral), could alter the perfor-mances of the speech recognition in noise.Speech Reception Thresholds (SRT) in noise(simulated through an Ambisonic virtual soundfield) have been measured monolaterally and bilaterally in order to properly investigate the roleof the binaural interaction in the perception of reproduced signals.Convention Paper 8046

14:30

P12-2 Acoustic Echo Cancellation for WidebandAudio and Beyond—Shreyas Paranjpe, ScottPennock, Phil Hetherington, QNX Software Systems Inc. (Wavemakers), Vancouver, BC,Canada

Speech processing is finally starting the transi-tion to wider bandwidths. The benefits include increased intelligibility and comprehension and amore pleasing communication experience. Highquality full-duplex Acoustic Echo Cancellation(AEC) is an integral component of a hands-freespeakerphone communication system because itallows participants to converse in a natural man-ner (and without a headset!) as they would in

person. Some high-end Telepresence systemsalready achieve life-like communication but arecomputationally demanding and prohibitively expensive. The challenge is to develop a robustAcoustic Echo Canceler (AEC) that processesfull-band audio signals with low computationalcomplexity and reasonable memory consump-tion for an affordable Telepresence experience. Convention Paper 8047

15:00

P12-3 Adaptive Noise Reduction for Real-Time Applications—Constantin Wiesener,1 TimFlohrer,2 Alexander Lerch,2 Stefan Weinzierl11TU Berlin, Berlin, Germany 2zplane.development, Berlin, Germany

We present a new algorithm for real-time noisereduction of audio signals. In order to derive thenoise reduction function, the proposed methodadaptively estimates the instantaneous noisespectrum from an autoregressive signal modelas opposed to the widely-used approach of using a constant noise spectrum fingerprint. Inconjunction with the Ephraim and Malah sup-pression rule a significant reduction of both sta-tionary and non-stationary noise can be obtained. The adaptive algorithm is able to workwithout user interaction and is capable of real-time processing. Furthermore, quality improve-ments are easily possible by integration of additional processing blocks such as transientpreservation.Convention Paper 8048

15:30

P12-4 Active Noise Reduction in Personal AudioDelivery Systems; Assessment Using Loudness Balance Methods—Paul Darlington,Pierre Guiu, Phitek Systems (Europe) Sarl, Lausanne, Switzerland

Subjective methods developed for rating passivehearing protectors are inappropriate to measureactive noise reduction of earphones or head-phones. Additionally, assessment using objec-tive means may produce misleading results,which do not correlate well with wearer experi-ence and do not encompass human variability.The present paper describes application of“loudness balance” methods to the estimation ofactive attenuation of consumer headphones andearphones. Early results suggest that objectivemeasures of the active reduction of pressure report greater attenuations than those implied byestimates of perceived loudness, particularly atlow frequency. Convention Paper 8049

16:00

P12-5 Mapping Speech Intelligibility in NoisyRooms—John F. Culling,1 Sam Jelfs,1Mathieu Lavandier21Cardiff University, Cardiff, UK2Université de Lyon, Vauix-en-Velin Cedex, France

We have developed an algorithm for accuratelypredicting the intelligibility of speech in noise in areverberant environment. The algorithm is based

Technical Program


on a development of the equalization-cancella-tion theory of binaural unmasking, combinedwith established prediction methods for monaur-al speech perception in noise. It has been vali-dated against a wide range of empirical data.Acoustic measurements of rooms, known as bin-aural room impulse responses (BRIRs) are ana-lyzed to predict intelligibility of a nearby voicemasked by any number of steady-state noisemaskers in any spatial configuration within theroom. This computationally efficient method canbe used to generate intelligibility maps of roomsbased on the design of the room.Convention Paper 8050

16:30

P12-6 Further Investigations into Improving STI’sRecognition of the Effects of Poor FrequencyResponse on Subjective Intelligibility—GlennLeembruggen,1 Marco Hippler,2 Peter Mapp31Acoustic Directions Pty Ltd., ICE Design, Syd-ney, Australia, and University of Sydney, Sydney, Australia

2University of Applied Sciences Cologne, Cologne, Germany

3Peter Mapp and Associates, Colchester, UK

Previous work has highlighted deficiencies in theability of the STI metric to satisfactorily recog-nize the subjective loss of intelligibility that occurs with sound systems having poor frequen-cy responses, particularly in the presence of reverberation. In a recent paper we explored thechanges to STI values resulting from a range ofdynamic speech spectra taken over differingtime lengths with different filter responses. Thatwork included determining the effects on STI val-ues of three alternative spreading functions sim-ulating the ear’s upward masking mechanism.This paper extends that work and explores theeffects on STI values of two masking methodsused in MPEG-1 audio coding.Convention Paper 8051

17:00

P12-7 Real-Time Speech-Rate Modification Experiments —Adam Kupryjanow, AndrzejCzyzewski, Gdansk University of Technology,Gdansk, Poland

An algorithm designed for real-time speech timescale modification (stretching) is proposed, pro-viding a combination of typical synchronousoverlap and add based time scale modificationalgorithm and signal redundancy detection algo-rithms that allow to remove parts of the speechsignal and replace them with the stretchedspeech signal fragments. Effectiveness of signalprocessing algorithms are examined experimen-tally together with the resulting sound quality. Convention Paper 8052


HOW DO WE EVALUATE HIGH RESOLUTIONFORMATS FOR DIGITAL AUDIO?

Chair: Hans van Maanen, Temporal Coherence, The Netherlands

Panelists: Peter Craven, Algol Applications Ltd., Steyning, UKMilind Kunchur, University of South Carolina, SC, USAThomas Sporer, Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, GermanyMenno van der Veen, Ir. Bureau Vanderveen, Zwolle, The NetherlandsWieslaw Woszczyk, McGill University, Montreal, Quebec, Canada

Since the introduction of the High Resolution Formats forDigital Audio (e.g. SACD, 192 kHz / 24 bit), there hasbeen discussion about the audibility of these formats,compared to the CD format (44.1 kHz / 16 bit). What dif-ference does high sample rate and bit depth make in ourperception? Can we hear tones above 20 kHz? Can weperceive quantization errors in 16-bit audio? Does highsample rate make a difference in our phase resolution?Are we even asking the right questions? Controlled, sci-entific listening tests have mostly given ambiguous or in-conclusive results, yet a large number of consumers, using “high-end” audio equipment, prefer the sound fromthe “high resolution” formats over the CD. The workshopwill start will introductory notes from the panel members,discussing the differences between “analog” and first-generation digital formats, address some of the paradox-es of the CD format, present results on “circumstantial”evidence and subjective testing, show results on the au-dibility of the human hearing, which cannot be explainedby the commonly accepted 20 kHz upper limit and dis-cuss the problems and pitfalls of “scientific” listeningtests, where possible illustrated with demonstrations. These introductory notes should provoke a discussion

with the audience about the audibility of the improve-ments of the “high resolution” formats We attempt toreach consensus, where possible, regarding what isknown and what is not with respect to our ability to per-ceive the differences between standard and high resolu-tion audio. We further discuss the paradigms of testingfor evaluating the quality and perception of high resolu-tion audio, how to structure the tests, how to configurethe testing environment, and how to analyze the results. The outcome of the workshop should also be to find

the way forward by identifying the bottlenecks which—atthis moment—hamper the further implementation of the“high resolution” formats for "high-end" audio as theseformats create an opportunity for the audio industry as awhole: better sources stimulate the development of bet-ter reproduction systems.

Special EventAES/APRS/MPG PLATINUM PRODUCER BEHIND THE GLASSSunday, May 23, 14:00 – 15:45 Room C1

Moderator: Howard Massey

Panelists: Jon cohenJon KellyStephen LipsonRuss Titelman

An informal conversation with legendary British andAmerican producers about the journey to the intersectionwhere art and commerce meet. Topics will include tipsand techniques for eliciting the best performance from an �


artist; realizing the artist’s vision without busting the bud-get; and the changing role of the producer in today’s increasingly homegrown environment.Howard Massey is a leading audio industry consul-

tant, technical writer, and author of Behind the Glass andBehind the Glass Volume II, two collections of in-depthinterviews with top record producers and audio engi-neers widely used in recording schools the world over.He also co-authored legendary Beatles engineer GeoffEmerick’s acclaimed 2006 memoir, Here, There, andEverywhere: My Life Recording the Music of the Beatles.This event will be prefaced by a special presentation

to Sir George Martin OBE.

Student Event/Career DevelopmentEDUCATION FAIRSunday, May 23, 14:00 – 16:00C4-Foyer

Institutions offering studies in audio (from short coursesto graduate degrees) will be represented in a “table top”session. Information on each school’s respective pro-grams will be made available through displays and acad-emic guidance. There is no charge for schools to partici-pate. Admission is free and open to all Conventionattendees.

Sunday, May 23 14:00 Room Saint JulienTechnical Committee Meeting on Electro MagneticCompatibility

Sunday, May 23 15:00 Room Saint JulienTechnical Committee Meeting on Semantic AudioAnalysis


THE WORK OF FORENSIC SPEECH AND AUDIO ANALYSTS

Chair: Peter French, JP French Associates, York, UK, University of York, York, UK

Panelists: Philip Harrison, JP French Associates, York, UK, University of York, York, UK

This workshop provides an illustrated introduction to thevarious categories of work carried out by forensic speechand audio analysts. In attempting to convey the variety,scope, and flavor of the work, methodological issues,principles, and developments are discussed in respect ofkey areas of forensic investigation. These include speak-er profiling, the analysis of recordings of criminal speechsamples in order to assemble a profile of regional, social,ethnic and other information concerning speakers;speaker comparison testing, the comparison of recordedvoice samples using auditory-phonetic and acousticmethods as well as automatic systems to assist withidentification; content analysis, the use of phonetic andacoustic techniques to decipher contentious areas of evi-dential recordings; procedures for evaluating ear-witnesstestimony; sound propagation testing, the reconstruc-tions of crime scenes and the measurement of soundfrom given points across distances to assess what mighthave been heard by witness in various positions; record-ing authentication, the examination of evidential record-ings for signs of tampering or falsification. Some of thepoints developed are exemplified by recordings and

analyses arising from important and high profile criminalcases. The presentation will be followed by a discussionsession where questions and points are invited fromthose attending.


SPATIAL AUDIO REPRODUCTION: FROM THEORY TO PRODUCTION

Presenters:Frank Melchior, IOSONO GmbH, Erfurt, GermanySascha Spors, Deutsche Telekom AG Laboratories, Berlin, Germany

Advanced high-resolution spatial sound reproductionsystems like Wave Field Synthesis (WFS) and Higher-Order Ambisonics (HOA) are being used increasingly.Consequently more and more material is being producedfor such systems. Established channel-based productionprocesses from stereophony can only be applied to acertain extent. In the future, a paradigm shift toward object-based audio production will have to take place inorder to cope for the needs of systems like WFS. This tutorial spans the bridge from the physical foundations ofsuch systems, over their practical implementation towardefficient production processes. The focus lies on WFS,however the findings will also be applicable to other sys-tems. The tutorial is accompanied by practical examplesof object-based productions for WFS.

Student Event/Career DevelopmentRECORDING COMPETITION—PART 1(WORLD/FOLK, JAZZ/BLUES, POP/ROCK)Sunday, May 23, 16:00 – 19:00Room C1

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participatesin critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereorecordings in these categories:

• Jazz/Blues 16:00 to 17:00• World/Folk 17:00 to 18:00• Pop/Rock 18:00 to 19:00

The top three finalists in each category present a shortsummary of their techniques and intentions, and playtheir projects for all who attend. Meritorious awards willbe presented at the closing Student Delegate AssemblyMeeting (SDA-2) on Tuesday afternoon. It’s a great chance to hear the work of your colleagues

from other educational institutions. Everyone learns fromthe judges’ comments even if your project isn’t one of thefinalists, and it's a great chance to meet other studentsand faculty. Judges include: Stereo – Jazz/Blues: Jim Anderson,

Dave McLaughlin, Jim Kaiser. Stereo – World/Folk: DarcyProper, Andres Mayo, Mark Drews. Stereo – Pop/Rock:Gary Gottlieb, Hugh Robjohns, Barry Marshall.

Sunday, May 23 16:00 Room Saint JulienTechnical Committee Meeting on Fiber Optics for Audio

Sunday, May 23 16:00 Room Mouton CadetStandards Committee Meeting SC-04-04 MicrophoneMeasurement and Characterization

Technical Program


Session P13 Sunday, May 2316:30 – 18:00 C4-Foyer

POSTERS: ROOM ACOUSTICS, SOUND REINFORCEMENT, AND INSTRUMENTATION

16:30

P13-1 Adaptive Equalizer for Acoustic FeedbackControl—Alexis Favrot, Christof Faller, IllusonicLLC, Lausanne, Switzerland

Acoustic feedback is a recurrent problem in audio applications involving amplified closedloop systems. The goal of the proposed acousticfeedback control algorithm is to adapt an equal-izer, applied to the microphone signal, to preventfeedback before the effect of it is noticeable.This is achieved by automatically decreasing thegain of an equalizer at frequencies where feed-back likely will occur. The equalization curve isdetermined using information contained in anadaptively estimated feedback path. A computa-tionally efficient implementation of the proposedalgorithm, using short-time Fourier transform, isdescribed. Convention Paper 8053

16:30

P13-2 The Meshotron: A Network of Specialized Hardware Units for 3-D Digital WaveguideMesh Acoustic Model Parallelization—Guilherme Campos, Sara Barros, University of Aveiro, Aveiro, Portugal

This paper presents the project of a computingnetwork—the Meshotron—specifically designedfor large-scale parallelization of three-dimension-al Digital Waveguide Mesh (3-D DWM) roomacoustic models. It discusses the motivation ofthe project, its advantages, and the architectureenvisaged for the application-specific hardware(ASH) units to form the network. The initialstages involve the development of a softwareprototype based on the rectangular mesh topolo-gy, using appropriate hardware simulation tools,and the design and test of FPGA-based scatter-ing units for air and boundary nodes. Convention Paper 8054

16:30

P13-3 A Hybrid Approach for Real-Time RoomAcoustic Response Simulation—Andrea Primavera,1 Lorenzo Palestini,1 Stefania Cecchi,1 Francesco Piazza,2 Marco Moschetti21Università Politecnica delle Marche, Ancona, Italy

2Korg Italy, Osimo, Italy

Reverberation is a well known effect particularlyimportant for music listening especially forrecorded and live music. Generally, there are twoapproaches for artificial reverberation: the desired signal can be obtained by convolving theinput signal with a measured impulse response(IR) or a synthetic one. Taking into account theadvantages of both approaches, a hybrid artificialreverberation algorithm is presented. The earlyreflections are derived from a real IR, truncatedconsidering the calculated mixing time, and the

reverberation tail is obtained considering theMoorer‘s structure. The parameters defining thisstructure are derived from the analyzed IR, usinga minimization criteria based on SimultaneousPerturbation Stochastic Approximation (SPSA).The obtained results showed a high-quality rever-berator with a low computational load.Convention Paper 8055

16:30

P13-4 Sensor Networks for Measurement ofAcoustic Parameters of Classrooms—Suthikshn Kumar, PESIT, Bangalore, India

Measurement of acoustic parameters such assignal-to-noise ratio, reverberation time, andbackground noise is important in order to opti-mize classroom configuration and architecture.The classroom learning environment can thus beimproved. The lecturers and students find com-fort and assurance in knowing that the acousticparameters of the classroom have been mea-sured and indicate good acoustics. We proposesensor networks for acoustic measurement applications. The sensor network consists of anarray of inexpensive sensor nodes which com-municate and aggregate the acoustic parame-ters. The paper presents the sensor node archi-tecture and illustrates the typical usage of thesensor networks for the measurement ofacoustic parameters in classrooms.Convention Paper 8056[This paper will not be presented but is availablefor purchase.]

16:30

P13-5 In Situ Directivity Measurement of Flush-Mounted Loudspeakers in a Non-Environment Listening Room—Daniel Fernández-Comesaña,1 Paúl Rodriguez-Garcia,1 Soledad Torres-Guijarro,2Antonio Pena31Institute of Sound and Vibration Research (ISVR), Southampton, UK

2Laboratorio Oficial de Metroloxía de Galicia (LOMG), Tecnópole Ourense, Spain

3ETSE Telecomunicación, Universidade de Vigo, Vigo, Spain

Directivity is one important parameter to define thebehavior of a loudspeaker. There are many techniques and standards about directivity mea-surements in anechoic chambers, but in situ mea-surements of flush-mounted loudspeakers showsome specific problems. This paper develops aprocedure to measure directivity under the specialconditions of a non-environment listening room, introducing the techniques utilized, the problemsfound with the proposed solutions, and discussingthe limitations of the process. The existence of re-flections, baffling effects due to adjacent walls anda comparison to theoretical models of the radiationof a piston are discussed.Convention Paper 8057

16:30

P13-6 Measurement of Loudspeakers without Usingan Anechoic Chamber Utilizing Pulse-TrainMeasurement Method—Teruo Muraoka,1

�


Takahiro Miura,1 Haruhito Shimura,2Hiroshi Akino,2 Tohru Ifukube11The University of Tokyo, Tokyo, Japan 2Audio-Technica Corporation, Machida, Japan

Generally, loudspeakers are measured in ananechoic chamber. However, those chambersare very expensive, which makes the sound en-gineer's easy acoustical measurements almostimpossible. Thus the authors devised a newmeasurement without using any anechoic cham-ber. Traditionally, loudspeakers are measured inan anechoic chamber by installing in a properspeaker enclosure and by driving with swept sinusoidal wave. Loudspeaker's radiated soundis detected with an omnidirectional reference microphone, and the output of the microphone isdisplayed graphically. If a shotgun microphone isemployed instead of omnidirectional micro-phone, an anechoic chamber will become un-necessary. Furthermore, if pulse-train is used asa test signal, noise in surrounding circumstanceswill be reduced by applying a synchronous-aver-aging algorithm. Based upon this idea, the authors measured loudspeaker in any acousticalcircumstances and obtained very similar datawith that by the traditional method.Convention Paper 8058Paper presented by Takahiro Miura.

16:30

P13-7 Acoustic Space Sampling and the Grand Piano in a Non-Anechoic Environment: ARecordist-Centric Approach to the MusicalAcoustic Study—Grzegorz Sikora,1,2Brett Leonard,1,2 Martha de Francisco,1,2Douglas Eck2,3,41McGill University, Montreal, Quebec, Canada 2Centre for Interdisciplinary Research in Music, Media, and Technology, Montreal, Quebec, Canada

3Université de Montréal, Montreal, Quebec, Canada

4International Laboratory for Brain, Music, and Sound Research, Montreal, Quebec, Canada

A novel approach to instrument acoustics research is presented in which the instrument iscoupled with a room and both are measured asa single acoustical system, in juxtaposition tomany anechoic and computer simulated studiesof musical acoustics. The technique is applied tothe ubiquitous concert grand piano, where spec-tral information is gathered through the processof “acoustic space sampling” (AcSS), using morethan 1,330 microphones. The physical data isthen combined with psychoacoustic predictors togenerate a map of timbre. This map is comparedto the preference of expert listeners, thereby cor-relating the physical measures obtained throughacoustic space sampling to the application of therecording engineer.Convention Paper 8059

Sunday, May 23 17:30 Room Mouton CadetStandards Committee Meeting SC-04-01 Acousticsand Sound Source Modeling

Sunday, May 23 17:30 Room Mouton CadetStandards Committee Meeting SC-03-12 Forensic Audio

Special EventOPEN HOUSE OF THE TECHNICAL COUNCIL AND THE RICHARD C. HEYSER MEMORIAL LECTURE Sunday, May 23, 19:00 – 20:30Room C2

Lecturer: Brian C. J. Moore

The Heyser Series is an endowment for lectures by emi-nent individuals with outstanding reputations in audio engi-neering and its related fields. The series is featured twiceannually at both the United States and European AESConventions. Established in May 1999, The Richard C.Heyser Memorial Lecture honors the memory of RichardHeyser, a scientist at the Jet Propulsion Laboratory, whowas awarded nine patents in audio and communicationtechniques and was widely known for his ability to clearlypresent new and complex technical ideas. Heyser was alsoan AES governor and AES Silver Medal recipient.The Richard C. Heyser distinguished lecturer for the

128th AES Convention is Brian C. J. Moore, Professor ofAuditory Perception, Department of Experimental Psycholo-gy, University of Cambridge. The topic of his lecture is“Hearing Loss and Hearing Aids.” Hearing loss affects more than 10 percent of the adult

population in most countries and is especially prevalentamong the elderly. The most common form of hearing lossarises from dysfunction of the cochlea in the inner ear. Inmost cases, the only form of treatment is via hearing aids or(for profound losses) cochlear implants. In this lectureMoore will review some of the perceptual consequences ofhearing loss, which involve much more than just loss of sen-sitivity to weak sounds. He will then describe the signal pro-cessing that is performed in hearing aids and will considerthe extent to which hearing aids “compensate” for hearingloss. Possible avenues for the future will be discussed.Brian Moore is Professor of Auditory Perception in the

University of Cambridge. He is a Fellow of the Royal Soci-ety, the Academy of Medical Sciences, and the AcousticalSociety of America, and an Honorary Fellow of the BelgianSociety of Audiology and of the British Society of HearingAid Audiologists. He is President of the Association of Inde-pendent Hearing Healthcare Professionals (UK). He is anAssociate Editor of the Journal of the Acoustical Society ofAmerica and is a member of the Editorial Boards of HearingResearch, and Audiology and Neurotology. He has writtenor edited 14 books and over 500 scientific papers and bookchapters. In 2003 he was awarded the Acoustical Society ofAmerica Silver Medal in physiological and psychologicalacoustics. In 2004 he was awarded the first “InternationalAward in Hearing” from the American Academy of Audiolo-gy. He has twice been awarded the Littler Prize of theBritish Society of Audiology. In 2008 he received the Awardof Merit from the Association for Research in Otolaryngolo-gy and the Hugh Knowles Prize for Distinguished Achieve-ment from Northwestern University. He is Wine Steward ofWolfson College Cambridge.Moore’s presentation will be followed by a reception

hosted by the AES Technical Council.

Session P14 Monday, May 2409:00 – 12:00 Room C3

SPATIAL AUDIO PERCEPTION

Chair: Russell Mason, University of Surrey, Guildford, Surrey, UK

09:00

P14-1 Perceptual Assessment of Delay Accuracy

Technical Program


and Loudspeaker Misplacement in WaveField Synthesis—Jens Ahrens, Matthias Geier,Sascha Spors, Technische Universität Berlin,Berlin, Germany

The implementation of simple virtual sourcemodels like plane and spherical waves in wavefield synthesis (WFS) employs delays that areapplied to the input signals. We present a formalexperiment evaluating the perceptual conse-quences of different accuracies of these delays.Closely related to the question of delay accuracyis the accuracy of the loudspeaker positioning.The second part of the presented experiment investigates the perceptual consequences of improperly placed loudspeakers. Dynamic binau-ral room impulse response-based simulations ofa real loudspeaker array are employed and astatic audio scene setup is considered. Convention Paper 8068

09:30

P14-2 Perceptual Evaluation of Focused Sources in Wave Field Synthesis—Matthias Geier, Hagen Wierstorf, Jens Ahrens, Ina Wechsung,Alexander Raake, Sascha Spors, TechnischeUniversität Berlin, Berlin, Germany

Wave Field Synthesis provides the possibility toreproduce virtual sound sources located betweenthe loudspeaker array and the listener. Suchsources are known as focused sources. A previ-ously published study including an informal listen-ing test has shown that the reproduction of fo-cused sources is subject to audible artifacts,especially for large loudspeaker arrays. The com-bination of the time-reversal nature of focusedsources and spatial sampling leads to pre-echoes.The perception of these artifacts is quite differentdepending on the relative listener position. Thispaper describes a formal test that was conductedto verify the perceptual relevance of the physicalproperties found in previous papers.Convention Paper 8069

10:00

P14-3 Experiments on the Perception of ElevatedSources in Wave-Field Synthesis Using HRTFCues—Jose J. Lopez,1 Maximo Cobos,1Basilio Pueo21Technical University of Valencia, Valencia, Spain2University of Alicante, Alicante, Spain

Wave-Field Synthesis (WFS) is a spatial soundreproduction technique that has attracted the interest of many researchers in the last decades.Unfortunately, although WFS has been shown toprovide excellent localization accuracy, thisproperty is restricted to sources located on thehorizontal plane. Recently, the authors proposeda hybrid system that combines HRTF-basedspectral filtering with WFS. This system makesuse of the conventional WFS approach toachieve localization in the horizontal plane,whereas elevation effects are simulated bymeans of spectral elevation cues. This paperprovides a review of the proposed method to-gether with a compilation of the last experimentscarried out to evaluate the perception of elevat-ed sources in this novel system.Convention Paper 8070

10:30

P14-4 Comparison of the Width of Sound Sourcesin 2-Channel and 3-Channel Sound Reproduction— Munhum Park, Aki Härmä,Steven van de Par, Georgia Tryfou, Philips Research Laboratories, Eindhoven, The Netherlands

In this paper we present the result of listeningtests where the width of the sound stage wascompared between conventional 2-channelstereophony and 3-channel reproduction with anadditional center loudspeaker. When listenerswere seated at the axis of symmetry, there wasno significant difference between the two cases.In off axis positions there was a clear trend thatthe perceived image width of the noise is influ-enced by the way an additional interfering stimu-lus is reproduced, which was found to dependon the correlation of the noise. The results sug-gest that the spatial attribute of one sound imagemay be affected by another, for which possibleexplanations based on the principles of binauralhearing will be discussed. Convention Paper 8071

11:00

P14-5 Study of the Effect of Source Directivity onthe Perception of Sound in a Virtual Free-Field—Sorrel Hoare, Alex Southern, DamianMurphy, University of York, York, UK

In the context of soundscape evaluation, an areathat has yet to be explored satisfactorily concernsfaithful modeling and auralization of outdoorspaces. A common limitation of popular acousticmodeling techniques relates to source characteri-zation, an aspect of which considered here issource directivity. In terms of auralization, sourcedirectivity may be linked to perceived spatial soundquality. In a preliminary listening test a selection ofaudio samples are auralized in a virtual free-fieldwhere the directivity pattern of the source is var-ied. The results confirm the significance of thischaracteristic on the acoustic perception, thus vali-dating the extension of this research to more com-plex models. Convention Paper 8072

11:30

P14-6 Audio Spatialization for The Morning Line—David G. Malham, Tony Myatt, Oliver Larkin, Peter Worth, Matthew Paradis, University ofYork, York, UK

The Morning Line is a large-scale outdoor sculp-ture with a multi-dimensional sound system con-sisting of 41 small weatherproof main loudspeak-ers and 12 subs, configured in 6 surround arrays,or “rooms,” distributed throughout the sculpture.The irregular nature of these arrays necessitatedthe use of VBAP panning on the sculpture butwhen preparing works to be played on it, Ambison-ics is more often used. This paper describes thehardware and software developed for the sculp-ture, theories of spatial audio perception thatprompted the design approach, as well as the pro-duction facilities provided for the composerswhose works are performed on The Morning Line. Convention Paper 8073

�



LOUDSPEAKERS AND HEADPHONES: PART 2

Chair: John Vanderkooy, University of Waterloo, Waterloo, Ontario, Canada

09:00

P15-1 Chameleon Subwoofer Arrays—GeneralizedTheory of Vectored Sources in a ClosedAcoustic Space—Adam J. Hill, Malcolm O. J.Hawksford, University of Essex, Colchester, UK

An equalization model is presented that seeksoptimal solutions to wide area low-frequencysound reproduction in closed acoustic spaces.The methodology improves upon conventionalwisdom by incorporating a generalized sub-woofer array where individual frequency depen-dent loudspeaker polar responses are describedby complex spherical harmonic frequency dependent functions. Multi-point system identifi-cation is performed using three-dimensional finite-difference time-domain simulation with optimization applied to seek global equalizationrepresented by a set of orthogonal transfer func-tions applied to each spherical harmonic of eachsubwoofer within the array. The system is evalu-ated within a three-dimensional virtual acousticspace using both time and frequency domainmetrics. Convention Paper 8074

09:30

P15-2 Dynamical Measurement of the Effective Radiation Area SD—Wolfgang Klippel,1Joachim Schlechter21University of Dresden, Dresden, Germany2Klippel GmbH, Dresden, Germany

The effective radiation area SD is one of the mostimportant loudspeaker parameters because it determines the acoustical output (SPL, soundpower) and efficiency of the transducer. This parameter is usually derived from the geometricalsize of the radiator considering the diameter ofhalf the surround area. This conventional tech-nique fails for microspeakers and headphonetransducers where the surround geometry is morecomplicated and the excursion varies not linearlyversus radius. The paper discusses new methodsfor measuring the SD more precisely. The firstmethod uses a laser sensor and microphone tomeasure the voice coil displacement and thesound pressure generated by the transducerwhile mounted in a sealed enclosure. The secondmethod uses only mechanical vibration andgeometry of the radiator measured by using alaser triangulation scanner. The paper checks thereliability and reproducibility of conventional andthe new methods and discusses the propagationof the measurement error on the T/S parametersusing text box perturbation technique and otherderived parameters (sensitivity).Convention Paper 8075

10:00

P15-3 Modeling Acoustic Horns with FEA—David J. Murphy,1 Rick Morgans21Krix Loudspeakers, Hackham, SA, Australia 2Cyclopic Energy, Adelaide, SA, Australia

Simulations for acoustic horns have been devel-oped using Finite Element Analysis (FEA) inboth ANSYS and COMSOL modeling software.The FEA method solves the ideal, linear waveequation, so the effects of nonlinearity and vis-cosity are not inherently simulated. Quarter mod-els were used in each case as the acoustichorns were not axi-symmetric. The size andshape of the horns have been informed by long-standing design parameters. The simulation results have been compared with acoustic mea-surements, and discrepancies investigated.While it was found that beam angle (dispersion)characteristics were relatively robust, it was use-ful in both cases to improve the accuracy of thebeam angle, SPL, and impedance simulationsby the application of frequency dependentdamping to models of the compression driver. Convention Paper 8076

10:30

P15-4 Electroacoustic Measurements for HighNoise Environment Intercom Headsets—Stelios Potirakis,1 Nicolas–Alexander Tatlas,2Maria Rangoussi11Technological Education Institute of Piraeus, Aigaleo-Athens, Greece

2University of Patras, Patras, Greece

Intercom headsets are mandatory communica-tion apparatus in high noise environment (HNE)conditions. Although military intercom headsetsare typically used under extreme environmentalconditions, a standard performance evaluationmethod exists only for the earphone elements. Asystematic methodology for the measurementand performance evaluation of HNE headsetshas recently been proposed based on the use ofHead and Torso Simulator (HATS), addressingboth signal reproduction and noise reduction issues. In this paper an improvement of the spe-cific method is proposed concerning the headsetelectroacoustic reproduction measurements.The proposed enhancement refers to the use ofimpulse response measurements for the extrac-tion of the specific characteristics as a functionof frequency.Convention Paper 8077

11:00

P15-5 Phase, Polarity, and Delay or Why a Loudspeaker Crossover Is Not a Time Machine—Ian Dash,1 Fergus Fricke21Australian Broadcasting Corporation, Sydney, NSW, Australia

2University of Sydney, Sydney, NSW, Australia

Phase delay and group delay functions are wellknown in lumped system theory but they are notwell understood, particularly in the context ofnon-minimum phase lumped systems. This paper presents four paradoxes that arise froman overly-literal interpretation of the phase delayfunction. The paradoxes are very simply illustrat-

Technical Program


ed using the example of a first order loudspeak-er crossover. A new system response model ispresented to resolve these paradoxes. The mod-el is extended to higher order systems and tonon-minimum phase systems. Applications andimplications for system analysis are discussed.Convention Paper 8078

11:30

P15-6 Time and Level Localization Curves for aRegularly -Spaced Octagon Loudspeaker Array—Laurent S. R. Simon, Russell Mason,University of Surrey, Guildford, Surrey, UK

Multichannel microphone array designs oftenuse the localization curves that have been derived for 2-0 stereophony. Previous studiesshowed that side and rear perception of phan-tom image locations require somewhat differentcurves. This paper describes an experiment con-ducted to determine localization curves using anoctagonal loudspeaker setup. Various signalswith a range of interchannel time and level differ-ences were produced between pairs of adjacentloudspeakers, and subjects were asked to evalu-ate the perceived sound event's direction and itslocatedness. The results showed that the curvesfor the side pairs of adjacent loudspeakers aresignificantly different to the front and rear pairs.The resulting curves can be used to derive suit-able microphone techniques for this loudspeakersetup. Convention Paper 8079

Workshop 8 Monday, May 2409:00 – 11:00 Room C2

INTERACTING WITH SEMANTIC AUDIO—BRIDGING THE GAP BETWEEN HUMANS AND ALGORITHMS

Chair: Michael Hlatky, University of Applied Sciences, Bremen, Bremen, Germany

Panelists: Masataka Goto, Media Interaction Group, National Institute of Advanced Industrial Science and Technology, Tsukuba, JapanAnssi Klapuri, Queen Mary University of London, London, UKJörn Loviscach, Fachhochschule Bielefeld, University of Applied Sciences, Bielefeld, GermanyYves Raimond, BBC Audio & Music Interactive, London, UK

Technologies under the heading Semantic Audio haveundergone a fascinating development in the past fewyears. Hundreds of algorithms have been developed;first applications have made their way from research intopossible mainstream application. However, the currentlevel of awareness among prospective users and theamount of actual practical use do not seem to live up tothe potential of semantic audio technologies. We arguethat this is more an issue concerning interface and inter-action than a problem concerning the robustness of theapplied algorithms or a lack of need in audio production.The panelists of this workshop offer ways to improve theusability of semantic audio techniques. They look intocurrent applications in off-the-shelf products, discuss theuse in a variety of specialized applications such as cus-

tom-tailored archival solutions, demonstrate and show-case their own developments in interfaces for semantic audio, and propose future directions in interface and interaction development for semantic audio technologiesranging from audio file retrieval to intelligent audio effects. The second half of this workshop includes hands-on

interactive experiences provided by the panel.


REDUNDANT NETWORKS FOR AUDIO

Chair: Umberto Zanghieri, ZP Engineering srl, Rome, Italy

Panelists: Marc Brunke, Optocore, Munich, GermanyDavid Myers, Audinate, Ultimo NSW, AustraliaMichel Quaix, Digigram, Montbonnot, FranceAl Walker, Midas Klark Teknik, Kidderminster, UK

Redundancy within a digital network for audio transporthas specific requirements when compared to redundan-cy for usual data networks, and for this reason severalpractical implementations offer dual network ports. Ideasand solutions from current formats will be presented, detailing requirements specific to usage cases, such asdigital audio transport for live events or for install. Thediscussion will include also non-IP audio networks. Dif-ferent topologies and switchover issues will be present-ed, and practical real-world examples will be shown.

Tutorial 6 Monday, May 2409:00 – 10:15 Room C1

CLASSICAL MUSIC WITH PERSPECTIVE

Presenter: Sabine Maier, Tonmeister, Vienna, Austria

Concerts of classical music, as well as operas, havebeen a part of broadcast programming since the begin-ning of television. The aesthetic relationship betweensound and picture plays an important part in the satisfac-tory experience of the consumer. The question how farthe audio perspective (if at all!) should follow the videoangle (or vice versa) has always been a subject of dis-cussion among sound engineers and producers. In thecourse of a diploma work this aspect has been investi-gated systematically. One excerpt of the famous NewYear's Concert (from 2009) has been remixed into fourdistinctly different versions (in stereo and surroundsound). Close to 80 laymen who expressed an interest inclassical music had the task of judging these versions tothe same picture if they found the audio perspective appropriate to the video or not. In this tutorial the experimental procedure as well as

the results will be discussed. Examples of the differentmixes will be played.

Monday, May 24 09:00 Room Saint JulienTechnical Committee Meeting on High ResolutionAudio

Monday, May 24 10:00 Room Saint JulienTechnical Committee Meeting on Audio Forensics

�


Session P16 Monday, May 2410:30 – 12:00 C4-Foyer

POSTERS: AUDIO PROCESSING—AUDIO CODING AND MACHINE INTERFACE

10:30

P16-1 Trajectory Sampling for Computationally Efficient Reproduction of Moving SoundSources—Nara Hahn, Keunwoo Choi, Hyunjoo Chung, Koeng-Mo Sung, Seoul National University, Seoul, Korea

Reproducing moving virtual sound sources hasbeen addressed in a number of spatial audiostudies. The trajectories of moving sources arerelatively oversampled if they are sampled intemporal domain, as this leads to inefficiency incomputing the time delay and amplitude attenua-tion due to sound propagation. In this papermethods for trajectory sampling are proposedthat use the spatial property of the movementand the spectral property of the received signal.These methods reduce the number of sampledposit ions and reduce the computational complexity. Listening tests were performed todetermine the appropriate downsampling ratethat depends not only on the trajectory but onthe frequency content of the source signal. Convention Paper 8080

10:30

P16-2 Audio Latency Measurement for Desktop Operating Systems with Onboard Soundcards—Yonghao Wang,1,2 Ryan Stables,2 Joshua Reiss11Queen Mary University of London, London, UK 2Birmingham City University, Birmingham, UK

Using commodity computers in conjunction withlive music digital audio workstations (DAW) hasbecome increasingly more popular in recentyears. The latency of these DAW audio process-ing chains for some application such as live audio monitoring has always been perceived asa problem when DSP audio effects are needed.With “High Definition Audio” being standardizedas the onboard soundcard’s hardware architec-ture for personal computers, and with advancesin audio APIs, the low latency and multichannelcapability has made its way into home studios.This paper will discuss the results of latencymeasurements of current popular operating sys-tems and hosts applications with different audioAPIs and audio processing loads.Convention Paper 8081

10:30

P16-3 Error Control Techniques within Multidimensional-Adaptive Audio Coding Algorithms—Neil Smyth, APTX, Belfast, N. Ireland, UK

Multidimensional-adaptive audio coding algorithmscan adapt multiple performance measures to thedemands of different audio applications in real-time. Depending on the transmission or storageenvironment, audio processing applications require forms of error control to maintain accept-able audio quality. By definition, multidimensional-

adaptive audio coding utilizes numerous error detection, correction, and concealment tech-niques. However, such techniques also haveimplications for other relevant performance measurements, such as coded bit-rate and com-putational complexity. This paper discusses thesignal-processing tools used by a multidimension-al- adaptive audio coding algorithm to achievevarying levels of error control while the fundamen-tal structure of the algorithm is also varying. Theeffects and trade-offs on other coding performancemeasures will also be discussed. Convention Paper 8082

10:30

P16-4 Object-Based Audio Coding Using Non-Negative Matrix Factorization for the Spectrogram Representation—JoonasNikunen, Tuomas Virtanen, Tampere Universityof Technology, Tampere, Finland

This paper proposes a new object-based audiocoding algorithm, which uses non-negative matrix factorization (NMF) for the magnitudespectrogram representation and the phase infor-mation is coded separately. The magnitudemodel is obtained using a perceptually weightedNMF algorithm, which minimizes the noise-to-mask ratio (NMR) of the decomposition, and isable to utilize long term redundancy by an object-based representation. Methods for thequantization and entropy coding of the NMF rep-resentation parameters are proposed and thequality loss is evaluated using the NMR mea-sure. The quantization of the phase informationis also studied. Additionally we propose asparseness criteria for the NMF algorithm, whichis set to favor the gain values having the highestprobability and thus the shortest entropy codingword length, resulting to a reduced bit rate. Convention Paper 8083

10:30

P16-5 Cross-Layer Rate-Distortion Optimization for Scalable Advanced Audio Coding—Emmanuel Ravelli, Vinay Melkote, Tejaswi Nanjundaswamy, Kenneth Rose, University of California at Santa Barbara, Santa Barbara,CA, USA

Current scalable audio codecs optimize eachlayer of the bit-stream successively and inde-pendently with a straightforward application ofthe same rate-distortion optimization techniquesemployed in the non-scalable case. The maindrawback of this approach is that the perfor-mance of the enhancement layers is significantlyworse than that of the non-scalable codec at thesame cumulative bit-rate. We propose in this paper a novel optimization technique in the Advanced Audio Coding (AAC) frameworkwherein a cross-layer iterative optimization isperformed to select the encoding parameters foreach layer with a conscious accounting of rateand distortion costs in all layers, which allows fora trade-off between performance at different lay-ers. Subjective and objective results demon-strate the effectiveness of the proposed approach and provide insights for bridging thegap with the non-scalable codec.Convention Paper 8084

Technical Program


10:30

P16-6 Issues and Solutions Related to Real-TimeTD-PSOLA Implementation—Sylvain Le Beux,1Boris Doval,2 Christophe d'Alessandro11LIMSI-CNRS, Université Paris-Sud XI, Orsay, France

2LAM-IJLRA, Université Paris, Paris, France

This paper presents a procedure adaptation for thecalculation of TD-PSOLA algorithm when the pro-cessing of pitch-shifting and time-stretching coeffi-cients needs to be achieved in real-time at everynew synthesis pitch mark. In the scope of standardTD-PSOLA algorithm, modification coefficients aredefined from the analysis time axis whereas forreal-time applications, pitch and duration controlparameters need to be sampled at synthesis time.This paper will establish the theoretical correspon-dence between both approaches. Another issue related to real-time context concerns the trade-offbetween the latency required for processing andthe type of analysis window used.Convention Paper 8085Paper presented by Boris Doval.

10:30

P16-7 Integrating Musicological Knowledge into a Probabilistic Framework for Chord and KeyExtraction—Johan Pauwels, Jean-PierreMartens, Ghent University, Ghent, Belgium

In this paper a formerly developed probabilisticframework for the simultaneous detection ofchords and keys in polyphonic audio is furtherextended and validated. The system behavior iscontrolled by a small set of carefully defined freeparameters. This has permitted us to conduct anexperimental study that sheds a new light on therelative importance of musicological knowledgein the context of chord extraction. Some of theobtained results are at least surprising and, toour knowledge, never reported as such before. Convention Paper 8086

10:30

P16-8 A Doubly Sparse Greedy Adaptive DictionaryLearning Algorithm for Music and Large-Scale Data—Maria G. Jafari, Mark D. Plumbley,Queen Mary University of London, London, UK

We consider the extension of the greedy adaptivedictionary learning algorithm that we introducedpreviously to applications other than speech sig-nals. The algorithm learns a dictionary of sparseatoms, while yielding a sparse representation forthe speech signals. We investigate its behavior inthe analysis of music signals and propose a differ-ent dictionary learning approach that can be ap-plied to large data sets. This facilitates the appli-cation of the algorithm to problems that generatelarge amounts of data, such as multimedia andmultichannel application areas.Convention Paper 8087

Student Event/Career DevelopmentRECORDING COMPETITION—PART 2(STEREO CLASSICAL, SURROUND CLASSICAL AND NON-CLASSICAL) Monday, May 24, 10:30 – 13:30Room C1

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participatesin critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereoand surround recordings in these categories:• Stereo Classical 10:30 to 11:30• Surround Classical 11:30 to 12:30• Surround Non-Classical 12:30 to 13:30

The top three finalists in each category present a shortsummary of their techniques and intentions, and playtheir projects for all who attend. Meritorious awards willbe presented at the closing Student Delegate AssemblyMeeting (SDA-2) on Tuesday afternoon. It’s a great chance to hear the work of your colleagues

from other educational institutions. Everyone learns fromthe judges’ comments even if your project isn’t one of thefinalists, and it's a great chance to meet other studentsand faculty. Judges include: Stereo – Classical: Tony Falkner, Ken

Blair, David Bowles. Surround – Classical: Philip Hobbs,Tin Jonker, Bastian Kuijit. Surround – Non-Classical:Ron Prent, Thor Legvold, Bill Crabtree.


A CURRICULUM FOR GAME AUDIO

Chair: Richard Stevens, Leeds Metropolitan University, UK

Panelists: Dan Bardino, Creative Services Manager, Sony Computer Entertainment Europe Limited, London, UKAndy Farnell, Author of Designing SoundDavid Mollerstedt, DICE, Stockholm, SwedenDave Raybould, Leeds Metropolitan University, Leeds, UKNia Wearn, Staffordshire University, Staffordshire, UK

How do I get work in the games industry? Anyone involved in the discussions that follow this question in forums, conferences, and workshops worldwide will real-ize that many students in Higher Education who are aiming to enter the sector are not equipped with theknowledge and skills that the industry requires. In thisworkshop a range of speakers will discuss, and attemptto define, the various roles and related skillsets for audiowithin the games industry and will outline their personalroute into this field. The panel will also examine the relat-ed work of the IASIG Game Audio Education WorkingGroup in light of the recent publication of its Game AudioCurriculum Guidelines draft. This will be a fully interactiveworkshop inviting debate from the floor alongside discus-sion from panel members in order to share a range ofviews on this important topic.

Special EventAES/APRS—LIFE IN THE OLD DOGS YET—PART TWO: QUALITY AND QUANTITY—A COSTLY DILEMMAMonday, May 24, 11:00 – 13:00Room C2

Moderator: George Massenburg, GML LLC

Panelists: Barry FoxRob Kelly, Air/Strongroom Studios

Legendary audio polymath George Massenburg presents �


his take on the value of aspiring to the heights of quality.This is followed by a panel discussing whether a way canbe found to square the circle between the desire toachieve long-term, multi-format, re-exploitable catalogand short-term production cost-cutting. Clearly, quality isnot simply a matter of applying technology. The panelwill ponder on how the need to sustain realistic invest-ment in recording budgets may be a challenge to manystudio clients and how much do recording costs impactupon quality and success?

Monday, May 24 11:00 Room Saint JulienTechnical Committee Meeting on Acoustics and Sound Reinforcement

Monday, May 24 12:00 Room Saint JulienTechnical Committee Meeting on Network Audio Systems

Monday, May 24 12:30 Room Mouton CadetStandards Committee Meeting SC-02-01 Digital Audio Measurement Techniques

Monday, May 24 13:00 Room Saint JulienTechnical Committee Meeting on Automotive Audio


MULTICHANNEL AND SPATIAL AUDIO: PART 1

Chair: Wieslaw Woszczyk, McGill University, Montreal, Quebec, Canada

14:00

P17-1 Individualization of Dynamic Binaural Synthesis by Real Time Manipulation ofITD—Alexander Lindau, Jorgos Estrella, StefanWeinzierl, Technical University of Berlin, Berlin,Germany

The dynamic binaural synthesis of acoustic envi-ronments is usually constrained to the use non-individual impulse response datasets, measuredwith dummy heads or head and torso simulators.Thus, fundamental cues for localization such asinteraural level differences (ILD) and interauraltime differences (ITD) are necessarily corruptedto a certain degree. For ILDs, this is a minorproblem as listeners may swiftly adapt to spec-tral coloration at least as long as an external ref-erence is not provided. In contrast, ITD errorscan be expected to lead to a constant degrada-tion of localization. Hence, a method for the individual customization of dynamic binaural reproduction by means of real time manipulationof the ITD is proposed. As a prerequisite, sub-jectively artifact free techniques for the decom-position of binaural impulse responses into ILDand ITD cues are discussed. Finally, based onlistening test results, an anthropometry-basedprediction model for individual ITD correctionfactors is presented. The proposed approach entails further improvements of auditory qualityof real time binaural synthesis. Convention Paper 8088

14:30

P17-2 Perceptual Evaluation of Physical Predictorsof the Mixing Time in Binaural Room ImpulseResponses—Alexander Lindau, Linda Kosanke,Stefan Weinzierl, Technical University of Berlin,Berlin, Germany

The mixing time of room impulse responses denotes the moment when the diffuse reverbera-tion tail begins. A diffuse sound field can physicallybe defined by (1) equi-distribution of acoustical energy and (2) a uniform acoustical energy fluxover the complete solid angle. Accordingly, theperceptual mixing time is the moment when thediffuse tail cannot be distinguished from that of anyother position in the room. This provides an oppor-tunity for reducing the length of binaural impulseresponses that are dynamically exchanged in vir-tual acoustic environments (VAEs). Numerousmodel parameters and empirical features for theprediction of perceptual mixing time in rooms havebeen proposed. This paper aims at a perceptualevaluation of all potential estimators. Therefore,binaural impulse response data sets were collect-ed with an adjustable head and torso simulator fora representative sample of rectangular-shapedrooms. Prediction performance was evaluated bylinear regression using results of a listening testwhere mixing times could be adaptively altered inreal time to determine a just audible transition timeinto a homogeneous diffuse tail. Regression for-mulae for the perceptual mixing time are present-ed, conveniently predicting perceptive mixingtimes to be used in the context of VAEs.Convention Paper 8089

15:00

P17-3 HRTF Measurements with a ContinuouslyMoving Loudspeaker and Swept Sines—Ville Pulkki, Mikko-Ville Laitinen, Ville PekkaSivonen, Aalto University School of Science and Technology, Aalto, Finland

An apparatus is described, which is designed tomeasure head-related transfer functions(HRTFs) for audio applications. A broadband,two-driver loudspeaker is rotated around thesubject with continuous movement, and responses are measured with a swept-sine tech-nique. Potential error sources are discussed andquantified, and it is shown that the responsesare almost identical to responses measured witha static, small single-driver loudspeaker. It isalso shown that the method can be used to mea-sure a large number of HRTFs in a relativelyshort time period. Convention Paper 8090

15:30

P17-4 In Situ Microphone Array Calibration for Parameter Estimation in Directional AudioCoding—Oliver Thiergart, Giovanni Del Galdo,Maja Taseska, Jose Angel Pineda Pardo, Fabian Kuech, Fraunhofer Institute for IntegratedCircuits IIS, Erlangen, Germany

Directional audio coding (DirAC) provides an ef-ficient representation of spatial sound using adownmix audio signal and parametric informa-tion, namely direction-of-arrival (DOA) and dif-

Technical Program


fuseness of sound. Input to the DirAC analysisare B-format signals, usually obtained via micro-phone arrays. The DirAC parameter estimationis impaired when phase mismatch between thearray sensors occurs. We present an approachfor the in situ microphone array calibration solelybased on the DirAC parameters. The algorithmaims at providing consistent parameter esti-mates rather than matching the sensors explicit-ly. It does neither require to remove the sensorsfrom the array, nor depend on a priori knowledgesuch as the array size. We further propose asuitable excitation signal to assure robust cali-bration in reverberant environments.Convention Paper 8093

16:00

P17-5 Sound Field Recording by Measuring Gradients —Mihailo Kolundzija,1 Christof Faller,1 Martin Vetterli1,21Ecole Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

2University of California at Berkeley, Berkeley, CA, USA

Gradient-based microphone arrays, the horizon-tal sound field's plane wave decomposition, andthe corresponding circular harmonics decompo-sition are reviewed. Further, a general relationbetween directivity patterns of the horizontalsound field gradients and the circular harmonicsof any order is derived. A number of example dif-ferential microphone arrays are analyzed, includ-ing arrays capable of approximating the soundpressure gradients necessary for obtaining thecircular harmonics up to order three.Convention Paper 8092

16:30

P17-6 Evaluation of a Binaural Reproduction System Using Multiple Stereo-Dipoles—Yesenia Lacouture Parodi, Per Rubak, AalborgUniversity, Aalborg, Denmark

The sweet spot size of different loudspeakerconfigurations was investigated in a previousstudy carried out by the authors. Closely spacedloudspeakers showed a wider control area thanthe standard stereo setup. The sweet spot withrespect to head rotations showed to be especial-ly large when the loudspeakers are placed at elevated positions. In this paper we describe asystem that uses the characteristics of the loud-speakers placed above the listener. The pro-posed system is comprised of three pairs ofclosely spaced loudspeakers: one pair placed infront, one placed behind, and one placed abovethe listener. The system is based on the idea ofdividing the sound reproduction into regions toreduce front-back confusions and enhance thevirtual experience without the aid of a headtracker. A set of subjective experiments with theintention of evaluating and comparing the perfor-mance of the proposed system are also discussed.Convention Paper 8091

17:00

P17-7 Conditioning of the Problem of a Source Array Design with Inverse Approach—

Jeong-Guon Ih,1Wan-Ho Cho1,21KAIST, Daejeon, Korea 2Chuo University, Bunkyoku, Tokyo, Japan

An inverse approach based on the acousticalholography concept can be effectively applied tothe acoustic field rendered for achieving a targetsound field given as a relative response distribu-tion of sound pressure. To implement thismethod, the source configuration should be determined a priori, and a meaningful inversesolution of an ill-conditioned transfer matrixshould be obtained. To choose efficient sourcepositions that are almost mutually independent,the redundancy detection algorithm like the effective independence method was employedto decide the proper positions for a given num-ber of sources. In this way, an efficient and sta-ble filter set for a source array in controlling thesound field can be obtained. An interior domainwith irregularly shaped boundaries was adoptedas the target field to control for testing the sug-gested inverse method. Convention Paper 8094


AUDIO CODING AND COMPRESSION

Chair: Jamie Angus, University of Salford, Salford, UK

14:00

P18-1 Identification of Perceptual Attributes for theAssessment of Low Bit Rate MultichannelAudio Codecs—Paulo Marins, Capes, Brasilia,Brazil

This paper describes an experiment that aimedto identify independent perceptual attributes detailing the distortions introduced by low bit ratemultichannel audio codecs. This study consistedof a verbal elicitation experiment in which theRepertory Grid Technique (RGT) method wasemployed. Four attributes were identified and labeled. The independent perceptual attributesuncovered were: “coding and high frequencynoise,” “spatial image clarity,” “scene width,” and“tone color.” The overall importance of each ofthe independent perceptual attributes identifiedwas quantified. It was found that “coding andhigh frequency noise” was the attribute with thehighest overall importance, followed by “spatialimage clarity,” “scene width,” and “tone color.” Convention Paper 8095Paper will not be presented but is available forpurchase.

14:30

P18-2 High-Level Sound Coding with ParametricBlocks —Daniel Möhlmann, Otthein Herzog,Universität Bremen, Bremen, Germany

This paper proposes a new parametric encodingmodel for sound blocks that is specifically designed for manipulation, block-based compari-son, and morphing operations. Unlike otherspectral models, only the temporal evolution ofthe dominant tone and its time-varying spectral �


envelope are encoded, thus greatly reducingperceptual redundancy. All sounds are synthe-sized from the same set of model parameters,regardless of their length. Therefore, new instances can be created with greater variabilitythan through simple interpolation. A method forcreating the parametric blocks from an audiostream through partitioning is also presented. Anexample of sound morphing is shown and appli-cations of the model are discussed.Convention Paper 8096

15:00

P18-3 Exploiting High-Level Music Structure forLossless Audio Compression—Florin Ghido,Tampere University of Technology, Tampere,Finland

We present a novel concept of “noncontiguous”audio segmentation by exploiting the high-levelmusic structure. The existing lossless audiocompressors working in asymmetrical mode divide the audio into quasi-stationary segmentsof variable length by recursive splitting (MPEG-4ALS) or by dynamic programming (asymmetricalOptimFROG) before computing a set of linearprediction coefficients for each segment. Instead, we combine several variable lengthsegments into a group and use a single set oflinear prediction coefficients for each group. Theoptimal algorithm for combining has exponentialcomplexity and we propose a quadratic time approximation algorithm. Integrated into asym-metrical OptimFROG, the proposed algorithmobtains up to 1.20% (on average 0.23%) com-pression improvements with no increase in decoder complexity. Convention Paper 8097

15:30

P18-4 Interactive Teleconferencing Combining Spatial Audio Object Coding and DirAC Technology—Jürgen Herre, Cornelia Falch,Dirk Mahne, Giovanni del Galdo, MarkusKallinger, Oliver Thiergart, Fraunhofer Institutefor Integrated Circuits IIS, Erlangen, Germany

The importance of telecommunication continuesto grow in our everyday lives. An ambitious goalfor developers is to provide the most natural wayof audio communication by giving users the impression of being located next to each other.MPEG Spatial Audio Object Coding (SAOC) is atechnology for coding, transmitting, and interac-tively reproducing spatial sound scenes on anyconventional multi-loudspeaker setup (e.g., ITU5.1). This paper describes how Directional AudioCoding (DirAC) can be used as recording front-end for SAOC-based teleconference systems tocapture acoustic scenes and to extract the indi-vidual objects (talkers). By introducing a novelDirAC to SAOC parameter transcoder, a highlyefficient way of combining both technologies ispresented that enables interactive, object-basedspatial teleconferencing. Convention Paper 8098

16:00

P18-5 A New Parametric Stereo- and Multichannel Extension for MPEG-4 Enhanced Low Delay

AAC (AAC-ELD)—María Luis Valero,1 AndreasHölzer,2 Markus Schnell,1 Johannes Hilpert,1Manfred Lutzky,1 Jonas Engdegård,3 HeikoPurnhagen,3 Per Ekstrand,3 Kristofer Kjörling31Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

2DSP Solutions GmbH & Co., Regensburg, Germany

3Dolby Sweden AB, Stockholm, Sweden

ISO/MPEG standardizes two communicationcodecs with low delay: AAC-LD is a well estab-lished low delay codec for high quality communi-cation applications such as video conferencing,tele-presence, and Voice over IP. Its successorAAC-ELD offers enhanced bit rate efficiency being an ideal solution for broadcast audio gate-way codecs. Many existing and upcoming com-munication applications benefit from the trans-mission of stereo or multichannel signals at lowbitrates. With low delay MPEG Surround, ISOhas recently standardized a low delay paramet-ric extension for AAC-LD and AAC-ELD. It isbased on MPEG Surround technology with spe-cific adaption for low delay operation. This extension comes along with a signif icant improved coding efficiency for transmission ofstereo and multichannel signals.Convention Paper 8099

16:30

P18-6 Efficient Combination of Acoustic Echo Control and Parametric Spatial Audio Coding—Fabian Kuech, Markus Schmidt, Meray Zourub, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

High-quality teleconferencing systems utilizesurround sound to provide natural communica-tion experience. Directional Audio Coding(DirAC) is an efficient parametric approach tocapture and reproduce spatial sound. It uses amonophonic audio signal together with paramet-ric spatial cue information. For reproduction,multiple loudspeaker signals are determinedbased on the DirAC stream. To allow for hands-free operation, multichannel acoustic echo con-trol (AEC) has to be employed. Standard approaches apply multichannel adaptive filteringto address this problem. However, computation-al complexity constraints and convergence issues inhibit practical applications. This paperproposes an efficient combination of AEC andDirAC by explicitly exploiting its parametricsound field representation. The approach sup-presses the echo components in the microphonesignals solely based on the single channel audiosignal used for the DirAC synthesis of the loud-speaker signals. Convention Paper 8100

17:00

P18-7 Sampling Rate Discrimination: 44.1 kHz vs. 88.2 kHz—Amandine Pras, Catherine Guastavino, McGill University, Montreal, Quebec, Canada

It is currently common practice for sound engi-neers to record digital music using high-resolu-tion formats, and then down sample the files to44.1 kHz for commercial release. This study

Technical Program


aims at investigating whether listeners can per-ceive differences between musical files recordedat 44.1 kHz and 88.2 kHz with the same analogchain and type of AD-converter. Sixteen expertlisteners were asked to compare 3 versions(44.1 kHz, 88.2 kHz, and the 88.2 kHz versiondown-sampled to 44.1 kHz) of 5 musical excerpts in a blind ABX task. Overall, partici-pants were able to discriminate between filesrecorded at 88.2 kHz and their 44.1 kHz down-sampled version. Furthermore, for the orchestralexcerpt, they were able to discriminate betweenfiles recorded at 88.2 kHz and files recorded at44.1 kHz. Convention Paper 8101

17:30

P18-8 Comparison of Multichannel Audio Decoders for Use in Mobile and Handheld Devices—Manish Nema, Ashish Malot, Nokia India Pvt.Ltd., Bangalore, Karnataka, India

Multichannel audio provides immersive experi-ence to listeners. Consumer demand coupledwith technological improvements will drive con-sumption of high-definition content in mobile andhandheld devices. There are several multichan-nel audio coding algorithms, both, proprietaryones like Dolby Digital, Dolby Digital Plus, Win-dows Media Audio Professional (WMA Pro), Dig-ital Theater Surround High Definition (DTS-HD),and standard ones like Advanced Audio Coding(AAC), MPEG Surround, available in the market.This paper presents salient features/coding tech-niques of important multichannel audio decodersand a comparison of these decoders on key parameters like processor complexity, memoryrequirements, complexity/features for stereoplayback, and quality/coding efficiency. The paper also presents a ranking of these multi-channel audio decoders on the key parametersin a single table for easy comparison.Convention Paper 8102Paper will be presented by Marina Villanueva-Barreiro.


POSTERS: PSYCHOACOUSTICS AND LISTENING TESTS

14:00

P19-1 Auditory Perception of Dynamic Range in theNonlinear System—Andrew J. R. Simpson,Simpson Microphones, West Midlands, UK

This paper is concerned with the perception ofdynamic range in the nonlinear system. Thework is differentiated from the generic investiga-tion of “sound quality,” which is usually associat-ed with studies of nonlinear distortion. The pro-posed hypothesis suggests that distortionproducts generated within the compressive typenonlinear system are able to act as loudnesscompensator for the associated amplitude com-pression in the perceived loudness function. Thehypothesis is tested using the Time-Varying

Loudness model of Glasberg and Moore (2002)and further tested using AB scale hidden refer-ence type listening test methods. The LoudnessOverflow Effect (LOE) is introduced and band-width is shown to be a significant limiting factor.Results and immediate implications are briefly discussed.Convention Paper 8103

14:00

P19-2 A Subjective Evaluation of the Minimum Audible Channel Separation in Binaural Reproduction Systems through Loudspeakers—Yesenia Lacouture Parodi, PerRubak, Aalborg University, Aalborg, Denmark

To evaluate the performance of crosstalk cancel-lation systems the channel separation is usuallyused as a parameter. However, no systematicevaluation of the minimum audible channel sep-aration has been found in the literature known bythe authors. This paper describes a set of sub-jective experiments carried out to evaluate theminimum channel separation needed such thebinaural signals with crosstalk are perceived tobe equal to the binaural signals reproduced with-out crosstalk. A three alternative-forced-choicediscrimination experiment, with a simple adap-tive algorithm with weighed up-down methodwas used. The minimum audible channel sepa-ration was evaluated for the listeners placed atsymmetric and asymmetric posit ions with respect to the loudspeakers. Eight different stim-uli placed at two different locations were evaluat-ed. Span angles of 12 and 60 degrees were alsosimulated. Results indicate that in order to avoidlateralization the channel separation should bebelow –15 dB for most of the stimuli and around–20 dB for broad-band noise.Convention Paper 8104

14:00

P19-3 Evaluation of Speech Intelligibility in Digital Hearing Aids—Lorena Álvarez, Leticia Vaque-ro, Enrique Alexandre, Lucas Cuadra, Roberto Gil-Pita, University of Alcalá, Alcalá de Henares,Madrid, Spain

This paper explores the feasibility of using theSpeech Intelligibility Index (SII) to evaluate theperformance of a digital hearing aid. This stan-dardized measure returns a number, betweenzero and unity, which can be interpreted as theproportion of the total speech information avail-able to the listener, and correlates with the intel-ligibility of the speech signal. The paper will focus on the use of the SII as a metric fromwhich to compare the performance of two differ-ent hearing aids, in terms of speech intelligibility.From the purpose of this work, experiments employing data from four real subjects with mild-to-profound hearing losses, when using thesedifferent hearing aids, will be done. Results willshow how the use of the SII can lead to a betterselection of a hearing aid in the detriment of oth-ers, while avoiding the need for making exten-sive subjective listening tests. Convention Paper 8105

�


14:00

P19-4 Method to Improve Speech Intelligibility inDifferent Noise Conditions—Rogerio G. Alves,Kuan-Chieh Yen, Michael C. Vartanian, SameerA. Gadre, CSR - Cambridge Silicon Radio,Auburn Hills, MI, USA

Mobile communication applications have to address various environmental noise situations.In order to improve the quality of voice communi-cation, not only an effective noise reduction algorithm for the far-end user is wanted, but alsoan algorithm that helps improve intelligibility forthe near-end user in different environmentalnoise situations is desired. Due to it, the goal ofthis paper is to improve the overall voice com-munication experience of mobile device users byintroducing a method to improve intelligibility byincreasing perceptual loudness of the receivedspeech signal in accordance with the environ-mental noise. Please note that, due to the char-acteristics of the mobile devices application, amethod capable of working in real time with lowcomputational complexity is highly desired. Convention Paper 8106

14:00

P19-5 Contemporary Theories of Tinnitus Generation, Diagnosis and ManagementPractices—Stamatia Staikoudi, Queen Margaret University, Edinburgh, Scotland, UK

Tinnitus can be defined as the perception of asound in the head or the ears, in the absence ofexternal acoustic stimulation. It is a symptom experienced by more than seven million peopleacross the UK and many more worldwide, including children. Its pitch may vary from indi-vidual to individual, and it can be described asringing, whistling, humming, or buzzing amongstother. We will be looking at contemporary theo-ries for its generation, current methods of diag-nosis, and management practices. Convention Paper 8107

14:00

P19-6 Analytical and Perceptual Evaluation of Nonlinear Devices for Virtual Bass System—Nay Oo, Woon-Seng Gan, Nanyang Technological University, Singapore

Nonlinear devices (NLDs) are generally used invirtual bass systems (VBS). The prime objectiveis to extend the low frequency bandwidth psy-choacoustically by generating a series of har-monics in the upper-bass and/or mid-frequencyrange where loudspeakers can reproduce well.However, these artificially added harmonics introduce intermodulation distortion and maychange the timbre of the sound tracks. In thispaper nine memoryless NLDs are studied basedon objective analysis and subjective listeningtests. The objectives of this paper are (1) toquantify the spectral contents of NLDs when fedby single-tone; (2) to find out which type of NLDsis best for psychoacoustics bass enhancementthrough subjective listening tests and objectiveGedLee nonlinear distortion metric; and (3) to investigate whether there is any correlation between subjective listening tests results and

objective performance scores. Convention Paper 8108


SURROUND FOR SPORTS

Presenters:Martin Black, Senior Sound Superisor& Technical Consultant, BSkyB, UKPeter Davey, Audio Quality Supervisor at Beijing 2008 Olympics and Vancouver 2010 OlympicsIan Rosam, 5.1 Audio Quality Supervisor for FIFA World Cup, Euro 2008, Beijing 2008 and Vancouver 2010 Olympics

Surround for Sports is an increasingly important area ofmultichannel audio. It is a de-facto standard for large-scale productions like the Olympics or the football worldand European championships. The presenters, both experienced audio supervisors

for such events, will touch on a variety of subjects regarding surround sound design for sports and manypractical issues like: crowd/audience; field of play FX;game sounds, e.g., ball kicks; competitors, e.g., curling;referees/umpires, e.g., rugby/tennis; board sounds, e.g.,darts/basketball; scoring/timing, e.g., fencing buzzers,boxing time bell; commentators/reporters out of vision;commentators/presenters/reporters in vision. Wherethose elements should be in a 5.1 mix will be discussedas well as: use of center channel, use of the LFE, HDTVstereo fold-down in a set top box, Dolby E, metadata,bass management, and last but not least localization ofsounds with 5.1 and human hearing.


SCREEN CURRENT INDUCED NOISE (SCIN),SHIELDING, AND GROUNDING—MYTHS VS. REALITY

Presenters:Bruce C. Olson, Olson Sound Design, Brooklyn Park, USAJohn Woodgate, J M Woodgate and Associates, Rayleigh, UK

Since the landmark series of AES Journal articles onScreen Current Induced Noise (SCIN), Shielding, andthe Pin 1 problem, were published in June of 1995 therehas been much discussion of the results and their impli-cations for audio systems. This tutorial will demonstrateSCIN and show a Spice model that helps to explain whatis happening. These effects are often confused withgrounding and shielding schemes so we will also explainwhich interactions are real and which ones are mythical.


LES PAUL: WE USE HIS INNOVATIONS EVERY DAY

Presenter: Barry Marshall, New England Institute of Art, Brookline, MA, USA

The many contributions of the late Les Paul to the artand technology of recording were often overlooked inmedia coverage of his passing last year at the age of 94.While his importance as a musician and as an innovator

Technical Program


of the solid body electric guitar certainly deserve widepraise and respect, his developments in recording con-tinue to be used by producers, engineers and musiciansat virtually every recording session. Producer and educa-tor Barry Marshall looks at the career of Les Paul andplays some of his ground-breaking recordings as a side-man, as a solo artist, and as a part of the Les Paul andMary Ford duo act. Special emphasis in the presentationwill be placed on the way that Les’s musicianship andmusical instincts drove his technical breakthroughs.

Student Event/Career DevelopmentMENTORING SESSIONSMonday, May 24, 14:00 – 15:45C4-Foyer

Students are invited to sign-up for an individual meetingwith a distinguished mentor from the audio industry. Theopportunity to sign up will be given at the end of theopening SDA meeting. Any remaining open spots will beposted in the student area. All students are encouragedto participate in this exciting and rewarding opportunityfor individual discussion with industry mentors.

Monday, May 24 14:00 Room Saint JulienTechnical Committee Meeting on Loudspeakers and Headphones

Monday, May 24 14:30 Room Mouton CadetStandards Committee Meeting SC-03-06 Digital Library and Archive Systems

Monday, May 24 15:00 Room Saint JulienTechnical Committee Meeting on Audio for Telecommunications


AUDIO HISTORY, ARCHIVING, AND RESTORATION

Chair: Sean W. Davies, S.W. Davies Ltd., Aylesbury, UK

Panelists: Ted Kendall, Sound Restorer, Kington, UKJohn Liffen, The Science Museum, London, UKWill Prentice, The National Sound Archive, British Library, London, UKNadia Walaskowitz, Phonogram Archive, Vienna, Austria

This workshop covers: (a) the preservation of historic audio equipment, both as exhibits and as working appa-ratus required for playback of historic formats; (b) thearrangement and management of archives/collections;(c) the preservation of the audio content for distribution.


LOUDNESS IN BROADCASTING—THE NEW EBU RECOMMENDATION R128

Chair: Andrew Mason, BBC R&D, London, UK

Panelists: Jean-Paul Moerman, Salzbrenner Stagetec Media Group, Buttenheim, GermanyRichard van Everdingen, Dutch Broadcasting Loudness Committee

The EBU group P/LOUD is approaching the final stage ofits work that will result in recommendations that will havea profound effect on any audio production in broadcast-ing. The gradual switch from peak to loudness normal-ization combined with a new maximum true peak leveland the usage of the descriptor “loudness range” allowsfor the first time to fully characterize the audio part of aprogram. More importantly it has the potential to solvethe most frequent complaint of the listeners, that of severe level inconsistencies. This is the first time that thenew EBU loudness recommendation R128 is presentedin detail, alongside a detailed introduction to the subjectas well as practical case studies.

Student Event/Career DevelopmentEDUCATION FORUM PANELMonday, May 24, 16:00 – 18:00Room C6

Chair: Gary Gottlieb

Panelists: Chuck AinlayAkira FukadaGeorge MassenburgUlrike Schwarz

How Does It Sound Now? The Evolution of Audio

One day Chet Atkins was playing guitar when a womanapproached him. She said, “That guitar sounds beauti-ful.” Chet immediately quit playing. Staring her in theeyes he asked, “How does it sound now?” The quality ofthe sound in Chet’s case clearly rested with the player,not the instrument, and the quality of our product ulti-mately lies with us as engineers and producers, not withthe gear we use. The dual significance of this question,“How does it sound now,” informs our discussion, since itaddresses both the engineer as the driver and thechanges we have seen and heard as our business andmethodology have evolved through the decades. Let’sstart by exploring the methodology employed by themost successful among us when confronted with newand evolving technology. How do we retain quality andcontinue to create a product that conforms to our ownhigh standards? Is this even possible for students in today’s market, with the option of true apprenticeships allbut gone? How can students develop chops, aesthetics,and technical knowledge? How can they become pre-pared to thrive and produce a quality product? Can wehelp give educators the tools they need to address theissue of continuity of quality? This may lead to other con-versations about the musicians we work with, the con-sumers we serve, and the differences and similarities between their standards and our own. How high shouldyour standards be? How should it sound now? Howshould it sound tomorrow?

Monday, May 24 16:00 Room Saint JulienTechnical Committee Meeting on Audio for Games

Monday, May 24 16:00 Room Mouton CadetStandards Committee Meeting SC-03-07 Audio Metadata

�



POSTERS: AUDIO CONTENT MANAGEMENT—AUDIO INFORMATION RETRIEVAL

16:30

P20-1 Complexity Scalable Perceptual Tempo Estimation From HE-AAC Encoded Music—Danilo Hollosi,1 Arijit Biswas21Ilmenau University of Technology, Ilmenau, Germany

2Dolby Germany GmbH, Nürberg, Germany

A modulation frequency-based method for per-ceptual tempo estimation from HE-AAC encodedmusic is proposed. The method is designed towork on fully-decoded PCM-domain; the inter-mediate HE-AAC transform-domain after partialdecoding; and directly on HE-AAC compressed-domain using Spectral Band Replication (SBR)payload. This offers complexity scalable solu-tions. We demonstrate that SBR payload is anideal proxy for tempo estimation directly fromHE-AAC bit-streams without even decodingthem. A perceptual tempo correction stage isproposed based on rhythmic features to correctfor octave errors in every domain. Experimentalresults show that the proposed method signifi-cantly outperforms two commercially availablesystems, both in terms of accuracy and compu-tational speed.Convention Paper 8109

16:30

P20-2 On the Effect of Reverberation on Musical Instrument Automatic Recognition—Mathieu Barthet, Mark Sandler, Queen Mary University of London, London, UK

This paper investigates the effect of reverbera-tion on the accuracy of a musical instrumentrecognition model based on Line Spectral Fre-quencies and K-means clustering. One-hundred-eighty experiments were conducted by varyingthe type of music databases (isolated notes, soloperformances), the stage in which the reverbera-tion is added (learning, and/or testing), and thetype of reverberation (3 different reverberationtimes, 10 different dry-wet levels). The perfor-mances of the model systematically decreasedwhen reverberation was added at the testingstage (by up to 40%). Conversely, when rever-beration was added at the training stage, a 3%increase of performance was observed for thesolo performances database. The results sug-gest that pre-processing the signals with a dere-verberation algorithm before classification maybe a means to improve musical instrumentrecognition systems. Convention Paper 8110

16:30

P20-3 Harmonic Components Extraction in Recorded Piano Tones—Carmine EmanueleCella, Università di Bologna, Bologna, Italy

It is sometimes desirable, in the purpose of ana-lyzing recorded piano tones, to remove from the

original signal the noisy components generatedby the hammer strike and by other elements involved in the piano action. In this paper wepropose an efficient method to achieve such result, based on adaptive filtering and automaticestimation of fundamental frequency and inhar-monicity; the final method, applied on a recordedpiano tone, produces two separate signals con-taining, respectively, the hammer knock and theharmonic components. Some sound examplesto listen for evaluation are available on the webas specified in the paper. Convention Paper 8111

16:30

P20-4 Browsing Sound and Music Libraries by Similarity—Stéphane Dupont,1 Christian Frisson,2 Xavier Siebert,1 Damien Tardieu11Université de Mons, Mons, Belgium 2Université Catholique de Louvain, Louvain-la-Neuve, Belgium

This paper presents a prototype tool for brows-ing through multimedia libraries using content-based multimedia information retrieval tech-niques. It is composed of several groups ofcomponents for multimedia analysis, data min-ing, interactive visualization, as well as connec-tion with external hardware controllers. The musical application of this tool uses descriptorsof timbre, harmony, as well as rhythm and twodifferent approaches for exploring/browsing con-tent. First, a dynamic data mining allows theuser to group sounds into clusters according tothose different criteria, whose importance can beweighted interactively. In a second mode,sounds that are similar to a query are returned tothe user, and can be used to further proceedwith the search. This approach also borrowsfrom multi-criteria optimization concept to returna relevant list of similar sounds.Convention Paper 8112

16:30

P20-5 On the Development and Use of Sound Mapsfor Environmental Monitoring—Maria Rangoussi,1 Stelios M. Potirakis,1 IoannisParaskevas,1 Nicolas–Alexander Tatlas21Technological Education Institute of Piraeus, Aigaleo-Athens, Greece

2University of Patras, Patras, Greece

The development, update, and use of soundmaps for the monitoring of environmental inter-est areas is addressed in this paper. Soundmaps constitute a valuable tool for environmen-tal monitoring. They rely on networks of micro-phones distributed over the area of interest torecord and process signals, extract and charac-terize sound events and finally form the map;time constraints are imposed by the need fortimely information representation. A stepwisemethodology is proposed and a series of practi-cal considerations are discussed to the end ofobtaining a multi-layer sound map that is periodi-cally updated and visualizes the sound contentof a “scene.” Alternative time-frequency-basedfeatures are investigated as to their efficiencywithin the framework of a hierarchical classifica-tion structure.

Technical Program


Convention Paper 8113Paper presented by Nicolas–Alexander Tatlas

16:30

P20-6 The Effects of Reverberation on Onset Detection Tasks—Thomas Wilmering, GyörgyFazekas, Mark Sandler, Queen Mary Universityof London, London, UK

The task of onset detection is relevant in variouscontexts such as music information retrieval andmusic production, while reverberation has always been an important part of the productionprocess. The effect may be the product of therecording space or it may be artificially added,and, in our context, destructive. In this paper weevaluate the effect of reverberation on onset detection tasks. We compare state-of-the-arttechniques and show that the algorithms havevarying degrees of robustness in the presence ofreverberation depending on the content of theanalyzed audio material. Convention Paper 8114

16:30

P20-7 Segmentation and Discovery of Podcast Content—Steven Hargreaves, Chris Landone,Mark Sandler, Panos Kudumakis, Queen MaryUniversity of London, London, UK

With ever increasing amounts of radio broadcastmaterial being made available as podcasts, sophisticated methods of enabling the listener toquickly locate material matching their own per-sonal tastes become essential. Given the abilityto segment a podcast that may be in the order ofone or two hours duration into individual songpreviews, the time the listener spends searchingfor material of interest is minimized. This paperinvestigates the effectiveness of applying multi-ple feature extraction techniques to podcast seg-mentation and describes how such techniquescould be exploited by a vast number of digitalmedia delivery platforms in a commercial cloud-based radio recommendation and summariza-tion service.Convention Paper 8115

Monday, May 24 16:00 Room Mouton CadetStandards Committee Meeting SC-03-04 Storage and Handling of Media

Special EventBANQUETMonday, May 24, 20:00 – 22:30Kew Bridge Steam MuseumGreen Dragon LaneBrentford, Middlesex

This year the Banquet will take place at one of London’smost spectacular and dramatic venues—the Kew BridgeSteam Museum. The museum is set within a magnificentand atmospheric Victorian waterworks, where industrysits side by side with elegance. Guests enter via theGrand Junction Engine House and can enjoy a venuewith a difference. Dinner is served inside the Steam Hallwhile the majestic steam engines work around you.The ticket price includes all food and drinks and the

bus to the museum and back. Buses will leave Novotel at19:00 and 19:30.

£ 60 for AES members and nonmembersTickets will be available at the Special Events desk.

Session P21 Tuesday, May 2509:00 – 13:00 Room C3

MULTICHANNEL AND SPATIAL AUDIO: PART 2

Chair: Ronald Aarts, Philips Research Laboratories, Eindhoven, The Netherlands

09:00

P21-1 Center-Channel Processing in Virtual 3-D Audio Reproduction over Headphones orLoudspeakers —Jean-Marc Jot, Martin Walsh,DTS Inc., Scotts Valley, CA, USA

Virtual 3-D audio processing systems for thespatial enhancement of recordings reproducedover headphones or frontal loudspeakers generally provide a less compelling effect oncenter-panned sound components. This paperexamines this deficiency and presents virtual 3-D audio processing algorithm modificationsthat provide a compelling spatial enhancementeffect over headphones or loudspeakers evenfor sound components localized in the center ofthe stereo image, ensure the preservation of thetimbre and balance in the original recording, andproduce a more stable “phantom center” imageover loudspeakers. The proposed improvementsare applicable, in particular, in laptop and TV setaudio systems, mobile internet devices, andhome theater “soundbar” loudspeakers. Convention Paper 8116

09:30

P21-2 Parametric Representation of ComplexSources in Reflective Environments—DylanMenzies, De Montfort University, Leicester, UK

Aspects of source directivity in reflective environ-ments are considered, including the audible effects of directivity and how these can be repro-duced. Different methods of encoding and pro-duction are presented, leading to a new approach to extend parametric encoding of reverberation, as described in the DIRAC andMPEG formats, to include the response tosource directivity.Convention Paper 8118

10:00

P21-3 Analysis and Improvement of Pre-Equalization in 2.5-Dimensional Wave FieldSynthesis—Sascha Spors, Jens Ahrens, Technische Universität Berlin, Berlin, Germany

Wave field synthesis (WFS) is a well establishedhigh-resolution spatial sound reproduction tech-nique. Typical WFS systems aim at the repro-duction in a plane using loudspeakers enclosingthe plane. This constitutes a so-called 2.5-dimensional reproduction scenario. It has beenshown that a spectral correction of the repro-duced wave field is required in this context. ForWFS this correction is known as pre-equalizationfilter. The derivation of WFS is based on a series �


of approximations of the physical foundations.This paper investigates on the consequences ofthese approximations on the reproduced soundfield and in particular on the pre-equalization fil-ter. An exact solution is provided by the recentlypresented spectral division method and is employed in order to derive an improved WFSdriving function. Furthermore, the effects of spa-tial sampling and truncation on the pre-equaliza-tion are discussed. Convention Paper 8121

10:30

P21-4 Discrete Wave Field Synthesis Using Fractional Order Filters and Fractional Delays—César D. Salvador, Universidad deSan Martin de Porres, Lima, Peru

A discretization of the generalized 2.5D WaveField Synthesis driving functions is proposed inthis paper. Time discretization is applied withspecial attention to the prefiltering that involveshalf-order systems and to the delaying that involves fractional-sample delays. Space dis-cretization uses uniformly distributed loudspeak-ers along arbitrarily shaped contours: visual andnumerical comparisons between lines and con-vex arcs, and between squares and circles, areshown. An immersive soundscape composed ofnature sounds is reported as an example. Mod-eling uses MATLAB and real-time reproductionuses Pure Data. Simulations of synthesizedplane and spherical wave fields, in the whole lis-tening area, report a discretization percentageerror of less than 1%, using 16 loudspeakersand 5th order IIR prefilters. Convention Paper 8122

11:00

P21-5 Immersive Virtual Sound Beyond 5.1 ChannelAudio—Kangeun Lee, Changyong Son, Dohyung Kim, Samsung Advanced Institute of Technology, Suwon, Korea

In this paper a virtual sound system is intro-duced for the next generation multichannel audio. The sound system provides a 9.1 channelsurround sound via a conventional 5.1 loud-speaker layout and contents. In order to deliver9.1 sound, the system includes channel upmix-ing and vertical sound localization that can create virtually localized sound at any sphericalsurface around human head. An amplitude pan-ning coefficient is used to channel upmixing thatincludes a smoothing technique to reduce musi-cal noise occurred by upmixing. The proposedvertical rendering is based on VBAP (vectorbased amplitude panning) using three loud-speakers among the 5.1. For quality test, our upmixing and virtual rendering method is evalu-ated in real 9.1 and 5.1 loudspeaker respectivelyand compared with Dolby Pro Logic IIz. Thedemonstrated performance is superior to the references. Convention Paper 8117

11:30

P21-6 Acoustical Zooming Based on a ParametricSound Field Representation—RichardSchultz-Amling, Fabian Kuech; Oliver Thiergart,

Markus Kallinger, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Directional audio coding (DirAC) is a parametricapproach to the analysis and reproduction ofspatial sound. The DirAC parameters, namely direction-of-arrival and diffuseness of sound canbe further exploited in modern teleconferencingsystems. Based on the directional parameters,we can control a video camera to automaticallysteer on the active talker. In order to provideconsistency between the visual and acousticalcues, the virtual recording position should matchthe visual movement. In this paper we presentan approach for an acoustical zoom, which pro-vides audio rendering that follows the movementof the visual scene. The algorithm does not relyon a priori information regarding the sound reproduction system as it operates directly in theDirAC parameter domain.Convention Paper 8120

12:00

P21-7 SoundDelta: A Study of Audio AugmentedReality Using WiFi-Distributed AmbisonicCell Rendering—Nicholas Mariette,1 Brian F. G.Katz,1 Khaled Boussetta,2 Olivier Guillerminet31LIMSI-CNRS, Orsay, France 2Université Paris 13, Paris, France3REMU, Paris, France

SoundDelta is an art/research project that pro-duced several public audio augmented realityart-works. These spatial soundscapes werecomprised of virtual sound sources located in adesignated terrain such as a town square.Pedestrian users experienced the result as inter-active binaural audio by walking through theaugmented terrain, using headphones and theSoundDelta mobile device. SoundDelta uses adistributed “Ambisonic cell” architecture thatscales efficiently for many users. A server ren-ders Ambisonic audio for fixed user positions,which is streamed wirelessly to mobile users thatrender a custom, individualized binaural mix fortheir present position. A spatial cognition map-ping experiment was conducted to validate thesoundscape perception and compare with an individual rendering system.Convention Paper 8123

12:30

P21-8 Surround Sound Panning Technique Basedon a Virtual Microphone Array—Filippo M.Fazi,1 Toshiro Yamada,2 Suketu Kamdar,2Philip A. Nelson,1 Peter Otto21University of Southampton, Southampton, UK 2University of California, San Diego, La Jolla, CA, USA

A multichannel panning technique is presented,which aims at reproducing a plane wave with anarray of loudspeakers. The loudspeaker gainsare computed by solving an acoustical inverseproblem. The latter involves the inversion of amatrix of transfer functions between the loud-speakers and the elements of a virtual micro-phone array, the center of which corresponds tothe location of the listener. The radius of the vir-tual microphone array is varied consistently withthe frequency, in such a way that the transfer

Technical Program


function matrix is independent of the frequency.As a consequence, the inverse problem issolved for one frequency only, and the loud-speaker coefficients obtained can be implement-ed using simple gains.Convention Paper 8119


MICROPHONES, CONVERTERS, AND AMPLIFIERS

Chair: Mark Sandler, Queen Mary University of London, London, UK

09:00

P22-1 A Comparison of Phase-Shift Self-Oscillatingand Carrier-Based PWM Modulation for Embedded Audio Amplifiers—Alexandre Huffenus,1 Gaël Pillonnet,1 Nacer Abouchi,1Frédéric Goutti21Lyon Institute of Nanotechnology, Villeurbanne,France

2STMicroelectronics, Grenoble, France

This paper compares two modulation schemesfor class-D amplifiers: phase-shift self-oscillating(PSSO) and carrier-based pulse width modula-tion (PWM). Theoretical analysis (modulation,frequency of oscillation, bandwidth, etc.), designprocedure, and IC silicon evaluation will beshown for mono and stereo operation (on thesame silicon die) on both structures. The designof both architectures will use as many identicalbuilding blocks as possible, to provide a fair, “allelse being equal,” comparison. THD+N perfor-mance and idle consumption went from 0.02%and 5.6mA in PWM to 0.007% and 5.2mA inself-oscillating. Other advantages and draw-backs of the self-oscillating structure will be explained and compared to the classical carrier-based PWM one, with a focus on battery-pow-ered applications.Convention Paper 8127

09:30

P22-2 Digital-Input Class-D Audio Amplifier—Hassan Ihs, Christian Dufaza, Primachip SAS,Marseille, France

Not only digital-input class-D audio amplifier directly converts digital PCM-coded audio signalinto power, it also exhibits superior perfor-mances with respect to traditional PWM analogclass-D amplifiers. In the latter, several analogtechniques have to be deployed to combat themany side-effects inherent to analog blocks, resulting in complex circuitry. Digital domain pro-vides many degrees of freedom that allow com-bating signal non-idealities with no or just littleextra cost. As the output to the real world is ana-log, digital class-D amplifier requires Analog toDigital Conversion (ADC) in feedback to com-pensate for jitter, signal distortions, and powersupply noise. With careful sigma delta modula-tion design and few digital techniques, the class-D loop is stabilized and achieves superior audio

performances. The audio signal cascade chainis tremendously simplified resulting in significantreduction in area cost and power consumption.For applications that require analog input pro-cessing, the digital class-D still accepts analoginputs with no extra cost. Convention Paper 8128

10:00

P22-3 Digital PWM Amplifier Using Nonlinear Feedback and Predistortion—Peter Craven,1Larry Hand,2 Brian Attwood,3 Jack Andersen41Algol Applications Ltd., Steyning, West Sussex,UK

2Intersil/D2Audio, Austin, TX, USA3PWM Systems, Crawley, Sussex, UK4D2Audio, Austin, TX, USA (deceased)

A nonlinear feedback topology is used to reducethe deviations of a practical PWM output stagefrom ideal theoretical behavior, the theoreticalnonlinearity of the PWM process being correctedusing predistortion. As the final output is analog,an ADC is needed if the feedback is to be digital,and several problems arise from the practical limi-tations of commercially-available ADCs, includingdelay and addition of ultrasonic noise. We showhow these problems can be minimized and illus-trate the performance of a digital PWM amplifierin which feedback results in a significant reductionof distortion throughout the audio range.Convention Paper 8129

10:30

P22-4 Microphone Choice: Large or Small, Single orDouble?—Martin Schneider, Georg NeumannGmbH, Berlin, Germany

How do large and small diaphragm condensermicrophones differ? A common misapprehen-sion is that large capsules necessarily becomeless directional at low frequencies. It is shownthat this is not a question of large or small, butrather of single or double diaphragm design. Thedifferent behaviors have a direct impact on thesound engineers’ choice and placement of microphone. Likewise, the much debated ques-tion of proximity effect with multi-pattern micro-phones and omnidirectional directivity is discussed.Convention Paper 8124

11:00

P22-5 Improvements on a Low-Cost ExperimentalTetrahedral Ambisonic Microphone—Dan T.Hemingson, Mark J. Sarisky, University of Texasat Austin, Austin, TX, USA

An earlier paper [Hemingson and Sarisky, “APractical Comparison of Three Tetrahedral Ambisonic Microphones,” 126th AES ConventionMunich May 2009] compared two low-cost tetra-hedral ambisonic microphones, an experimentalmicrophone and a Core Sound TetraMic, using aSoundfield MKV or SPS422B as a standard forcomparison. This paper examines improvementsto the experimental device, including that sug-gested in the “future work” section of the originalpaper. Modifications to the capsules and a redesign of the electronics package made signif- �

42 Audio Engineering Society 122nd Convention Program, 2007 Spring

icant improvements in the experimental micro-phone. Recordings were made in natural environments and of live performances, some simultaneously with the Soundfield standard. Of interest is the use of the low-cost surroundmicrophone for student and experimental education. Convention Paper 8125

11:30

P22-6 Analysis of the Interaction between RibbonMotor, Transformer, and Preamplifier and itsApplication in Ribbon Microphone Design—Julian David, Audio Engineering Associates,Pasadena, CA, USA, University of Applied Sciences Düsseldorf, Düsseldorf, Germany

The transformer in ribbon microphones interactswith the complex impedances of both the ribbonmotor and the subsequent preamplifier. This paper presents a test setup that takes the influ-ences of source and load impedances into account for predicting the amplitude and phaseresponse of a specific ribbon microphone designin combination with different transformers. As aresult, the effect of the transformer on the ampli-tude and phase response of the system can besimulated in good approximation by means of ageneric low-impedance source. This allows foroptimizing the ribbon/transformer/load circuit under laboratory conditions in order to achievethe desired microphone performance. Convention Paper 8126

Workshop 14 Tuesday, May 2509:00 – 10:45 Room C6

10 THINGS TO GET RIGHT IN PA AND SOUND REINFORCEMENT

Chair: Peter Mapp, Peter Mapp and Associates, Colchester, UK

Panelists: Mark Bailey, JBL/Harman International, UKJason Baird, Martin Audio, High Wycombe, UKChris Foreman, Community Professional Loudspeakers, Chester, PA, USARalph Heinz, Renkus Heinz, Foothill Ranch, CA, USAGlenn Leembruggen, Acoustic Directions Leichhardt, NSW, Australia

The workshop will discuss the 10 most important thingsto get right when designing/operating sound reinforce-ment and PA systems. However, there are many morethings to consider than just the ten golden rules, and thatthe order of importance of these often changes depend-ing upon the venue and type of system. The workshopaims to provide a practical approach to sound systemsdesign and operation and will be illustrated with manypractical examples and case histories. Each workshoppanelist has many years practical experience and between them can cover just about any aspect of soundreinforcement and PA systems design, operation, andtechnology. Come along to a workshop that aims to an-swer questions you never knew you had but of course tofind out the ten most important ones you will need to attend the session.


SINGLE-UNIT SURROUND MICROPHONES

Chair: Eddy B. Brixen, EBB-consult, Smørum, Denmark

Panelists: Mikkel Nymand, DPA Microphones, Allerød, DenmarkMattias Strömberg, Milad, Helsingborg, SwedenHelmut Wittek, SCHOEPS Mikrofone GmbH, Karlsruhe, Germany

The workshop will present available single-unit surroundsound microphones in a kind of shoot out. There are anumber of these microphones available and more unitsare on their way. These microphones are based on dif-ferent principles. However, due to their compact sizesthere may/may not be restrictions to the performance.This workshop will present the different products and theideas and theories behind them.

Tutorial 9 Tuesday, May 2509:00 – 10:45 Room C2

COMPRESSION FX—USE YOUR POWER FOR GOOD, NOT EVIL

Presenter: Alex U. Case, University of Massachusetts Lowell, MA, USA

Dynamic range compression, so often avoided by thepurist and damned by the press, is most enthusiasticallyembraced by pop music creators. As an audio effect, itcan be easily overused. Reined-in it can be difficult toperceive. It is always difficult to describe. As a tool, itscontrols can be counterintuitive, and its meters and flash-ing lights uninformative. In this tutorial—rich with audioexamples—Case organizes the broad range of iconic ef-fects created by audio compressors as they are used toreduce and control dynamic range, increase perceivedloudness, improve intelligibility and articulation, reshapethe amplitude envelope, add creative doses of distortion,and extract ambience cues, breaths, squeaks, and rat-tles. Learn when pop engineers reach for compression,know what sort of parameter settings are used (ratio,threshold, attack, and release), and advance your under-standing of what to listen for and which way to tweak.

Tuesday, May 25 09:00 Room Saint JulienStandards Committee Meeting SC-02-12 Audio Applications of Networks

Session P23 Tuesday, May 2510:30 – 12:00 C4-Foyer

POSTERS: AUDIO PROCESSING—MUSIC AND SPEECH SIGNAL PROCESSING

10:30

P23-1 Beta Divergence for Clustering in MonauralBlind Source Separation—Martin Spiertz,Volker Gnann, RWTH Aachen University,Aachen, Germany

General purpose audio blind source separationalgorithms have to deal with a large dynamic

Technical Program

Audio Engineering Society 122nd Convention Program, 2007 Spring 43

range for the different sources to be separated.In the used algorithm the mixture is separatedinto single notes. These notes are clustered toconstruct the melodies played by the activesources. The non-negative matrix factorization(NMF) leads to good results in clustering thenotes according to spectral features. The costfunction for the NMF is controlled by the para-meter beta. Beta should be adjusted properlydepending on the dynamic difference of thesources. The novelty of this paper is to proposea simple unsupervised decision scheme that estimates the optimal parameter beta for increasing the separation quality over a largerange of dynamic differences.Convention Paper 8130

10:30

P23-2 On the Effects of Room Reverberation in 3-DDOA Estimation Using a Tetrahedral Microphone Array—Maximo Cobos, Jose J.Lopez, Amparo Marti, Universidad Politécnica de Valencia, Valencia, Spain

This paper studies the accuracy in the estimationof the Direction-Of-Arrival (DOA) of multiple soundsources using a small microphone array. As othersparsity-based algorithms, the proposed methodis able to work in undetermined scenarios, wherethe number of sound sources exceeds the num-ber of microphones. Moreover, the tetrahedralshape of the array allows estimation of DOAs inthe three-dimensional space easily, which is anadvantage over other existing approaches. How-ever, since the proposed processing is based onan anechoic signal model, the estimated DOAvectors are severely affected by room reflections.Experiments to analyze the resultant DOA distrib-ution under different room conditions and sourcearrangements are discussed using both simula-tions and real recordings.Convention Paper 8131

10:30

P23-3 Long Term Cepstral Coefficients for Violin Identification—Ewa Lukasik, Poznan Universityof Technology, Poznan, Poland

Cepstral coefficients in mel scale proved to beefficient features for speaker and musical instru-ment recognition. In this paper Long Term Cep-stral Coefficients —LTCCs—of solo musicalphrases are used as features for identification ofindividual violins. LTCC represents the envelopeof LTAS—Long Term Average Spectrum—in lin-ear scale useful to characterize the subtleties’ ofviolin sound in frequency domain. Results of theclassification of 60 instruments are presentedand discussed. It was shown, that if the experts’knowledge is applied to analyze violin sound, theresults may be promising. Convention Paper 8132

10:30

P23-4 Adaptive Source Separation Based on Reliability of Spatial Feature Using Multichannel Acoustic Observations—Mitsunori Mizumachi, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan

Separation of sound source can be achieved byspatial filtering with multichannel acoustic obser-vations. However, the right algorithm should beprepared in each condition of acoustic scene. Itis difficult to provide the suitable algorithm underreal acoustic environments. In this paper anadaptive source separation scheme is proposedbased on the reliability of a spatial feature, whichgives an estimate of direction of arrival (DOA).As confidence measures for DOA estimates, thethird and fourth moments for spatial features areemployed to measure how sharp the main-lobesof spatial features are. This paper proposes toselectively use either spatial filters or frequency-selective filters without spatial filtering dependingon the reliability of each DOA estimate. Convention Paper 8133

10:30

P23-5 A Heuristic Text-Driven Approach for AppliedPhoneme Alignment—Konstantinos Avdelidis,Charalampos Dimoulas, George Kalliris, GeorgePapanikolaou, Aristotle University of Thessaloniki, Thessaloniki, Greece

The paper introduces a phoneme matching algo-rithm considering a novel concept of functionalstrategy. In contrast to the classic methodologiesthat are focusing on the convergence to a fixedexpected phonemic sequence (EPS), the pre-sented method follows a more realistic approach. Based on text input, a soft EPS ispopulated taking into consideration the structuraland linguistic deviations that may appear in anaturally spoken sequence. The results of thematching process is evaluated using fuzzy infer-ence and is consisted of both the phoneme tran-sition positions as well as the actual utterancephonemic content. An overview of convergencequality performance through a series of runs forthe Greek language is presented. Convention Paper 8134

10:30

P23-6 Speech Enhancement with Hybrid Gain Functions —Xuejing Sun, Kuan-Chieh Yen,Cambridge Silicon Radio, Auburn Hills, MI, USA

This paper describes a hybrid gain function forsingle-channel acoustic noise suppression sys-tems. The proposed gain function consists ofWiener filter and Minimum Mean Square Error –Log Spectral Amplitude estimator (MMSE-LSA)gain functions and selects respected gain valuesaccordingly. Objective evaluation using a com-posite measure shows the hybrid gain functionyields better results over using either of the twofunctions alone. Convention Paper 8135

10:30

P23-7 Human Voice Modification Using Instantaneous Complex Frequency—Magdalena Kaniewska, Gdansk University ofTechnology, Gdansk, Poland

The paper presents the possibilities of changinghuman voice by modifying instantaneous com-plex frequency (ICF) of the speech signal. Theproposed method provides a flexible way of �


altering voice without the necessity of findingfundamental frequency and formants’ positionsor detecting voiced and unvoiced fragments ofspeech. The algorithm is simple and fast. Apartfrom ICF it uses signal factorization into two fac-tors: one fully characterized by its envelope andthe other with positive instantaneous frequency.ICFs of the factors are modified individually fordifferent sound effects. Convention Paper 8136

10:30

P23-8 Designing Optimal Phoneme-Wise FuzzyCluster Analysis—Konstantinos Avdelidis,Charalampos Dimoulas, George Kalliris, George Papanikolaou, Aristotle University of Thessaloniki, Thessaloniki, Greece

A large number of pattern classification algo-rithms and methodologies have been proposedfor the phoneme recognition task during the lastdecades. The current paper presents a proto-type distance-based fuzzy classifier, optimizedfor the needs of phoneme recognition. This isaccomplished by the specially designed objec-tive function and a respective training strategy.Particularly, each phonemic class is representedby a number of arbitrary-shaped clusters thatadaptively match the corresponding featuresspace distribution. The formulation of the approach is capable of delivering a variety of re-lated conclusions based on fuzzy logic arith-metic. An overview of the inference capability ispresented in combination with performance results for the Greek language. Convention Paper 8137

Convention Paper 8138 was withdrawn


MPEG UNIFIED SPEECH AND AUDIO CODING

Chair: Ralf Geiger, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Panelists: Philippe Gournay, University of Sherbrooke / VoiceAge, Quebec, CanadaMax Neuendorf, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, GermanyLars Villemoes, Dolby Sweden AB, Stockholm, Sweden

Recently the ISO/MPEG standardization group haslaunched an activity on unified speech and audio coding(USAC). This codec aims at achieving consistently highquality for speech, music, and mixed content over abroad range of bit rates. This outperforms current state-of-the-art coders. While low bit rate speech codecs focuson efficient representation of speech signals, they fail formusic signals. On the other hand, generic audio codecsare designed for any kind of audio signals, but they tendto show unsatisfactory results for speech signals, espe-cially at low bit rates. The new USAC codec unifies thebest of both systems. This workshop provides anoverview of architecture, performance, and applicationsof this new unified coding scheme. Experts in the fields

of speech coding and audio coding will present details onthe technical solutions employed to top-notch unifiedspeech and audio coding.

Special EventAES/APRS/MPG PLATINUM ENGINEER BEHIND THE GLASSTuesday, May 25, 11:00 – 13:00Room C1

Moderator: Howard Massey

Panelists: Chuck AinlayAndy BradfieldPhil HardingGeorge Massenburg

A look behind the scenes with some of today’s top Britishand American audio engineers that is sure to be lively andcontroversial. Topics will include new media and onlinedistribution; the application of legacy (analog) technologyin an increasingly digital world; and the critical role playedby the engineer in driving both the technical and creativeaspects of the recording process.Howard Massey is a leading audio industry consul-

tant, technical writer, and author of Behind the Glass andBehind the Glass Volume II, two collections of in-depthinterviews with top record producers and audio engi-neers widely used in recording schools the world over.He also co-authored legendary Beatles engineer GeoffEmerick’s acclaimed 2006 memoir, Here, There, andEverywhere: My Life Recording the Music of the Beatles.

Tuesday, May 25 11:00 Room Saint JulienStandards Committee Meeting SC-04-03 Loudspeaker Modeling and Measurement


ADR

Presenter: Dave Humphries, Loopsync, Canterbury, UK

ADR is becoming more necessary than ever. Noisy loca-tions, special effects, and dialog changes are a majorpart of any drama production. Wind noise, generators,lighting chokes, aircraft, traffic, and rain can all be elimi-nated by good ADR. In this tutorial Dave Humphries will discuss the rea-

sons why we need to put actors through this and how wecan help minimize the need. What a Dialog Editor needsto know and how to help actors succeed at ADR. He willalso demonstrate the art of recording ADR live.

Tuesday, May 25 12:00 Room Mouton CadetTechnical Council Meeting

Tuesday, May 25 13:00 Room Saint JulienStandards Committee Meeting SC-03-02 TransferTechnologies


INNOVATIVE APPLICATIONS

Chair: John Dawson, Arcam, Cambridge, UK

Technical Program


14:00

P24-1 The Serendiptichord: A Wearable Instrumentfor Contemporary Dance Performance—Tim Murray-Browne, Di Mainstone, Nick Bryan-Kinns, Mark D. Plumbley, Queen MaryUniversity of London, London, UK

We describe a novel musical instrument designed for use in contemporary dance perfor-mance. This instrument, the Serendiptichord,takes the form of a headpiece plus associatedpods that sense movements of the dancer, together with associated audio processing soft-ware driven by the sensors. Movements such astranslating the pods or shaking the trunk of theheadpiece cause selection and modification ofsampled sounds. We discuss how we haveclosely integrated physical form, sensor choice,and positioning and software to avoid issues thatotherwise arise with disconnection of the innatephysical link between action and sound, leadingto an instrument that non-musicians (in this case,dancers) are able to enjoy using immediately.Convention Paper 8139

14:30

P24-2 A Novel User Interface for Musical TimbreDesign—Allan Seago,1 Simon Holland,2Paul Mulholland21London Metropolitan University, London, UK 2Open University, UK

The complex and multidimensional nature of musical timbre is a problem for the design of intuitive interfaces for sound synthesis. A usefulapproach to the manipulation of timbre involvesthe creation and subsequent navigation orsearch of n-dimensional coordinate spaces ortimbre spaces. A novel timbre space searchstrategy is proposed based on weighted centroidlocalization (WCL). The methodology and resultsof user testing of two versions of this strategy inthree distinctly different timbre spaces are pre-sented and discussed. The paper concludes thatthis search strategy offers a useful means of locating a desired sound within a suitably config-ured timbre space.Convention Paper 8140

15:00

P24-3 Bi-Directional Audio-Tactile Interface forPortable Electronic Devices—Neil Harris, New Transducers Ltd. (NXT), Cambridge, UK

When an audio system uses the screen or case-work vibrating as the loudspeaker, it can alsoprovide haptic feedback. Just as a loudspeakermay be used reciprocally as a microphone, thehaptic feedback aspect of the design may be operated as a touch sensor. This paper consid-ers how to model a basic system embodyingthese aspects, including the electrical part, witha finite element package. For a piezoelectric exciter, full reciprocal modeling is possible, butfor electromagnetic exciters it is not, unless mul-ti-physics simulation is supported. For the latter,a model using only lumped parameter mechani-cal elements is developed. Convention Paper 8141

15:30

P24-4 Tactile Music Instrument Recognition for Audio Mixers—Sebastian Merchel, Ercan Altinsoy, Maik Stamm, Dresden University of Technology, Dresden, Germany

To use touch screens for digital audio worksta-tions, particularly audio mixing consoles, is notvery common today. One reason is the ease ofuse and the intuitive tactile feedback that hard-ware faders, knobs, and buttons provide. Addingtactile feedback to touch screens will largely improve usability. In addition touch screens canreproduce innovative extra tactile information.This paper investigates several design parame-ters for the generation of the tactile feedback.The results indicate that music instruments canbe distinguished if tactile feedback is renderedfrom the audio signal. This helps to improverecognition of an audio signal source that is assigned, e.g., to a specific mixing channel. Applying this knowledge, the use of touchscreens in audio applications becomes more intuitive. Convention Paper 8142

16:00

P24-5 Augmented Reality Audio Editing—JacquesLemordant, Yohan Lasorsa, INRIA, Rhône-Alpes, France

The concept of augmented reality audio (ARA)characterizes techniques where a physically realsound and voice environment is extended withvirtual, geolocalized sound objects. We showthat the authoring of an ARA scene can be donethrough an iterative process composed of twostages: in the first one the author has to move inthe rendering zone to apprehend the audio spa-tialization and the chronology of the audioevents, and in the second one a textual editingof the sequencing of the sound sources andDSP acoustics parameters is done. This author-ing process is based on the joint use of two XMLlanguages, OpenStreetMap for maps and A2MLfor Interactive 3-D audio. A2ML, being a formatfor a cue-oriented interactive audio system, requests for interactive audio services are donethrough TCDL, a tag-based cue dispatching lan-guage. This separation of modeling and audiorendering is similar to what is done for the webof documents with HTML and CSS style sheets. Convention Paper 8143

16:30

P24-6 Evaluation of a Haptic/Audio System for 3-DTargeting Tasks—Lorenzo Picinali,1 BobMenelas,2 Brian F. G. Katz,2 Patrick Bourdot21De Montfort University, Leicester, UK2LIMSI-CNRS, Orsay, France

While common user interface designs tend tofocus on visual feedback, other sensory channels may be used in order to reduce thecognitive load of the visual one. In this papernon-visual environments are presented in orderto investigate how users exploit information delivered through haptic and audio channels. Afirst experiment is designed to explore the effec-tiveness of a haptic audio system evaluated in a �


single target localization task; a virtual magnetmetaphor is exploited for the haptic rendering,while a parameter mapping sonification of thedistance to the source, combined with 3-D audiospatialization, is used for the audio one. Anevaluation is carried out in terms of the effec-t iveness of separate haptic and auditory feedbacks versus the combined multimodalfeedback.Convention Paper 8144

17:00

P24-7 Track Displays in DAW Software: Beyond Waveform Views—Kristian Gohlke,1 MichaelHltaky,1 Sebastian Heise,1 David Black,1Jörn Loviscach21Hochschule Bremen (University of Applied Sciences), Bremen, Germany

2Fachhhochschule Bielefeld (University of Applied Sciences), Bielefeld, Germany

For decades, digital audio workstation softwarehas displayed the content of audio tracksthrough bare waveforms. We argue that thesame real estate on the computer screen can beused for far more expressive and goal-orientedvisualizations. Starting from a range of require-ments and use cases, this paper discusses existing techniques from such fields as music visualization and music notation. It presents anumber of novel techniques, aimed at better ful-filling the needs of the human operator. To thisend, the paper draws upon methods from signalprocessing and music information retrieval aswell as computer graphics.Convention Paper 8145


LISTENING TESTS AND EVALUATION PSYCHOACOUSTICS

Chair: Natanya Ford, Buckinghamshire New University, UK

14:00

P25-1 Toward a Statistically Well-Grounded Evaluation of Listening Tests—Avoiding Pitfalls, Misuse, and Misconceptions—Frederik Nagel,1 Thomas Sporer,2 PeterSedlmeier31Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

2Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany

3Technical University of Dresden, Dresden, Germany

Many recent publications in audio research pre-sent subjective evaluations of audio qualitybased on the Recommendation ITU-R BS.1534-1 (MUSHRA, MUltiple Stimuli with Hidden Refer-ence and Anchor). This is a very welcome trendbecause it enables researchers to assess theimplications of their developments. The evalua-tion of listening tests, however, sometimes suf-fers from an incomplete understanding of the underlying statistics. The present paper aims at

identifying the causes for the pitfalls and miscon-ceptions in MUSHRA evaluations. It exemplifiesthe impact of falsely used or even misused sta-tistics. Subsequently, schemes for evaluating thelisteners’ judgments that are well-grounded onstatistical considerations comprising an under-standing of the concepts of statistical power andeffect size are proposed.Convention Paper 8146

14:30

P25-2 Audibility of Headphone Positioning Variability—Mathieu Paquier, Vincent Koehl,Université de Brest, Plouzané, France

This paper aims at evaluating the audibility ofspectral modifications induced by slight but real-istic changes in the headphone position over alistener’s ears. Recordings have been performedon a dummy head on which four different head-phone models were placed eight times each.Musical excerpts and pink noise were playedover the headphones and recorded with micro-phones located at the entrance of the blockedear canal. These recordings were then present-ed to listeners over a single test headphone. Thesubjects had to assess the recordings in a3I3AFC task to discriminate between the differ-ent headphone positions. The results indicatethat, whatever the headphone model or the excerpt, the modifications caused by differentpositions were always perceived. Convention Paper 8147

15:00

P25-3 Objectivization of Audio-Video CorrelationAssessment Experiments—Bartosz Kunka,Bozena Kostek, Gdansk University of Technology, Gdansk, Poland

The purpose of this paper is to present a newmethod of conducting audio-visual correlationanalysis employing a head-motion-free gazetracking system. First, a review of related worksin the domain of sound and vision correlation ispresented. Then assumptions concerning audio-visual scene creation are briefly described. Theobjectivization process of carrying out correlationtests employing gaze-tracking system is out-lined. The gaze tracking system developed atthe Multimedia Systems Department is described, and its use for carrying out subjectivetests is given. The results of subjective tests examining the relationship between video andaudio associated with the video material are pre-sented. Conclusions concerning the newmethodology, as well as future work direction,are provided. Convention Paper 8148

15:30

P25-4 A New Time and Intensity Trade-Off Functionfor Localization of Natural Sound Sources—Hyunkook Lee, LG Electronics, Seoul, Korea

This paper introduces a new set of psychoa-coustic values of interchannel time difference(ICTD) and interchannel intensity difference(ICID) required for 10°, 20°, and 30° localizationin the conventional stereophonic reproduction,

Technical Program


which were obtained using natural soundsources of musical instruments and widebandspeech representing different characteristics. Itthen discusses the new concept of ICID and theICTD trade-off function developed based on therelationship of the psychoacoustic values. Theresult of the listening test is conducted to verifythe performance of the proposed method is alsopresented. Convention Paper 8149

16:00

P25-5 Effect of Signal-to-Noise Ratio and VisualContext on Environmental Sound Identification—Tifanie Bouchara,1 Bruno L.Giordano,2 Ilja Frissen,2 Brian F. G. Katz,1Catherine Guastavino21LIMSI-CNRS, Orsay, France2McGill University, Montreal, Quebec, Canada

The recognition of environmental sounds is ofmain interest for the perception of our environ-ment. This paper investigates whether visualcontext can counterbalance the impairing effectof signal degradation (signal-to-noise ratio, SNR)on the identification of environmental sounds.SNRs and semantic congruency between senso-ry modalities, i.e., auditory and visual informa-tion, were manipulated. Two categories of soundsources, living and nonliving were used. Theparticipants’ task was to indicate the category ofthe sound as fast as possible. Increasing SNRsand congruent audiovisual contexts enhancedidentification accuracy and shortened reactiontimes. The results further indicated that livingsound sources were recognized more accuratelyand faster than nonliving sound sources. A pre-liminary analysis of the acoustical factors medi-ating participants’ responses revealed that theharmonic-to-noise ratio (HNR) sound signalswas significantly associated with the probabilityof identifying a sound as living. Further, the extent to which participants’ identifications weresensitive to the HNR appeared to be modulatedby both SNR and audiovisual congruence.Convention Paper 8150

16:30

P25-6 The Influence of Individual Audio Impairments on Perceived Video Quality—Leslie Gaston,1 Jon Boley,2 Scott Selter,1Jeffrey Ratterman11University of Colorado, Denver, Denver, CO, USA

2LSB Audio LLC, Lafayette, IN, USA

As the audio, video, and related industries worktoward establishing standards for subjectivemeasures of audio/video quality, more informa-tion is needed to understand subjectiveaudio/video interactions. This paper reports acontribution to this effort that aims to extend pre-vious studies, which show that audio and videoquality influence each other and that some audioartifacts affect overall quality more than others.In the current study, these findings are combinedin a new experiment designed to reveal how in-dividual impairments of audio affect perceivedvideo quality. Our results show that some audioartifacts enhance the ability to identify video

artifacts, while others make discrimination moredifficult.Convention Paper 8151Paper presented by Scott Selter

17:00

P25-7 Vertical Localization of Sounds with Frequencies Changing over Three Octaves—Eiichi Miyasaka, Tokyo City University, Yokohama, Kanagawa, Japan

Vertical localization was investigated for soundsconsisting of 22 tone-bursts ascending/descend-ing along the whole-tone scale between C4 (262Hz) and C7 (2093 Hz). The kinds of the soundsused were pure tones, one-third octave bandnoises, and piano-tones. The sounds were pre-sented through a fixed loudspeaker (SP-A) setup just in front of listeners with numbered cardsset perpendicularly (case-1) or with seven dum-my loudspeakers attached to the numberedcards (case-2). The results show that most observers perceived the locations of the soundimages moved upward from a loudspeakeraround SP-A for the ascending sounds or down-ward for the descending sounds in both cases,although the sounds were radiated through thefixed loudspeaker (SP-A). Convention Paper 8152

17:30

P25-8 Variability in Perceptual Evaluation of HRTFs—David Schönstein,1 Brian F. G. Katz21Arkamys, Paris, France2Université Paris XI, Orsay Cedex, France

The implementation of the head-related transferfunction (HRTF) is key to binaural rendering applications. An HRTF evaluation and selectionis often required when individual HRTFs are notavailable. This paper examines the variability inperceptual evaluations of HRTFs using a listen-ing test. A set of six different HRTFs was select-ed and was then used in a listening test basedon the standardized MUSHRA method for evalu-ating audio quality. A total of six subjects partici-pated, each having their own recorded HRTFsavailable. Subjects performed five repetitions ofthe listening test. While conclusive HRTF judg-ments were evident, a significantly large degreeof variance was found. The effect of listener expertise on variability in perceptual judgmentswas also analyzed.Convention Paper 8153

Session P26 Tuesday, May 2514:00 – 15:30 C4-Foyer

POSTERS: RECORDING, PRODUCTION, AND REPRODUCTION—MULTICHANNEL AND SPATIAL AUDIO

14:00

P26-1 Evaluation of Virtual Source Localization Using 3-D Loudspeaker Setups—Florian Keiler, Johann-Markus Batke, Technicolor, Research, and Innovation, Hannover, Germany

This paper evaluates the localization accuracy of �


Localization and have been modified to take intoaccount off-center listening positions. The derived decoders exhibit improved theoretical localization performance for off-center listeners.The theoretical results are supported by initiallistening test results.Convention Paper 8061

14:00

P26-3 Vibrational Behavior of High Aspect Ratio Multiactuator Panels—Basilio Pueo,1 Jorge A.López,1 Javier Moralejo,1 José Javier López21University of Alicante, Alicante, Spain2Technical University of Valencia, Valencia, Spain

Multiactuator Panels (MAPs) consist of a flatpanel of a light and stiff material to which a num-ber of mechanical exciters are attached, creatingbending waves that are then radiated as soundfields. MAPs can substitute the traditional dynamic loudspeaker arrays for Wave Field Syn-thesis (WFS) with added benefits, such as thelow visual profile or omnidirectional radiation.However, the exciter interaction with the panel,the panel material, and the panel edge boundaryconditions are some of the critical points thatneed to be evaluated and improved. In this paper the structural acoustic behavior of a highaspect ratio MAP is analyzed for two classicaledge boundary conditions: free and clamped.For that purpose, the surface velocity over thewhole area of the MAP prototype has been mea-sured with a Laser Doppler Vibrometer (LDV),which helped in understanding the sound-gener-ating behavior of the panel.Convention Paper 8062

Technical Programdifferent playback methods for 3-D spatial soundusing a listening test. The playback methods arecharacterized by their panning functions that define the gain for each loudspeaker to playback a sound source positioned at a distinct pairof azimuth and elevation angles. The testedmethods are Ambisonics decoding using themode matching approach, vector base amplitudepanning (VBAP), and a newly proposed 3-D robust panning approach. For irregular 3-D loud-speaker setups, as found in home environments,the mode matching shows poor localization. Thenew 3-D robust panning leads to a better local-ization and can also outperform the VBAP tech-nology dependent on the source position and theloudspeaker setup used. Convention Paper 8060

14:00

P26-2 Optimization of the Localization Performanceof Irregular Ambisonic Decoders for MultipleOff-Center Listeners—David Moore,1 JonathanWakefield21Glasgow Caledonian University, Glasgow, Lanarkshire, UK

2University of Huddersfield, Huddersfield, West Yorkshire, UK

This paper presents a method for optimizing theperformance of irregular Ambisonic decoders formultiple off-center listeners. New off-center eval-uation criteria are added to a multi-objective fitness function, based on auditory localizationtheory, which guides a heuristic search algorithmto derive decoder parameter sets for the ITU 5-speaker layout. The new evaluation criteria arebased upon Gerzon’s Metatheory of Auditory

128th Convention Papers and CD-ROMConvention Papers of many of the presentations given at the 126th Convention and a CD-ROM containing the 128th Convention Papers are available from the Audio Engineering Society. Prices follow:

128th CONVENTION PAPERS (single copy)Member: $ 5. (USA)

Nonmember: $ 20. (USA)

128th CD-ROM (193 papers)Member: $190. (USA)

Nonmember: $230. (USA)

128th Convention2010 May 22–

May 25London, UK

Presented at the 128th Convention

2010 May 22 – May 25 • London, UK


14:00

P26-4 Design of a Circular Microphone Array for Panoramic Audio Recording and Reproduction: Microphone Directivity—Hüseyin Hacihabiboglu, Enzo De Sena, Zoran Cvetkovic, King’s College London, London, UK

Design of a circularly symmetric multichannelrecording and reproduction system is discussedin this paper. The system consists of an array ofdirectional microphones evenly distributed on acircle and a matching array of loudspeakers. Therelation between the microphone directivity andthe radius of the circular array is establishedwithin the context of time-intensity stereophony.The microphone directivity design is identified asa constrained linear least squares optimizationproblem. Results of a subjective evaluation arepresented that indicate the usefulness of theproposed microphone array design technique. Convention Paper 8063

14:00

P26-5 Design of a Circular Microphone Array for Panoramic Audio Recording and Reproduction: Array Radius—Enzo De Sena,Hüseyin Hacihabiboglu, Zoran Cvetkovic, King’sCollege London, London, UK

A multichannel audio system proposed by John-ston and Lam aims at perceptual reconstructionof the sound field of an acoustic performance inits original venue. The system employs a circularmicrophone array, of 31 cm diameter, to capturerelevant spatial cues. This design proved to beeffective in the rendition of the auditory perspec-tive, however other studies showed that there isstill substantial room for improvement. This paper investigates the impact of the array diam-eter on the width and naturalness of the auditoryimages. To this end we propose a method forquantification and prediction of the perceivednaturalness. Simulation results support array diameters close to that proposed by Johnstonand Lam in the sense that they achieve optimalnaturalness in the center of the listening area,but also suggest that larger arrays might providea more graceful degradation of the naturalnessfor listening positions away from the center. Convention Paper 8064

14:00

P26-6 MIAUDIO—Audio Mixture Digital Matrix—David Pedrosa Branco, José Neto Vieira, Iouliia Skliarova, Universidade de Aveiro, Aveiro, Portugal

Electroacoustic music is turning more and moreto the sound diffusion techniques. Multichannelsound systems like BEAST and SARC are builtso that the musician can independently controlthe intensity of several audio channels. This fea-ture provides the possibility of creating severalsound diffusion scenarios, i.e., immersion andthe possibility of movement around the audi-ence. The developed system (MIAUDIO) is areal-time sound diffusion system currently ableto mix up to 8 audio input channels through 32output channels. A hardware solution was adopt-

ed using a Field Programmable Gate Array(FPGA) to perform the mixture. The analog audio signals are conditioned, converted to digi-tal format by several analog-to-digital converters,and then sent to the FPGA that is responsible toperform the mixing algorithm. The host computerconnects to the FPGA via USB and is responsi-ble for supplying the parameters that define theaudio mixture. Being so, the user has controlover the input levels through the output channelsindependently. MIAUDIO was successfully implemented with a low-cost solution when com-pared with similar systems. All the channelswere tested using a Precision One system withvery good results. Convention Paper 8065

14:00

P26-7 The Perception of Focused Sources in WaveField Synthesis as a Function of Listener Angle—Robert Oldfield, Ian Drumm, Jos Hirst,University of Salford, Salford, UK

Wave field synthesis (WFS) is a volumetricsound reproduction technique that allows virtualsources to be positioned anywhere in space.The reproduction of the wave field of thesesources means they can be accurately localizedeven when placed in front of the secondarysources/loudspeakers. Such “focused sources”are very important in WFS as they greatly add tothe realism of an auditory scene, however theperception and localization-ability changes withlistener and virtual source position. In this paperwe present subjective tests to determine the localization accuracy as a function of angle,defining the subjective “view angle.” We alsoshow how improvements can be made throughthe addition of the first order image sources. Convention Paper 8066

14:00

P26-8 Gram-Schmidt-Based Downmixer and Decorrelator in the MPEG Surround Coding—Der-Pei Chen, Hsu-Feng Hsiao, Han-Wen Hsu,Chi-Min Liu, National Chiao Tung University,Hsinchu, Taiwan

MPEG Surround (MPS) coding is an efficientmethod for multichannel audio coding. In MPScoding, downmixing from multichannel signalsinto a fewer number of channels is an efficientway to achieve a high compression rate in anencoder. In decoder, an upmixing module com-bining with the decorrelator is the key module toreconstruct the multichannel signals. This paperconsiders the design of the downmixer and thedecorrelator through the Gram-Schmidt orthogo-nal process. The individual and joint effects fromthe downmixer and decorrelator are verifiedthrough intensive subjective and objective quali-ty measure. Convention Paper 8067


5.1 INTO 2 WON'T GO—THE PERILS OF FOLD-DOWN IN GAME AUDIO �


Chair: Michael Kelly, Sony Computer Entertainment Europe, London, UK

Panelists: Richard Furse, Blue Ripple Sound Limited, London, UKSimon Goodwin, Codemasters, Southam, UKJean-Marc Jot, DTS, Inc., Scotts Valley, CA, USADave Malham, University of York, York, UK

One mixing solution cannot suit mono, stereo, head-phone, and various surround configurations. Howevergames mix and position dozens of sounds on the fly, sothey can readily make a custom mix, rather than relyupon downmixing or upmixing that penalizes listenerswho do not use the default configuration. This workshopexplains practical solutions to problems of stereo speak-er, headphone, and mono compatibility (including DolbyProLogic and 2.1 set-ups) without detriment to surround.It notes differences between the demands of games andcinema for surround and the challenges of reconciling defacto (quad+2) and theoretical (ITU 5.1) standard loud-speaker layouts and playing 5.1 channel content on a 7.1loudspeaker system.

Special EventAES/APRS—LIFE IN THE OLD DOGS YET—PART THREE: AFTER THE BALL—PROTECTING THE CROWN JEWELS Tuesday, May 25, 14:00 – 15:45Room C2

Moderator: John Spencer, BMS CHACE

Panelists: Chris Clark, British Library Sound ArchiveTommy D, ProducerSimon Drake, Naim Label RecordsTony Dunne, A&R Coordinator, DECCA Records and UMTV/UMR, UKSimon Hutchinson, PPLPaul Jessop, Consulant IFI/RIAAGeorge Massenburg, P&E Wing, NARASCrispin Murray, Metropolis Mastering

A fascinating peek into the unspoken worlds of archivingand asset protection. It examines the issues surroundingretrievable formats that promise to future-proof recordedassets and the increasing importance of accurate record-ings information (metadata). A unique group of expertsfrom archiving and royalty distribution communities willhear a presentation from John Spencer, from BMSCHACE in Nashville, explaining his work with NARAS andthe U.S. Library of Congress to establish an informationschema for sound recording and Film and TV audio andthen engage in a group discussion. The discussion thenmoves onto probably the most important topic to impacton the future of the sound and music economies—how tokeep what we’ve got and reward those who made it.

Student Event/Career DevelopmentSTUDENT DELEGATE ASSEMBLY MEETING—PART 2Tuesday, May 25, 14:00 – 15:30Room C6

At this meeting the SDA will elect a new vice chair. Onevote will be cast by the designated representative fromeach recognized AES student section in the Europeanand International Regions. Judges’ comments andawards will be presented for the Recording Competi-tions. Plans for future student activities at local, regional,and international levels will be summarized.

Tuesday, May 25 14:30 Room Saint JulienAESSC Plenary Meeting


LOUDSPEAKERS AND HEADPHONES

Presenter: Wolfgang Klippel, Klippel GmbH, Dresden, Germany

Distributed mechanical parameters describe the vibrationand geometry of the sound radiating surface of loud-speaker drive units. This data is the basis for predictingthe sound pressure output and a decomposition of thetotal vibration into modal and sound pressure relatedcomponents. This analysis separates acoustical frommechanical problems, shows the relationship to thegeometry and material properties, and gives indicationsfor practical improvement. The tutorial combines the the-oretical background with practical loudspeaker diagnos-tics illustrated on various kinds of tranducers such aswoofer, tweeter, compression driver, microspeaker, andheadphones.

Technical Program

AES128th Conven tion Pro gram - Audio Engineering · PDF fileThis paper intro - duces software...

Documents

Transcript of AES128th Conven tion Pro gram - Audio Engineering · PDF fileThis paper intro - duces software...