The Care and Feeding of Loudness Models J. D. (jj) Johnston Chief Scientist Neural Audio Kirkland,...

The Care and Feeding of Loudness Models

J. D. (jj) JohnstonChief ScientistNeural Audio

Kirkland, Washington, USA

What’s the big deal?(why do I yammer on about loudness vs. intensity, anyhow?)

• Language exists to communicate.• If we don’t use the same meanings to the words

(leaving out philosophical issues for now), communication is difficult.

• Hence, a few definitions and words to help us in the discussions.

• This understanding will also help explain some of the reasons why different compression sounds so different.

Intensity – What is it?

• Intensity is the EXTERNAL MEASURED LEVEL OF A SOUND.

• There are many measures of intensity. For now, we will use the common psychoacoustic (not mechanical engineering) definition of SPL, or “sound pressure level”.– There are assumptions built into SPL. Let’s keep it simple

for now and forget them.– SPL, usually measured in dB, is a measure of acoustic

energy in the atmosphere.

What is Loudness?• Loudness is the INTERNAL, SUBJECTIVE EXPERIENCE OF HOW

LOUD A SIGNAL IS.• The term loudness dates back at least to Fletcher, if not

beyond.• Loudness is not intensity• Loudness and intensity can be mostly related by a complex

calculation, however:– Every listener is a bit different– Hearing injuries affect loudness in many ways– Intensity does NOT equal loudness. The relationship is complex, and

could constitute an entire tutorial in and of itself. – In the worst cases, intensity is a very poor substitute for loudness, and

vice versa.

When do we use Intensity?

• Intensity is an OBJECTIVE measure. It measures the actual fluctuations in air pressure in the atmosphere that constitute sound.– We use intensity when that is what we need to

know.– We do NOT use intensity when we want to know

how loud a sound is to a listener.

When do we use Loudness• When we are trying to describe the experience that the

listener has.• When we are trying to estimate psychoacoustic parameters.• When we want to know why somebody is shouting “turn that

**** thing down!” and the level (intensity) isn’t that high.• When we want to know why somebody is shouting “TURN UP

THE SOUND” when the intensity is already excessive.• When we want to match loudness, NOT level, across audio

selections, either full-bandwidth or from remotes/phones.

What is Loudness?

• Loudness is the perceived strength of an audio signal.

• Loudness is something that works differently inside of a critical bandwidth and outside of a critical bandwidth

• Loudness differences between frequencies depends substantially on the presentation intensity at both low and high frequencies.

Everyone’s Seen This.But what does it mean?

Equal Loudness Curves

• Show loudness equivalent between tones presented sequentially

• Doesn’t show what happens when you have multiple frequencies present at once

• Doesn’t account for growth of sensation, that’s a different experiment

What now?

• To make a very long story short, we know some things.– Inside of a critical bandwidth (or ERB), loudness

grows as the 1/3.5 power of the power present in that band.

– When energy is spread over critical bandwidths, the loudness ADDS across frequency outside of a critical bandwidth, but is compressed inside each.

Implications?• Single band weighting filters can’t get it right.– They can get it moderately right for wideband signals

with similar spectrum, where spectrum is smoothed on a critical band basis.• This means that for some “typical signals” they aren’t too far

off.• There’s no mention of time here yet.– Loudness is sensed across frequency at a given time.

• That’s called “partial loudness”• Getting that far is easy.

NOW What?

The time domain

• While a variety of experiments have shown that the sum of partial loudnesses is a good measure of total loudness for a given instant, there is a lack of work on what it means when either partial or total loudness varies over time.

And, that is where we are today.

Some other issues

• In order to get it right, you have to know the intensity at the playback site (i.e. volume control setting, efficiency, acoustics, etc)– Good luck with that.

• Distortion, especially in the upper (70-120Hz) region can throw off loudness measurement by a phenominal amount.

Implications?• Single band weighting filters can’t get it right.– They can get it moderately right for wideband signals

with similar spectrum, where spectrum is smoothed on a critical band basis.• This means that for some “typical signals” they aren’t too far

off.• There’s no mention of time here yet.– Loudness is sensed across frequency at a given time.

• That’s called “partial loudness”• Getting that far is easy.

NOW What?

The time domain

• While a variety of experiments have shown that the sum of partial loudnesses is a good measure of total loudness for a given instant, there is a lack of work on what it means when either partial or total loudness varies over time.

And, that is where we are today.

Some issues

• In order to get it right, you have to know the intensity at the playback site (i.e. volume control setting, efficiency, acoustics, etc)– Good luck with that.– This creates particularly difficult issues below 500Hz.

• Distortion, especially in the upper (70-120Hz) region can throw off loudness measurement by a phenominal amount.

Bass distortion

• Consider a 90 Hz sine wave.– Harmonics at 180 and 270– Each of those harmonics is separated by a critical

bandwidth.

– Remember compression? If the teensy woofer has 20dB SNR (that would be a very good woofer), the total loudness would scale to something like

– 1^1/4 + .01^1/4 +.01^1/4 = 1.63

The point?

• Overall, loudness models for extended periods are still in development.– We don’t know if loudness or annoyance, or

something else, is what people adjust volume controls for

– We don’t know if it’s peak or average, or some of both

– We don’t know if everyone responds in the same fashion

Maybe they don’t all act the same• In a very small (and therefore quite inconclusive) test, we

found:– Some people seemed to set a level balance based on something

like average– Others tended more toward equalizing peaks.

This experiment compared a signal to a “homogenized” signal that had exactly the same average spectrum. The sample size is very small (4-8 subjects), and it can only be taken with an extreme grain of salt at this time.

None the less, responses were quite bimodal, not “random”, to at least some extent.

What Else?

• We don’t know how well people agree on long term vs short term preferences– Some people seem to care about peak– Some care about something kinda-sorta like average

• We need a system that can be adapted at the point of playback, NOT at the source.– Then, just maybe, we might get some dynamic range

back

Bass distortion

• Consider a 90 Hz sine wave.– Harmonics at 180 and 270– Each of those harmonics is separated by a critical

bandwidth.

– Remember compression? If the woofer has 20dB SNR (that would be a very good woofer), the total loudness would scale to something like

– 1^1/4 + .01^1/4 +.01^1/4 = 1.63

The point?

• Overall, loudness models for extended periods are still in development.– We don’t know if loudness or annoyance, or

something else, is what people adjust volume controls for

– We don’t know if it’s peak or average, or some of both

– We don’t know if everyone responds in the same fashion

What Else?

• We don’t know how well people agree on long term vs short term preferences– Some people seem to care about peak– Some care about something kinda-sorta like average

• We need a system that can be adapted at the point of playback, NOT at the source.– Then, just maybe, we might get some dynamic range

back

The Care and Feeding of Loudness Models J. D. (jj) Johnston Chief Scientist Neural Audio Kirkland,...

Documents

Transcript of The Care and Feeding of Loudness Models J. D. (jj) Johnston Chief Scientist Neural Audio Kirkland,...