ARTICULATION_in_Audio_Processing_Feb2015

13
7/30/2013 1 http://www.linkedin.com/in/gpbrefini/

Transcript of ARTICULATION_in_Audio_Processing_Feb2015

7/30/2013 1 http://www.linkedin.com/in/gpbrefini/

Audio Processing • Critical part of “internet radio” station

– Very important for digital streaming to avoid artifacts of compression algorithm

• Must operate on diverse array of files

– Think in grand scale: 80’s music to current day

• Instruments, mastering, music structure, recording technology

• Dynamically adjust to maintain high probability of clarity or articulation of original file structure in spectral and temporal domains while providing a uniform “sound signature” on these diverse files

– Articulation or intelligibity of complex music audio files is preserved and enhanced

• Maintain transient “punch”

• Goal is to make sure “everyone wins” in the mix

– ‘Muddiness” must be avoided

– Preserve loudness, pitch and timbre

• Purpose

– A consistent ‘sonic signature” for the station format 7/30/2013 2 http://www.linkedin.com/in/gpbrefini/

Motivation: The Connected Dash

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 3

• Delivering Internet audio to the car is hard – Carrier’s signals not ubiquitous ….yet – Everyday more people accessing high

speed content in particular rush hour – Carrier’s have limited BW

• DASH Applications are different from auto manufacturer to another

• Early studies show humans want Internet radio in car to work like conventional radio

• NO ONE ARGUES: Internet radio is the FUTURE! – The automobile is the listening

"theater,"

Good Audio Processing is Multi-Band Processing!

• Perfected by Mike Dorrough (based on Altec-Lansing design of the 1950’s)

– “Monolith” installed at KRLA, 8 band analog processor

– DAP-310, 3 band analog

– Both had phase equalized pass-bands before combining back to composite

• Minimize phase rotations at band edges

• Linear Group Delay!!!!

7/30/2013 4 http://www.linkedin.com/in/gpbrefini/

“Process for the Stream”

• Streams use "lossy" data compression such as: • MPEG, Real Audio, Microsoft's MSV2 codec • linear 44.1kHz stereo audio stream ~1.6Mb/s • At 128kb/s the MPEG Layer 3 compression ratio is approximately

11:1 • At 256kb/s the MPEG Layer 3 compression ratio is about 6:1

• Critical area of these perceptual coding schemes is the high-frequency area – Maintain consistent amplitude near FS for codec – Keep the upper spectrum free from clipping distortion or

excessive high-frequency processing – Consistent spectral balance over a wide range of material

is a must!

7/30/2013 5 http://www.linkedin.com/in/gpbrefini/

Codec Magic: Masking!

• Codecs remove redundant information that humans will not perceived as being removed – Audio spectrum split into 500 bands

– Algorithm models human ear • CODEC dynamically computes a “best frequency domain

fit” where certain signals present can be removed

• CODEC also performs “level masking” taking advantage of how human hearing focuses on what’s going on in the foreground

• Typically only 20% of original audio file is all that is needed to be transmitted!

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 6

Avoiding “watery sound” of Internet Radio

• Coders do not like hard limited audio, harmonics get squirreled into pass-band that algorithm can not model

• RMS is more important than peak of all waveforms – It is a measure of energy over time – Normalize FS to RMS (can’t exceed 0dB FS peak)

• Peak to RMS ratio is critical

• Contemporary Hit Music format uses processing to make it more exciting – As in movie production: frame by frame image is color

corrected & exposure corrected

• If we understand the transmission system and technical challenges and we can minimize or hide sonic challenges the better we sound!

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 7

http://schedule.sxsw.com/2011/events/event_MP7661 http://www.digido.com

Articulation Processing Example

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 8

• Process to bring the kick drum beater slaps forward

– Use linear phase to keep the transient rise/fall times steep

• Bring near-infrasonics forward during transients in mid-range ( 2.1 – 6.4 KHz )

– This gives the audio a subtle “thumb”

Example:

• Modern Hit Music Station format – Today’s Hits, the 2K’s, the 90’s and the 80’s

• Processing Challenges – Modern Music has

• very limited dynamic range • large bottom • Digitally corrected vocals

– 80’s Music has • larger dynamic range • More traditional instruments, less synthesized • SPARS code was highly likely AAD

• Need processing that makes for a consistent air sound

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 9

The Nation’s Hit Music Station!

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 10

Press above for Radio XL5 live!

Press above for Radio XL5 website

The Audio Chain

• Mild multiband processing with impact/thumb enhancements – Articulation processing

• Second stage multi-band processing – More bands

– Clip/Bass distortion correction

– Mild stereo enhancement

• Articulation processing

• Two-band DSP limit/compression

• Analog “fast” compressor

• High-end Soundcards

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 11

BreakAway Proc

StereoTool Proc

Behringer Digital Proc

Alesis Analog Proc

‘Proprietary System” Articulation Proc

Internet “Air Sound” vs FM “Air Sound”

• Internet Radio – Flat audio processing throughout the air chain

• FM Radio – requires pre-emphasis, 17 dB gain at 15 KHz for

75 uS (US radio)!

• Most modern music is highly clipped/limited – FM pre-emphasis really increases distortion

– Internet audio, flat and easy to use de-clipping algorithms

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 12

Summary

7/30/2013 http://www.linkedin.com/in/gpbrefini/ 13

• Playback using high quality sources • Multi-stage, multi-band processors

– Less is more – Phase & bass correct – Declipper function is important – Goal is for spectral balance

• Apply multiple articulation/transient “punch” processing at front and back end of chain – Avoid anything that does not have linear group delay – Preserve original transient information (spectral components).

• Minimal analog processing if possible • Use high end sound cards

– A/D and D/A low jitter clocks, preferably locked to common source