MPEG-4 Structured Audio

25
MPEG-4 Structured Audio Eric D. Scheirer [email protected] Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio) Project Bar-B-Q 1999 Guadalupe River Ranch 15 Oct 1999

description

MPEG-4 Structured Audio. Eric D. Scheirer [email protected] Machine Listening Group MIT Media Laboratory Editor, ISO 14496-3 (MPEG-4 Audio). Project Bar-B-Q 1999 Guadalupe River Ranch 15 Oct 1999. - PowerPoint PPT Presentation

Transcript of MPEG-4 Structured Audio

Page 1: MPEG-4 Structured Audio

MPEG-4 Structured Audio

Eric D. [email protected]

Machine Listening GroupMIT Media LaboratoryEditor, ISO 14496-3 (MPEG-4 Audio)

Project Bar-B-Q 1999Guadalupe River Ranch

15 Oct 1999

Page 2: MPEG-4 Structured Audio

MPEG-4 Structured Audio,A New Standard for Interactive Sound, in the Creation of Which Tom White did not Run the Whole Show, but Only Played a Small (Though Valuable) Part

Eric D. [email protected]

Machine Listening GroupMIT Media LaboratoryEditor, ISO 14496-3 (MPEG-4 Audio)

Project Bar-B-Q 1999Guadalupe River Ranch

15 Oct 1999

Page 3: MPEG-4 Structured Audio

What’s this all about?

MPEG-4 is not just about compression

MPEG-4 shows one way for the IA world to move beyond wavetable synthesis

Page 4: MPEG-4 Structured Audio

Overview

What is MPEG?What is MPEG-4 Structured Audio?Why was it created?How does it work?How can it be used in IA applications?What is its current status?A brief note on MPEG-4 AudioBIFS

Page 5: MPEG-4 Structured Audio

Intellectual property in MPEG-4

Structured Audio and AudioBIFS are freeAll patentable IP has been released to public domainNo licensing or other costs to build tools & players(Standard itself costs $300 for printing/bureaucracy)

SA and AudioBIFS are open standardsCompanies competing through cooperationInteroperability makes the whole pie biggerMPEG processes for improving/correcting standardMIT has no veto over the future of the standard

Page 6: MPEG-4 Structured Audio

What is MPEG?

MPEG is ISO/IEC JTC1 SC29 WG11A subcommittee of the Int’l Standards OrganizationThe “Moving Pictures Experts Group”

MPEG-1 : 1993 (ISO 11172)Digital audio/video coding (MP3)

MPEG-2 : 1994-7 (ISO 13818)Digital coding for broadcast

MPEG-4: 1998 (ISO 14496)Object based, synthetic/natural, interactive coding

Page 7: MPEG-4 Structured Audio

MPEG Marketplace Model

MPEG Committee

Server-side tools makers Client-side tools makers

Content developers Content consumers

MPEG Standard

Authoring tools Playback tools

MPEGContent

Page 8: MPEG-4 Structured Audio

MPEG Marketplace Model

MPEG Committee

Server-side tools makers Client-side tools makers

Content developers Content consumers

MPEG Standard

Authoring tools Playback tools

MPEGContent

This talk

Page 9: MPEG-4 Structured Audio

MPEG Marketplace Model

MPEG Committee

Server-side tools makers Client-side tools makers

Content developers Content consumers

MPEG Standard

Authoring tools Playback tools

MPEGContent

The businessopportunities

Page 10: MPEG-4 Structured Audio

MPEG-4 Audio

High-quality soundBased on MPEG-AAC algorithm: twice as good as MP3

Low-bitrate soundFor WWW and cellular: speech/music as low as 4 kbps

Synthetic soundInterface to Text-to-Speech synthesizersHigh-quality audio synthesis with Structured Audio

AudioBIFSMix and postproduce multi-track sound streams

Page 11: MPEG-4 Structured Audio

MPEG-4 Structured Audio

Transmit structured description of soundUse real-time synthesis to play sound“PostScript for audio”Based on new (to MPEG) technology

SAOL: New music synthesis languageSASL: New music control format

A lot of related technology in academiaCsound, Music-11, SynthScript, Nyquist, CLM, ...

Page 12: MPEG-4 Structured Audio

Standardization goals

Provide synthetic sound in MPEG-4 Bring algorithmic synthesis to wider

communityStandardize academic state-of-the-art; don’t innovate

Get new companies to work on synthesisImplementation required for full MPEG-4 system

Set a higher bar for PC sound architectureDrive forward the world of sound on PCs!

Stated goals

Secret goals

Page 13: MPEG-4 Structured Audio

MPEG-4 SA decoding process

ReconfigurableSynthesis

Engine

ReconfigurableSynthesis

Engine

SAOLDecoderSAOL

Decoder

SASL/MIDIDecoder

SASL/MIDIDecoder

Bitstream

Bitstream header

Multichannelhigh-quality audio

Controlparameters

Samples

Page 14: MPEG-4 Structured Audio

What SAOL looks like

A C-like languageBased on the Music-N

modelVariables hold audio

signalsUnit generators do

basic functions Instruments controlled

by score or MIDI

instr beep(mp, vol) {

asig wave;

ksig env;

table sig(harm,2048,1,1);

wave = oscil(sig,cpsmidi(mp));

env = kline(0,dur*0.05,vol,

dur*0.6,vol,

dur*0.35,0);

output(wave * env);

}

SAOL: Structured Audio Orchestra Language

Page 15: MPEG-4 Structured Audio

SAOL capabilities

Many nice features built inWavetable manipulation FFT/IFFTMultitap delay lines Arrays of signalsFIR & IIR filters Effects routingGranular synthesis 3-D audio interfaceDynamic layering and triggering

SAOL is extensible-from-within(Allows encapsulation and structured programming)

Any kind of synthesis can be used in SAOL

Page 16: MPEG-4 Structured Audio

Example

“Xanadu” (Joseph Kung)60 seconds long, 44 KHz stereo (10.5 MB as WAVE)2.2 KB in header4.2 KB in bitstream (= 0.07 kbps)No samples anywhere, only algorithmic synthesis

More than 1200:1 “compression”, no loss of qualityCould be controlled/restructured interactively

Page 17: MPEG-4 Structured Audio

MPEG-MMA relationship

MIDI can control MPEG-4 SA synthSASL = more flexible, more tightly coupled

DLS-2 synthesis embedded in SA synthDo wavetable in series or parallel with other techniques

“Wavetable-only” profile of MPEG-4MIDI + DLS-2 + compressed audio + video (no SAOL)Logical path of progression from today to tomorrow

Lots of help from MMA - appreciated!MPEG is ready to help in the other direction (MIDI-DLA?)

Page 18: MPEG-4 Structured Audio

Applications ideas

MPEG-4 is not an application!It’s a tool - enables functionality and interoperabilityImplementations could be hardware, software, bothAuthoring tools also very important

Use MPEG-4 SA like Staccato SynthcoreUse MPEG-4 SA like BeatnikUse MPEG-4 SA like KoanUse MPEG-4 SA for new music applications

Page 19: MPEG-4 Structured Audio

Application example: Gaming

MPEG-4 enabled

sound card

Host program (game)

MPEG-4 & MIDI controls

Runtime

StartupMPEG-4

synthesis/effects algorithms

Multichannel, 3-D,

post-processed sound

MPEG-4algorithm andsample editors

MPEG-4 algorithmmarketplace

Not just music -- parametric sound effects as well All audio programming and asset development in SAOL

No host-language audio programming needed Host APIs (e.g. DirectMusic) can generate controls

Embedded MPEG-4 side can do this too, if useful

Page 20: MPEG-4 Structured Audio

Current status

Standard and reference software finishedMany implementation projects starting

Creative Tech Center: Compression & Interactive AudioStuder + EPFL: “ThreeDSpace” projectHobbyist projects (Java API, ActiveX plugin)Others: Be Inc., Sseyo, Kings College, UC Berkeley, Catholic U. Leuven, Q-Team DE, Nokia, ...3 complete implementations already!

A few authoring tools projectsActive mailing list for developers

Page 21: MPEG-4 Structured Audio

A brief note on AudioBIFS

BIFS is scene-description part of MPEG-4“Binary Format for Scenes”Based on VRML, but with many new features

AudioBIFS is the audio mixing partStream audio in multitrack formatDeliver mixdown instructions in AudioBIFSMixing, spatialization, effects in SAOL, multichannelTerminal-adaptive capabilityCandidate for “PC DSP architecture”?

Page 22: MPEG-4 Structured Audio

AudioBIFS - scene graph model

AudioSource

AudioSource

NaturalDecode

r

Synthetic

Decoder

AudioBIFSmanipulation

Sound

Streaming compressed audio & synthesis controls

Decode into raw audio samples

Inject sound into scene graph

Create sound objectwith AudioBIFS (mixing, filtering, reverb, etc)

Attach sound to main scene (spatially position if desired)

Page 23: MPEG-4 Structured Audio

Summary

MPEG-4 Structured AudioThe international standard for algorithmic sound synthesis

MPEG-4 AudioBIFSThe international standard for audio postproduction

New market opportunities for Hardware/software MPEG-4 players (embedded or not)Authoring tools (editors, sequencers)Advanced interactive audio content

Page 24: MPEG-4 Structured Audio

What was this all about?

MPEG-4 is not just about compression

MPEG-4 shows one way for the IA world to move beyond wavetable synthesis

Page 25: MPEG-4 Structured Audio

For more information

MPEG home pagehttp://www.cselt.it/mpegRequirements, future of MPEG

MPEG-4 SA home pagehttp://sound.media.mit.edu/mpeg4Draft standard, code, mailing lists, matchmaking

[email protected], technical papers, discussion available