MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended...

58
MPEG-7 MPEG-7 overview – What is… – Why? – Objectives and scope – Main elements and organization. MPEG-7 Audio – Low-level features – High-level tools

Transcript of MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended...

Page 1: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

MPEG-7

• MPEG-7 overview– What is…– Why?– Objectives and scope– Main elements and organization.

• MPEG-7 Audio– Low-level features– High-level tools

Page 2: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

What is MPEG-7?• "Multimedia Content Description Interface”• ISO/IEC standard by MPEG (Moving Picture Experts Group)

• Providing meta-data for multimedia• MPEG-1, -2, -4: make content available;

MPEG-7: makes content accessible, retrievable, filterable, manageable (via device / computer).

• Multi-degrees of interpretation of information’s meaning• Support as broad a range of applications as possible.• A compatible (with existing tech) and extensible standard.

Page 3: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Why MPEG-7?

• “The value of information often depends on how easy it can be found, retrieved, accessed, filtered and managed. ”

• Past: poverty of the digital multimedia sources -> Simplicity of the access mechanisms

• Now: growing amount of audiovisual information-> Identifying and managing them efficiently is

becoming more difficult.e.g. “record only news about sport.”

Page 4: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Why MPEG-7?• For future multimedia services, content

representation and description may have to be addressed jointly.

• Many services dealing with content representation will have to deal first with content description– “a non-described content may be useless”

• Need for access only to the content description:– New original services (e.g. optimizing personal time)– Adaptation to networks and terminal capabilities

Page 5: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Application domains• Broadcast media selection (e.g., radio channel, TV

channel).• Digital libraries (e.g., film, video, audio and radio

archives).• E-Commerce (e.g., personalized advertising).• Education (e.g., repositories of multimedia courses,

multimedia search for support material).• Home Entertainment (e.g., management of personal

multimedia collections, including manipulation of content, e.g. karaoke).

• Journalism (e.g. searching speeches of a certain politician using his name, his voice or his face).

• Multimedia directory services (e.g. yellow pages, G.I.S).• Surveillance and remote sensing.

Page 6: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

MPEG-7 ObjectivesStandardize content-based description for various

types of audiovisual information

• Independent from media support (encoding and storage)• Different granularity

– Low-level features: shape, size, key, tempo changes,– High-level semantic info: “scene with a barking brown dog on the

left and with the sound of passing cars in the background.”• Meaningful in the context of the application

– Same material -> different types of features and combinationse.g. timbre v.s. loudness

Page 7: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

MPEG-7 Objectives

• Information about the content– The form: e.g. the coding format used

– Conditions for accessing the material:e.g. Intellectual property rights / price

– Classification: e.g. parental rating

– Links to other relevant materials– The context: “e.g. Olympic Games 1996, final of 200 meter

hurdles, men)”

• Information present in the content:– Combination of low-level and high-level descriptors

Page 8: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Scope of the Standard

processing chain:

Page 9: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

An example of architecture

• Pull: (Client Queries -> Descriptions repository -> Matched Ds)• Push: (Filter descriptions -> Programmed actions)

Page 10: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Where are the descriptions from?• Preservation of existing descriptive data (e.g.

scripts) through production/delivery• Generated automatically by capture devices

(e.g. time or GPS location in a camera)• Extracted automatically & semi-automatically

(i.e. with some human assistance)• Manually produced (e.g. for legacy material such

as existing film archives)

Page 11: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Main Elements of MPEG-7

• Relationship among elements introduced above.

Page 12: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Descriptions

• MPEG-7 approaches the description of content from several viewpoints.

• A set of methods and tools for the different viewpoints of the description (not a monolithic system)

• Interrelated and can be combined in many ways.• Associated with the content itself: (searching, filtering)• Location: (document V.S. stream)

– physically located with the material– somewhere else on the globe (maybe not)

• Interoperability with other metadata standards: (XML)

Page 13: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Major Functionalities• MPEG-7 Systems• MPEG-7 Description Definition Language• MPEG-7 Visual• MPEG-7 Audio• MPEG-7 Multimedia Description Schemes • Reference Software: the eXperimentation Model (test)

• MPEG-7 Conformance (syntax checking)

• MPEG-7 Extraction and use of descriptions (technical report)

Page 14: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

MPEG-7 Audio• Audio provides structures—building upon

some basic structures from the MDS—for describing audio content.

• Low-level Descriptors:– audio features that cut across many applications

• High-level Description Tools:– more specific to a set of applications.

Page 15: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features

Page 16: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)• Basic: (temporally sampled scalar values for general use)

– AudioWaveform Descriptor• waveform envelope: (for display purposes).

– AudioPower Descriptor• temporally-smoothed instantaneous power:

(quick summary of a signal)• Silence segment: (no significant sound)

– aid further segmentation of the audio stream, or as a hint not to process a segment

– Applicable to all kinds of signals

Page 17: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)

• Basic Spectral: (single time-frequency analysis of signal)– AudioSpectrumEnvelope: (Base class)

• the short-term power spectrum:(display, synthesize, general-purpose search)

– AudioSpectrumCentroid: • dominated by high or low frequencies ?

– AudioSpectrumSpread:• the power spectrum centered near the spectral centroid, or spread

out over the spectrum?• pure-tone and noise-like sounds

– AudioSpectrumFlatness: (the presence of tonal components)

Page 18: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)

• Signal Parameters: (periodic or quasi-periodic signals)

– AudioFundamentalFrequency:• “confidence measure”, replacing “pitch-tracking”

– AudioHarmonicity:• distinction between sounds with a

harmonic / inharmonic / non-harmonic spectrum

Page 19: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)• Timbral Temporal: (temporal characteristics of segments

of sounds, musical timbre)– LogAttackTime– TemporalCentroid

• where in time the energy of a signal is focused.• Useful when attack times are identical

T0t

Signal envelope(t)

T1Illustration of log-tack time

Page 20: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)

• Timbral Spectral: (spectral features in a linear-frequency space)– SpectralCentroid:

• power-weighted average of the frequencyof the bins in the linear power spectrum.

• distinguishing musical instrument timbres– 4 Ds for harmonic regularly-spaced components of signals:

• HarmonicSpectralCentroid• HarmonicSpectralDeviation• HarmonicSpectralSpread• HarmonicSpectralVariation

Page 21: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Low-level Features (details)• Spectral Basis: (low-dimensional projections of a spectral space to

aid compactness and recognition)

– AudioSpectrumBasis:• a series of (time-varying / statistically independent) basis functions

derived from the singular value decomposition of a normalized power spectrum.

– AudioSpectrumProjection:• low-d features of a spectrum after projection upon a reduced rank

basis.

– independent subspaces of a spectra correlate strongly with different sound sources.

– Provide more salience using less space.• With Sound Classification and Indexing Description Tools.

Page 22: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (Ds and DSs)

• Exchange some generality for descriptive richness:– a smaller set of audio features (as compared to visual

features) that may canonically represent a sound without domain-specific knowledge.

• Audio Signature (DS)

• Musical Instrument Timbre• Melody• General Sound Recognition and Indexing• Spoken Content

Page 23: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (details)

• Audio Signature Description Scheme– SpectralFlatness Ds– a unique content identifier for the purpose of

robust automatic identification– e.g. audio fingerprinting

Page 24: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (details)

• Musical Instrument Timbre Description Tools– HarmonicInstrumentTimbre Ds:

• LogAttackTime Descriptor– PercussiveIinstrumentTimbre Ds:

• SpectralCentroid Descriptor

Page 25: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (details)

• Melody Description Tools: – efficient, robust, and expressive melodic similarity

matching.– MelodyContour Description Scheme:

• terse, efficient melody contour / rhythm– MelodySequence Description Scheme:

• verbose, complete, expressive melody / rhythm.• Interval encoding

Page 26: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (details)

• General Sound Recognition and Indexing Description Tools: – SoundModel Description Scheme– SoundClassificationModel Description Scheme

• a set of SoundModel DS -> multi-way classifier– SoundModelStatePath Descriptor

• indices to states generated by a SoundModel of a segment

– immediately applied to sound effects– automatically index and segment sound tracks.– Low -> mid -> high level analyses

Page 27: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

High-level audio Description Tools (details)

• Spoken Content Description Tools: – detailed description of words spoken within an

audio stream.– indexing into and retrieval of an audio stream– indexing of multimedia objects annotated with

speech.• Recall of audio/video data by memorable spoken events.

– a character or person spoke a particular word• Spoken Document Retrieval

– separate spoken documents• Annotated Media Retrieval

– photograph retrieved using a spoken annotation

Page 28: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 29: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 30: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 31: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 32: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 33: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 34: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 35: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 36: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 37: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

Instantaneous HarmonicSpectralCentroid

Instantaneous HarmonicSpectralDeviation

Signal

Sliding Analysis Window

STFT

Signal envelope

f0

Harmonic Peaks

Detection

Instantaneous HarmonicSpectralSpread

Temporal Centroid

z-1

Power Spectrum SpectralCentroid

LogAttackTime

Instantaneous HarmonicSpectralVariation

Timbre Descriptor Estimation

Page 38: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 39: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 40: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 41: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 42: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 43: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 44: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 45: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 46: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 47: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 48: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 49: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 50: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 51: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 52: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 53: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 54: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 55: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 56: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 57: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors
Page 58: MPEG-7 - Electronic Engineeringee502/MPEG-7.pdf · MPEG-7 Audio Amendment 2 will include extended functionality of audio metadata that is complementary to low-level audio descriptors

MPEG-7 Audio Amendment 2

will include extended functionality of audio metadatathat is complementary to low-level audio descriptorsin ISO/IEC 15938-4,

providing high level description tools like chord pattern and Rhythm pattern,

both of which support compact representation of timbre and rhythm.