Data Formats and Codecs

62
Data Formats and Data Formats and Codecs Codecs 30/8 – 2004 INF 5070 – Media Storage and Distribution Systems

description

INF 5070 – Media Storage and Distribution Systems. Data Formats and Codecs. 30/8 – 2004. Why codecs and formats?. Codecs (coders/decoders) Determine how information is represented Important for servers and distribution systems Required sending speed Amount of loss allowed Buffers required - PowerPoint PPT Presentation

Transcript of Data Formats and Codecs

Page 1: Data Formats and Codecs

Data Formats and Data Formats and CodecsCodecs

30/8 – 2004

INF 5070 – Media Storage and Distribution Systems

Page 2: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Why codecs and formats? Codecs (coders/decoders)

Determine how information is represented Important for servers and distribution systems

Required sending speed Amount of loss allowed Buffers required …

Formats Determine how data is stored Important for servers and distribution systems

Where is the data? Where is the data about the data?

Page 3: Data Formats and Codecs

Media data

Page 4: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Media data Medium: "Thing in the middle“

here: means to distribute and present information Media affect human computer interaction

The mantra of multimedia users Speaking is faster than writing Listening is easier than reading Showing is easier than describing

Page 5: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Dependence of Media Time-independent media

Text Graphics Discrete media

Time-dependent media Audio Video Continuous media

Interdependant media Multimedia

"Continuous" refers to the user’s impression of the data, not necessarily to its representation

Combined video and audio is multimedia - relations must be specified

Page 6: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Dependence of Media Defined by the presentation of the data, not its

representation

Discrete media Text Graphics Video stills (image displayed by pausing a video stream)

Continuous media Audio Video Animation Ticker news (continuously scrolling text)

Multimedia Multiplexed audio and video

Page 7: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Properties of a Multimedia System Flexibility

Provide mechanisms to handle all kinds of media, in particular, discrete and continuous media

A VCR and a desktop publishing system for text and graphics are no multimedia systems

An editor with voice annotation is a multimedia system

Integration Independent media storage Computer-controlled media combination

DefinitionA multimedia system is characterized by the

integrated computer-controlled handling of independent discrete and continuous media

Page 8: Data Formats and Codecs

Coding for distribution

Page 9: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Compression - Necessity

E.g., video sequence 25 images/sec.

PAL standard 3 byte/pixel

YUV (luminance + 2 chrominance values) RGB (red-green-blue values)

Image resolution 640 * 480 pixel Data rate = 640 * 480 * 3 Byte * 25/s = 23040000 byte/s

~ 22 MByte/s Approx. 1/16 stream over Ethernet Approx. 1/2 stream over Fast Ethernet

Compression is necessary

Page 10: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Compression – General Requirements

Dependence on application type: Dialogue mode Retrieval mode

Page 11: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Compression – Mode Dependent Requirements

Dialogue and retrieval mode requirements: Synchronization of audio, video, and

other media

Dialogue mode requirements: End-to-end delay < 150ms Compression and decompression in real-

time Symmetric

Retrieval mode requirements: Fast forward and backward data retrieval Random access within 1/2 s Asymmetric

We look mainly at retrieval mode!

Page 12: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Compression Categories

Page 13: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Basic Encoding Steps

Page 14: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Run-Length Coding Assumption

Long sequences of identical symbols

Example

Page 15: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Bit-Plane Coding Assumption

Even longer sequences of identical bits

Example

1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB)

0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0, … ,0,0 (MSB-1)

1,0,1,0,0,1,0,1,1,0,0,1,0,0,0,0, … ,0,0 (MSB-2)

0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0, … ,0,0 (MSB-3)

10,0,6,0,0,3,0,2,2,0,0,2,0,0,1,0, … ,0,0 (absolute)

0,x,1,x,x,1,x,0,0,x,x,1,x,x,0,x, … ,x,x (sign bits)

(0,1)

(2,1)

(0,0)(1,0)(2,0)(1,0)(0,0)(2,1)

(5,0)(8,1)

Up to 20% savings over run-length coding can be achieved

No 0s before a 1End of plane

Page 16: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Huffman Coding Assumption

Some symbols occur more often than others E.g., character frequencies of the English language

Fundamental principle Frequently occurring symbols are coded with shorter

bit strings

Page 17: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Huffman Coding Example

Characters to be encoded: A, B, C, D, E

Probability to occur: p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15

Page 18: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Huffman Table and example of application to data stream

Page 19: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

JPEG “JPEG”: Joint Photographic Expert Group

International Standard: For digital compression and coding of continuous-tone still

images: Gray-scale Color

Since 1992

Joint effort of: ISO/IEC JTC1/SC2/WG10 Commission Q.16 of CCITT SGVIII

Compression rate of 1:10 yields reasonable results

Page 20: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

JPEG Very general compression scheme

Independence of Image resolution Image and pixel aspect ratio Color representation Image complexity and statistical characteristics

Well-defined interchange format of encoded data

Implementation in Software only Software and hardware

Page 21: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

JPEG Sequence of compression steps

Different resolutions possible Lossy or lossless mode

lossless compression factor ~1,6:1 Symmetrical codec

Page 22: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

JPEG – Baseline Mode: Quantization Use of quantization tables for the DCT-coefficients

Map interval of real numbers to one integer number Allows to use different granularity for each coefficient

Page 23: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

JPEG – 4 Modes of Compression

Page 24: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Motion JPEG Use series of JPEG frames to encode video

Pro Lossless mode – editing advantage Frame-accurate seeking – editing advantage Arbitrary frame rates – playback advantage Arbitrary frame skipping – playback advantage Scaling through progressive mode – distribution

advantage Min transmission delay = 1/framerate – conferencing

advantage Supported by popular frame grabbers

Contra Series of JPEG-compressed images No standard, no specification

Worse, several competing quasi-standards No relation to audio No inter-frame compression

Page 25: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

H.261 (px64) International Standard

Video codec for video conferences at p x 64kbit/s (ISDN): Real-time encoding/decoding, max. signal delay of 150ms Constant data rate

Intraframe coding DCT as in JPEG baseline mode

Interframe coding, motion estimation

Search of similar macroblock in previous image and compare Position of this macroblock defines motion vector Difference between similar macroblocks

Page 26: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG (Moving Pictures Expert Group)

International Standard: Compression of audio and video for playback (1.5 Mbit/s): Real-time decoding

Sequence of I-, P-, and B-Frames:

Random access at I-frames at P-frames: i.e. decode previous I-frame first at B-frame: i.e. decode I and P-frames first

Page 27: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-2 From MPEG-1 to MPEG-2

Improvement in quality From VCR to TV to HDTV

No CD-ROM based constraints Higher data rates

MPEG-1: about 1.5 MBit/s MPEG-2: 2-100 MBit/s

Evolution 1994: International Standard Also later known as H.262 Prominent role for digital TV in DVB (digital video

broadcasting) and DVD (digital video disk) Commercial MPEG-2 realizations available

Page 28: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-2 Beyond MPEG-1:

Higher quality encoding Higher data rates Interleaved modes

Use cases Broadcast quality production

DVB-T: Terrestrial DVB-S: Satellite DVB-C: Cable

Program Stream for post-processing, storage, and DVD distribution

Transport Stream for broadcasting, error resilience

Scaling: Signal to Noise Ration (SNR) scaling - progressive compression

error correcting codes Spatial scaling - several pixel resolutions Temporal scaling - frame dropping

Page 29: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4 MPEG-4 (ISO 14496) originally

Targeted at systems with very scarce resources To support applications like

Mobile communication Videophone and E-mail

Max. data rates and dimensions (roughly) Between 4800 and 64000 bits/s 176 columns x 144 lines x 10 frames/s

Further demand To provide enhanced functionality to allow for

analysis and manipulation of image contents

Page 30: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4

Hence: find standardized ways to Represent units of aural, visual or audiovisual content

audio/visual objects" or AVOs object coding independent of other objects, surroundings and

background natural and synthetic objects

Compose these objects together i.e. creation of compound objects that form audiovisual scenes

Multiplex and synchronize the data associated with AVOs for transportation over network channels providing a QoS (Quality-of-

Service) Interact with the audiovisual scene generated at the decoder’s site

Page 31: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Scope Definition of

„System Decoder Model“ specification for decoder implementations

Description language binary syntax of an AV object’s bitstream representation scene description information

Corresponding concepts, tools and algorithms, especially for

content-based compression of simple and compound audiovisual objects

manipulation of objects transmission of objects random access to objects animation scaling error robustness

Page 32: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Scope Targeted bit rates for video and audio:

VLBV core „Very Low Bit-rate Video“ 5 - 64 Kbit/s image sequences with CIF resolution and up to 15 frames/s

Higher-quality video 64 Kbit/s - 4 Mbit/s quality like digital TV

Natural audio coding 2 - 64 Kbit/s

Page 33: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Video and Image Encoding

Encoding / decoding of Rectangular images and video

coding similar to MPEG-1/2 motion prediction texture coding

Images and video of arbitrary shape as done in conventional approach

8x8 DCT or shape-adaptive DCT plus coding of shape and transparency information

Encoder Must generate timing information

speed of the encoder clock = time base desired decoding times and/or expiration times

by using time stamps attached to the stream Can specify the minimum buffer resources needed for decoding

Page 34: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Composition of Scenes Scene description includes:

Tree to define hierarchical relationships between objects

Objects’ positions in space and time by converting the objects’ local coordinate system into a global

coordinate system Attribute value selection

e.g. pitch of sound, color, texture, animation parameters Description based on some VRML concepts

VRML = „Virtual Reality Modeling Language“ Interaction with scenes

e.g. change viewing point, drag object, start/stop streams, select language

Page 35: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Example of a Composition

Page 36: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Synthetic Objects Visual objects:

Virtual parts of scenes e.g. virtual background

Animation e.g. animated faces

Audio objects: „Text-to-speech“

speech generation from given text and prosodic parameters face animation control

„Score driven synthesis“ music generation from a score more general than MIDI

Special effects

Page 37: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4: Error Handling Mobile communication:

Low bit-rate (< 64 Kbps) Error-prone

MPEG-4 concepts for error handling: Resynchronization

enables receiver to „tune in“ again based on markers within bitstream

Data recovery enables receiver to reconstruct lost data encode data in an error-resilient manner

Error concealment enables receiver to bridge gaps in data e.g. by repeating parts of old frames

Page 38: Data Formats and Codecs

Network-aware coding

Page 39: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Network-aware coding Adapt to reality of the Internet

Content Is created once, off-line Is sent many times, under different circumstances

No guarantees concerning Throughput Jitter Packet loss

Sending rate Must adhere to rules Often: don’t send more than TCP would

Can’t send at the best available encoding rate

Page 40: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Approaches Simulcast Scalable coding

SNR Scalability Temporal Scalability Spatial Scalability Fine Grained Scalability

Multiple Description Coding

Page 41: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

3 simulcast rates

Simulcast Choose a set of sending

rates During content creation

Encode content in best possible quality below that sending rate

During transmission Choose version with the

best admissable quality

Sending rate

Qu

alit

y

Best possi

ble quality at p

ossible se

nding rate

Single rate codec

Page 42: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Scalable coding Typically used as

Layered coding

A base layer Provides basic quality Must always be transferred

One or moreenhancement layers Improve quality Transferred if possible

Sending rate

Qu

alit

y

Best possi

ble quality at p

ossible se

nding rate

Base layer

Enhancement layer

Page 43: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Temporal Scalability Frames can be dropped

In a controlled manner Frame dropping does not violate dependancies Low gain example: B-frame dropping in MPEG-1

Page 44: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Spatial Scalability Idea

Base layer Downsample the original image (code only 1 pixel instead of

4) Send like a lower resolution version

Enhancement layer Subtract base layer pixels from all pixels Send like a normal resolution version

If enhancement layer arrives at client Decode both layers Add layers

72

75 83

6173

-1 -12

2 10

Base layer

Enhancement layerBetter compression due to low values

Less data to code

Page 45: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Spatial Scalability

raw video base layer

enhancement layer

DS - downsamplingDCT – discrete cosine transformationQ – quantizationVLC – variable length coding

-

+

DS DCT Q VLC

DS DCT Q VLC

DCT Q VLC

enhancement layer 2

-

+

Page 46: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

SNR Scalability SNR – signal-to-noise ratio Idea

Base layer Is regularly DCT encoded A lot of data is removed using quantization

Enhancement layer is regularly DCT encoded Run Inverse DCT on quantized base layer Subtract from original DCT encode the result

If enhancement layer arrives at client Add base and enhancement layer before running Inverse

DCT

Page 47: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

SNR Scalability

IQ+

raw video base layer

enhancementlayer

DCT – discrete cosine transformationQ – quantizationIQ – inverse quantizationVLC – variable length coding

DCT Q VLC

Q VLC

-

Page 48: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Fine Grained Scalability Idea

Cut of compressed tail bits of samples Base layer

As in SNR coding Enhancement layer

Use bit-plane coding for enhancement layerinstead of run-level coding

Cut tail bits off until data rate is reached

Page 49: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Fine Grained ScalabilityMSB (0,1)

MSB-1 (2,1)

MSB-2 (0,0)(1,0)(2,0)(1,0)(0,0)(2,1)

MSB-3 (5,0)(8,1)

Sending rate

Qu

alit

y

Best possi

ble quality at p

ossible se

nding rate

Goal of FGS

Page 50: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Fine Grained Scalability

IQ+

raw video base layer

enhancementlayer

DCT – discrete cosine transformation

Q – quantizationIQ – inverse quantizationVLC – variable length codingBC – bitplane coding

DCT Q VLC

Q BC

-

Page 51: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Fine Grained Scalability

+

raw videobase layer

enhancementlayer

DCT

-

Q

Q

IQ

BC

VLC

IQIDCTMotion

Estimation

+

Motion vectors

Page 52: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Multiple Description Coding Idea

Encode data in two streams Each stream has acceptable quality Both streams combined have good quality The redundancy between both streams is low

Problem The same relevant information must exist in both

streams Old problem: started for audio coding in telephony Currently a hot topic

Page 53: Data Formats and Codecs

Multimedia File Formats

Page 54: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Overview File formats

Define the storage of media data on disks Specify synchronization Specify timing Contain metadata

They allow Interchange of data without interpretation

Copying Platform independance

Management Editing Retrieval for presentation

Needed for all asynchronous applications

Page 55: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

File Format Examples

Streaming format File format and wire format are identical MPEG-1, DVI

Streamable format File format specifies wire format(s) MPEG-4, Quicktime, Windows Media, Real Video

Page 56: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

RTP Recorder Solution Pragmatic generic solution

Stores and sends all MBone sessions No interpretation of data Interpretation of network timestamps Derivation of synchronity information

data

indexfile

file

ReceiverReceiver

MBone VCR

Sender

Sender

data

indexfile

file

Page 57: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Stored Motion JPEG Motion JPEG Chunk File Format (UC Berkeley)

Specifies entire clip’s length in s+ns Contains sequence of images Each image in Independent JPEG Group’s JFIF format

AVI MJPEG DIB (Microsoft) Supports audio interleaving Time-stamped data chunks One frame per AVI RIFF data chunk Hack for .le size > 1GB

Quicktime (Apple) Dedicated tracks for interleaving and timing One frame per field Several fields per sample Formats A: full JFIF images, B: QT headers and data only

Page 58: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Quicktime File Format

Run-time choice of tracks availability of codecs bandwidth language

Quicktime file

sample

sample

sample

sample

media

med

ia

med

ia

locationsize order <prop>track

edit list

locationsize order <prop>track

edit list

track track choice track choice

default rateduration copyright author

sample

Page 59: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

MPEG-4 File Format

trak (audio)

trak (BIFS)

trak (OD)

trak (video)

MP4 file

moov

other

1st OD

atoms...

mdat

interleaved, time-orderedBIFS, OD, video, and audioaccess units

MP4 file

mdat

video and audio access unitssome units may be unorderedsome units may be unused

mdat mdatmdatmoov

media file

BIFS access unitssome units may be unorderedsome units may be unused

Page 60: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Other File Formats Real Video

Not published Supports various codecs Supports various encoding formats per .le Supports dynamic selection Supports dynamic scaling ("stream thinning")

AVI AVI is published Uses Resource Interchange File Format (RIFF) Supports various codecs

ASF / Windows Media File Format Submitted as MPEG-4 proposal (but refused) ASF files can include Windows binary code ASF is patented in the USA

Page 61: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

Summary Storage and distribution system must support:

Discrete media such as text and graphics Continuous media such as audio and video Interrelated Multiplexed media

Encoding Format and File Format must be distinguished Separation of file format and wire format Streamable files vs. streaming format

Trend towards Formats that define presentation environments Interaction of encoding format and application Interaction of client and server

Influence on Distribution Systems?

Page 62: Data Formats and Codecs

2004 Carsten Griwodz & Pål Halvorsen

INF 5070 – media servers and distribution systems

References Ralf Steinmetz, Klara Nahrstedt: Multimedia Fundamentals, Volume I: Media Coding and Content

Processing (2nd Edition), Prentice Hall, 2002, ISBN 0130313998 Touradj Ebrahimi (Ed.), Fernando Pereira, The MPEG-4 Book, Prentice Hall, 2002, ISBN

0130616214 Weiping Li, Overview of Fine Granularity Scalability in MPEG-4 Video Standard, IEEE

Transactions on Circuits and Systems for Video Technology, 11(3), Mar. 2001 Vivek K. Goyal, Multiple Description Coding: Compression Meets the Network, IEEE Signal

Processing Magazine, Sep. 2001