Source Coding: Part I of Fundamentals of Source and Video ... · PDF filePart I of...

download Source Coding: Part I of Fundamentals of Source and Video ... · PDF filePart I of Fundamentals of Source and Video Coding ... source coding techniques that are found in a ... The

If you can't read please download the document

Transcript of Source Coding: Part I of Fundamentals of Source and Video ... · PDF filePart I of...

  • Foundations and TrendsR insampleVol. 1, No 1 (2011) 1217c 2011 Thomas Wiegand and Heiko SchwarzDOI: xxxxxx

    Source Coding:

    Part I of Fundamentals of Sourceand Video Coding

    Thomas Wiegand1 and Heiko Schwarz2

    1 Berlin Institute of Technology and Fraunhofer Institute for Telecommunica-

    tions Heinrich Hertz Institute, Germany, [email protected] Fraunhofer Institute for Telecommunications Heinrich Hertz Institute,

    Germany, [email protected]

    Abstract

    Digital media technologies have become an integral part of the way we

    create, communicate, and consume information. At the core of these

    technologies are source coding methods that are described in this text.

    Based on the fundamentals of information and rate distortion theory,

    the most relevant techniques used in source coding algorithms are de-

    scribed: entropy coding, quantization as well as predictive and trans-

    form coding. The emphasis is put onto algorithms that are also used in

    video coding, which will be described in the other text of this two-part

    monograph.

  • To our families

  • Contents

    1 Introduction 1

    1.1 The Communication Problem 3

    1.2 Scope and Overview of the Text 4

    1.3 The Source Coding Principle 5

    2 Random Processes 7

    2.1 Probability 8

    2.2 Random Variables 9

    2.2.1 Continuous Random Variables 10

    2.2.2 Discrete Random Variables 11

    2.2.3 Expectation 13

    2.3 Random Processes 14

    2.3.1 Markov Processes 16

    2.3.2 Gaussian Processes 18

    2.3.3 Gauss-Markov Processes 18

    2.4 Summary of Random Processes 19

    i

  • ii Contents

    3 Lossless Source Coding 20

    3.1 Classification of Lossless Source Codes 21

    3.2 Variable-Length Coding for Scalars 22

    3.2.1 Unique Decodability 22

    3.2.2 Entropy 27

    3.2.3 The Huffman Algorithm 29

    3.2.4 Conditional Huffman Codes 31

    3.2.5 Adaptive Huffman Codes 33

    3.3 Variable-Length Coding for Vectors 33

    3.3.1 Huffman Codes for Fixed-Length Vectors 34

    3.3.2 Huffman Codes for Variable-Length Vectors 36

    3.4 Elias Coding and Arithmetic Coding 40

    3.4.1 Elias Coding 41

    3.4.2 Arithmetic Coding 47

    3.5 Probability Interval Partitioning Entropy Coding 52

    3.6 Comparison of Lossless Coding Techniques 61

    3.7 Adaptive Coding 62

    3.8 Summary of Lossless Source Coding 64

    4 Rate Distortion Theory 66

    4.1 The Operational Rate Distortion Function 67

    4.1.1 Distortion 68

    4.1.2 Rate 70

    4.1.3 Operational Rate Distortion Function 70

    4.2 The Information Rate Distortion Function 72

    4.2.1 Mutual Information 72

    4.2.2 Information Rate Distortion Function 76

    4.2.3 Properties of the Rate Distortion Function 80

    4.3 The Shannon Lower Bound 81

    4.3.1 Differential Entropy 81

    4.3.2 Shannon Lower Bound 85

    4.4 Rate Distortion Function for Gaussian Sources 89

    4.4.1 Gaussian IID Sources 90

    4.4.2 Gaussian Sources with Memory 91

  • Contents iii

    4.5 Summary of Rate Distortion Theory 98

    5 Quantization 100

    5.1 Structure and Performance of Quantizers 101

    5.2 Scalar Quantization 104

    5.2.1 Scalar Quantization with Fixed-Length Codes 106

    5.2.2 Scalar Quantization with Variable-Length Codes 111

    5.2.3 High-Rate Operational Distortion Rate Functions 119

    5.2.4 Approximation for Distortion Rate Functions 125

    5.2.5 Performance Comparison for Gaussian Sources 127

    5.2.6 Scalar Quantization for Sources with Memory 129

    5.3 Vector Quantization 133

    5.3.1 Vector Quantization with Fixed-Length Codes 133

    5.3.2 Vector Quantization with Variable-Length Codes 137

    5.3.3 The Vector Quantization Advantage 138

    5.3.4 Performance and Complexity 142

    5.4 Summary of Quantization 144

    6 Predictive Coding 146

    6.1 Prediction 148

    6.2 Linear Prediction 152

    6.3 Optimal Linear Prediction 154

    6.3.1 One-Step Prediction 156

    6.3.2 One-Step Prediction for Autoregressive Processes 158

    6.3.3 Prediction Gain 160

    6.3.4 Asymptotic Prediction Gain 160

    6.4 Differential Pulse Code Modulation (DPCM) 163

    6.4.1 Linear Prediction for DPCM 165

    6.4.2 Adaptive Differential Pulse Code Modulation 172

    6.5 Summary of Predictive Coding 174

    7 Transform Coding 176

    7.1 Structure of Transform Coding Systems 179

    7.2 Orthogonal Block Transforms 180

  • iv Contents

    7.3 Bit Allocation for Transform Coefficients 187

    7.3.1 Approximation for Gaussian Sources 188

    7.3.2 High-Rate Approximation 190

    7.4 The Karhunen Loeve Transform (KLT) 191

    7.4.1 On the Optimality of the KLT 193

    7.4.2 Asymptotic Operational Distortion Rate Function 197

    7.4.3 Performance for Gauss-Markov Sources 199

    7.5 Signal-Independent Unitary Transforms 200

    7.5.1 The Walsh-Hadamard Transform (WHT) 201

    7.5.2 The Discrete Fourier Transform (DFT) 201

    7.5.3 The Discrete Cosine Transform (DCT) 203

    7.6 Transform Coding Example 205

    7.7 Summary of Transform Coding 207

    8 Summary 209

    Acknowledgements 212

    References 213

  • 1

    Introduction

    The advances in source coding technology along with the rapid develop-

    ments and improvements of network infrastructures, storage capacity,

    and computing power are enabling an increasing number of multime-

    dia applications. In this text, we will describe and analyze fundamental

    source coding techniques that are found in a variety of multimedia ap-

    plications, with the emphasis on algorithms that are used in video cod-

    ing applications. The present first part of the text concentrates on the

    description of fundamental source coding techniques, while the second

    part describes their application in modern video coding.

    The application areas of digital video today range from multi-

    media messaging, video telephony, and video conferencing over mo-

    bile TV, wireless and wired Internet video streaming, standard- and

    high-definition TV broadcasting, subscription and pay-per-view ser-

    vices to personal video recorders, digital camcorders, and optical stor-

    age media such as the digital versatile disc (DVD) and Blu-Ray disc.

    Digital video transmission over satellite, cable, and terrestrial channels

    is typically based on H.222.0/MPEG-2 systems [37], while wired and

    wireless real-time conversational services often use H.32x [32, 33, 34] or

    SIP [64], and mobile transmission systems using the Internet and mo-

    1

  • 2 Introduction

    bile networks are usually based on RTP/IP [68]. In all these application

    areas, the same basic principles of video compression are employed.

    EncoderChannel Modu-

    lator

    DecoderDemodu-Channel

    Capture

    lator

    Video

    Human

    SceneVideo

    Encoder

    Error Control ChannelVideoCodec

    b

    bs

    s

    VideoDecoder

    Channel

    VideoDisplayObserver

    Fig. 1.1 Typical structure of a video transmission system.

    The block structure for a typical video transmission scenario is il-

    lustrated in Fig. 1.1. The video capture generates a video signal s that

    is discrete in space and time. Usually, the video capture consists of a

    camera that projects the 3-dimensional scene onto an image sensor.

    Cameras typically generate 25 to 60 frames per second. For the con-

    siderations in this text, we assume that the video signal s consists of

    progressively-scanned pictures. The video encoder maps the video sig-

    nal s into the bitstream b. The bitstream is transmitted over the error

    control channel and the received bitstream b is processed by the videodecoder that reconstructs the decoded video signal s and presents itvia the video display to the human observer. The visual quality of the

    decoded video signal s as shown on the video display affects the view-ing experience of the human observer. This text focuses on the video

    encoder and decoder part, which is together called a video codec.

    The error characteristic of the digital channel can be controlled by

    the channel encoder, which adds redundancy to the bits at the video

    encoder output b. The modulator maps the channel encoder output

    to an analog signal, which is suitable for transmission over a phys-

    ical channel. The demodulator interprets the received analog signal

    as a digital signal, which is fed into the channel decoder. The chan-

    nel decoder processes the digital signal and produces the received bit-

    stream b, which may be identical to b even in the presence of channelnoise. The sequence of the five components, channel encoder, modula-

  • 1.1. The Communication Problem 3

    tor, channel, demodulator, and channel decoder, are lumped into one

    box, which is called the error control channel. According to Shannons

    basic work [69, 70] that also laid the ground to the subject of this text,

    by introducing redundancy at the channel encoder and by introducing

    delay, the amount of transmission errors can be controlled.

    1.1 The Communication Problem

    The basic communication problem may be posed as conveying source

    data with the highest fidelity possible without exceeding an available

    bit rate, or it may be posed as conveying the source data using the

    lowest bit rate possible while maintaining a specified reproduction fi-

    delity [69]. In either case, a fundamental trade-off is made between bit

    rate and signal fidelity. The ability of a source coding system to suit-

    able choose