H264 Tutorial
date post
10Dec2015Category
Documents
view
11download
3
Embed Size (px)
description
Transcript of H264 Tutorial
8th Texas Instruments Developer Conference India30 Nov  1 Dec 2005, Bangalore
Dinesh Kumar (Senior Design engineer)Pavan Shastry (Design engineer)
Anirban Basu (Senior Design engineer)MMCodec TI India
Overview of the H.264/AVC
Scope and Context H.264 overview Profiles New Features Comparison with existing standards AVC coding tools
CAVLC CABAC Intra predication Loop Filter
Agenda
Scope and Context H.264/AVC is a joint project of ITU and
MPEG, Standard defines:
Decoder functionality (but not encoder) File and stream structure
It aims to provide highquality compression for various services: IP streaming media (501500 kbps) SDTV and HDTV Broadcast and videoondemand (1  8+
Mbps) DVD Conversational services (
H.264 Encoder block diagram
EntropyCoding
Scaling & Inv. Transform
MotionCompensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.

Intraframe Prediction
DeblockingFilter
OutputVideoSignal
ProfilesH.264 has three profiles:
Baseline (IP Video phone, Simple streaming)
Main Profile (Broadcast, VOD)
Extended profile (Streaming profiles)
ProfilesThere are four High Profiles (Fidelity range
extensions) High Profile is to support the 8bit video with 4:2:0 sampling
for applications using high resolution. High 10 Profile is to support the 4:2:0 sampling with up to 10
bits of representation accuracy per sample.
Profiles High 4:2:2 Profile is to support up to 4:2:2
chroma sampling and up to 10 bits per sample. High 4:4:4 Profile is to support up to 4:4:4
chroma sampling, up to 12 bits per sample, and integer residual color transform for coding RGB signal.
Mainly targeting the HD DVD, Broadcast, Editing, Digital still camera market
Features Variable block size
motion compensation.
Quarter sample accuracy.
Unrestricted motion vector.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
FeaturesSupport for multiple reference pictures. It gives significant compression when motion is periodic in nature.
Features Weighted prediction.
Different weight can be given to the signalEx: X1 and X2 are prediction signals (B  frame).
w1 and w2 are the weights, Then Final predicted value = w1 * x1 + w2 * x2.
Intra prediction. Many new modes Happens in spatial domain
4*4 Transform 16 bit word length transform. Bit exact transform. Can be implemented by add and shift.
Features
Arithmetic coding. Used to create a very powerful technique CABAC
Parameter set structure Loss of few key bits can have severe negative impact Header information handled separately in a more flexible
manner.
Flexible slice size. Increase coding efficiency by reducing the header data.
Features PAFF (Picture adaptive frame/field)
Combine the two fields together and to code them as one single coded frame (frame mode).
Not combine the two fields and to code them as separate coded fields (field mode).
MBAFF (Macro block adaptive frame/field) The decision of field/frame happens at macro
block pair level.
Features NAL unit syntax structure
Each structure Is placed into logical data unit depending on the network.
Features Flexible macro block ordering
Picture can be partitioned into regions (slices) Each region can be decoded independently.
Features Arbitrary slice ordering.
Since each slice can be decoded independently. It can be sent out of order
It reduces end to end delay in some networks.
Redundant pictures Encoder has the flexibility to send redundant
pictures. These pictures can be used during loss of data.
Comparison
4*4,8*8(High Profile)4*4,4*8,8*4,8*88*8 DCTTransform
YesYesNo (Optional)Deblocking Filter
YesNoNoWeighted Prediction
Multiple picturesTwo (Interlace)One pictureReference frame
CAVLC,CABACVLCVLCEntropy coding
Intra Prediction (Spatial Domain)
Ac Prediction (Transform Domain)Ac Prediction (Transform Domain)Intra Prediction
4*4,4*8,8*4,8*8, 8*16,16*8,16*1616*16, 16*8, 8*8 , 4*416*16, 8*8
Prediction Block size
H.264WMV9MPEG4Feature
Comparison
Entropy Coding Tools of H.264
Two modes in H.264
Entropy coding mode = 0 : Residual block coded using Context Adaptive Variable Length Coding (CAVLC) and other syntax elements are coded using ExpGolomb codes.
Entropy coding mode = 1 : Residual block and other syntax elements are coded using Context Adaptive Binary Arithmetic Coding (CABAC).
Baseline Profile has support for mode 0 only. Main Profile and higher profiles support mode 0 and mode 1.
Context Adaptive Variable Length Coding (CAVLC)
Past Deficiencies The usage of fixed VLC tables does not allow an adaptation
to the actual symbol statistics.
Entropy coding such as MPEG2, H.263, MPEG4 (SP) is based on fixed tables of VLCs i.e Probability distribution is static
Since there is a fixed assignment of VLC tables and syntax elements, existing intersymbol redundancies cannot be exploited i.e. no conditional probabilities are used
Context Adaptive VLC of Residual Coefficients
Runlength encoding is used. Number of nonzero coefficients and trailing ones
(coeff_token). Sign of each trailing one. Levels of the remaining nonzero coefficients. Total number of zeros before the last coefficient. Each run of zeros.
Example of encoding a 4x4 residual block
Example of decoding
Example Courtesy: www.vcodex.com (H.264 white papers)
Adapting based on Context Encoding of coefficients
in reverse zigzag order Seven VLC tables Level_VLC0 biased
towards lower magnitude, Level_VLC1 biased towards slightly higher magnitude and so on.
Start with Level_VLC0 table
If the magnitude of the coefficient is larger than a predefined threshold, switch to next table.
Level_VLC648Level_VLC524Level_VLC412Level_VLC36Level_VLC23Level_VLC10Level_VLC0
Threshold to increment
table
Current table
ExpGolomb entropy coding
Variable Length codes with a regular construction Easy to parse Optimal for one / twosided geometric pdfs No need for tables Syntax elements covered mb_type,
sub_mbtype, MVD, ref_idx etc.
ExpGolomb code words Format :
[M zeros] [1] [M bits of info] Code_num = 2M + info 1 For signed numbers another
level of mapping is used 2 + 1 101122 + 0 10101
8 + 0 100010007 4 + 3 10011164 + 2 1001105
4 + 1 10010144 + 0 1001003
1 + 0 110
2M + info 1
CodewordCode_num
53
42
32
21
11
00
Code_numV
Context Adaptive Binary Arithmetic Coding (CABAC)
Arithmetic Coding Basics
From an Information Theory point of view Minimum no of bits for a symbol = log2p, where p is probability of occurrence of the symbol.
Huffman coding takes fractional number of bits, unless the probabilities are powers of .5.
Further, if the probability is more than .5 compulsorily one bit is allocated.
Solution would be to allocate bits for multiple symbols.
Arithmetic coding achieves this by coding a group of symbols together.
Arithmetic Coding Basics Basic Idea: Working
interval is divided based on probability of symbols and the subinterval corresponding to the current symbol is used as interval for the next step.
Example: Source emitting five possible symbols A, E, I, O, U. Sequence of symbols shown is IOUA.
Moving towards CABAC
Usage of adaptive probability models
Restriction to binary arithmetic coding, for Simple and fast adaptation mechanism
Exploiting symbol correlations by using contexts
CABAC Framework
binarization context modeling arithmetic coding
Binarization schemes Unary Binarization: For each unsigned integer valued symbol x>=0,
the code word consists of x 1 bits followed by a terminating 0 bit. Example  U: 6 => 1111110
Truncated unary (TU) code: This is only defined for x with 0 111111111
FixedLength Binarization
kth order ExpGolomb Binarization
Concatenated schemes  TU Code UEGk Code
Context Modeling Encode the bin based on
context info. Four types of context
models : Based on neighboring blocks info
 Based on previous bins Based on scanning position (Significance map)
 Based on previously encoded levels (Transform coefficients)
context template consisting of two neighboring syntax element A and Bto the left and on top of the current syntax element C.
Context Modeling Example
Context selection for mvd


Other bins
5352515047/ 48/ 49 mvdy4645444340/ 41/ 42mvdx
Bin >= 4 &
CAVLC Vs CABAC
Typically CABAC provides 1015 % reduction in bit rate compared to CAVLC, for the same PSNR.
Mobile_D1_90frames
29
30
31
32
33
34
35
2000 3000 4000 5000Bitrate (Kbps)
P
S
N
R
(
Y
)
With CAVLC
With CABAC
INTRA PREDICTION IN H.264
Agenda What i