FAST MODE DECISION IN H264/AVC VIDEO CODEC NIRANJAN MULAY (0393251) CHEN GAO(0401840) (EL6123:...

35
FAST MODE DECISION IN H264/AVC VIDEO CODEC NIRANJAN MULAY (0393251) CHEN GAO(0401840) (EL6123: PROJECT PRESENTATION) 05/06/2010

Transcript of FAST MODE DECISION IN H264/AVC VIDEO CODEC NIRANJAN MULAY (0393251) CHEN GAO(0401840) (EL6123:...

FAST MODE DECISION IN H264/AVC VIDEO CODEC

NIRANJAN MULAY (0393251)CHEN GAO(0401840)

(EL6123: PROJECT PRESENTATION)

05/06/2010

Outline:

Introduction to H.264/AVC coding standardMode decisions in H.264/AVC - Intra Block - Inter Block

RDO algorithm and the need for FMD FMD (for Intra and Inter)

Literature survey: edge-map based FMD Study of x264 code and encoding options Implementation:

-Generation of MB mode statistics file from X264 -Visualize the modes in Matlab -Intra FMD; Inter FMD

Summary and future work

Introduction to H.264/AVC Coding StandardThe key features of H.264: Improved Intra prediction: Directional spatial

prediction Enhanced Temporal Prediction:

-Motion compensation with variable block sizes from 4x4 to 16x16: reduces ‘prediction error’

-Quarter-pel accurate motion estimation

-Multiple reference for motion estimation

-Weighted prediction (for B and P frames) DCT-like integer transform: No mismatch between

encoder and decoder

Introduction to H.264/AVC Coding Standard(Cntd) Efficient entropy coding:

-Uses arithmetic entropy coding, has option for VLC coding-Context adaptive entropy coding: 2 options – CAVLC and CABAC

Variable size (primarily 4x4 along with 8x8,16x16) transform:

- Smaller size helps to represent a signal in locally adaptive manner which reduces ringing artifacts.

- Generally high frequency=> 4x4 and low frequency=> 16x16

In-loop deblocking filter: Reduces blocking artifacts, improves quality.

Special Error Resilient Tools

H.264 Intra Modes:

Intra 4x4 : useful for a MB with significant detail

Intra 16x16 : good for coding very smooth areas

(Intra 8x8 chroma: similar to intra 16x16)

I_PCM : no prediction or transform

‘Intra 16x16’:

Mode 0 (vertical): extrapolation from upper samples. Mode 1 (horizontal): extrapolation from left samples. Mode 2 (DC): mean of upper and left-hand samples. Mode 3 (Plane): plane prediction based on a linear

spatial interpolation by using the upper and left-hand samples of the MB.

‘Intra 4x4’:

Figure:4x4 luma prediction mode

Intra 4x4(Cntd):

Mode 0: Vertical Mode 1: Horizontal Mode 2: DC prediction Mode 3: Diagonal down-

left Mode 4: Diagonal down-

right Mode 5: Vertical-right Mode 6: Horizontal-down Mode 7: Vertical-left Mode 8: Horizontal-up

H.264 Inter Modes:

Hierarchical Decision

Level-1 (Partition): Compute RD-cost for:16x16, 16x8, 8x16, 8x8. Level-2 (Sub-Partition): If level-1 => 8x8,Then, compute RD cost of

8x4,4x8 and 4x4 Select the most optimal

block! P_Skip Mode

RDO Algorithm

Formula: RD_cost(s,c,MODE|Qp) = D + . R ------------------------------------------------------------------------------ Computational Complexity of brute-force RDO: INTRA block:

Total Modes = 4 (16x16) + 9 (4x4) + 1 (I_PCM) + 4 (chroma_8x8) = 18

Total # of RDO calculations = M8 * ( M4*16 + M16)

Theoretical Bound for a MB: 4 x (9x16+4)=592! INTER block:

Total Modes = [ 7+1 (P_SKIP) ] + Intra counterparts

HUGE Computations!! Problem for real time application => So, Need of FMD!

FMD-Intra : Edge-Histogram approach

Main Idea: Use Prediction in Edge DirectionGenerate edge map using Sobel

operator

Build edge direction histogram

Fast intra mode decision

Generate Edge Map

Sobel Operator (Compute Gradients):

, 1, 1 , 1 1, 1 1, 1 , 1 1, 12 2i j i j i j i j i j i j i jdx p p p p p p

, 1, 1 1, 1, 1 1, 1 1, 1, 12 2i j i j i j i j i j i j i jdy p p p p p p

, , ,( )i j i j i jAmp D dx dy

,, ,

,

180( ) arctan( ), ( ) 90i j

i j i ji j

dyAng D Ang D

dx

oo

Edge Direction Histogram for Intra_4x4

, ,

,

,

,

,

{( 0& 0) ( 5.027)}

(0) ( )

( 0.199)

(1) ( )

(0.199 0.668)

(6) ( )

.....

( 1.497 0.668)

(3) ( )

( 0.668 0.199)

i j i j

i j

i j

i j

i j

if dx dy or

histo Amp D

elseif

histo Amp D

elseif

histo Amp D

elseif

histo Amp D

elseif

histo

,(8) ( )i jAmp D

9011.25

8 o

,

,

tan( )i j

i j

dyAng

dx

tan(11.25 ) 0.199o

FMD for Intra_4x4 Contd…

As per observations in Reference[5]:- The ideal 4x4 mode is either the primary mode or

one of the two neighboring modes - DC mode (Mode 2) is always evaluated - Total Modes = 1(Prime) + 2 (neighbors) + DC

= 4

Edge Direction Histogram for Intra_16x16

Total Modes = 1(Prime) + DC = 2

,

,

,

( 2.414)

(0) ( )

( 0.414)

(1) ( )

(3) ( )

i j

i j

i j

f

histo Amp D

elseif

histo Amp D

else

histo Amp D

Fast Mode Decision-Inter

Main idea: If we can reasonably decide that MB is temporally stationary or spatially homogeneous, we can encode MB using larger block-size and safely skip all other modes!

Stationary Region Determination Refers to the stillness between

consecutive frames in the temporal dimension

Evaluate Zero-MV Diff :

If (Diff < Threshold Ts) => “Stationary” So, choose16x16 mode and skip other

sizes ! Threshold Ts = 200 (Reference[6])

16,16

1, 1

( ( , ) ( , ))i j

Diff abs M i j N i j

Homogeneous Region Determination

Refers to texture similarities inside a single video frame

Edge amplitude computation is already done in fast intra mode decision

Threshold values (Reference[6]): for 16x16 block : 20000 for 8x8 block : 5000

, , ,( )i j i j i jAmp D dx dy

,r cH

,,

1, ( )i j Hi j N N

if Amp D Thd

,,

0, ( )i j Hi j N N

if Amp D Thd

Flow Chart of FMD_Inter

Wait...Changing the mode:Theory to Practice!

Implementation

&

Demo

H.264/AVC Profiles

H264/AVC Profiles 

Q. What is X264 ?

‘x.264’ : Open source H264/AVC encoder by VideoLAN ‘C’ code library, Platform : Linux Optimized as compared to reference JSVM software Bunch of encoding options! We finalized the options for “benchmarking” performance

of Non-FMD vs FMD caseE.g.: Command to encode ‘foreman_qcif.yuv’ sequence…./x264 -o foreman_qcif.264 foreman_qcif.yuv 176x144 --profile baseline --frame 30 --verbose --keyint 15 --min-keyint 15 --no-scenecut --bframes 0 --ref 1 --slices 1 --fps 15 --qp 25 --partitions all --weightp 0 --me esa --subme 7 --no-chroma-me --no-8x8dct --trellis 0 --no-fast-pskip --visualize

X264 Coding Options:

--keyint 15/--min-keyint 15: Sets GOP size to 15 --bframes 0: Disables B-frame --slices 1: Sets 1 slices per frame --ref 1: Only 1 frame can be used as reference --me esa: Select exhaustive motion estimation --no-chroma-me: Ignore chroma in motion

estimation --qp 25: Fixed quantization step-size --partitions all: Do all possible partitions --no-scenecut: Disables adaptive I-frame

decision

Implementation I:‘Generation of Mode Statistics’

Intra MB: 3 Types :: I_4x4=0 ( 11 Modes), I_16x16=2 (4 Modes), I_PCM=3, Inter MB: 3 Types :: P_L0=4, P_8x8=5, P_SKIP=6 P_LO (Level-1): can have 3 Partitions: D_16x8=14, D_8x16=15, D_16x16=16 P_8x8 (Level-2): has D_8x8 partition and can have 4 Sub-partitions: D_L0_8x8=3, D_L0_4x4=0, D_L0_8x4=1, D_L0_4x8=2

Implementation II: ‘Visualization Utility’

I-FrameRED : Intra_4x4

CYAN: Intra_16x16

P-FrameGREEN: P_SKIPBLUE: P_8X8 (and below)MAGENTA: P_16x16,P_16x8,

P_8x16

Motive: “Seeing is Believing !”Let’s see a Demo…

Key observations:

I- Frame: 16x16 size chosen for spatially homogeneous region 4x4 size chosen for a MB with many spatial details/local

edges------------------------------------------------------------------------------------- P-Frame:

% of Skipped

% of Inter % of Intra

Akiyo 78.2 21.8 0

Football 6.6 81 12.4

Foreman 17.5 81.9 0.6

Contd…

Though H.264 allows variable size MC up-to 4x4 size… Real world video sequences: Certain percentage of

‘Skipped’ blocks Spatially Homogeneous regions gets best

compensated with 16x16

(such blocks have similar motion; very seldom split to smaller blocks)

Temporally Stationary blocks ( e.g. stationary background even with strong edges) gets best compensated with 16x16 or P_SKIP

Nonetheless,

Blocks containing motion boundaries or motion in smaller objects benefit from 8x8 or 4x4 MC

Implementation III: FMD Intra in x264

Block Size Total # of modes

# of modes selected

Luma(Y) 4x4 9 4

Luma(Y) 16x16 4 2

Chroma(U,V) 8x8 4 3 or 2

~1000 lines of C code: Edge Map computation, Prime mode computation based on histogram, Modification of mode decision logic in .x264 Number of candidate modes in Intra-FMD:

Results: Intra FMD (All I frames, Qp=25)

RESULTS△TIME(%) △PSNR_Y △PSNR_U △PSNR_V △PSNR_AV

G

Mobile-30.22

-0.022 0.007 0.006-0.016

Akiyo-35.81

-0.181 -0.091 -0.091-0.165

Paris-39.15

-0.067 0.006 -0.02-0.055

Foreman-38.88

-0.154 -0.014 -0.042-0.125

Football

-38.84-0.084 -0.066 -0.068

-0.198Avg. Time Saving:

36.70%

Avg. PSNR drop: 0.11 dB

Results: Intra FMD (PSNR vs R)

Sequence: Mobile, Coding: All I, Qp= 37,33,29,25

Avg PSNR drop: 0.044 dB, Avg. Increase in R: ~6%, Avg Time Saving: 37.51%

Summary and future work:

To Conclude: Learnt x264 code-flow, different encoding options Matlab ‘mode visualization script’ is ready Intra-FMD ready, Inter-FMD (in progress) Important: FMD framework is ready! Different FMD

algorithms can be plugged in to evaluate prime mode selection…

Future Work: Inter FMD FMD enhancement: Analysis of different modes with

conditional probabilistic model

Reference

[1] URL: http://www.videolan.org/developers/x264.html [2] Thomas Wiegand, Gary J Sullivan, “Overview of the H264/AVC Video

Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7,July 2003

[3]URL: http://www.vcodex.com/files/H.264_overview.pdf White Paper: An Overview of H.264 Advanced Video Coding

[4] Iain E G Richardson, “H.264 and MPEG4 Video Compression”, WILEY Publications, 2003

[5] Feng Pan et al, “Fast Mode Decision Algorithm for Intra-prediction in H264/AVC Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 7,July 2005

[6] D. Wu et al, “Fast Intermode Decision in H264/AVC Video Coding”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 6,July 2005

[7] Rui Su, Guizhong Liu, Tongyu Zhang,”Fast Mode Decision Algorithm for Intra Prediction In H264/AVC”, ICASSP-2006

Thank you!

Questions?

Questions?

QUESTIONS?