1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection...

42
1 Adaptive slice-level para llelism for H.264/AVC enc oding using pre macrobloc k mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication and Image Representation 2008
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    0

Transcript of 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection...

Page 1: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

1

Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection

Bongsoo Jung, Byeungwoo Jeon

Journal of Visual Communication and Image Representation 2008

Page 2: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

2

Outline

Introduction Complexity Analysis Method

Pre Macroblock Mode Selection Adaptive Slice-level Parallelism

Experimental Results Conclusions

Page 3: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

3

Introduction

H.264/AVC achieves high coding efficiency Variable block size, multiple reference frame,

quarter-pel motion vector accuracy,etc. High computational complexity

Complexity reduction algorithm Parallel processing

Page 4: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

4

Introduction

GOP level Simple but high latency

Frame level Keep coding efficiency, but the dependence am

ong frames limits the thread scalability Slice level

Encode independently but less coding efficiency Macroblock level

High dependency

Page 5: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

5

Introduction

MBs in a slice may not have similar computational complexity. Unnecessary extra waiting time in some thr

eads.

slice 0

slice 1

slice 2

slice 3

slice 4

slice 5

slice 6

slice 7

Encoding time

PU0

PU1

PU2

PU3

PU4

PU5

PU6

PU7

Page 6: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

6

Main Purpose

Objective Using parallel algorithm to speed up

H.264/AVC encoder Maximize the parallelism efficiency by

distributing the workload equally. Method

Pre processing: Fast MB mode selection Adaptive slice-level parallelism

Page 7: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

7

Complexity Analysis

Inter prediction mode of MBs in H.264 Intra prediction mode: 4*4, 16*16

Page 8: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

8

Complexity Analysis

The run-time complexity of the H.264/AVC encoder Pentium IV 2.4GHz Foreman_CIF with IPPP structure

Page 9: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

9

Pre Macroblock Mode SelectionOverview

Why? High computational complexity of ME in

variable block size Remove unnecessary ME block size and RD

calculation of intra prediction mode This removal leads to

Complexity reduction Workload balancing among slices

Page 10: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

10

Pre Macroblock Mode SelectionInter MB mode selection

MC block sizes in video sequence Foreground region : 8*8 or smaller Non-moving region : 16*16

High temporal correlation Check consistency history of block size 16*

16 and zero MV Two measurements

Zero motion consistency (ZMC) Large block consistency (LBC)

Page 11: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

11

Pre Macroblock Mode SelectionInter MB mode selection

Zero Motion Consistency (ZMC) Indicates how long a specified block has had

a zero MV consecutively

When a block is encoded in intra mode ZMC is set to 0

t : frame index , ZMC0 = 0,

(n,m;i,j) indicates a 4*4 block at (n,m)

within a MB (i,j)

high value of ZMC

high prob. of belonging

to background region

Page 12: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

12

Pre Macroblock Mode SelectionInter MB mode selection

Zero Motion Consistency Score Indicates how likely a MB being a stationary

region

TMOTION : A threshold value

Page 13: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

13

Pre Macroblock Mode SelectionInter MB mode selection

Large Block Consistency (LBC) Indicates the number of continuous frames h

aving a 16*16 MC block size at (i,j)th MB

When a block is encoded in intra mode LBC is set to 0

bestModet(i,j) : The best MB mode of the (i,j) MB in tth

frame

LBC0 = 0

Page 14: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

14

Pre Macroblock Mode SelectionInter MB mode selection

Large Block Consistency Score Indicates how likely a MB being partitioned in

16*16

TMODE1 ,TMODE2 : Threshold values used to make the

assessment of the LBC

Page 15: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

15

Pre Macroblock Mode SelectionInter MB mode selection

A illustration of LBCS

Page 16: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

16

Pre Macroblock Mode SelectionInter MB mode selection

Conditional probability of MB modes given ZMCS = High

The other block sizes are very unlikely to appear (less than about 0.04)

Early detect SKIP and P16*16 mode

TMotion = 4

Page 17: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

17

Pre Macroblock Mode SelectionInter MB mode selection

Joint conditional probability of given LBCS with ZMCS = Low

A: LBCS = High, B: LBCS = Medium, C: LBCS = Low

TMODE1 = 1, TMODE2 = 4

Page 18: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

18

Pre Macroblock Mode SelectionPre selective intra mode selection

High computational load of computing RD costs of intra mode

Comparing temporal correlation with spatial correlation of the current MB prior to frame coding

Page 19: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

19

Pre Macroblock Mode SelectionSelective intra mode selection

Mean Absolute Temporal Difference

Mean Absolute Spatial Difference

cx,y : Pixel values at location (x,y) of MB in current frame

rx,y : Pixel values at location (x,y) of MB in previous frame

X, Y : Horizontal and vertical dimensions of a MB

MASDH : The MASD between horizontally

neighboring pixels

MASDV : The MASD between vertically

neighboring pixels

Page 20: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

20

Pre Macroblock Mode SelectionSelective intra mode selection

Comparing MATD and MASD to determine whether current MB should calculate RD costs of intra modes

A larger w makes skipping intra mode search easier

A smaller QP will incur more intra modes than a larger QP

w: Weighting factor, currently is set to 0.6

More temporally correlated than spatially correlated

Page 21: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

21

Pre Macroblock Mode SelectionMB mode classfication

Decision table of candidate MB mode

A block diagram of MB selection

Page 22: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

22

Adaptive Slice-level ParallelismOverview

Characteristic Easy to implement Lower overhead of inter communication a

mong processor unit Good scalability Increase bitrate

Slice boundary is defined on the basis of a fixed number of MBs or fixed number of bits

Hard to decide a slice boundary prior toencoding

Page 23: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

23

Adaptive Slice-level ParallelismFixed MB assignment

The number of consecutive MBs in each slice

L : The number of processor units on a multi-core system

M : The total number of MBs in a frame i : Slice index

Example : number of processing unit L = 8, sequence resolution

is CIF (352*288), M = 22*18 = 396

We can assign about 49 MBs to each slice

Page 24: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

24

Adaptive Slice-level ParallelismFixed MB assignment

The scheduling of slice-level parallelism in eight processor units

slice 0

slice 1

slice 2

slice 3

slice 4

slice 5

slice 6

slice 7

Encoding time

PU0

PU1

PU2

PU3

PU4

PU5

PU6

PU7

slice 0

slice 1

slice 2

slice 3

slice 4

slice 5

slice 6

slice 7

Encoding time

PU0

PU1

PU2

PU3

PU4

PU5

PU6

PU7

Ideal case Practical case

Bottleneck

Page 25: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

25

Adaptive Slice-level ParallelismFixed MB assignment

The imbalance of computational load distribution

Exhaustive Search Method Fast ME / Fast Mode Search

Page 26: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

26

Adaptive Slice-level ParallelismFixed MB assignment

Computational load for encoding one frame in slice level parallelism

Computation load of the tth frame by a single processor system

Ctslice(i) : The computational load of ith slice in tth frame

L : Number of slice in a frame

Page 27: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

27

Adaptive Slice-level ParallelismFixed MB assignment

The speedup of multiprocessor system over a single processor system

To achieve the maximum speedup Computation loads of each slice should be

as similar as possible Adaptive slice partition method

Page 28: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

28

Adaptive Slice-level ParallelismComplexity estimation model

A simple estimation method by utilizing the result of fast MB mode selection

Define the group value g corresponding to the candidate MB modes

Page 29: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

29

Adaptive Slice-level ParallelismComplexity estimation model

Complexity model

Ck,CHKIntra(g) : Complexity cost of the kth MB

g : Group index

einter : Estimated complexity cost of inter mode in g = 1

eintra : Complexity cost according to the intra mode check

in g = 1

α1, α2, α3, β1 β2 β3 : Weighting values of complexity cost

Page 30: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

30

Adaptive Slice-level ParallelismComplexity estimation model

Relative computational load

4,5.28

3, 3.12

2,2.42

1, 1

)(

33

22

11

0,

gee

gee

gee

gee

gC

IntraInter

IntraInter

IntraInter

IntraInter

IntraCHKk

CHKintra = 0

CHKintra = 1

Assume einter = 1, eintra = 0

α1=2.42, α2=3.12,α3=5.28

4,9.48

3, 7.23

2,.486

1,97.4

)(

33

22

11

1,

gee

gee

gee

gee

gC

IntraInter

IntraInter

IntraInter

IntraInter

IntraCHKk

β1=0.82, β2=0.83, β3=0.84

Assume einter = 1, eintra = 3.97

Page 31: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

31

Adaptive Slice-level ParallelismAdaptive MB assignment

The total computational load at the tth frame

Ideal computational load of each slice for the uniform workload distribution

1

0, )(

~ M

kIntraCHKk

t gCC

L

CC

ttslice

~~

Page 32: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

32

Adaptive Slice-level ParallelismAdaptive MB assignment

MB assignment of slice

Much better than fixed MB assignment in each slice

Page 33: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

33

Adaptive Slice-level ParallelismAdaptive MB assignment

Entire block diagram

Page 34: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

34

Experimental ResultsOverview

Performance comparison between proposed MB mode decision and the conventional method

Comparing adaptive slice-level parallelism with fixed slice-level parallelism

Page 35: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

35

Experimental ResultsMB mode selection

Average encoding time saving AST[%]

BDPSNR and BDBR are used to measure the performance against FULL_1Slice

FULL_1Slice : Exhaustive methodFMD_1Slice : Fast MB mode search method

Page 36: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

36

Experimental ResultsRate distortion curves

Page 37: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

37

Experimental Results

R-D performance compared to one slice per frame (FMD_1Slice)

Page 38: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

38

Experimental ResultsRate distortion curves

Page 39: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

39

Experimental ResultsSlice-level parallelism

Comparing adaptive and fixed slice level parallelism

Speedup

meOverheadTiisliceEncTimeMAX

SliceFMDEncTimeSpeedup

FixedFMDiFixedFMD

_

_

)1_(

meOverheadTiisliceEncTimeMAX

SliceFMDEncTimeSpeedup

AdaptiveFMDiAdaptiveFMD

_

_

)1_(

Encoding time of one slice per frame

by a single processor system

The longest encoding time of a slice using fixed mode

The longest encoding time of a slice using adaptive mode

Page 40: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

40

Experimental ResultsSpeedup

Page 41: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

41

Conclusions

Proposed a fast MB mode selection using consistency history of block size and a zero MV

Proposed a intra mode selection by comparing the correlation

Using these two schemes, they proposed a new adaptive slice-level parallelism to speed up H.264/AVC encoder

Page 42: 1 Adaptive slice-level parallelism for H.264/AVC encoding using pre macroblock mode selection Bongsoo Jung, Byeungwoo Jeon Journal of Visual Communication.

42

Reference

Z. Chen, P. Zhou, Y. He, Fast motion estimation for JVT, JVT Doc.JVT-G016,March 2003.

B. Jeon, J. Lee, Fast mode decision for H.264, JVT-J003, ISO/IEC MPEG and ITU-T VCEG Joint Video Team, (Waikoloa, HI), December 2003.

I. Choi, J. Lee, B. Jeon, Fast coding mode selection with rate-distortion optimization for MPEG-4 Part-10 AVC/H.264, IEEE Trans. Circuits Syst. VideoTechnol. 16 (12) (2006) 1557–1561.