Quad-Tree Motion Modeling with Leaf Merging

QUAD-TREE MOTION MODELING WITH LEAF MERGING

Reji Mathew and David S. Taubman

CSVT 2010

Outline Introduction

Quad-tree representation Quad-tree motion modeling

Motion vector prediction strategies Pruning algorithm Merging principle Motion signaling R-D performance results

Hierarchical and polynomial motion modeling Scalable motion modeling Conclusion

Quad-tree Representation Image modeling

Image to be recursively divided into smaller regions, each region represented by a suitable model.

Sub-optimal: dependency between neighboring leaf nodes with different parents is not exploited

Quad-tree Representation Image modeling

Rate-distortion optimization, allowing a Lagrangian cost function(D+λR) to be minimized using tree pruning with leaf merging step.

[1] R. Shukla, P. Dragotti, M. Do, and M. Vetterli, “Rate-distortion optimized tree structure compression algorithms for piecewise polynomial images,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 343–359, Mar. 2005.

Quad-tree Motion Modeling

Motion model forward-only, backward only or bi-directional

motion with two reference frames. Motion vector prediction strategies

Hierarchical motion coding H.264 spatial motion vector prediction

strategy

Motion models


Pruning Algorithm Produce a quad-tree structure that minimizes

the Lagrangian cost objective Df + λRf Given a parent node p, the four children ci , 1 ≤ i ≤ 4,

are pruned away if

When pruning occurs, andOtherwise, and

=Rp in hierarchical coding=0 at all times in spatial coding

R-D optimally pruned quad-tree:Tree pruning yields a globally minimal value for Df + λRf for hierarchical coding; while it is somewhat greedy for spatially predictive coding.


Merging principle possibility of jointly coding and optimizing

neighboring nodes that belong to different parents. Merge target contains nieghboring node located

at a higher level or at the same level. Merging is allowed to take place only if it

reduces the overall Lagrangian cost.

The same parent


Motion signaling Anchor node:

Hierarchical: the only member node of the region that is not signaled as being merged

Spatial: the first node in the region that is encountered during decoding.(the top-left block)


R-D performance results35% 25%

45%35%

once merging is included the performance of hierarchical motion representation can be brought close to that achieved by spatial prediction with merging.

Hierarchical and Polynomial Motion Modeling

Further improve the performance of hierarchical motion representation by polynomial motion models. Formation of larger regions during merging process Smoother motion representations

Motion models

The parameters of the motion model are obtained by a weighted least squares fitting procedure.

Pruning phase Merging phase

: mv belonging to node b at level k: motion corresponding to translation, linear and affine flows

Hierarchical and Polynomial Motion Modeling

Motion compensation Generate a set of MVs for each descendants at

level K (4*4 block)

R-D performancewith motion models

depend on the motion model and the central location of block b’

Scalability Motion Modeling

Scalability objective Modified Lagrangian cost function

When terminating decoding at an intermediate resolution level, motion compensation is performed using leaf nodes that may already be available; in those cases where leaf nodes are not available, information contained in branch nodes is utilized.

: The costs for each level k of the quad-tree: The weights assigned to each level,

and

Leaf node b Branch node bContribution to Contribution to

: The total distortion of all nodes for which motion compensation is performedLevel k :

terminate


Scalability performance α0 = α1= α2=0.1, α3=0.7


Residual coding JPEG2000: full resolution motion compensated

residual frames Total rate for coding motion and residual

frames


Wavelet-based video encoding results integrate the quad-tree motion model with the

wavelet-based scalable interactive video (SIV) codec[9]

[9] A. Secker and D. S. Taubman, “Lifting-based invertible motion adaptive transform framework for highly scalable video compression,” IEEE Trans. Image Process., vol. 12, no. 12, pp. 1530–1542, Dec. 2003.

Conclusion The merging step can be incorporated into

quad-tree motion representations for a range of motion modeling contexts.

R-D performance that can be gained by introducing merging for the two cases of hierarchical and spatially predictive motion coding (such as that employed by H.264).

Report on the benefits of polynomial modeling and hierarchical coding, once merging has been incorporated into the conventional quad-tree approach.

Quad-Tree Motion Modeling with Leaf Merging

Documents

Transcript of Quad-Tree Motion Modeling with Leaf Merging