Rochelle 5

download Rochelle 5

of 44

Transcript of Rochelle 5

  • 7/28/2019 Rochelle 5

    1/44

    Transcoding of an

    MPEG-2 bit stream to anH.264 bit stream

  • 7/28/2019 Rochelle 5

    2/44

    What is Transcoding ?

    The operation of converting video in one format toanother format.

    Need: Compatibility between MPEG-2 and H.264devices

    Applications: To adapt the bit rate of a compressedstream to the channel bandwidth, to change the spatial

    or temporal resolution of a compressed stream etc.

  • 7/28/2019 Rochelle 5

    3/44

    Criteria considered in Heterogeneous

    Transcoding Quality of the transcoded stream should be comparable

    to that obtained by complete decoding and re-encodingwith full motion search and to that of the initial inputstream.

    The information in the input bit stream should be re-used as much as possible to reduce multigenerationaldegradation.

    The computational cost and complexity should be keptminimal.

  • 7/28/2019 Rochelle 5

    4/44

    MPEG-2 Decoder

    Variable LengthDecoding

    InverseScan

    InverseQuantization

    InverseDCT

    MotionCompensation

    Frame storeMemory

    Decoded Pels+

    +

    MPEG-2 bitstream

  • 7/28/2019 Rochelle 5

    5/44

    H.264 encoder

  • 7/28/2019 Rochelle 5

    6/44

    Transcoding Algorithm

  • 7/28/2019 Rochelle 5

    7/44

    Intra frame coding

    MPEG-2 H.264

    Macroblock modes supported 8x8 16x16 with 4 directional modes,4x4 with 9 directional modes

    Type of intra prediction Fixed prediction of D.C.

    coefficient

    Adaptive directional prediction

    of 4x4 or 16x6 pixel blocks

    Transform 8x8 DCT 4x4 Integer transform

  • 7/28/2019 Rochelle 5

    8/44

    Intra Frame Transcoding

    MPEG-2bit stream

    H.264bitstream

    -

    +

    VLC Decode Inverse Quantize IDCT

    ModeDecision

    SpatialPrediction

    4x4 IntegerTransform

    QuantizeEntropy

    Coding

  • 7/28/2019 Rochelle 5

    9/44

    Complexity of applying mode decisions in

    transform domain Example: Vertical

    prediction

    Predicted block=

  • 7/28/2019 Rochelle 5

    10/44

    Intra modes in H.264/AVC

    0

    1

    43

    57

    8

    6

    Directional modes for an intra 4x4 macroblock

    Directional modes for an intra 16x16 macroblock

    Directional prediction

  • 7/28/2019 Rochelle 5

    11/44

    Mode decision algorithm

  • 7/28/2019 Rochelle 5

    12/44

    Why use standard deviation?

    Simple metric

    Can be easily computed as the transform domain coefficients are alreadyavailable.

  • 7/28/2019 Rochelle 5

    13/44

    Post mode decision

  • 7/28/2019 Rochelle 5

    14/44

    Intra Frame transcoding results

  • 7/28/2019 Rochelle 5

    15/44

    Subjective Quality of Intra frames

    MPEG-2 Input Stream H.264 bit stream obtained H.264 bit streamTest clip :Akiyo by the proposed method obtained from completeBit rate: 1 Mbps Bit rate: 768 Kbps decoding and re-encodingSpatial resolution: 352x240 of the input MPEG-2 bit

    stream

    Bit rate: 670 Kbps

  • 7/28/2019 Rochelle 5

    16/44

    Subjective Quality of Intra frames

    MPEG-2 Input H.264 bit stream obtained H.264 bit stream bitstream by the proposed method obtained from completeTest clip: Foreman Bit rate: 1Mbps decoding and re-encodingBit rate: 1 Mbps of the input MPEG-2 bitSpatial resolution: 352x240 stream.

    Bit rate: 1Mbps

  • 7/28/2019 Rochelle 5

    17/44

    Inter frame coding

    MPEG-2 H.264

    MC prediction with pel

    accuracy

    No, only pel accuracy yes

    MC modes 16x16 16x16,16x8,8x16,

    8x8,8x4,4x8, 4x4

    Multiple reference prediction no yes

    Direct modes in B frames no yes

    Use of B frames as reference

    framesno Allowed, can be selected by the

    user

  • 7/28/2019 Rochelle 5

    18/44

    Inter frame transcoding

    VLDInverse

    quantise IDCTSum

    residuals

    MVRefinement Hierarchicalmode decisionsPassparametersInter prediction

    4x4 Integer

    transformQuantiseVLC/CABAC

    Inverse

    VLC/CABACInverse

    Quantise

    Motion

    compensate

    Store as

    reference frame

    Rate control

    MPEG-2 bitstream

    H.264 bit

    stream

  • 7/28/2019 Rochelle 5

    19/44

    Inter frame transcoding

    Features:

    Motion vector extraction

    Motion vector refinement

    Motion vector reuse

    Hierarchical mode decision

  • 7/28/2019 Rochelle 5

    20/44

    Motion vector extraction

    Motion vectors can be extracted from the MPEG-2 bit stream

    after variable length decoding.

  • 7/28/2019 Rochelle 5

    21/44

    Need for motion vector refinement

    Need:

    Differences in the quantization parameters of the incoming bit stream andthose selected may differ. When these differences are large it results in quality

    degradation. MPEG-2 supports certain modes in which no motion information is coded.

    However, since H.264 supports more fine motion estimation block sizes, asmall amount of motion may result upon refinement.

    Re-evaluation of the decision to intra code macroblocks in a P frame.

    Improves accuracy of the motion vectors and helps achieve compatibilitybetween pel MV accuracy in MPEG-2 and pel MV accuracy in H.264

  • 7/28/2019 Rochelle 5

    22/44

    Need for motion vector refinement

    Compensates for field coding to frame coding changes and vice versa

  • 7/28/2019 Rochelle 5

    23/44

    Motion vector refinement

    MPEG-2 motion vectors are refined over a one pixel window i.e. dx = dy = 3 pixels, in the most recent reference frame in List 0.

    Half pixel and quarter pixel refinement is performed with the defined

    window.

  • 7/28/2019 Rochelle 5

    24/44

    Search window size (dx,dy) selection

    Before window size selection, differentincreasing window sizes were tested toverify the effect of varying the searchwindow size on the PSNR.

    The graph for one such test clip Akiyois as follows

    It was observed that the PSNRobtained for a one pixel window

    closely approximated the steady statevalue and using a one pixel windowprovided a good tradeoff betweencomplexity and the PSNR.

    Frame window size test (Clip Akiyo)

    42.57542.58

    42.585

    42.59

    42.595

    42.6

    42.605

    42.61

    42.615

    0 2 4 6 8 10 12

    --> Search Window Size s (dx,dy)

    --->

    PSNR

    (dB)

    Frame window size test

  • 7/28/2019 Rochelle 5

    25/44

    Motion vector reuse

  • 7/28/2019 Rochelle 5

    26/44

    Hierarchical mode decisions

    Coding modes are compared and selected based on the sum of absolutedifference (SAD) value. In the full mode decision method, every codingmode is evaluated ,the SAD value is computed and the mode with theminimum SAD value is selected as the best mode. However , although thismethod would give the best results ,it is very computationally intensive. Forinstance, each macroblock in the P frame would have to be evaluated for16x16, 2 16x8, 2 8x16, 4 8x8, 8 8x4,8 4x8, 16 4x4 intra and skip modes

    Hierarchical mode decision process makes use of the fact that after evaluatinga mode and the next level of sub partitioned modes , if sub partitioning doesnot reduce the SAD value then further sub partitioning need not be evaluated.

  • 7/28/2019 Rochelle 5

    27/44

    Hierarchical mode decisions

    The top down splitting approach is shownbelow

  • 7/28/2019 Rochelle 5

    28/44

    P frame transcoding Results

  • 7/28/2019 Rochelle 5

    29/44

    P frame results

    (see previous slide)

  • 7/28/2019 Rochelle 5

    30/44

  • 7/28/2019 Rochelle 5

    31/44

    Motion vectors in P framesMPEG-2 Input Stream H.264 bit stream obtained H.264 bit streamTest clip :Akiyo by the proposed method obtained from completeBit rate: 1 Mbps Bit rate: 768 Kbps decoding and re-encodingSpatial Resolution:352x240 of the input MPEG-2 bit

    streamBit rate: 670 Kbps

    MPEG-2 motion vectors H.264 motion vectors after transcoding H.264 motion vectors after full motion search

  • 7/28/2019 Rochelle 5

    32/44

  • 7/28/2019 Rochelle 5

    33/44

    B frame transcoding results

  • 7/28/2019 Rochelle 5

    34/44

    B frame transcoding results

    Comparison of PSNR Values for transcoding with and without the hierarchical

    mode decision for the test clip Akiyo (spatial resolution:352x240)

    0

    10

    20

    30

    40

    50

    60

    0 500 1000 1500 2000 2500 3000

    ---> Bitrate(kbps)

    --->PSNR(dB

    Proposed method PSNR

    Proposed method w/o hierarchical

    mode decision PSNR"

    Comparison of Execution time for the proposed method with and without

    hierarchical mode decision for the te st clip Akiyo (spatial resolution: 352x240)

    0

    1000

    2000

    3000

    4000

    5000

    6000

    7000

    0 500 1000 1500 2000 2500 3000

    ---> Bitrate(kbps)

    --->Executiont

    ime(ms)

    Proposed method executiontime

    Proposed method w/o

    heirarchical mode decision

  • 7/28/2019 Rochelle 5

    35/44

    B frame transcoding results

  • 7/28/2019 Rochelle 5

    36/44

    Motion vectors in B framesMPEG-2 Input Stream H.264 bit stream obtained H.264 bit stream.

    Test clip :Akiyo by the proposed method obtained from completeBit rate: 1 Mbps Bit rate: 768 Kbps decoding and re-encodingSpatial resolution:352x240 of the input MPEG-2 bit

    streamBit rate: 670 Kbps

    Forward motion vectors in MPEG-2

    Backward motion vectors in MPEG-2

    Backward motion vectors in the H.264transcoded bit stream

    Forward motion vectors in theH.264 transcoded bit stream

    Forward motion vectors in the

    H.264 transcoded bit streamBackward motion vectors in theH.264 transcoded bit stream

  • 7/28/2019 Rochelle 5

    37/44

    Mode decisions in B framesMPEG-2 Input Stream H.264 bit stream obtained H.264 bit stream bitstream. by the proposed method obtained from completeTest clip :Akiyo Bit rate: 768 Kbps decoding and re-encodingBit rate: 1 Mbps of the input MPEG-2 bitSpatial resolution:352x240 stream

    Bit rate: 670 Kbps

    16x16 modes in the MPEG-2 bit stream16x16 and sub 16x16 modes in theH.264 transcoded bit stream 16x16 and sub 16x16 modes in the

    H.264 re-encoded bit stream

  • 7/28/2019 Rochelle 5

    38/44

  • 7/28/2019 Rochelle 5

    39/44

    Comparison of the Input MPEG-2 bit stream

    vs. the transcoded H.264 bit stream

    The table below illustrates the comparison between the PSNR of theinput MPEG-2 bit streamand the PSNR of the transcoded H.264 bit stream obtained by transcoding 35 frames at 1Mbps

    with the IBBPBBP GOP structure

  • 7/28/2019 Rochelle 5

    40/44

    Comparison of the proposed method with the

    DCT domain transcoder proposed by Chang

    and Messerschmitt [23]

    Comparison of the Proposed method(PM), Complete decoding and

    re-encoding (CDRE) and DCT Domain Transcoding(DDT)

    30

    32

    34

    36

    38

    40

    42

    44

    0 5 10 15 20 25 30

    --> frame number

    -->PSNR(

    dB)

    PM

    DDT

    CDRE

    The graph shown compares the proposed method with DCT domain transcoding [23]and complete decoding and re-encoding of a 1 Mbps MPEG-2 bit stream (test clipForeman) to an H.264 bit stream with an IBBPBBP.GOP structure at a constant bitrate.

  • 7/28/2019 Rochelle 5

    41/44

    Proposed method transcoded stream

    Proposed method

    Full re-encoding

  • 7/28/2019 Rochelle 5

    42/44

    References

    [1] J. Youn and M-T. Sun , Motion Vector Refinement for high-performance transcoding, in IEEE Int. Conf.Consumer Electronics, Los Angeles, CA, Vol. 1, Issue 1, pp. 30-40, March 1999.[2] J. Xin, C-W. Lin and M-T. Sun, Digital Video Transcoding , Proceedings of the IEEE, Vol. 93, pp. 84-97, Jan.

    2005.[3] T. Wiegand et. al., Overview of the H.264/AVC Video Coding Standard, IEEE Trans. CSVT, Vol. 13, pp.

    560-576, July 2003.[4] A. Vetros, C. Christopoulos and H. Sun, Video transcoding architectures and techniques: an overview, IEEE

    Signal Processing magazine, Vol. 20, pp. 18-29,March 2003.[5] H. Kalva, Issues in H.264/MPEG-2 Video Transcoding, IEEE Consumer Communications and Networking

    Conf., CCNC 2004, pp 657-659, Jan 2004.[6] Information Technology-Generic coding of moving pictures and associated audio information: Video, ITU-T

    Rec. H.262 (2000 E).[7] B. Haskell, A. Puri and A. Netravali, Digital Video: an introduction to MPEG -2, N.Y. Chapman and Hall,

    International Thomson Pub., 1997.[8] G. Chen et. al., Efficient block size selection for MPEG-2 to H.264 transcoding, Proceedings of the 12th

    annual ACM International Conference on Multimedia, pp. 300-303, Oct. 2004.[9] MPEG-2 software (version 12) from MPEG software simulation group,

    http://www.mpeg.org/MPEG/MSSG/#source[10] H.264 Software (JM9.5) from http://iphome.hhi.de/suehring/tml/download/jm94.zip[11] A. Puri, X. Chen and A. Luthra, Video coding using the H.264/MPEG -4 AVC compression standard, Signal

    processing: Image communication, Vol. 19, pp. 793-849, Oct. 2004.[12] B. Shen and I. Sethi, Direct feature extraction from compressed images, SPIE: Vol. 2670 Storage and

    Retrieval for Image Databases IV, pp. 404-414, 1996.[13] Commercially available transcoders, PSP Video 9, http://www.pspvideo9.com[14] K.R. Rao and J. J. Hwang, Techniques and Standards for Image, Video and Audio coding, Upper Saddle

    River, N.J.: Prentice Hall, 1996.

    http://www.mpeg.org/MPEG/MSSG/http://www.pspvideo9.com/http://www.pspvideo9.com/http://www.mpeg.org/MPEG/MSSG/
  • 7/28/2019 Rochelle 5

    43/44

    References continued

    [15] M. Ghanbari, Video Coding: an introduction to standard codecs, London, U.K.: Institution of ElectricalEngineers, 1999.

    [16] I. E. G. Richardson, H.264 and MPEG-4 video compression: video coding for next generation multimedia,Chichester: Wiley, 2003.

    [17]Test streams obtained from ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/andhttp://www.cipr.rpi.edu/resource/sequences/sif.html

    [18] Y-J. Chuang, Y-C. Huang and J-L Wu, An efficient block algorithm for splitting an 8x8 DCT into four 4x4

    modified DCT used in AVC/H.264, EURASIP 2005, pp. 311-316.[19] P. Assunco and M. Ghanbari, Post Processing of MPEG-2 coded video for transmission at lower bit rates,Proc. IEEE ICASSP, pp. 1998-2001, Atlanta, GA, 1996.

    [20] T. Shanableh and M. Ghanbari, Transcoding Architectures for DCT domain heterogeneous videotranscoding, Proc. IEEE ICIP, Vol. 1, pp. 433-436, Thessaloniki, Greece, Sept. 2001,.

    [21] J. Xin, M.T. Sun and K. Chun, Motion re -estimation for MPEG-2 to MPEG-4 simple profile transcoding,Proc. Int. Workshop Packet Video, Pittsburgh, PA, Apr. 2002.

    [22] D-Y. Chan, S-J. Lin and C-Y. Chang, A rate control scheme using Kalman filtering for H.263, Journal ofVisual Communication and Image Representation, Vol. 16, pp. 734-748, Dec. 2005.

    [23] S. Liu and A. Bovik, Foveated embedded DCT domain video transcoding, Journal of Visual Communicationand Image Representation, Vol. 16, pp. 643-667, Dec. 2005.

    [24] I. E. G. Richardson, Video codec design: developing image and video compression systems, Chichester:Wiley, 2002.

    [25] G. Sullivan, T. Wiegand and A. Luthra, Draft of Version 4 of H.264/AVC (ITU -T Recommendation H.264and ISO/IEC 14496-10 (MPEG-4 part 10) Advanced Video Coding), JVT Doc., 14th Meeting: Hong Kong,China 18-21 Jan. 2005.

    [26] G. F-Escribano et.al., Computational complexity reduction of intra frame prediction in MPEG2/H.264 video

    transcoders, ICME, pp. 707-710, July 2005.

    ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/http://www.cipr.rpi.edu/resource/sequences/sif.htmlhttp://www.cipr.rpi.edu/resource/sequences/sif.htmlftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/ftp://ftp.tek.com/tv/test/streams/Element/MPEG-Video/525/
  • 7/28/2019 Rochelle 5

    44/44