1 Video Compression. 2 Video Compression Standards JPEG: ISO and ITU-T for compression of still...
-
Upload
joleen-williams -
Category
Documents
-
view
223 -
download
0
Transcript of 1 Video Compression. 2 Video Compression Standards JPEG: ISO and ITU-T for compression of still...
1
Video CompressionVideo Compression
2
Video Compression StandardsVideo Compression Standards JPEG: ISO and ITU-T
for compression of still image Moving JPEG (MJPEG) H.261: ITU-T SG XV
for audiovisual service at p x 64Kbps MPEG-1, 2, 4, 7: ISO IEC/JTC1/SC29/WG11
for compression of combined video and audio H.263: ITU-T SG XV
for videophone at a bit-rate below 64Kbps JBIG: ISO
for compression of bilevel images Non-standardized techniques
DVI: de facto standard from Intel for storage compression and real-time decompression
QuickTime: Macintosh
3
Frame/Picture TypesFrame/Picture Types I frame: Intra-coded frame
points for random access used as a reference for
coding other frames Use JPEG except
quantization threshhold values are same for all DCT components
P frame: Predictively coded frame based on the reference frame
(previous I or P frame) B frame: Bidirectionally
predictively coded frame based on the previous and
following I and/or P frames D frame: DC coded frame
intra-coded frame, neglecting AC coefficients
used for fast forward and rewind mode
4
Group of Picture (GOB) StructureGroup of Picture (GOB) Structure
5
Display and Transmission OrderDisplay and Transmission Order
Transmission order and display order may differ Reference frames must be transmitted first
1 2 3 4 5 6 7 8
I B B B P B B B
9
I
Bidirectional prediction
Forward prediction
Transmission Order : 1 5 2 3 4 9 6 7 8 I P B B B I B B B
6
Motion Estimation and CompensationMotion Estimation and Compensation Macroblock: motion
compensation unit Motion Estimation
extracts the motion information from a video sequence
Motion information one motion vector for
forward-predicted macroblock
two motion vectors for bidirectionally predicted macroblocks
Motion Compensation reconstructs an image
using blocks from the previous image along with motion information, I.e., motion vectors
7
Implementation IssuesImplementation Issues
In case of P-frames, encoding of each macrobock is dependent on the output of motion estimation unit If two contents are the same, only the address of the
MB in the reference frame is encoded If very close, both the motion vector and the
difference matices are encoded If no close match is found, encode in the same way
as in I-frame
8
Implementation SchematicsImplementation Schematics
Bitsteram format
9
PerformancePerformance
I-frame Similar to JPEC 10:1 – 20:1
P-frame 20:1 – 20:1
B-frame 30:1 – 50:1
10
Video CompressionVideo CompressionH.261H.261
11
H.261 OverviewH.261 Overview
ITU-T standard for the compression/ decompression of digital video (1990) to facilitate video conferencing and video phone over
ISDN at the rate of p x 64 kbps; p = 1,2, ... ,30 real-time encoding-decoding ( 150ms) low-cost VLSI implementation
12
Picture preparationPicture preparation An image: 3 rectangular matrices (components)
Luminance Y Chrominance Cb (blue), Cr (red) 4:1:1 format
Image format CIF(common intermediate format) : 352x288
Used for video conferencing 30fps, progressive scanning
QCIF(Quarter CIF) : 176x144 Used for video telephony 15 / 7.5fps, progressive scanning
QCIF is mandatory. CIF is optional Bandwidth requirement of CIF with 15 fps
Y = 352 x 288 x 8bits/pixel x 15frame/sec Cb + Cr = 2 x ¼ x Y 18.3 Mbps need more than 50:1 compression for transmitting at 384
Kbps (p=6) I, P-frames are used in H.261
3 P-frames between each pair of I-frame
13
H.261 Encoding FormatH.261 Encoding FormatFrame format
Macro block format
GOB structure
14
H.261 Video EncoderH.261 Video Encoder
15
Entropy EncodingEntropy Encoding
Run-length encoding (run, amplitude)
Huffman encoding Huffman table is predefined by the H.261 standard
table for motion vectors table for quantized DCT coefficient
16
Video CompressionVideo CompressionH.263H.263
17
H.263H.263
Low-bit rate standard for teleconferencing applications Optimize H.261 so as to operate on below 64Kbps or
V.34 Modem 2.5 times more compressed than H.261
An extension of H.261 2 image formats 5 image formats Motion-compensated prediction has been refined supports B frame( has only P frame as a reference)
Used in IETF RTSP(Real Time Streaming Protocol)
Used in RealPlayer G2
18
Picture PreparationPicture Preparation
Digitization format QCIF(Quarter CIF) : 176x144
Used for video telephony 15 / 7.5fps, progressive scanning
Sub-QCIF (S-QCIF): 128 x 96 Progressive scanning, 15 / 7.5fps
Frame types I, P, B frames
19
Picture ProcessingPicture Processing
Unrestricted motion vectors For those pixels of a potential close-match MB that
fall outside of the frame boundary, the edge pixels themselves are used instead
The resulting MB produce a close match, then the motion vector, if necessary is allowed to point outside of the frame area
20
Error resilienceError resilience Target network for H.263 is a wireless network or PSTN relatively
high error rate Error propagation
Due to the resulting errors in the motion estimation vectors and motion compensation information, errors within a GOB may propagate to other regions of the frame
To minimize error propagation Error tracking Independent segment decoding Reference picture selection
21
Error trackingError tracking
Error detection methods Out-of-range motion vectors Invalid variable length codewords Out-of-range DCT coefficients Excessive number of coefficients within a MB
22
Independent Segment DecodingIndependent Segment Decoding
Each GOB is treated as a separate subvideo which is independent or the other GOBs in the frame
Motion estimation and compensation is limited to the boundary pixels of a GOB rather than a frame
Effect of a GOB being corrupted
Used with error tracking
23
Reference Picture SelectionReference Picture Selection
NAK mode ACK mode
24
MPEGMPEGVideo CompressionVideo Compression
MPEGMPEG MPEG(Moving Picture Experts Group)
ISO/IEC JTC1/SC29/WG11 standard for synchronized video and audio consists of System, Video, Audio, …
System: for multiplexing and synch. MPEG-1
ISO Recommendation 11172 Intended for the storage of VHS-quality audio-visual information on CD-ROM at
bit rates up to 1.5Mbps Video resolution: SIF (up to 352 x 288 pixels) Compressed bandwidth 1.5 Mbps
about 1.1Mbps for video, 128Kbps for audio, remainder for system Allows random access, fast forward, rewind
MPEG-2 Intended for the recording and transmission of studio-quality audio and video
MPEG-4 Initially, concerned with a similar range of applications to those of H.263, at very
low bit rate 4.8 – 64 kbps Later interactive multimedia applications over the Internet and the various types
of entertainment networks MPEG-7
To describe the structure and features of the content of the (compressed MM information
Used in search engine
26
MPEG-1MPEG-1
27
MPEG-1 framesMPEG-1 frames Spatial resolution: 352 x 288 pixels (SIF) Progressive scanning with refresh rate of 30Hz (for NTSC) and
25Hz (for PAL) Standard allows use of
I-frames only I- and P-frames only I-, P-, B- frames No D frames are supported
I-frame is used for random-access functions Example sequence
IBBPBBPBBI… for PAL IBBPBBPBBPBBI… for NTSC
28
Use of B FrameUse of B Frame
29
OverviewOverview Compression algorithm is based on H.261
MB Y plane: 16x16, Cb, Cr plane: 8x8
Differences from H.261 Time-stamps (temporal references) to enable the
decoder to resynchronize more quickly in the event of one or more corrupted or missing MBs
Introduction of B-frames, Search window in the reference frame is increased To improve the accuracu of the motion vectors, a finer
resolution is used
Typical compression ration I-frame: 10:1 P-frame: 20:1 B-frame: 50:1
30
31
MPEG System MPEG System
MPEG Standard Video coding Audio coding System coding
Timing and Synchronization Presentation Time
Stamps(PTS) Decoding Time
Stamps(DTS) System Clock
Reference(SCR)
32
MPEG-1 Video Bitstream StructureMPEG-1 Video Bitstream StructureComposition
Format
GOP layer: video coding unit First picture must start with I frame for edting
Picture layer: primary coding unit Slice layer: resynchronization unit Macroblock layer: motion compensation unit Block layer: DCT unit
33
MPEG Frame StructureMPEG Frame Structure
MPEG-1
MPEG-2
34
Constrained Parameter setConstrained Parameter set
horizontal size <= 720 pels vertical size <= 576 pels total number of macroblocks/picture <= 396 total number of macroblocks/second <= 396*25
= 330*30 picture rate <= 30 fps bit rate <= 1.86 Mbps decoder buffer <=376,832 bits
35
MPEG Encoding SchemeMPEG Encoding Scheme
36
MPEG Decoding SchemeMPEG Decoding Scheme
37
MPEG-2MPEG-2
38
MPEG-2 VideoMPEG-2 Video
jointly developed by ISO/IEC (IS 13818-2) and ITU-T (H.262)
permits data rates up to 100Mbps supports interlaced video formats supports HDTV, can be used for video over satellite, cable, and
other broadband channels backward compatibility with MPEG-1 and H.261
39
MPEG-1 and MPEG-2 MPEG-1 and MPEG-2
Parameter MPEG-1 MPEG-2Standardized 1992 1994Main application
Digital video on CD-ROM Digital TV (and HDTV)
Spatial resolution
SIF format (1/4 TV)
360x288 pixels
TV (4xTV)
720x576 (1440x1152)
Temporal resolution 25/30 frame/s
50/60 fields/s
(100/120 fields/s)Bit rate 1.5 Mbps 4 Mbps (20 Mbps)Quality VHS NTSC/PAL for TVCompression ratio over PCM
20-30 30-40
40
MPEG-2 Profile and LevelsMPEG-2 Profile and Levels
41
Main Profile at Main Level (MP@ML)Main Profile at Main Level (MP@ML)
Target application: digital TV broadcasting Interlaced scanning: 2 fields
Field modeSuitable for live sports
Frame modeSuitable for studio-based program
42
HDTVHDTV 3 Standards
ATV (advance television) in North America DVB (digital video broadcast) in Europe MUSE (multiple sub-Nyquist sampling encoding) in Japan and rest of
Asia ITU-R HDTV specification
16/9 aspect ratio 1920 sample/line, 1152(1080 visible) lines/frame Interlaced scanning with 4:2:0 format
ATV standard: Grand Alliance standard ITU-R spec + 1280 x 720, 16/9 aspect ratio Video compression: MP@HL Audio compression: Dolby AC-3
DVB standard 4/3 aspect ration, 1440 x 1152(1080 visible) Video compression: SSP@H1140 (spatially-scalable profile)
MUSE standard 16/9 aspect ratio, 1920 x 1034 Video compression: similar to MP@HL
43
MPEG-4MPEG-4
44
Goal of MPEG-4 (1)Goal of MPEG-4 (1)
Initial goal was to refine H.261 with a compression ratio 10 times better. But, failed.
Consequently, the focus was shifted to development of standard for Flexible bitstreams that are scalable for receivers with
different capabilities such as resolutions Extendable configuration for transmitters to download
new applications and algorithms into receivers Content-based interactivity for multimedia data
access, manipulations and bitstream editing, and hybrid, natural and synthetic data
Network independence, so that it can be used with any communication network to provide universal accessibility
45
Goal of MPEG-4 (2)Goal of MPEG-4 (2) MPEG-4 standards for
Multimedia content generation Network interface for multimedia transport Interactivity for users
Content-based interactivity Defined by SNHC (Synthetic and Natural Hybrid
Coding) group Coding for a synthetic human face and body Animation of the face and body Media integration of text and graphics Texture coding for view-dependent applications Static and dynamic mesh coding with texture mapping Interface for text-to-speech synthesis and synthetic
audio
46
AVO: Audio/Visual ObjectAVO: Audio/Visual Object Primitive AVOs
2D fixed background Picture of a walking and talking lady without the background Voice associated with that person
Compound AVO e.g) AVO that contains both the audio and visual components of
a talking and walking person MPEG-4 treats the audiovisual activities and associated
operations, including compression, decompression, multiplexing and synchronization of audiovisual activities, as objects – similar to OOP View as a configuration, communication, and instantiation of
classes of objects VOP (Video Object Plane)
a video object at any given time Video encoder encodes each VOP separately
47
Content-based Video Coding Content-based Video Coding
48
User InteractionUser Interaction
User interaction operations with the decoded scene following the design of the scene’s author: Changing view/listening point of the scene by
navigating through a scene Dragging objects to different positions Triggering a sequence of events by clicking on a
specific object, including the starting and stopping of a video stream
Selecting the desired language when multiple language tracks are available
49
Scalability and AccessibilityScalability and Accessibility MPEG-4 video object coding supports spatial
and temporal scalability This allows the receiver to decode only a part of a
bitstream and reconstruct images or image sequences
Good for video delivery over multimedia networks due to bandwidth limitation
Good for displaying limited resolution due to receiver’s capability
Universal accessibility to support various communication media MPEG-4 provides error robustness and resilience for
a noisy environment such as mobile networks Supports audio and video compression algorithms in
error-prone environments at low bit-rates ( < 64 Kbps)
50
Audio CompressionAudio Compression
Compressed using one of algorithms, depending on available bit rate of the transmission channel and sound quality required, e.g. G.723.1 (CELP) for interactive MM applications over
Internet Dolby AC-3, or MPEG Layer 2 for interactive TV
applications over entertainment networks
51
MPEG-4 Encoder/DecoderMPEG-4 Encoder/DecoderVOP endcoder
MPEG-4 decoder
52
Error Resilience TechniquesError Resilience Techniques
Use of fixed-length video packets (VP: 188B) instead of GOBs
New variable-length coding (VLC) scheme based on reversible VLCs Convential
GOB approach
Using fixed-length VP
53
Applications of MPEG-4Applications of MPEG-4
Real-time communication systems Mobile computing Content-based storage and retrieval Streaming video on the Internet Collaborative scene visualization High-quality broadcasting Studio and TV post-production Interactive movie, travel guide, computer-based
teaching, Karaoke
54
MPEG-7MPEG-7Multimedia Content Description Multimedia Content Description InterfaceInterface
55
OverviewOverview
Description, identification and access of AV information Used to perform a search for AV information
Search picture using characteristics such as color, texture or shape of objects
MPEG-7 description can be attached to any kind of multimedia material independent of the format of the representation
Visual description based on Color, texture, sketch, 2D and 3D shape, still images, 3D visual
data, spatial composition relations, temporal composition information
Audio description base on Frequency contour, frequency profile, prototypical soound,
souce of sound, stereo of 5.1-channel or binaural sounds
56
MPEG-7 ApplicationsMPEG-7 Applications
Medical diagnosis Home shopping Search for video and audio database Architecture, interior design Multimedia directory services
57
MHEGMHEG
58
OverviewOverview
Standardized by ISO/IEC/JTC1/SC29 WG12 Describes how video is displayed, audio is
replayed and the means by which a user can interact with the ongoing presentation
Also addresses multiplatform issue Uses ASN.1 for representing data structure
More functionality than HTML Multimedia handling capabilities such as
synchronization of stream, replay speed control, user’s interactivity with stream events
Uses 3 spatial coordinates and time to synchronize the presentation
59
MHEG ApplicationsMHEG Applications
Video on demand Interactive multimedia service Interactive TV