© De Montfort University, 20031 Digital Video Howell Istance School of Computing Technology De...

© De Montfort University, 2003 1

Digital Video

Howell Istance

School of Computing Technology

De Montfort University


Moving pictures…

• We see a sequence of still images as a continuous movement if rate of presentation is greater than critical flicker fusion frequency (about 40 images/sec)

• Film is shot at 24 frames/second but each frame effectively shown twice in projection giving refresh rate of 48 frames/second

• 3 broadcast TV and video standards– NTSC – US, Canada

– PAL – Europe, Australia

– SECAM – France, Eastern Europe


Interlacing and Frame rates

• For each system, refresh rate is double broadcast frame rate, by showing half of one frame followed by the other half

• NTSC (30 frames/second), PAL and SECAM (25 frames/second)


Image sizes and Data Rates…

• Consider the amount of data to represent a sequence of digital images at NTSC broadcast rate, if 3 bytes used to represent a pixel value– 640 * 480 * 3 = 900K per image– 1 second at 30 frames/second = 26 Mb– 1 minute at 30 frames/second = 1.6 Gb

• For PAL/SECAM, similar– 768 * 576 * 3 – 1 second at 25 frames/second = 31 Mb– 1 minute at 25 frames/second = 1.85 Gb

• Data (transfer) rate = data amount per unit time (e.g 26 Mb/sec for NTSC, 31 Mb/sec for PAL)


Sites for digitisation and compression

• Camera– Pass digital data to computer via high-speed interface (IEEE 1394

- Firewire), low grade cameras use USB interface

– +ve: digital data resistant to effect of noise during transfer

– -ve: no user control over compression/quality tradeoff

• Computer– Analogue data from camera passed to video capture card

– Data susceptible to noise, reduces compression efficiency

– User has greater control over compression

• Playback after decompression via external monitor

• Compressor/decompressor = ‘codec’


Analogue Broadcast Standards

• Field is each interlaced half of frame• Fields transmitted at grid frequency

– NTSC (US) 60 Hz – 30 frames/second– PAL and SECAM (Europe) 50 Hz – 25 frames/second

• Adding colour to NTSC signal interfered with audio so correction factor applied (1000/1001)

• NTSC field rate = 60 * 1000/1001 = 59.94, giving frame rate of 29.97

• Playback of video on computer monitor not interlaced, lines from each field written into a frame buffer, top to bottom (progressive scanning)

• fast monitor refresh enables lower frame rate without flicker


Line/field rates

• NTSC 525 lines/frame, 45 lines contain synchronisation data, 480 picture data– Represented as 525/59.94

• PAL SECAM 625 lines/frame, 49 lines contain synchronisation data, 576 picture data– Represented as 625/50

• Mapping film footage shot at 24 frames/second to video– NTSC 3:2 pulldown, PAL shows 24 frames in 24/25 seconds


Colour Model

• Originally required means of transmitting colour signal which could be ignored by black and white TV receivers

• Separate ‘brightness’ of a image element from its colour

• Y (luminance) = 0.2125R + 0.7154G + 0.0721B

• U = (weighting factor) * (B – Y)

• V = (weighting factor) * (R – Y)

• Analogue TV uses Y’UV– 3 signals combined into 1 composite signal

• Digital TV uses Y’CBCR : same idea, different weights


Down sample chrominance components

• All 4 Y’ values preserved• The 4 (2h2v) Cr values replaced by one Cr value in sample• Same reduction with Cb values (2h2v = 4:2:0)

Y’

Cr

Cb4 pixels(3 bytes each) Y’

Cr

Cb


Sampling Analogue data

• Standard CCIR 601 prescribes 720 samples / picture line, both broadcast standards for luminance, 360 samples of both colour difference values

• Chrominance sub-sampling (4:2:2)

• Less bandwidth to transmit colour than luminance

• NTSC frame 720 * 480 pixels

• PAL frame 720 * 576 pixels

• Sampled digital data then has to be compressed for transmission


(720)

(480) – NTSC(576) – PAL/SECAM


4:2:2 Chrominance Sub sampling

4 Y luminance samples2 Cr chrominance samples2 Cb chrominance samples



4 Y luminance samples1 Cr chrominance sample1 Cb chrominance sample


Standards for captured data

• Hardware codecs used to compress (and decompress) sampled data from camera to storage

• 2 standards emerge here

• DV (consumer, semi-professional equipment)– Variations DVCAM, DVPRO concern tape formats

• MPEG-2 (digital broadcast, studio equipment)– Collection of standards, grouped into profiles and levels

– Most common ‘main profile at main level’ (MP@ML)

– CCIR 601 scanning, 4:2:0 chrominance subsampling, data rate of 15 MegaBits/seconds (1.87 Mb/sec)

mailto:MP@ML


Compression techniques

• Can’t assume that devices (camera, video card) used to digitise images will be available for playback on end user machine

• Need to provide software codec to apply a compression technique suited to capabilities of end user machine

• All techniques operate on a sequence of bitmapped images

• Video data normally compressed and recompressed twice, – when captured (hardware codec) –real-time compression needed

– In order to be transmitted (software codec)


Intra- and Inter-frame compression

• Spatial (intra-frame) compression compresses each frame in isolation– Lossy techniques applied, leading to some loss of image quality

• Temporal (inter-frame) compression calculates and compressed differences between sequence of frames– 1 Key frame + (succession of usually) 6 difference frames

– Difference frame contains difference between original frame and preceeding key frame or preceeding difference frame

• Time to compress may be (much) longer than time to decompress – asymetric codec

• Fast decompression times important


Static.avi moving.avi

• File size: 4.49M bytes

• Total duration: 20.48 seconds

• Average data rate: 224.45K per second

• Image size: 320 x 240

• Pixel depth: 24 bits

• Frame rate: 30.03 fps

• There are 39 keyframes, 537 delta frames.

• There are 39 empty frames.

• Compressor: 'IV50', Indeo® video 5.10

• File size: 7.22M bytes

• Total duration: 23.61 seconds

• Average data rate: 313.14K per second

• Image size: 320 x 240

• Pixel depth: 24 bits

• Frame rate: 30.03 fps

• There are 46 keyframes, 632 delta frames.

• There are 31 empty frames.

• Compressor: 'IV50', Indeo® video 5.10


Digital Video (DV) - cameras

• DV equipment uses similar compression technique to MJPEG

• Chrominance subsampling 4:2:0

• Also uses temporal compression

• Has to maintain 3.25 Mbytes/sec data rate (due to demands of DV VTR equipment

• Quality is varied dynamically – if no or little motion in a sequence, more opportunity for temporal compression to make savings, so less spatial compression applied, giving higher image quality


Motion JPEG (MJPEG) - cards

• Most common approach during capture of analogue video

• Loosely defined way of applying JPEG compression

• JPEG compression applied to each frame, no temporal compression

• Discrete Cosine Transform works just as well on Y’CbCr

• Can specify quality setting – compression vs image quality trade-off

• Typical data rates 3 Mbytes/sec – compression ratio of 7:1, achieved by low- mid range capture cards


Software codecs

• Four are main contenders for compressing video for delivery on CD-ROM, or via internet: Cinepak, Intel Indeo, Sorenson and MPEG-1

• Cinepak, Intel Indeo, Sorenson all use vector quantisation• Frame divided into small blocks (vectors), • Code book contains typical block patterns• closest approximation to code book entry worked out and

index to code book is stored instead of original vector• Decompression (fast) obtained by replacing indices from

data stream with code book entries• Compression (slow) as much as 150* decompress time


Software codecs

• Full motion, full screen playback not possible with mid-range processors (decompression not fast enough)

• VHS quality (in terms of lossiness), ¼ frame (320 * 240 pixels) @ 12 frames / second is feasible

• Sorenson codec can compress a video with these parameters to (only) 50Kbytes/sec– Within capabilites of multimedia PC or 1x speed CD-ROM


Comparison of S/W Codecs

Moving

(panning camera)

Static

(talking head)Intel Indeo 5.10 7,394K 4,597K

Cinepak(Quality =100%)

8,451K

(2m 40s)

7,381K

Sorenson 3 12,199K

(40s)

12,393K

(35s)

MPEG1 3,292K

(1m 55s)

2,880K

(1m 40s)

Original Size

(20secs)

67,500K 67,500K

© De Montfort University, 20031 Digital Video Howell Istance School of Computing Technology De...

Documents

Transcript of © De Montfort University, 20031 Digital Video Howell Istance School of Computing Technology De...