Digital Image Representation - nets.rwth-aachen.de · • Digital image representation ... Chapter...
Transcript of Digital Image Representation - nets.rwth-aachen.de · • Digital image representation ... Chapter...
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 1Chapter 2.2: Images and Graphics
2.2: Images and Graphics• Digital image representation
• Image formats and color models• JPEG, JPEG2000• Image synthesis and graphics
systems
Chapter 2: Representation of Multimedia Data• Audio Technology
• Images and Graphics• Video Technology
Chapter 3: Multimedia Systems –Communication Aspectsand Services
Chapter 4: Multimedia Systems –Storage Aspects
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 2Chapter 2.2: Images and Graphics
A digital image is a spatial representation of an object(2D, 3D scene or another image - real or virtual)
Definition of “digital image”:Let I, J, K ⊆ Z be a finite interval. Let G ⊂ N0 with |G| < ∞ be the grey scale level / color depth (intensity value of a picture element = a pixel) of the image.
(1) A 2D-image is a function f: I × J → G(2) A 3D-image is a function f: I × J × K → G
(3) If G = {0,1}, the function is a binary (or bit) image, otherwise it is a pixel image
The Resolution depends on the size of I and J (and K) and describes the number of pixels per row resp. column.
Example
To display a 525-line television picture (NTSC) without noticeable degradation with a Video Graphics Array (VGA) video controller, 640x480 pixels and 256 discrete grey levels give an array of 307.200 8-bit numbers and a total of 2.457.600 bit.
Digital Image Representation
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 3Chapter 2.2: Images and Graphics
An Image Capturing Format is specified by:
spatial resolution (pixel x pixel) and color encoding (bits per pixel)
Example: captured image of a DVD video with 4:3 picture size:
spatial resolution: 768 x 576 pixelcolor encoding: 1-bit (binary image), 8-bit (color or grayscale),
24-bit (color-RGB)
Image Representation
An Image Storing Format is a 2-dimensional array of values representing the image in a bitmap or pixmap, respectively. Also called raster graphics. The data of the fields of a bitmap is a binary digit, data in a pixmap may be a collection of:
• 3 numbers representing the intensities of red, green, and blue components of the color
• 3 numbers representing indices to tables of red, green and blue intensities• Single numbers as index to a table of color triples
• Single numbers as index to any other data structures that represents a color / colorsystem
Further properties can be assigned with the whole image: width, height, depth, version, etc.
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 4Chapter 2.2: Images and Graphics
Color Models
Why storing values for red, green, blue?
Color perception by the human brain is possible through the additive composition of red, green and blue light (RGB system). The relative intensities of RGB values are transmitted to the monitor where they are reproduced at each point in time.
On a computer monitor, each pixel is given as an overlay of those three image tones with different intensities – by this, any color can be reproduced.
But: another possible color model: CYMK
When printing an image, other color components are used –cyan, yellow, magenta, kontrast – which in all can also reproduce all colors.Thus, many image processing software and also some image storing formats also support this model.
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 5Chapter 2.2: Images and Graphics
Another possibility is to use a different representation of color information by means of the YUV system where
• Y is the brightness (or luminance) information• U and V are color difference signals (chrominance)
• Y, U and V are functions of R, G and BWhy? As the human eye is more sensitive to brightness than to chrominance, separate brightness information from the color information and code the more important luminance with more bit than the chrominance – this can save bits in the representation format.
Color Models
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 6Chapter 2.2: Images and Graphics
Usual scheme:
• Y = 0.30 · R + 0.59 ·G + 0.11 · B (the color sensitivity of the human eye is considered)• U = c1 · (B-Y); V = c2 ·(R-Y)
• c1 , c2 = constants reflecting perception aspects of the human eye and the human brain!
Possible Coding:• YUV signal
� Y = 0.30 R + 0.59 G + 0.11 B� U = (B-Y) · 0.493 = - 0.148 · R - 0.29 · G + 0.439 · B� V = (R-Y) · 0.877 = 0.614 · R - 0.517 · G - 0.096 · B
• This is a system of 3 equations for determining Y, U, V from R, G, B or for recalculating R, G, B from Y, U, V
• The resolution of Y is more important than the resolution of U and V
• Spend more bits for Y than for U and V (Y : U : V = 4 : 2 : 2)• The weighting factors in the calculation of the Y signal compensate the color perception
misbalance of the human eye
Color Models
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 7Chapter 2.2: Images and Graphics
Image FormatsLots of different image formats are in use today, e.g.• GIF (Graphics Interchange Format)
Compressed with some basic lossless compression techniques to 20 – 25% of original picture without loss. Supports 24-bit colors.
• BMP (Bitmap)
Devide-independent representation of an image: uses RGB color model, without compression. Color depth up to 24-bit, additional option of specifying a color table to use.
• TIFF (Tagged Image File Format)Supports grey levels, RGB, and CYMK color model. Also supports lots of different compression methods. Additionally contains a descriptive part with properties a displayshould provide to show the image.
• PostScriptImages are described without reference to special properties as e.g. resolution. Nice feature for printers, but hard to include into documents where you have to know the image size...
• JPEG (Joint Photographics Expert Group)Lots of possible compressions, mostly with loss!
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 8Chapter 2.2: Images and Graphics
Why Compression?
High-resolution image: e.g. 1024×768 pixel, 24 bit color depth→ 1024·768·24 = 18.874.368 bit
Image formats like GIF:
• Lossless compression (entropy encoding) for reducing data amount while keeping image quality
JPEG: • Lossy compression – remove some image details to achieve a higher compression rate
by suppressing higher frequencies
• Combined with lossless techniques• Trade-Off between file size and quality
• JPEG is a joint standard of ISO and ITU-T� In June 1987, an adaptive transformation coding technique based on DCT was
adopted for JPEG
� In 1992, JPEG became a ISO international standard
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 9Chapter 2.2: Images and Graphics
JPEG
Implementation
• Independent of image size• Applicable to any image and pixel aspect ratio
Color representation
• JPEG applies to color and grey-scaled still images
Image content• Of any complexity, with any statistical characteristics
Properties of JPEG• State-of-the-art regarding compression factor and image quality
• Run on as many available standard processors as possible• Compression mechanisms are available as software-only packages or together with
specific hardware support - use of specialized hardware should speed up image decompression
• Encoded data stream has a fixed interchange format • Fast coding is also used for video sequences: Motion JPEG
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 10Chapter 2.2: Images and Graphics
How could We compress?
Entropy encoding
� Data stream is considered to be a simple digital sequence without semantics� Lossless coding, decompression process regenerates the data completely
� Used regardless of the media‘s specific characteristics� Examples: Run-length encoding, Huffman encoding, Arithmetic encoding
Source encoding� Semantics of the data are taken into account
� Lossy coding (encoded data are not identical with original data) � Degree of compression depends on the data contents
� Example: Discrete Cosine Transformation (DCT) as transformation technique of the spatial domain into the two-dimensional frequency domain
Hybrid encoding
� Used by most multimedia systems� Combination of entropy and source encoding
� Examples: JPEG, MPEG, H.261
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 11Chapter 2.2: Images and Graphics
Compression Steps in JPEG
ImagePreparation
Pixel
Quantization(approxima-tion of real numbers by
rational numbers)
Block, MCU
Image Processing
Predictor
DCT
Entropy Encoding
Run-length
Huffman
Arithmetic
UncompressedImage
CompressedImage
MCU: Minimum Coded UnitDCT: Discrete Cosine Transform
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 12Chapter 2.2: Images and Graphics
Compression Steps in JPEG
Image Preparation• Analog-to-digital conversion
• Image division into blocks of N×N pixels• Suitable structuring and ordering of image information
Image Processing - Source Encoding• Transformation from time to frequency domain using DCT
• In principle no compression itself – but computation of new coefficients as input forcompression process
Quantization
• Mapping of real numbers into rational numbers (approximation)• A certain loss of precision will in general be unavoidable
Entropy Encoding• Lossless compression of a sequential digital data stream
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 13Chapter 2.2: Images and Graphics
The Principle
• Without “Quantization“: Encoding gain would be very poor (or nonexisting) • Transformation and Retransformation must be inverse to each other
• Task of transformation: produce a picture representation which may be encoded with a high gain of reduction
Original
Transformation Quantization
get rid of“invisible details“
QuantizationTable
Encode
Huffman,Run Length Encoding
JPEGPicture
The opposite
JPEG Decoder Dequantization
the “details“ cannotbe reconstructed
Retransformation
“Original“
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 14Chapter 2.2: Images and Graphics
Variants of Image Compression
JPEG is not a single format, but it can be chosen from a number of modes:
Lossy sequential DCT-based mode (baseline process)• Must be supported by every JPEG implementation
• Block, MCU, FDCT, Run-length, Huffman
Expanded lossy DCT-based mode
• Enhancement to the baseline process by adding progressive encoding
Lossless mode
• Low compression ratio → „perfect“ reconstruction of original image• No DCT, but differential encoding by prediction
Hierarchical mode• Accommodates images of different resolutions
• Selects its algorithms from the three other modes
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 15Chapter 2.2: Images and Graphics
First Step: Image Preparation
General image model
• Independence from image parameters like size and pixel ratio • Description of most of the well-known picture representations• Source picture consists of 1 to 255 components (planes) Ci
• Components may be assigned to RGB or YUV values• For example, C1 may be assigned to ‘red color information’
• Each component Ci can have a different number of superpixels Xi, Yi
(A superpixel is a rectangle of pixels which all have the same value)
Yi
Ci
Xisuperpixel• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •• • •
CN
C3C2C1
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 16Chapter 2.2: Images and Graphics
Picture Preparation - Components
Resolution of the components may be different:
A1
AN
A2
X1
Y1
B1
BM
B2
X2
Y2
D1
DM
D2
X3
Y3
1 2 3
1 2 3
X 2 X 2 X
Y Y Y
= ⋅ = ⋅
= =
• A grey-scale image consists (in most cases) of a single component• RGB color representation has three components with equal resolution
• YUV color image processing usesY1 = 4 · Y2 = 4 · Y3 and X1 = 4 · X2 = 4 · X3
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 17Chapter 2.2: Images and Graphics
Image Preparation - Dimensions
Dimensions of a compressed image are defined by
• X (maximum of all Xi),• Y (maximum of all Yi),
• Hi and Vi (relative horizontal and vertical sampling ratios for each component i) with and
• Hi and Vi must be integers in the range of 1 to 4. This restriction is needed for the interleaving of components
Example:
Y = 4 pixels, X = 6 pixels
X1 = 6, Y1 = 4H1 = 2V1 = 2
i
jj
Xi min XH = i
jj
Yi m in YV =
• • •
X3 = 3, Y3 = 2H3 = 1V3 = 1
X2 = 6, Y2 = 2H2 = 2V2 = 1
C2
• • •
C3
• • • • • •
• • • • • •
X1
••••
••••
••••
••••
••••
••••
C1
Y1
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 18Chapter 2.2: Images and Graphics
Image Preparation – Data Ordering
An image is divided into several components which can be processed one by one. But: how to prepare a component for processing?
• Observation for most parts of an image: not so much difference between the values in a rectangle of N×N pixels
• For further processing: divide each component of an image into blocks of N×N pixels
• Thus, the image is divided into data units (blocks):� Lossless mode uses one pixel as one data unit
� Lossy mode uses blocks of 8×8 pixels (with 8 or 12 bits per pixel)
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 19Chapter 2.2: Images and Graphics
Image Preparation - Data Ordering
Non-interleaved data ordering:• The easiest but not the most convenient sequence of data processing
• Data units are processed component by component• For one component, the processing order is left-to-right and top-to-bottom
•• • • • •• • • • • •
•
•••
•
•
•
•
•
•
•
•
With the non-interleaved technique, a RGB-encoded image is processed by:
• First the red component only• Then the blue component, followed by the green component
This is (for speed reasons) less suitable than data unit interleaving
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 20Chapter 2.2: Images and Graphics
Interleaved Data Ordering
Often more suitable: interleave data units
• Interleaving means: don‘t process all blocks component by component, but mix data units from all components
• Interleaved data units of different components:
� Combination to Minimum Coded Units (MCUs)
• If all components have the same resolution
→ MCU consists of one data unit for each component
• If components have different resolutions
1. For each component, regions of data units are determined;data units in one region are ordered left-to-right and top-to bottom
2. Each component consists of the same number of regions3. MCU consists of one region in each component
• Up to 4 components can be encoded in interleaved mode (according to JPEG)
• Each MCU consists of at most ten data units
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 21Chapter 2.2: Images and Graphics
Image Preparation - MCUs
0 1 2 3 4 5
0
1
2
3
MCU example: four components C1, C2, C3, C4
a00 a01
a10 a11
� �
� �
� �
� �
� �
� �
� �
� �
� �
� �
b00 b01 � � � �
� �� �� �
C2: H2 = 2, V2 = 1C1: H1 = 2, V1 = 2
c00
c10
�
�
�
�
�
�
�
�
�
�
C3: H3 = 1, V3 = 2 C4: H4 = 1, V4 = 1
d00 � �
���
MCUs: 9 data units per MCU
MCU1 = a00a01a10a11b00b01c00c10d00
MCU2 = a02a03a12a13b02b03c01c11d01
MCU3 = a04a05a14a15b04b05c02c12d02
MCU4 = a20a21a30a31b10b11c20c30d10
ii
j
ii
j
xH
min x
yV
min y
=
=
whereaij: data units of C1
bij: data units of C2
cij: data units of C3
dij: data units of C4
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 22Chapter 2.2: Images and Graphics
Compression Steps in JPEG
ImagePreparation
Pixel
Quantization(approxima-tion of real numbers by
rational numbers)
Block, MCU
Image Processing Entropy
Encoding
Run-length
Huffman
Arithmetic
UncompressedImage
CompressedImage
• Result of image preparation: sequence of 8×8 blocks, the order is defined by MCUs
• The samples are encoded with 8 bit/pixel
• Next step: image processing by source encoding
DCT
MCU: Minimum Coded UnitDCT: Discrete Cosine Transform
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 23Chapter 2.2: Images and Graphics
Source Encoding – Transformation
Encoding by transformation: Data are transformed into another mathematical domain, which is more suitable for compression.
• The inverse transformation must exist and must be easy to calculate• Most widely known example: Fourier transformation
• The parameters m and n indicate the ‘granularity’
Most effective transformation for image compression:
• Discrete Cosine Transformation (DCT)
• Fast Fourier Transformation (FFT)
2 i 2 im n
m 1 n 1 u x vyu v xyx 0 y 0
F f e eπ π− −
= == ∑ ∑
m 1 n 1 u( 2 x 1 ) v ( 2 y 1 )uv nm xy 2 m 2 nx 0 y 0
F f cos cosπ πδ − − + += =
= ∑ ∑
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 24Chapter 2.2: Images and Graphics
Discrete Cosine Transformation
xyLet f be a pixel (x,y) in the original picture. (0 x N 1; 0 y N 1)≤ ≤ − ≤ ≤ −
{ }N 1 N 1
uv N u v xyx 0 y 0
u
( 2 x 1 ) u ( 2 y 1 ) vF : c c f cos cos , u ,v 0,...,N 1 ,
2N 2N
1 u 02 c
1 u 0
π πγ− −
= =
+ ⋅ + ⋅⎛ ⎞ ⎛ ⎞= ⋅ ⋅ ⋅ ⋅ ⋅ ∈ −⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
⎧ =⎪= ⎨⎪ >⎩
∑∑
fxy ↔ space domain (i.e. “geometric“)
Fuv ↔ “frequency domain“ (indicates how fast the information moves inside the rectangle)
F00 is the lowest frequency in both directions, i.e. a measure of the average pixel
value
Fuv with small total frequency (i.e u+v small) are (in general) larger than Fuv with large u+v
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 25Chapter 2.2: Images and Graphics
Retransformation: Inverse Cosine Transformation
N 1 N 1
xy N u v u vu 0 v 0
( 2 x 1 ) u ( 2 y 1 ) vf c c F co s co s
2 N 2 Nπ πδ
− −
= =
+ ⋅ + ⋅⎛ ⎞ ⎛ ⎞= ⋅ ⋅ ⋅ ⋅ ⋅⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
∑ ∑
2
00 N N uvx y
1 1 NF f cos(0 ) f all other F 0
22 2γ γ⇒ = ⋅ ⋅ ⋅ = ⋅ ⋅ ← =∑∑
N N N
2if then
Nδ γ γ= =
!
00 N u v uvu v
N 0 0 00
2
N N
1
f f c c F cos(...) cos(...)
c c F 1 1
1 Nf
2 2
δ
δ
δ γ
=
= = ⋅ ⋅ ⋅ ⋅ ⋅
= ⋅ ⋅ ⋅ ⋅ ⋅
= ⋅ ⋅ ⋅ ⋅
∑∑
�������
Simplest example (just for demonstration): Let fxy = f = constant
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 26Chapter 2.2: Images and Graphics
Example
• N=8 (Standard):
• N=2:
1uv u v 4F c c ...= ⋅ ⋅ ∑∑
1 1
u v u v xyx 0 y 0
u v 00 0 1 1 0 1 1
2 ( 2 x 1 ) u ( 2 y 1 ) vF c c f co s co s
2 4 4
u v u 3 v 3 u v 3 u 3 vc c f co s co s f co s f co s f co s
4 4 4 4 4 4 4 4
π π
π π π π π π π π= =
+ ⋅ + ⋅⎛ ⎞ ⎛ ⎞= ⋅ ⋅ ⋅ ⋅⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
⎡ ⎤= ⋅ ⋅ ⋅ ⋅ + ⋅ ⋅ + ⋅ ⋅ + ⋅ ⋅⎢ ⎥⎣ ⎦
∑ ∑
[ ]00 00 01 10 11 xy
1F f f f f i.e. 2 f if f f
2= ⋅ + + + ← ≈ ⋅ ≈
01 00 01 10 11 00 01 10 11
1 1 1 12 2 2 22 2 2 2
1 3 3 1F f cos f cos f cos f cos f f f f
4 4 4 4 22
π π π π
− −
⎡ ⎤⎢ ⎥ ⎡ ⎤⎢ ⎥ ⎢ ⎥= ⋅ + ⋅ + ⋅ + ⋅ = − + −⎢ ⎥ ⎢ ⎥⎣ ⎦⎢ ⎥⎣ ⎦
���������� ����� ��� �����
2 positive + 2 negative terms,i.e. if fxy ≈ f ⇒ F01 ≈ 0
Transformed values can be much smaller than original values:
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 27Chapter 2.2: Images and Graphics
Baseline Process - Image Processing
First step of image processing:• Samples are encoded with 8 bits/pixel; each pixel is an integer in the range [0,255]
• Pixel values are shifted to the range [-128, 127] (2-complement representation)
• Data units of 8 x 8 pixel values are defined by fxy ∈ [-128, 127] , where x, y are in the range [0, 7]
• Each value is transformed using the Forward DCT (FDCT):
7 7( 2 x 1 )u ( 2 y 1 )v1
uv u v xy4 16 16x 0 y 0
F = c c f cos cosπ π+ +
= =∑∑
12
u/v
for u / v 0where c and u,v [0,7 ]
1 otherwise
=⎧⎪= ∈⎨⎪⎩
• Cosine expressions are independent of fxy → fast calculation is possible• Result: From 64 coefficients fxy we get 64 coefficients Fuv in the frequency domain
How can DCT be useful for JPEG? - Fuv for larger values of u and v are often very small!
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 28Chapter 2.2: Images and Graphics
Meaning of Coefficients
Low
Low
High
High
Low
High
HighLow
8×8 block
Transformation to frequencies
+
...
F00
F01
F10F12 F23
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 29Chapter 2.2: Images and Graphics
Baseline Process - Image Processing
• Coefficient F00: DC-coefficient
– Corresponds to the lowest frequency in both dimensions– Determines the fundamental color of the data unit of 64 pixels– Normally the values for F00 are very similar in neighbored blocks
• Other coefficients (Fuv for u+v > 0): AC-coefficients– Non-zero frequency in one or both dimensions
• Reconstruction of the image: Inverse DCT (IDCT)If FDCT and IDCT could be calculated with full precision → DCT would be lossless
• In practice: precision is restricted (real numbers!), thus DCT is lossy
→ different implementations of JPEG decoder may produce different images
• Reason for the transformation:
– Experience shows that many AC-coefficients have a value of almost zero, i.e. they are zero after quantization → entropy encoding may lead to significant data reduction.
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 30Chapter 2.2: Images and Graphics
Compression Steps in JPEG
ImagePreparation
Pixel
Quantization(approxima-tion of real numbers by
rational numbers)
Block, MCU
Image Processing
Predictor
EntropyEncoding
Run-length
Huffman
Arithmetic
UncompressedImage
CompressedImage
• Result of image processing: 8×8 blocks of DC/AC coefficients
• Till now, no compression is done – this task is enabled by quantization
DCT
MCU: Minimum Coded UnitDCT: Discrete Cosine Transform
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 31Chapter 2.2: Images and Graphics
Quantization
How to enforce that even more values are zero?Answer: by Quantization.
Divide Fuv by Quantumuv = Quv and take the nearest integer as the result
[ ]Quv uv uvF F / Q=
Q *uv uv uvF Q F⋅ =
uv
3 5 7 9 ... 17
5 7 9
7 9Q
9
17 31
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥
= ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦...
...
...
. . .
.. .
Example:N=8; quantization step=2, Quv =2⋅(u+v)+3
Fuv �
smaller values
most values are zero
0 1 ... N-1
0
1...N-1
Observation:
Dequantization:
(only an approximation of Fuv)
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 32Chapter 2.2: Images and Graphics
Baseline Process - Quantization
• Quantization process:
– Divide DCT-coefficient value Fuv by an integer number Quv
and round the result to the nearest integer
• Quantization of all DCT-coefficients results in a lossy transformation – some image details given by higher frequencies are cut off.
• JPEG application provides a table with 64 entries, each used for quantization of one DCT-coefficient → each coefficient can be adjusted separately
• A high compression factor is achievable on the expense of image quality→ large quantization numbers: high data reduction but information loss increases
• No default values for quantization tables are specified in JPEG
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 33Chapter 2.2: Images and Graphics
Example
152
144
140
168
162
148
136
147
155
152
144
145
148
155
156
167
136
140
147
156
156
136
123
140
167
147
140
160
148
155
167
155
163
140
140
152
140
152
162
155
162
148
155
155
136
147
144
140
152
167
179
136
147
147
140
136
172
179
175
160
162
136
147
162
Input values from exemplary grey-scale image
1.
First: subtract 128 from each element
Then: perform FDCT
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 34Chapter 2.2: Images and Graphics
Example
Quantization Matrix for Quality Level 2
3.
7
5
3
9
11
17
15
13
9
7
5
11
13
19
17
15
11
9
7
13
15
21
19
17
13
11
9
15
17
23
21
19
15
13
11
17
19
25
23
21
17
15
13
19
21
27
25
23
19
17
15
21
23
29
27
25
21
19
17
23
25
31
29
27
2.
Fuv
DCcoefficient
FDCT Output Values(because of space reasons only
the part before the comma )
-10
21
186
-8
-3
0
9
4
-24
-34
-18
-5
10
-8
1
-2
-2
26
15
14
8
-2
-3
-18
6
-9
-9
-15
1
2
4
8
-18
-11
23
-8
-11
1
-1
8
3
11
-9
-3
18
4
-7
- 4
-20
14
-14
-3
18
-6
-1
1
-1
7
-19
8
15
0
-2
-7
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 35Chapter 2.2: Images and Graphics
Example
-1
4
62
-1
0
0
1
0
-3
-5
-4
0
1
0
0
0
0
3
2
1
1
0
0
-1
0
-1
-1
-1
0
0
0
0
-1
-1
2
0
-1
0
0
0
0
1
-1
0
1
0
0
0
-1
1
-1
0
1
0
0
0
0
0
-1
0
1
0
0
0
Effects of Quantization
4.
Quantized Matrix
instead of -18
Indication of “quality loss“
-7
20
186
-9
0
0
15
0
-27
-35
-20
0
13
0
0
0
0
27
14
13
15
0
0
-17
0
-11
-9
-15
0
0
0
0
-15
-13
22
0
-19
0
0
0
0
15
-13
0
21
0
0
0
-19
17
-15
0
23
0
0
0
0
0
-17
0
25
0
0
0
Correct valuewas -11
5.
F*uv - reconstructionafter dequantization
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 36Chapter 2.2: Images and Graphics
Example
161
152
146
169
172
157
143
158
183
176
157
170
174
179
180
188
161
148
161
180
175
159
140
152
193
176
155
178
169
172
186
188
188
167
149
175
163
181
186
181
187
176
162
176
161
167
160
162
180
186
181
169
163
175
155
159
203
209
185
183
186
168
171
190
Reconstructed image after performing the inverse DCT:
6.9
8
6
1
10
9
7
11
28
24
13
25
26
24
24
21
25
8
14
24
19
23
17
12
26
29
15
18
21
17
19
33
25
27
9
23
23
29
24
26
25
28
7
21
25
20
16
22
28
19
2
33
16
28
15
23
31
30
10
23
24
32
24
28
Error in reconstruction:
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 37Chapter 2.2: Images and Graphics
Problem of Quantization
Cutting of higher frequencies leads to partly wrong color information
→ the higher the quantization coefficients, the more disturbance is in a 8×8 block
Result: edges of blocks can be seen
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 38Chapter 2.2: Images and Graphics
Compression Steps in JPEG
ImagePreparation
Pixel
Quantization(approxima-tion of real numbers by
rational numbers)
Block, MCU
Image Processing
Predictor
Entropy Encoding
Run-length
Huffman
Arithmetic
UncompressedImage
CompressedImage
• Result of quantization: 8×8 blocks of DC/AC coefficients with lots of zeros
• How to process and encode the data efficiently?
DCT
MCU: Minimum Coded UnitDCT: Discrete Cosine Transform
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 39Chapter 2.2: Images and Graphics
Entropy Encoding
• Initial step: map 8×8 block of transformed values FQuv to a 64 element vector which
can be further process by entropy encoding• DC-coefficients determine the basic color of the data units in a frame; variation
between DC-coefficients of successive frames is typically small→ The DC-coefficient is encoded as difference between the current coefficient and the previous one
• AC-coefficients: processing order uses zig-zag sequence
Baseline Process - Entropy Encoding
DC-coefficient AC-coefficients, higher frequencies
•
•• ••••
••
••
••
••
•••
•
•••
• Coefficients with lower frequencies are encoded first, followed by higher frequencies. Result: sequence of similar data bytes → efficient entropy encoding
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 40Chapter 2.2: Images and Graphics
Example
-1
4
62
-1
0
0
0
0
-3
-5
-3
0
1
0
0
0
0
3
2
1
0
0
0
-1
0
-1
-1
-1
0
0
0
0
-1
-1
2
0
-1
0
0
0
0
1
-1
0
1
0
0
0
-1
1
-1
0
1
0
0
0
0
0
1
0
1
0
0
0
Zig-zag ordering
DC-coefficient: code coefficients for one block as difference to the previous one
AC-coefficients: consider each block separately, order data using zig-zagsequence to achieve long sequences of zero-values:-3 4 -1 -5 2 -1 3 -3 -1 0 0 0 -1 2 -1 -10 1 1 0 0 0 0 -1 -1 1 -1 1 1 0 0 0 -1 0 0 0 0 0 -1 0 -1 0 0 0 1 0 0 0 0 0 01 0 1 0 0 0 0 0 0 0 0 0
Entropy encoding:
• Run-length encoding of zero values of quantized AC-coefficients• Huffman encoding on DC- and AC-coefficients
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 41Chapter 2.2: Images and Graphics
Run-length Encoding
Run-length encoding is a content-dependent coding technique
• Sequences of the same bytes are replaced by the number of their occurrences• A special flag byte is used which doesn‘t occur in the byte stream itself• Coding procedure:
� If a byte occurs at least four consecutive times, the “number of occurrences – 4”(offset = 4) is counted
� The compression data contain this byte followed by the special flag and the “number of occurrences – 4”
• As a consequence: Representation of 4 to 259 bytes with three bytes is possible (with corresponding compression effect)
Example with ‘!’ as special flag:Uncompressed sequence: ABCCCCCCCCDEFGGG
Run-length coded sequence: ABC!4DEFGGG
Offset of 4, since for smaller blocks there would be no reduction effect; e.g. with offset 3:• D!0 → DDD (both strings have same length)
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 42Chapter 2.2: Images and Graphics
Run-length Encoding
Similar it is done in JPEG:• The zero-value is the only one appearing in longer sequences, thus use a more efficient
coding by only “compressing” zero sequences: code nonzero coefficients together with their run-length, i.e. the number of zeros preceding the nonzero value
• Run-length ∈ {0,...,15}, i.e. 4 Bit for representing the length of zero sequences• Coded sequence: run-length, size, amplitude
with run-length number of subsequent zero-coefficientssize number of bits used for representing the following coefficientamplitude value of that following coefficient using size bits
• By adapting the size of representing a coefficient to its value achieves a further compression because most coefficients for higher frequencies have very small values
• If (run-length, size) = (15, 0) then there are more than 15 zeros after each other.
• (0,0) = EoB symbol (End of Block) indicates the termination of the actual rectangle (EoB is very frequently used)
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 43Chapter 2.2: Images and Graphics
Size i Amplitude
1 -1 ↔ 12 -3, -2 ↔ 2,33 -7,...,-4 ↔ 4,...,74 -15,...,-8 ↔ 8,...,15
-2i+1, ...,-2i-1 ↔ 2i-1,...,2i-110 -1023,...,-512 ↔ 512,...,1023
Example
35 zeroes
15, 0, 15, 0, 5, 7, 57
4 bits 1-complementRepresentation (other representations are
possible)
11 is for instance represented by:
size = 4, amplitude = 0011
The sequence 0 . . . 0 121 0 . . . is encoded by
In a second step, the string may be still reduced by Huffman encoding principles
0000 8
0001 9
0010 10
0011 11
0100 12
0101 13
0110 14
0111 15
1000 15
1001 14
1010 13
1011 12
1100 11
1101 10
1110 9
1111
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
∧
=
=
=
=
=
=
=
=
= −
= −
= −
= −
= −
= −
= −
8∧= −
35 zeros at all, followed by a value represented using 7 bit
With 7 bit, 121 is 64 + 57
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 44Chapter 2.2: Images and Graphics
Huffman Encoding
• The Huffman code is an optimal code using the minimum number of bits for a string of data with given probabilities per character
• Statistical encoding method: � For each character, a probability of occurrence is known by encoder and
decoder� Frequently occurring characters are coded with shorter strings than seldomly
occurring characters
� Successive characters are coded independent of each other• Resulting code is prefix free → unique decoding is guaranteed
• A binary tree is constructed to determine the Huffman codewords of the characters:� Leaves represent the characters that are to be encoded� Nodes contain the occurrence probability of the characters belonging to the
subtree� Edges of the tree are assigned with 0 and 1
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 45Chapter 2.2: Images and Graphics
Huffman Encoding
Algorithm for computing the Huffman code:1.) List all characters as well as their frequencies
2.) Select the two list elements with the smallest frequency and remove them from the list
3.) Make them the leafs of a tree, whereby the probabilities for both elements are being added; place the tree into the list
4.) Repeat steps 2 and 3, until the list contains only one element5.) Mark all edges:
Father → left son with “0”Father → right son with “1”
The code words result from the path from the root to the leafs
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 46Chapter 2.2: Images and Graphics
Huffman Encoding Example
Resulting Code:
p(ADCEB) = 1.00
p(C) = 0.16
0
00
0
1
11
1
p(CED) = 0.37
p(ED) = 0.21
p(E) = 0.07 p(D) = 0.14
p(AB) = 0.63
p(A) = 0.27
p(B) = 0.36
Suppose that characters A, B, C, D and E occur with probabilitiesp(A) = 0.27, p(B) = 0.36, p(C) = 0.16, p(D) = 0.14, p(E) = 0.07
x w(x)
A 10B 11C 00D 011E 010
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 47Chapter 2.2: Images and Graphics
Huffman Encoding in JPEG
Coding of run-length (∈ {0, …, 15}), size (∈ {0, …, 10})
• (i,j): i preceding zeroes (0≤ i ≤15) in front of a nonzero value coded with j bits
• The table has 10·16+2 = 162 entries with significantly different occurrence probabilities
• EoB is relatively frequent• ZRL: at least 16 successive zeroes, i.e.
ZRL = (15,0)
• Some values such as (15,10) are extremely rare: 15 preceding zeros in front of a very large value is practically impossible! The same holds for most of the combinations in the table.
• Thus: Huffman coding of the table entries will lead to significant further compression!
runlength
0 EoB1 (impossible)2 (impossible). .. .14 (impossible) 15 ZRL
0 1 2 3 ... 10
size
(i,j)
(1,3)
(15,10)
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 48Chapter 2.2: Images and Graphics
Huffman Encoding in JPEG
• Different Huffman tables for (run-length, size) are used for different 8x8 blocks, basing on their contents
• Thus the coding begins with a HTN (Huffman-table-number)
• The coding of amplitudes may also change from block to block
• Amplitude codes are stored in the preceding (run-length, size) coding table
A 8×8 block thus is coded as follows:
[VLC, DC coefficient, sequence of (run-length, size, amplitude) for the AC coefficients]
VLC = variable length code: contains actual HTN + actual VLI (Variable Length Integer), i.e. coding method for next amplitude
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 49Chapter 2.2: Images and Graphics
Characteristics:
• Achieves optimality (coding rate) as the Huffman coding
• Difference to Huffman: the entire data stream has an assigned probability, which consists of the probabilities of the contained characters. Coding a character takes place with consideration of all previous characters.
• The data are coded as an interval of real numbers between 0 and 1. Each value within the interval can be used as code word.
• The minimum length of the code is determined by the assigned probability.
• Disadvantage: the data stream can be decoded only as a whole.
Alternative to Huffman: Arithmetic Coding
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 50Chapter 2.2: Images and Graphics
Code data ACAB with pA = 0.5, pB = 0.2, pC = 0.3
pA = 0.5 pC = 0.3
ACAB can be coded by each binary number from the interval [0.3875, 0.4025),
rounded up to -log2(pACAB) = 6.06 i.e. 7 bit, e.g. 0.0110010
pB = 0.2
pAA = 0.25 pAC = 0.15pAB= 0.1 pBA pBCpBB pCA pCB pCC
pACA = 0.075 pACC = 0.045pACB = 0.03
pACAA = 0.0375 pACAC = 0.0225pACAB = 0.015
0
0
0.35
0.35
0.5
0.25 0.35 0.5
0.425
0.425
0.7 1
10.70.6 0.68 0.85 0.91
0.50.455
0.3875 0.4025
Arithmetic Coding: Example
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 51Chapter 2.2: Images and Graphics
Variants of Image Compression
JPEG is not a single format, but it can be chosen from a number of modes:
Lossy sequential DCT-based mode (baseline process)• Presented before, but not the only method
Expanded lossy DCT-based mode• Enhancement to the baseline process by adding progressive encoding
Lossless mode• Low compression ratio → „perfect“ reconstruction of original image
• No DCT, but differential encoding
Hierarchical mode
• Accommodates images of different resolutions• Selects its algorithms from the three other modes
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 52Chapter 2.2: Images and Graphics
With sequential encoding as in the baseline process the whole image is coded and decoded in a single run. An alternative to sequential encoding is progressive encoding, done in the entropy encoding step.
Two alternatives for progressive encoding are possible:
• Spectral selectionAt first, coefficients of low frequencies are passed to entropy encoding, coefficients of higher frequencies are processed in successive runs
• Successive approximationAll coefficients are transferred in one run, but most-significant bits are encoded prior to less-significant bits.
12 possible coding alternatives in the expanded mode:• Using sequential encoding, spectral selection, or successive approximation (3 variants)
• Using Huffman or Arithmetic encoding (2 variants)• Using 8 or 12 bits for representing the samples (2 variants)
Most popular mode: sequential display mode with 8 bits/sample and Huffman encoding
Variants: Expanded Lossy DCT-based Mode
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 53Chapter 2.2: Images and Graphics
Expanded Lossy DCT-based Mode (Example)
Progressive encoding: image is coded and decoded in refining steps
Sequential encoding: image is coded and decoded in a single run
Step 1 Step 2 Step 3
Step 1 Step 2 Step 3
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 54Chapter 2.2: Images and Graphics
Variants: Lossless Mode
Lossless mode uses differential encoding(Differential encoding is also known as prediction or relative encoding)• Sequence of characters whose values are different from zero, but which do not differ
much.→ Calculate only the difference wrt. the previous value (used also for DC-coefficients)
Differential encoding for still images:• Avoid using DCT/quantization• Instead: calculation of differences between nearby pixels or pixel groups
• Edges are represented by large values• Areas with similar luminance and chrominance are represented by small values
• Homogenous area is represented by a large number of zeros → further compression with run-length encoding is possible as for DCT
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 55Chapter 2.2: Images and Graphics
Variants: Lossless Mode
• Uses data units of single pixels for image preparation
• Any precision between 2 and 16 bits/pixel can be used• Image processing and quantization use a predictive technique instead of transformation
encoding
• 8 predictors are specified for each pixel X by means of a combination of the already known adjacent samples A, B, and C
predictor predicted values X
0 no prediction1 A2 B3 C4 A+B-C5 A+(B-C)/26 B+(A-C)/27 (A+B)/2
The number of the chosen predictor and the difference of the prediction to the actual value are passed to entropy encoding (Huffman or Arithmetic Encoding)
Example:(4,0): X is exactly given by A+B-C(7,1): X is (A+B)/2+1
ABCX
The actual predictor should give thebest approximation of x by the already known values A,B,C
Uncompressed data
Predictor
Entropy encoder
Compressed data
Lehrstuhl für Informatik 4
Kommunikation und verteilte Systeme
Page 56Chapter 2.2: Images and Graphics
Variants: Hierarchical Mode
The Hierarchical mode uses either the lossy DCT-based algorithms or the lossless compression technique
The idea: encoding of an image at different resolutions
Algorithm:• Image is initially sampled at a low resolution• Subsequently, the resolution is raised and the compressed image is subtracted from the
previous result• The process is repeated until the full resolution of the image is obtained in a compressed
form
Disadvantage:• Requires substantially more storage capacity
Advantage:• Compressed image is immediately available at different resolutions→ scaling becomes cheap