DIP Image compression 1.11.2015.ppt
Transcript of DIP Image compression 1.11.2015.ppt
Image Compression
Digital images are very large in size and hence occupy larger storage space. Due to their larger size, they take larger bandwidth and more time for upload or download through the Internet.
This makes it inconvenient for storage as well as file sharing. To combat with this problem, the images are compressed in size with special techniques.
This compression not only helps in saving storage space but also enables easy sharing of files. Image compression applications reduce the size of an image file without causing major degradation to the quality of the image.
Image Compression
Digital images require huge amounts of space for
storage and large bandwidths for transmission. A 640 x 480 color image requires close to 1MB of
space.The goal of image compression is to reduce the amount
of data required to represent a digital image.Reduce storage requirements and increase transmission
rates.
Data ≠ Information
• Data and information are not synonymous terms!
• Data is the means by which information is conveyed.
• Data compression aims to reduce the amount of data required to represent a given quantity of information while preserving as much information as possible.
Image Compression
Aims to reduce the amount of data required to represent images while preserving as much information as possible.
Saves storage space
Increase access speed (transmission rates)
Applications Medical imaging Satellite imagery (photograph of orbit of earth, forecasting weather and earth resource Digital radiography Legal aspects Video conferencing (as person is talking, we want to send full image: rate
at which information is send must be quite fast Controlling remote vehicle (Delhi crime satellite) Fax (less no. of bits are transmitted more efficiently) Entertainment Document imaging (any printing document for scanning and storage in
computer
Gray and Color Image Compression Gray Image Compression
A digital grayscale image is represented by 8 bits per pixel (bpp) in its uncompressed form. Each pixel has a value ranging from 0 (black) to 255 (white).
Color Image Compression
A digital color image is represented by 24 bits per pixel in its uncompressed form. Each pixel contains a value representing a red (R), green (G), and blue (B) component scaled between 0 and 255. This format is known as the RGB format.
Image Compression Characteristics
Compression Ratio
Image Quality
Compression Speed
Peak Signal to Noise Ratio
Compression Ratio
Image Quality
imageoriginalofsizeimagecompressedofsizeimageoriginalofsizeRC
..
pixelsofnoimagecompressedinbitsofnumberbpp
.
Compression Speed Amount of the time required to compress and decompress image.
PSNR
Where X = original image data Y = compressed image data MAX = Max. value that a pixel can have
M
m
N
n
nmYnmXMXN
MAXPSNR
1
2
1
2
)),(),((1log10
Principles Behind Compression
The no. of bits actually required to represent an image may be less because of redundancy.
Principles behind compression is to remove redundancy
Three types of redundancy in digital images:
Spatial redundancy: due to correlation between neighbouring pixel values.
Spectral Redundancy: due to correlation between different color planes.
Temporal Redundancy: due to correlation between adjacent frames in a sequence of images.
Data Redundancy• The relative data redundancy is define as :
where CR is the compression ratio
n1 denotes the number of bits in the original data set and n2 denotes the number of bits in the compressed data set If compression ratio C=10 [10:1] means we are representing a 10 bit data
by 1 bit by using some coding technique. Then the redundancy R=1-1/10 = 9/10 = 0.9 That means 90% of data was redundant.
2
1
11
nnC
CR
R
RD
Types of Data Redundancy
Coding Interpixel (spatial and temporal or video) redundancy Psychovisual
Compression attempts to reduce one or more of these redundancy
Coding Redundancy Code: a list of symbols (letters, numbers, bits etc.) Code word: a sequence of symbols used to represent a piece of
information or an event (e.g., gray levels). Code word length: number of symbols in each code word
Example l(rk) = constant length
Examplel(rk) = variable length
Consider the probability of the gray levels:
Interpixel redundancy Interpixel redundancy implies that any pixel value can be reasonably
predicted by its neighbors (i.e., correlated).
( ) ( ) ( ) ( )f x o g x f x g x a da
Interpixel redundancy Interpixel redundancy implies that any pixel value can be reasonably
predicted by its neighbors (i.e., correlated).
Inter pixel red.: There are pixel gray level ( 1 byte to store) 123 120 121 124 126 128 127 123 -3 1 3 2 2 -1 difference of 2 adj. Pixel
Run length coding (when binary images) 1111110000111110000000111111000 (1,6) (0,4) (1,5) (0,7) (1,6) (0,3) : no need to store separately 645763: initially 1 and then 0 because alternate for document, letter like
paper with some data.
Psycho-visual Redundancy In this case compression is lossy means compressed image is numerically
degraded to original image Use Quantization The human visual system is more sensitive to edges DCT and predictive Transform with quantizer
Uniform quantization from 256 to 16 gray levels, C.R.= 2
Compression Model ENCODER DECODER
The source encoder is responsible for removing redundancy (coding, inter-pixel, psycho-visual)
The channel encoder ensures robustness against channel noise.
Compression Techniques
Lossless Compression Reconstructed image after compression is numerically identical to the
original image. Achieve only modest amount of compression.
Lossy Compression Reconstructed image contains degradation relative to the original image
because the compression scheme completely discards the redundant information.
Achieve much higher compression. Visually lossless.
Lossless (Error-Free)Compression Some applications require no error in compression (medical, business
documents, etc..)
CR=2 to 10 can be expected.
Make use of coding redundancy and inter-pixel redundancy.
Ex: Huffman codes, LZW, Arithmetic coding, Run-length coding, Loss-
less Predictive Codingand Bit-Plane Coding.
Huffman Coding
The most popular technique for removing coding redundancy is due to
Huffman (1952)
Huffman Coding yields the smallest number of code symbols per source
symbol
The resulting code is optimal symbol per source symbol also known as
block coding.
The basic aim is to reduce the entropy of each bit in source coding.
Examplel(rk) = variable length
Consider the probability of the gray levels:
Huffman Coding
Symbol probability code words S0 0.4 00 S1 0.2 10 S2 0.2 11 S3 0.1 010 S4 0.1 011
1
0
1
20
( ) ( ) 0.4 2 0.2 2 0.2 2 0.1 3 0.1 3 2.2 bits
( ) log ( )
L
avg k r kk
L
k kk
L l r p r
H s p p
Lavg = average length of symbolPr( rk)= probability of occurrence of symboll(rk) =length of each symbolH(s)= Entropy
Avg. Information generated per pixel=2.2 bits/symbol
C.R.=3/2.2=1.36
RD=1-1/C.R.=26.67% of data is redundant
Efficiency
( )H s
L
EXAMPLE
CODING REDUNDANCY-CODE-1 CODE-2
0.25 01010111 8 01 2
0.47 10000000 8 1 1
0.25 11000100 8 000 3
0.03 11111111 8 001 3
kr ( )r kp r 1( )kL r 2( )kL r
87r
128r
186r
255r
EXAPMPLE
- Given intensity values - Probability of given intensity value
for code-2 its 0.25×2+0.47×1+0.25×3+0.03×3 = 1.81 bitsfor code-1 its 0.25×8+0.47×8+0.25×8+0.03×8 = 8 bits The total no. f bits needed to represent the entire image is
Compression ratio C= b/b’ =
C=4.42 so R= 1-1/C = 0.774.
Thus 77.4% of the data in original8 bit 2d intensity array is redundant.
( )r kp r1
0
( ) ( )L
avg k r kk
L l r p r
kr
M N 256 256 1.81 118,621 256 by 256 imageavgL
256 256 8256 256 1.81
Example
Arithmetic Coding It does not generate individual code for each character.
There is no one-to-one correspondence between source symbols and code words.
A single codeword is used for an entire sequence of symbols
The code defines an interval of real number between 0 and 1. As no. of symbols in message increases, interval used to represent it become
smaller and no. of information units required to represent the interval become larger.
No assumption on encode source symbols one at a time.
Slower than Huffman coding but typically achieves better compression. Performs well for sequences with low entropy where Huffman code lose their
efficiency.
It is complex but optimal
Arithmetic Coding Source’s symbol Probability Initial sub-range
a1 0.2 [0.0,0.2)a2 0.2 [0.2, 0.4)a3 0.4 [0.4, 0.8)a4 0.2 [0.8, 1.0)
Arithmetic decoding
If first symbol is a1 the tag will lie in 0.02 and rest of unit interval
is discarded, then this subinterval is divided into same
proportion as the original one. Suppose second symbol in seq
is a2, the tag value is restricted to lie in the interval
between .02to .04 (tag seq is alwaysa disjoint from others),
we now partition this interval into as same proportion as
original one.
Arithmetic decoding1) 0.068 [0.0; 0.2) => a1; (0.068-0.0)/(0.2-0.0)=0.34
2) 0.34 [0.2; 0.4) => a2; (0.34-0.2)/(0.4-0.2)=0.7
3) 0.7 [0.4; 0.8) => a3; (0.7-0.4)/(0.8-0.4)=0.75
4) 0.75 [0.4; 0.8) => a3; (0.75-0.4)/(0.8-0.4)=0.875
5) 0.875 [0.8; 1) => a4
Arithmetic decoding
The decoded sequence: a1 a2 a3 a3 a4
So final code for string ‘a1 a2 a3 a3 a4’ is between .0624 to .0688
Drawbacks of the arithmetic coding:
- precision is big issue
- an end-of-message flag is needed
Alternative solutions: re-normalization and rounding
Lampel Ziv Welch (LZW) coding Removing inter-pixel redundancy
Fixed length coding used in gif, tiff, pdf
Builds an identical decompression dictionary as it decodes
simultaneously the decoded data stream
Adv: No need of probability of occurrence of event Example given Seq. is 000101110010100101Numerical positions : 1 2 3 4 5 6 7 8 9Subsequences : 0 1 00 01 011 10 010 100
101Numerical representation : 11 12 42 21 41 61 62 Binary coded block : 0010 0010 1001 0100 1000 1101 1101
Bits planes encoding
8 planes of 1 bit, independently encoded (MSB …. LSB)
The higher order bits contains the majority of visual significant
data
Useful for image compression
Image data compression methods Predictive coding
Transform Coging
Predictive Coding Lossless and lossy.
Eliminates interpixel redundancies in time domain.
Information already sent or available is used to predict future values and
difference is coded.
1
ˆ ( ) ( )m
ii
f n round f n i
Fig.: Predictor
Lossless predictive coding Consists of an encoder and decoder both contains an identical predictor The predictor generates the anticipated value of each sampler based on a
specified no. of past samples
Fig.: Lossless Predictive Coding model (a) Encoder (b) Decoder
e(n) = f(n) - (n)where e(n) – prediction error,f(n): input signal, (n) – output of predictore(n) is encoded by using variable length code. At the receiver end, decoder
reconstructs e(n) from the received variable length codewords and perform the inverse operation to decompress or recreate the original input sequence.
f(n) = e(n) + (n)In many cases the prediction is formed by linear combination of ‘m’ previous
samples. Where ’m’ is the order of the linear predictor, round is a function used to denote the rounding or nearest integer operation and the for i = 1, 2 ... m are the prediction coefficients. It does 50% of compression.
Lossy Predictive Coding
Fig.: Lossy Predictive Coding model (a) Encoder (b) Decoder
PREDICTIVE CODINGDPCM coding principles: Maximize image compression efficiency by
exploiting the spatial redundancy present in an image!
Line-by-linedifference of the luminance
Many close to zero data => spatial redundancy, the brightnessis almost repeating from a point to the next one! => no need toencode all brightness info, only the new one!
see images: one can “estimate” ( “predict”) the brightness of the subsequent spatial point based on the brightness of the previous (one or more) spatial points = PREDICTIVE CODING
DPCM coding principles – continued – BASIC FORMULATION OF DPCM:Let {u(m)} – the image pixels, represented on a line by line basis, as a vector. Consider we already encoded u(0), u(1),…, u(n-1); then at the decoder => only available their decoded versions ( original + coding error), denoted: u•(0), u•(1),…, u•(n-1)
=> currently to do: predictive encode u(n), n-current sample:(1) estimate (predict) the grey level of the nth sample based on the knowledge of the previously encoded neighbor pixels u•(0), u•(1),…, u•(n-1):
(2) compute the prediction error:
(3) quantize the prediction error e(n) and keep the quantized e•(n); encode e•(n) PCM and transmit it.
At the decoder: 1) decode e => get e•(n) + 2) build prediction + 3) construct u•(n):
* The encoding & decoding error:
2)...)(n1)u(n(u(n)u
(n)uu(n)e(n)
u (n) u (n) e (n)
u(n) u(n) u (n) e(n) e (n) q(n)
(n)u
PREDICTIVE CODING - continued
Basic DPCM codec
DPCM codec: (a) with distorsions; (b) without distorsions
Lossy Transform Compression Provide greater compression compared to predictive methods although at the expense
of greater computation. A reversible linear transform (F.T.) is used to map the image into set of transform
coefficients To pack as much information as possible into smallest no. of coefficients. Quantizer stage eliminates coefficients that carry least information
Fig. A transform coding system a) Encoder b) Decoder
JPEG Compression
JPEG is an image compression standard which was accepted as an
international standard in 1992.
Developed by the Joint Photographic Expert Group of the ISO/IEC for
coding and compression of color/gray scale images.
Yields acceptable compression in the 10:1 range.
A scheme for video compression based on JPEG called Motion JPEG
(MJPEG) exists
Different transform techniques for image compression Discrete fourier transform (DFT) Discrete sine transform (DST) The Karhunen Loeve Transform (KLT) Discrete Cosine Transform (DCT) Discrete Wavelet Transform (DWT) Walsh Hadamad Transform (WHT)
Choice of particular transform in a given application depends on
the amount of reconstruction error that can be tolerated.
1 1
0 0
1 1
0 0
Forward Transform
( , ) ( , ) ( , , , )
At the receiving end, reconstructed image by takimg Inverese transform
( , ) ( , ) ( , , , )
( , , , ) : transforma
N N
x y
N N
u v
T u v f x y g x y u v
f x y T u v h x y u v
g x y u v
1 2
tion kernel or basis image( , , , ) : inverse transformation kernel( , , , ) ( , ) ( , )
Kernel is seperable (horizontal & vertical axis are independent so no. of computations are reduced).
h x y u vg x y u v g x u g y v
Image Transformation
Image Transformations
Unitary Transformations
Orthogonal and Orthonormal basis vectors
How an arbitrary 1-D signal can be represented by series summation of
orthogonal basis vectors
How an arbitrary image can be represented by series summation of
orthogonal basis images
What is Image Transformation?
Transform
Inverse Transform
Coefficient Matrix
ImageN X N
Another ImageN X N
Image Transformation Applications
Preprocessing• Filtering• Enhancement etc.
Data Compression
Feature Extraction• Edge Detection• Corner Detection etc.
What does Image Transformation do? It represents a given image as a series summation of a set of Unitary
matrices (orthogonal basis functions).
Example: For 1-D signal x(t), this representation can be given as
Here is set of orthogonal functions.
Unitary Matrix Basis Images
A matrix A is a Unitary Matrix if
A-1 = A*T
0
( ) ( )n nn
x t C a t
( )na t
Orthogonal/Orthonormal Function A set of real valued continuous functions is called orthogonal over the interval t to t+T, if
if k=1 then we say above set is orthonormal.
0 1( ) ( ), ( )......na t a t a t
( ). ( ) ;
0;
m nT
a t a t dt k m n
m n
Where an(t) is a set of orthogonal functions…Then plot sinwt and sin2wt in time period 0 to T. We will get…
Now product of sinwt and sin2wt wil be shown and the intergration of sinwt and sin2wt over the interval 0 to T is given as …
Similarly if we multiply sin2wt and sin3wt then integrate it we will get zero. Hence this particular set i.e {Sinwt, sin2wt,sin3wt} is called orthogonal basis function.
Now consider an example :-
Now to calculate the value of Cn, multiply both side by am(t) and integrate it
And expand it
Now according to definition of orthogonal, this expansion will be equal to k if n=m and zero otherwise.
This is how we get mth coefficient of any arbitrary function x(t)
Compression Techniques
The Karhunen Loeve Transform (KLT)
Discrete Cosine Transform (DCT)
Discrete Wavelet Transform
The 2 dimensional DFT is defined by
The inverse DFT is
DFT
21 1
0 0
1( , ) ( , ) (1)x yN N j v v
N
x y
F u v f x y eN
21 1
0 0
1( , ) ( , ) (2)x yN N j u u
N
u v
f x y F u v eN
Properties of DFTFast transform Good energy compaction; however – requires complex computationsvery useful in digital signal processing, convolution, filtering, image analysis
Why DCT not FFT? DCT is like FFT, but can approximate linear signals well with fewcoefficients.
Output File of Gray Image Using DCTOriginal image size (bytes) = 65536, Compressed image size (bytes) = 6382, C.R. = 90%, SNR (db) =24.49, Simulation Time (Secs) = 1.21
SNR Performance for Gray Image
24.4926.3229.26DCT
28.4729.2330.92KLT
29.0630.2931.24DWT
CR = 90%CR = 85%CR = 75%
Table : SNR Values for gray Image (All values in db)
Simulation time performance for gray image
1.211.151.1DCT1.341.331.31DWT3.953.793.68KLT
CR = 90%CR = 85%CR = 75%
Table : Simulation Time for Gray Image (All values in seconds)
Output File of Color Image Using DCTOriginal image size (bytes) = 196608, Compressed image size (bytes) = 58896, C.R. = 70%, SNR (db) =23.05, Simulation Time (Secs) = 3.57
Output File of Color Image Using DCTOriginal image size (bytes) = 196608 ,Compressed image size (bytes) = 18978C.R. = 90%, SNR (db) = 17.51, Simulation Time (Secs) = 3.68