DIP Image compression 1.11.2015.ppt

Image Compression

Digital images are very large in size and hence occupy larger storage space. Due to their larger size, they take larger bandwidth and more time for upload or download through the Internet.

This makes it inconvenient for storage as well as file sharing. To combat with this problem, the images are compressed in size with special techniques.

This compression not only helps in saving storage space but also enables easy sharing of files. Image compression applications reduce the size of an image file without causing major degradation to the quality of the image.

Image Compression

Digital images require huge amounts of space for

storage and large bandwidths for transmission. A 640 x 480 color image requires close to 1MB of

space.The goal of image compression is to reduce the amount

of data required to represent a digital image.Reduce storage requirements and increase transmission

rates.

Data ≠ Information

• Data and information are not synonymous terms!

• Data is the means by which information is conveyed.

• Data compression aims to reduce the amount of data required to represent a given quantity of information while preserving as much information as possible.

Image Compression

Aims to reduce the amount of data required to represent images while preserving as much information as possible.

Saves storage space

Increase access speed (transmission rates)

Applications Medical imaging Satellite imagery (photograph of orbit of earth, forecasting weather and earth resource Digital radiography Legal aspects Video conferencing (as person is talking, we want to send full image: rate

at which information is send must be quite fast Controlling remote vehicle (Delhi crime satellite) Fax (less no. of bits are transmitted more efficiently) Entertainment Document imaging (any printing document for scanning and storage in

computer

Gray and Color Image Compression Gray Image Compression

A digital grayscale image is represented by 8 bits per pixel (bpp) in its uncompressed form. Each pixel has a value ranging from 0 (black) to 255 (white).

Color Image Compression

A digital color image is represented by 24 bits per pixel in its uncompressed form. Each pixel contains a value representing a red (R), green (G), and blue (B) component scaled between 0 and 255. This format is known as the RGB format.

Image Compression Characteristics

Compression Ratio

Image Quality

Compression Speed

Peak Signal to Noise Ratio

Compression Ratio

Image Quality

imageoriginalofsizeimagecompressedofsizeimageoriginalofsizeRC

..

pixelsofnoimagecompressedinbitsofnumberbpp

.

Compression Speed Amount of the time required to compress and decompress image.

PSNR

Where X = original image data Y = compressed image data MAX = Max. value that a pixel can have

M

m

N

n

nmYnmXMXN

MAXPSNR

1

2

1

2

)),(),((1log10

Principles Behind Compression

The no. of bits actually required to represent an image may be less because of redundancy.

Principles behind compression is to remove redundancy

Three types of redundancy in digital images:

Spatial redundancy: due to correlation between neighbouring pixel values.

Spectral Redundancy: due to correlation between different color planes.

Temporal Redundancy: due to correlation between adjacent frames in a sequence of images.

Data Redundancy• The relative data redundancy is define as :

where CR is the compression ratio

n1 denotes the number of bits in the original data set and n2 denotes the number of bits in the compressed data set If compression ratio C=10 [10:1] means we are representing a 10 bit data

by 1 bit by using some coding technique. Then the redundancy R=1-1/10 = 9/10 = 0.9 That means 90% of data was redundant.

2

1

11

nnC

CR

R

RD

Types of Data Redundancy

Coding Interpixel (spatial and temporal or video) redundancy Psychovisual

Compression attempts to reduce one or more of these redundancy

Coding Redundancy Code: a list of symbols (letters, numbers, bits etc.) Code word: a sequence of symbols used to represent a piece of

information or an event (e.g., gray levels). Code word length: number of symbols in each code word

Example l(rk) = constant length

Examplel(rk) = variable length

Consider the probability of the gray levels:

Interpixel redundancy Interpixel redundancy implies that any pixel value can be reasonably

predicted by its neighbors (i.e., correlated).

( ) ( ) ( ) ( )f x o g x f x g x a da

Interpixel redundancy Interpixel redundancy implies that any pixel value can be reasonably

predicted by its neighbors (i.e., correlated).

Inter pixel red.: There are pixel gray level ( 1 byte to store) 123 120 121 124 126 128 127 123 -3 1 3 2 2 -1 difference of 2 adj. Pixel

Run length coding (when binary images) 1111110000111110000000111111000 (1,6) (0,4) (1,5) (0,7) (1,6) (0,3) : no need to store separately 645763: initially 1 and then 0 because alternate for document, letter like

paper with some data.

Psycho-visual Redundancy In this case compression is lossy means compressed image is numerically

degraded to original image Use Quantization The human visual system is more sensitive to edges DCT and predictive Transform with quantizer

Uniform quantization from 256 to 16 gray levels, C.R.= 2

Compression Model ENCODER DECODER

The source encoder is responsible for removing redundancy (coding, inter-pixel, psycho-visual)

The channel encoder ensures robustness against channel noise.

Compression Techniques

Lossless Compression Reconstructed image after compression is numerically identical to the

original image. Achieve only modest amount of compression.

Lossy Compression Reconstructed image contains degradation relative to the original image

because the compression scheme completely discards the redundant information.

Achieve much higher compression. Visually lossless.

Lossless (Error-Free)Compression Some applications require no error in compression (medical, business

documents, etc..)

CR=2 to 10 can be expected.

Make use of coding redundancy and inter-pixel redundancy.

Ex: Huffman codes, LZW, Arithmetic coding, Run-length coding, Loss-

less Predictive Codingand Bit-Plane Coding.

Huffman Coding

The most popular technique for removing coding redundancy is due to

Huffman (1952)

Huffman Coding yields the smallest number of code symbols per source

symbol

The resulting code is optimal symbol per source symbol also known as

block coding.

The basic aim is to reduce the entropy of each bit in source coding.

Examplel(rk) = variable length

Consider the probability of the gray levels:

Huffman Coding

Symbol probability code words S0 0.4 00 S1 0.2 10 S2 0.2 11 S3 0.1 010 S4 0.1 011

1

0

1

20

( ) ( ) 0.4 2 0.2 2 0.2 2 0.1 3 0.1 3 2.2 bits

( ) log ( )

L

avg k r kk

L

k kk

L l r p r

H s p p

Lavg = average length of symbolPr( rk)= probability of occurrence of symboll(rk) =length of each symbolH(s)= Entropy

Avg. Information generated per pixel=2.2 bits/symbol

C.R.=3/2.2=1.36

RD=1-1/C.R.=26.67% of data is redundant

Efficiency

( )H s

L

EXAMPLE

CODING REDUNDANCY-CODE-1 CODE-2

0.25 01010111 8 01 2

0.47 10000000 8 1 1

0.25 11000100 8 000 3

0.03 11111111 8 001 3

kr ( )r kp r 1( )kL r 2( )kL r

87r

128r

186r

255r

EXAPMPLE

- Given intensity values - Probability of given intensity value

for code-2 its 0.25×2+0.47×1+0.25×3+0.03×3 = 1.81 bitsfor code-1 its 0.25×8+0.47×8+0.25×8+0.03×8 = 8 bits The total no. f bits needed to represent the entire image is

Compression ratio C= b/b’ =

C=4.42 so R= 1-1/C = 0.774.

Thus 77.4% of the data in original8 bit 2d intensity array is redundant.

( )r kp r1

0

( ) ( )L

avg k r kk

L l r p r

kr

M N 256 256 1.81 118,621 256 by 256 imageavgL

256 256 8256 256 1.81

Example

Arithmetic Coding It does not generate individual code for each character.

There is no one-to-one correspondence between source symbols and code words.

A single codeword is used for an entire sequence of symbols

The code defines an interval of real number between 0 and 1. As no. of symbols in message increases, interval used to represent it become

smaller and no. of information units required to represent the interval become larger.

No assumption on encode source symbols one at a time.

Slower than Huffman coding but typically achieves better compression. Performs well for sequences with low entropy where Huffman code lose their

efficiency.

It is complex but optimal

Arithmetic Coding Source’s symbol Probability Initial sub-range

a1 0.2 [0.0,0.2)a2 0.2 [0.2, 0.4)a3 0.4 [0.4, 0.8)a4 0.2 [0.8, 1.0)

Arithmetic decoding

If first symbol is a1 the tag will lie in 0.02 and rest of unit interval

is discarded, then this subinterval is divided into same

proportion as the original one. Suppose second symbol in seq

is a2, the tag value is restricted to lie in the interval

between .02to .04 (tag seq is alwaysa disjoint from others),

we now partition this interval into as same proportion as

original one.

Arithmetic decoding1) 0.068 [0.0; 0.2) => a1; (0.068-0.0)/(0.2-0.0)=0.34

2) 0.34 [0.2; 0.4) => a2; (0.34-0.2)/(0.4-0.2)=0.7

3) 0.7 [0.4; 0.8) => a3; (0.7-0.4)/(0.8-0.4)=0.75

4) 0.75 [0.4; 0.8) => a3; (0.75-0.4)/(0.8-0.4)=0.875

5) 0.875 [0.8; 1) => a4

Arithmetic decoding

The decoded sequence: a1 a2 a3 a3 a4

So final code for string ‘a1 a2 a3 a3 a4’ is between .0624 to .0688

Drawbacks of the arithmetic coding:

- precision is big issue

- an end-of-message flag is needed

Alternative solutions: re-normalization and rounding

Lampel Ziv Welch (LZW) coding Removing inter-pixel redundancy

Fixed length coding used in gif, tiff, pdf

Builds an identical decompression dictionary as it decodes

simultaneously the decoded data stream

Adv: No need of probability of occurrence of event Example given Seq. is 000101110010100101Numerical positions : 1 2 3 4 5 6 7 8 9Subsequences : 0 1 00 01 011 10 010 100

101Numerical representation : 11 12 42 21 41 61 62 Binary coded block : 0010 0010 1001 0100 1000 1101 1101

Bits planes encoding

8 planes of 1 bit, independently encoded (MSB …. LSB)

The higher order bits contains the majority of visual significant

data

Useful for image compression

Image data compression methods Predictive coding

Transform Coging

Predictive Coding Lossless and lossy.

Eliminates interpixel redundancies in time domain.

Information already sent or available is used to predict future values and

difference is coded.

1

ˆ ( ) ( )m

ii

f n round f n i

Fig.: Predictor

Lossless predictive coding Consists of an encoder and decoder both contains an identical predictor The predictor generates the anticipated value of each sampler based on a

specified no. of past samples

Fig.: Lossless Predictive Coding model (a) Encoder (b) Decoder

e(n) = f(n) - (n)where e(n) – prediction error,f(n): input signal, (n) – output of predictore(n) is encoded by using variable length code. At the receiver end, decoder

reconstructs e(n) from the received variable length codewords and perform the inverse operation to decompress or recreate the original input sequence.

f(n) = e(n) + (n)In many cases the prediction is formed by linear combination of ‘m’ previous

samples. Where ’m’ is the order of the linear predictor, round is a function used to denote the rounding or nearest integer operation and the for i = 1, 2 ... m are the prediction coefficients. It does 50% of compression.

Lossy Predictive Coding

Fig.: Lossy Predictive Coding model (a) Encoder (b) Decoder

PREDICTIVE CODINGDPCM coding principles: Maximize image compression efficiency by

exploiting the spatial redundancy present in an image!

Line-by-linedifference of the luminance

Many close to zero data => spatial redundancy, the brightnessis almost repeating from a point to the next one! => no need toencode all brightness info, only the new one!

see images: one can “estimate” ( “predict”) the brightness of the subsequent spatial point based on the brightness of the previous (one or more) spatial points = PREDICTIVE CODING

DPCM coding principles – continued – BASIC FORMULATION OF DPCM:Let {u(m)} – the image pixels, represented on a line by line basis, as a vector. Consider we already encoded u(0), u(1),…, u(n-1); then at the decoder => only available their decoded versions ( original + coding error), denoted: u•(0), u•(1),…, u•(n-1)

=> currently to do: predictive encode u(n), n-current sample:(1) estimate (predict) the grey level of the nth sample based on the knowledge of the previously encoded neighbor pixels u•(0), u•(1),…, u•(n-1):

(2) compute the prediction error:

(3) quantize the prediction error e(n) and keep the quantized e•(n); encode e•(n) PCM and transmit it.

At the decoder: 1) decode e => get e•(n) + 2) build prediction + 3) construct u•(n):

* The encoding & decoding error:

2)...)(n1)u(n(u(n)u

(n)uu(n)e(n)

u (n) u (n) e (n)

u(n) u(n) u (n) e(n) e (n) q(n)

(n)u

PREDICTIVE CODING - continued

Basic DPCM codec

DPCM codec: (a) with distorsions; (b) without distorsions

Lossy Transform Compression Provide greater compression compared to predictive methods although at the expense

of greater computation. A reversible linear transform (F.T.) is used to map the image into set of transform

coefficients To pack as much information as possible into smallest no. of coefficients. Quantizer stage eliminates coefficients that carry least information

Fig. A transform coding system a) Encoder b) Decoder

JPEG Compression

JPEG is an image compression standard which was accepted as an

international standard in 1992.

Developed by the Joint Photographic Expert Group of the ISO/IEC for

coding and compression of color/gray scale images.

Yields acceptable compression in the 10:1 range.

A scheme for video compression based on JPEG called Motion JPEG

(MJPEG) exists

Different transform techniques for image compression Discrete fourier transform (DFT) Discrete sine transform (DST) The Karhunen Loeve Transform (KLT) Discrete Cosine Transform (DCT) Discrete Wavelet Transform (DWT) Walsh Hadamad Transform (WHT)

Choice of particular transform in a given application depends on

the amount of reconstruction error that can be tolerated.

1 1

0 0

1 1

0 0

Forward Transform

( , ) ( , ) ( , , , )

At the receiving end, reconstructed image by takimg Inverese transform

( , ) ( , ) ( , , , )

( , , , ) : transforma

N N

x y

N N

u v

T u v f x y g x y u v

f x y T u v h x y u v

g x y u v

1 2

tion kernel or basis image( , , , ) : inverse transformation kernel( , , , ) ( , ) ( , )

Kernel is seperable (horizontal & vertical axis are independent so no. of computations are reduced).

h x y u vg x y u v g x u g y v

Image Transformation

Image Transformations

Unitary Transformations

Orthogonal and Orthonormal basis vectors

How an arbitrary 1-D signal can be represented by series summation of

orthogonal basis vectors

How an arbitrary image can be represented by series summation of

orthogonal basis images

What is Image Transformation?

Transform

Inverse Transform

Coefficient Matrix

ImageN X N

Another ImageN X N

Image Transformation Applications

Preprocessing• Filtering• Enhancement etc.

Data Compression

Feature Extraction• Edge Detection• Corner Detection etc.

What does Image Transformation do? It represents a given image as a series summation of a set of Unitary

matrices (orthogonal basis functions).

Example: For 1-D signal x(t), this representation can be given as

Here is set of orthogonal functions.

Unitary Matrix Basis Images

A matrix A is a Unitary Matrix if

A-1 = A*T

0

( ) ( )n nn

x t C a t

( )na t

Orthogonal/Orthonormal Function A set of real valued continuous functions is called orthogonal over the interval t to t+T, if

if k=1 then we say above set is orthonormal.

0 1( ) ( ), ( )......na t a t a t

( ). ( ) ;

0;

m nT

a t a t dt k m n

m n

Where an(t) is a set of orthogonal functions…Then plot sinwt and sin2wt in time period 0 to T. We will get…

Now product of sinwt and sin2wt wil be shown and the intergration of sinwt and sin2wt over the interval 0 to T is given as …

Similarly if we multiply sin2wt and sin3wt then integrate it we will get zero. Hence this particular set i.e {Sinwt, sin2wt,sin3wt} is called orthogonal basis function.

Now consider an example :-

Now to calculate the value of Cn, multiply both side by am(t) and integrate it

And expand it

Now according to definition of orthogonal, this expansion will be equal to k if n=m and zero otherwise.

This is how we get mth coefficient of any arbitrary function x(t)

Compression Techniques

The Karhunen Loeve Transform (KLT)

Discrete Cosine Transform (DCT)

Discrete Wavelet Transform

The 2 dimensional DFT is defined by

The inverse DFT is

DFT

21 1

0 0

1( , ) ( , ) (1)x yN N j v v

N

x y

F u v f x y eN

21 1

0 0

1( , ) ( , ) (2)x yN N j u u

N

u v

f x y F u v eN

Properties of DFTFast transform Good energy compaction; however – requires complex computationsvery useful in digital signal processing, convolution, filtering, image analysis

Why DCT not FFT? DCT is like FFT, but can approximate linear signals well with fewcoefficients.

Output File of Gray Image Using DCTOriginal image size (bytes) = 65536, Compressed image size (bytes) = 6382, C.R. = 90%, SNR (db) =24.49, Simulation Time (Secs) = 1.21

SNR Performance for Gray Image

24.4926.3229.26DCT

28.4729.2330.92KLT

29.0630.2931.24DWT

CR = 90%CR = 85%CR = 75%

Table : SNR Values for gray Image (All values in db)

Simulation time performance for gray image

1.211.151.1DCT1.341.331.31DWT3.953.793.68KLT

CR = 90%CR = 85%CR = 75%

Table : Simulation Time for Gray Image (All values in seconds)

Output File of Color Image Using DCTOriginal image size (bytes) = 196608, Compressed image size (bytes) = 58896, C.R. = 70%, SNR (db) =23.05, Simulation Time (Secs) = 3.57

Output File of Color Image Using DCTOriginal image size (bytes) = 196608 ,Compressed image size (bytes) = 18978C.R. = 90%, SNR (db) = 17.51, Simulation Time (Secs) = 3.68

DIP Image compression 1.11.2015.ppt

Documents

Transcript of DIP Image compression 1.11.2015.ppt