Download - Understanding JPEG

Understanding JPEG

MIT-CETI Xi’an ‘99

Lecture 10

Ben Walter, Lan Chen, Wei Hu

What is JPEG?

• JPEG is a method for compressing image data so it takes less space to store or transmit across a network.

• JPEG is very efficient. A file that was 1Mb in size could be compressed to as little 25Kb (1:40)!

• JPEG achieves such good compression ratios because it is lossy - but the loss is not visually perceptible.

Overview

• Images contain different frequencies; low frequencies correspond the slowly varying colors, high frequencies correspond to fine detail.

• The low frequencies are much more important than the high frequencies; we can throw away some high frequencies to compress our data!

0 1 2 3 4 5 6 7-1

0

1

0 1 2 3 4 5 6 7-1

0

1

0 1 2 3 4 5 6 7-1

0

1

Overview

• Note that we aren’t talking about the frequencies of light, but of the light and dark areas in the image!

• We need a way to go from the color of pixels, which is essentially a number, to frequencies…

• This way is called the Discrete Cosine Transform (DCT).

A JPEG Encoder

Entropy Encoder

DCT

Quantizer

The Discrete Cosine Transform

0 2 4 6 8 10 12 14 160

2

4

6

8

10

12

14

16

X Position of Pixel

Col

or o

f Pix

el

0 2 4 6 8 10 12 14 16-20

-10

0

10

20

30

40

Inte

nsity

Frequency

=

0 10 20-2

0

2

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 2 4 6 8 10 12 14 16-20

-10

0

10

20

30

40

Inte

nsity

Frequency



0 2 4 6 8 10 12 14 160

2

4

6

8

10

12

14

16

X Position of Pixel

Col

or o

f Pix

el

0 2 4 6 8 10 12 14 16-1

-0.5

0

0.5

1

1.5

0 2 4 6 8 10 12 14 16-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 2 4 6 8 10 12 14 16-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 2 4 6 8 10 12 14 16-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

= x1

+ x2

+ … + x15 + x16

0 2 4 6 8 10 12 14 16-20

-10

0

10

20

30

40

Inte

nsity

Frequency


0 10 206

8

10

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

0 10 200

10

20

The 2D DCT

• So far we’ve been talking about one-dimensional images, just one line of the picture… but an image has two dimensions.

• We can talk about frequencies in two dimensions, although it’s much harder to visualize.

Basis

• Remember we saw that every 16-pixel line can be written as the sum of 16 different waves?

• Those 16 waves formed a basis for the set of 16-pixel lines.

0 10 20-2

0

2

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

0 10 20-0.5

0

0.5

Basis

• When we are compressing a JPEG, we work in blocks of 8x8 pixels. That’s 64 numbers, so there are 64 different basis images.

• This means we can describe any 8x8 image as a combination (a sum) of those 64 images.

The 2D DCT

0

2

4

6

8

0

2

4

6

8

0

200

400

0

2

4

6

8 02

46

8

-500

0

500

1000

1500

The 2D DCT

Summary

• The Discrete Cosine Transform (DCT) allows us to determine what frequencies make up an image.

• Into this stage we have 8x8 numbers that are the values of each pixel.

• Out of this stage we have 8x8 numbers that represent how much of each frequency (or how much of each basis) is in the image.

A JPEG Encoder

Entropy Encoder

DCT

Quantizer

Quantization

• So we still have 64 numbers to work with - we haven’t reduced the size at all!

• The reason we wanted the numbers as frequencies was because some frequencies are more important than others.

• The low frequencies are the most important, the high frequencies are not very important (think back to building up the image).

Quantization

• Before quantization, each frequency can be between 0 and 255.

• To quantize, we divide frequencies by a number so that the range is reduced. For example, it becomes 0 to 31. For high frequencies we divide by a higher number.

Quantization

• Before we had, say: 134,113,145,117,32,11,17,5… 4.

• After quantization, we might have: 116, 55, 55, 30, 1, 0, 0, … 0.

Quantization

124

56

113

17

34

27

49

25

110

2119

5

7

15

710

97

1 3

Quantization

300 kB 75 kB

Original Medium Quality JPEG

Quantization

300 kB 35 kB

Original Low Quality JPEG

Summary

• The degree of quantization, dictates the amount of information “thrown away”.

• If you throw away more information, you will get better compression, but the picture will start to look bad.

• When you adjust the quality of a JPEG save from Photoshop, you are changing the quantization!

A JPEG Encoder

Entropy Encoder

DCT

Quantizer

Entropy Encoding

• Entropy encoding is another stage of compression, that relies on statistical properties of the data, e.g. most frequently occuring numbers, lots of the same number in a row.

• So the take the 64 numbers, do Run Length Encoding, then follow that with Huffman Coding! (Remember yesterday?)

Entropy Encoding

• These compression schemes now work very well, because quantization turns numbers like 132, 117, 78 into numbers more like 31, 31, 15.

• After quantization, the range of numbers is smaller, and there are often large runs of numbers - so it can be highly compressed!

• This is where all of the compression happens!

Summary

Entropy Encoder

DCT

Quantizer

Summary

• We break up the image into 8x8 blocks.

• We calculate the frequencies in each block, this allows us to identify the important and less important data.

• We throw away some less important data.

• We compress the resulting data.

• The result: ~ 1:40 compression!