MLT Report Vfinal2
-
Upload
cosmincaba -
Category
Documents
-
view
234 -
download
0
Embed Size (px)
Transcript of MLT Report Vfinal2
-
8/4/2019 MLT Report Vfinal2
1/22
Modulated Lapped Transform based image compression
Course : Data Compression
Students: Paula PopaCosmin Caba
2011
-
8/4/2019 MLT Report Vfinal2
2/22
Table of Contents
1. Introduction ...................................................................................................................................... 3
2. Transform block ................................................................................................................................ 4
3. Quantization block ............................................................................................................................ 7
4. Entropy Encoding Block .................................................................................................................... 9
4.1 Entropy Encoding Part 1 : Huffman Encoding ................................................................................. 9
4.2 Entropy Encoding Part 2 ................................................................................................................ 10
5. Measurements ................................................................................................................................ 12
5.1. Code length measurements and entropy estimation .................................................................. 12
5.2. Objective and Subjective assessment .......................................................................................... 15
6. Conclusion ....................................................................................................................................... 19
7. Bibliography .................................................................................................................................... 20
8. Appendix ......................................................................................................................................... 21
-
8/4/2019 MLT Report Vfinal2
3/22
3
1.IntroductionThe general purpose of the project is to implement a coder and a decoder and to test the
implemented modules on some test data. The implementation will follow the block diagram presented
in figure 1:
Figure 1:Block diagram of encoder
The first module is the Transform block. The purpose of this block is to decorrelate the image .
The second block includes both Quantization of the transformed matrix and the ZigZag traversing of the
quantized matrix. The result obtained after the ZigZag is coded as a run length vector. The last block is
Entropy encoding module. The aim of this block is to code, lossless, the run length vector into the final
bit stream. More details about how this techniques have been implemented are going to be given in the
next chapters.
After the encoding and decoding process have been performed we will evaluate the quality of
the reconstructed image in comparison to the original one. Also the report will comprise several
measurements related to the code length and entropy estimation.
We have chosen to work with Modulated Lapped Transform (MLT) in order to decorrelate the
initial image. This decision is based on the fact that, as it has been proven in many articles, lappedtransforms perform very good in reducing the blocking effect of the reconstructed image. For
Quantization we have chosen to work with a standard matrix quantization, similar to JPEG standard. For
the Entropy encoding Huffman algorithm is used.
We have to mention that the goals of this project is to assess the coding in terms of
compression ratio, coding rate and how close is the reconstructed image to the original one( objective
and subjective assessment). We do not intend to measure the performance of the implemented
algorithms like speed, complexity or memory required, therefore the implementation is not focused on
optimizing the algorithms but only on providing the right results.
The structure of the report is : first we will describe the coding principle according to the
diagram in figure 1, then the implementation is presented and the tests that were effectuated,
afterwards the results will discussed and finally the conclusions.
T Q EInput image
Binary bit
stream
-
8/4/2019 MLT Report Vfinal2
4/22
4
2.Transform blockAs previously mentioned the transform used in this block in the Modulated Lapped Transform.
MLT is basically a block transform and it operates on the image similar to DCT transform. Lapped
transform is considered an extension to the regular block transform. The difference is that for lapped
transforms the basis functions are longer than for DCT leading to overlapping between neighboringblocks. Also the basis functions for MLT decay to 0 very smoothly thus reducing the block artifacts. We
have decided to use an 8x16 transform matrix. This is a compromise between quality and complexity.
Next we will show how the transformation matrix has been built. Let N be the number of basis
functions in the matrix and L be the length of one function. Therefore we have N=8 and L=16 for our
case. We denote the matrix P and the elements of the matrix are computed as follows:
where k=1..8 and n=1..16. h(n) is seen as a window that shapes (modulates ) the basis functions in such
a way that both ends have a continuous transition to 0. h(n) is defined :
Next picture depicts the basis functions computed with the previous formulas for the case of
N=8 and L=16:
Figure 2: MLT :N=8 ,L=16 basis functions
The obtained functions (figure 2) are similar to those provided in references [1] and [2].
-
8/4/2019 MLT Report Vfinal2
5/22
-
8/4/2019 MLT Report Vfinal2
6/22
6
achieved because of the distortion that the transform introduces in the image. Only 4 pixels at every of
the four edges are affected by this distortion. We have tried to eliminate or at least to reduce as much
as possible the effect of the distortion but this is the best we got. The technique used is to mirror 4 of
the pixels around the edge. This problem appears only for the lapped transforms and not for block
transforms like DCT. Although this may not affect the human perception thus subjective assessment of
the reconstructed image but it still affects the objective assessment. When the distortion will be
evaluated we will take into consideration this imperfection at the borders as well.
Next figure illustrates the reasons behind this behavior.
Figure 5: Mapping of pixels using MLT
In figure 5 the case of one-dimensional transform is depicted. Each block of 8 pixels from the
original data is transformed into a 8 pixel block. But in input of the transform is 16 pixels, thus we take 4
pixels from left side and 4 from the right side. For the first and last block of 8 pixels there is a need to
add 4 extra pixels on each side (the two red blocks). In our implementation this 4 pixels are mirrored
from the first 4 and the last 4 pixels. The blue blocks are the overlapping ones. They partially overlap
with the neighboring 8 pixels blocks. The problem is that we introduce some additional information with
the added pixels (red blocks) into the transformed image which cannot be extracted back when doing
inverse transformation.
As for the inverse transformation we compute the inverse transform matrix by doing the
transpose of the forward transform matrix (we take advantage of the fact that we deal with
orthonormal transform). This time the input of the transform module must be a block of 8 pixels and
the output is 16 pixels. We still need to map the input to an output of 8 pixels thus at the output we do
overlapping and sum the overlapped values in order to get the output sequence. Another technique
that we have used in order to minimize the distortion at the borders is to duplicate the last and the first
8 pixels block before doing the inverse transform and take into account also the influence of this
replicated blocks when computing the reconstructed data stream.
The property of separability has been used to implement the two dimensional transform (2-D
transform) by first doing one dimensional transform on each row of the input image and again one
dimensional transform for every column.The functions that can be used for forward transform are: MLTenc1D (unidimensional transform
), MLT_enc (2-D transform) and for inverse transform : MLTdec1D and MLT_dec.
The transformed matrix that we obtained for input image Lenna has almost all the information
concentrated in the DC coefficient (the first value in a 8x8 block) the rest of the coefficients decay to 0.
Next figure shows the first transformed block of image Lenna:
-
8/4/2019 MLT Report Vfinal2
7/22
7
Figure 6: 8x8 pixels transformed block
The purpose of the transformation process has been achieved. The pixels are not correlated as
much as they are in the original image. There is still some correlation which we are going to eliminate in
the next block of the encoder namely Quantization Block.
3.Quantization blockCompression is done starting with quantization. Thus each 8x8 block of the picture, after MLT
has been applied, is quantized by dividing each element from the 8x8 matrix with the corresponding
element from the quantization matrix and rounding to the nearest integer value. The image quality is
obtained by selecting specific quantization tables. The quality levels range between 1 and 100, where 1
is the poorest image quality and the highest compression and 100 is the best quality and the lowest
compression. In the project we use 50 quality level matrix because its a trade-off between image quality
and compression level by providing a very good image quality and high compression. The quantization
matrix used it is:
(
)
To obtain a better quality the matrix Q50 is multiplied with the (quality level)/50. Thus if we
want to obtain a quality level of 100, Q50 matrix will be multiplied by 100/50. The elements of the new
matrix will be then rounded to obtain positive values and situated between 1 and 255 interval. For the
measurements in the project we used Q10, Q30, Q70 and Q90 quantization matrices to compare the
differences in image quality with the image compressed with the standard Q50 matrix.
The other quantization matrices Q10 and Q90 will be equal to:
-
8/4/2019 MLT Report Vfinal2
8/22
8
(
)
(
)
Thus after quantization the coefficients situated in the upper-left corner of the image block
have bigger values and corresponds to the lower frequencies to which the human eye is more sensitive.
The rest of the coefficients will be zero and they would not be used for the reconstruction of the image
(as a part of lossy compression). More measurements have been done for the above quantization
matrices. In the pictures it can be seen the difference in image quality.
The quantized coefficients are now prepared for coding. The coefficients will be quantized in
ZigZag order, to compress much better the large number of zeros. The DC coefficients are differential
coded and the AC coefficients are run-length coded. This process is necessary because the DC
coefficients contain a big fraction of the image energy. In the differential coding from the DC coefficient
of the current block is subtracted the DC coefficient from the previous block. The difference obtained is
then encoded, to utilize spatial correlation of the DC coefficients from the neighboring blocks. Because
the difference between DC coefficients is smaller, then so many bits are not needed for encoding. Afterthe run length coding we will have something like (symbol, frequency of symbol), where the frequency
of the symbol is the number of consecutive apparitions of a value. The encoded values will be stored in a
single vector and then Huffman encoded. This process: ZigZag ordering, differential and run-length
coding makes the entropy coding more efficient and easy.
Figure 7 :Left: Zigzag ordering for a 8x8 block; Right: Differential coding [1]
-
8/4/2019 MLT Report Vfinal2
9/22
9
4.Entropy Encoding BlockAfter doing quantization thus lossy compression, this block aims to compress the data in the
lossless manner. The biggest part of this block is about Huffman encoding. We have chosen Huffman
instead Arithmetic coding because we found it easier to implement and it gives a better compressionratio than arithmetic encoding. The disadvantage of Huffman is that the codebook may be too large but
this is not our case. The idea behind the Huffman encoding is to use it for the run length vector that is
obtained in the Quantization phase.
4.1 Entropy Encoding Part 1 : Huffman Encoding
For the entropy coding we use Huffman algorithm, which uses fewer bits to encode with
variable length codes the most probable symbols (more frequently) and more bits for the less probable
symbols. The reason for using Huffman algorithm is because is much easy to implement and its not so
complex as Arithmetic algorithm. Arithmetic algorithm is slower and it needs more time for decoding
than Huffman, which requires only searching in the table look up the corresponding codeword for thesymbols. Huffman algorithm generates compact and optimal codes which are stored in codebook. The
encoded data and the codebook are needed at the decoder to obtain the initial symbols. The symbols
will be represented with a prefix-tree code and the bit sequence that represents a symbol will never be
a prefix for the bit sequence, which represents another symbol.
In the project we use Huffman static algorithm and we generate a binary tree by taking the two
probable symbols and add them to form a new symbol with a probability equal to sum of the other two
symbols. The process continues until we have one symbol (it remains a single node-the root node). As a
part of Huffman algorithm we use a function that generates the binary tree and the code words to
encode the symbols. The Huffman dictionary, the binary tree, is created by adding zeros and ones to the
last two probable codeword and keeps on doing this, until we dont have any symbols. The code words
are stored in a vector having the same size with the number of symbols and sorted according to theircorresponding symbol frequency. The lowest weights will be on the first positions and in this way it is
more efficient and it is not necessary to keep searching the symbols starting from the first positions.
Even if we have the last probable symbol on the first position when we will make decoding, the vector
with the code words will be evaluated in a descending order to be more efficient and to save time.
At the receiver side the decoding process takes place. The bit stream will be decoded using the
codebook to decode the initially symbols. Thus from the received bit stream we compare a number of
bits which is equal to the length of the corresponding codeword from the dictionary. If the values are
the same we take the corresponding symbol and we store it in a vector. The code words from the vector
will be evaluated in decreasing order, to be more efficient. Once we find a symbol we start to evaluate
again the last values from the vector because it is more probable to find the most frequent symbols at
the end of the vector. After we obtain the Huffman decoded values, the symbols are differential and
run-length decoded and then rearranged in zigzag order.
After encoding the run length vector we must provide the receiver side with some extra
information in order to be able to reconstruct the Huffman codebook identically to the one used in the
encoding side. The algorithm and techniques used for this are described in the next subchapter.
-
8/4/2019 MLT Report Vfinal2
10/22
10
4.2 Entropy Encoding Part 2
At this point we have our picture encoded using the Huffman algorithm. In order to decode this
data at the receiver, we need to have the same codebook. There are many possibilities to have the
codebook at the receiver side. One of them is to send the codebook from encoder to decoder in such a
way that encoder knows how to read it from the file. Another option is to send some data that can helpthe decoder to reconstruct the same codebook that has been used to encode the run length vector. A
format for the data and a protocol has to be established for this case as well.
The solution we have chosen is to send the symbols and the frequencies associated with them
to the decoder so that we can rebuild the codebook. Even if this solution might be slower than sending
the codebook directly, we are not concerned with the speed of the algorithms.
First we will describe how the file will look like when having all the necessary information
written in it. We may think of it as of a packet which has a header and a payload. The header contains
the necessary information to rebuild the Huffman codebook and the payload contains the data that
needs to be decoded using this codebook.
Figure 8: File structure
Basically the Header contains the information regarding the symbols and frequencies taken from
the run length vector. The next step is to encode this information in such a way that the decoder knows
how to interpret it. For this we have used a protocol.
Because each symbol might be using a different number of bits to be represented a fixed symbol
length would not be appropriate. Thus we represent each symbol with a variable number of bits and for
delimitation between consecutives values we used a standard sequence of bits: 0 1 1 1 1 0.
Before being encoded using this protocol the information is structured like in the next figure:
Figure 9: Header structure before encoding
The header comprises of two rows: one with the symbols and the other with the frequencies.
The columns are sorted so that we have the symbols in a increasing order with value of 0 being
somewhere in the middle.
We start by taking the first symbol and transform it to binary representation. The result is
written in some vector and then the delimiter 01 1 1 1 0 is added to the vector. Next the frequency
Header Payload
-
8/4/2019 MLT Report Vfinal2
11/22
11
associated with this symbol is transformed to binary and added followed by the delimiter. After this we
move to the next column and keep doing this until the end of the header. We based our protocol on the
fact that the delimiter will not appear in the middle of the data very often. In the case that during binary
transformation we encounter the delimiter 01 1 1 1 0, then a 0 is added after each sequence of
three consecutive 1. At the receiver side , whenever the decoder finds three consecutive 1 it
extract the next bit which should be the 0 we have inserted previously and whenever finds the
delimiter reads the bit sequence and does the transformation from binary to decimal reconstructing the
header as illustrated in figure 9.
We do not encode the sign of the symbol. For this we take advantage of the header being
sorted. The decoder adds negative sign in front of the symbols starting from the first one until it finds
0 or it discovers that the order of symbols is switching from decreasing to increasing.
At this point we have two vectors with binary values, the first one is encoded using Huffman
algorithm and it represents the picture itself and the second one represents the information needed to
reconstruct the Huffman codebook and it is encoded using the protocol we have described earlier.
Next logical step would be to write these two vectors into a file. When writing to a file in Matlab
we have to specify what data type we are writing (int8, int16, int32etc. ). This way, each value we are
writing will be represented in the file using the precision for that specific data type. The smallest
precision is for uint8. Using this each value will be represented using only 8 bits. The problem is thatour values in the two vectors are bits and there is a huge amount of wasted capacity because we
represent 1 bit using 8 bits.
In order to address this issue we came up with a solution that takes bits in blocks of 8 and
transforms this block of 8 bits into the corresponding integer value. The resulted integer is then written
into the file as a uint8 data type thus using 8 bits. In this way we have taken 8 bits and represent them
into the file using 8 bits. Therefore before writing the binary header into the file we transform it into a
vector of integers. The same goes for the binary vector coded using Huffman algorithm. The two
resulted vectors of integers are finally written into the file.
The decoder has to take the vectors from the file and represent the integer values in binary
obtaining the binary header and the Huffman binary vector. Then, from the binary header, using the
protocol presented above, it computes the header with symbols and frequencies associated to thesymbols. Using this header information the codebook is easily reconstructed and the Huffman algorithm
can be applied in order to obtain the run length vector.
-
8/4/2019 MLT Report Vfinal2
12/22
12
5.MeasurementsThe first part of the measurements chapter comprises the results about code length and
estimates of the entropy in different points during the encoding process. The measurements are
performed on Lenna image.
5.1. Code length measurements and entropy estimation
In the quantization stage we have used 5 different quantization tables thus we expect that
encoder/ decoder modules to perform different for each case. The most common quantization table
that is used also in JPEG standard is Q50. First we will perform some measurements on the test data for
this case (with Q50) and then well compare and relate to the other cases (Q10, Q30, Q70 and Q90).
The code length is computed as the final file size expressed in bits divided by the total number
of pixels in the compressed image. The result is the following:
Code length =215872 bits / 262144 pixels= 0.8235 bits/pixel
Using the code length we may compute the compression ratio as well:
Compression ratio: 8 / 0.8235=9.7146
We assumed that the initial image needs 8 bits to represent a pixel. In order to evaluate how
good is our compression algorithm, we have to relate the code length to the entropy estimate. There
are many ways of estimating the entropy. The entropy is going to be estimated using the probability
model of the source by applying the following formula:
In the first hand we estimate the entropy in the original image without taking into consideration
any correlation between pixels. We make the assumption of i.i.d. pixels. The entropy in this case is
H=7.2185.It is not relevant to compare this entropy to the code length but if we assume that initially
one pixel is represented using 8 bits (if there is no correlation and each pixel is coded individually)thus
the rate is 8 bits/pixel then the entropy estimate is very close to this value.
The purpose of the transform block is to decorrelate the initial image, thus improving the
coding. If we estimate the entropy of the matrix containing the transform coefficients the result is:
H=4.3889. For this case the entropy has decreased significantly because the pixels are not as correlated
to each other as in the original image. Still the difference between the code length and the entropy
estimate is very big. The explanation for this fact is that because of the Quantization block a part of the
information contained in the image is discarded. Quantization is not about improving the coding gainbut only about discarding some of the information which is not so relevant in order to have a good
reconstruction of the original image.
The entropy estimate of the quantized coefficients is H= 0.8591. This time we notice that the
entropy estimate is very close to the code length. But the entropy should be the lower limit, so the code
length should be close but higher than the entropy. Even if the entropy estimate in this case is very
close to the code length that we obtained, there is still some correlation that we havent captured. And
-
8/4/2019 MLT Report Vfinal2
13/22
13
this correlation is related to the run length coding. Therefore the coding algorithm uses this technique
but our entropy estimate does not capture the effect of run length encoding.
The code length that we have computed takes into account the run length coding and the extra
information added into the file to help reconstructing the codebook at the receiver thus it is not a very
precise technique if we intend to evaluate the performance of our Huffman algorithm. In order to focus
only on Huffman coding part we may consider as an input the run length vector and calculate the code
length relative to the symbols in the run length vector. Next picture depicts the idea:
Figure 10 : Part of encoder from Q50 quantization case
The entropy estimate for the run length vector is written in the figure. The code length may also
be computed in two ways: first one is to take into account the header information thus the whole filesize and second one is to take into account only the run length vector transformed in binary by the
Huffman algorithm. For the first case the code length is:
C1=215872 bits/64338 symbols== 3.3553 bits/symbol
And for the second case we have:
C2=3.3097 bits/symbol
As we notice this time the measurement and the entropy estimation comply with the general
rule for Huffman encoding:
,where L is the average length of the code.Also the code length C1 is bigger because of the header information that is written into the file.C1 is more relevant to use as a comparison term than C2, if we want to evaluate the encoder as a whole.
If we think that there are better ways of compressing the header and we only want to evaluate how the
Huffman algorithm performs without taking into account how well we have compressed the extra
information needed for decoding then it makes more sense to compare the entropy with C2.
Next table presents the measurements for all the quantization tables used:
Q10 Q30 Q50 Q70 Q90
File size(bits) 77384 140904 215872 387992 568632
H1 0.3486 0.5849 0.8591 1.4653 2.0290
H2 2.9145 3.1902 3.2596 3.2641 3.2935
C1(bits/pixel) 0.2952 0.5375 0.8235 1.4801 2.1691
C2(bits/symbol) 3.0078 3.2994 3.3553 3.3334 3.3644
C3(bits/symbol) 2.9671 3.2556 3.3097 3.2894 3.3126
Comp. ratio 27.1 14.9 9.7 5.4 3.6
Table 1.
H1- entropy estimation for the Quantized coefficients;
-
8/4/2019 MLT Report Vfinal2
14/22
-
8/4/2019 MLT Report Vfinal2
15/22
15
5.2. Objective and Subjective assessment
This subchapter deals with measuring the distortion between the original and reconstructed
image. This is a way of evaluating how close the reconstructed image resembles the initial picture.
Another way is to perform a subjective assessment by comparing the two images. Subjective assessment
usually gives better results in a sense that it reflects closer the quality and specially the difference inquality between the two images. Because is much difficult to make such subjective quality assessments
, many times objective measurements are sufficient.
Here we have performed objective assessment for the several cases of compression that we
have but also subjective assessment. In order to be relevant and precise, subjective assessment must be
made using a big number of persons. This is not the case for the project we are working on.
In order to objectively assess the difference between one of the reconstructed image and the
original Lenna image we chose to compute the PSNR. There are some other methods that can be used,
but that is not the scope of this report and project. PSNR reflects the difference in quality between two
pictures. A bigger value of the PSNR suggests that the reconstruction is very close to the original image.
The formula used to compute the PSNR is:
Where is the mean square errorbetween individual pixels in the two images and is computed as:
A good way of observing the tradeoff between the distortion and the rate of the code is to
create a rate-distortion graph. The next figure(figure 12) illustrates the rate-distortion curves for five
different quantization tables that are used inside the compression algorithm (Q10Q90).
The graphs shows that at low rates, thus poor quality , the PSNR is smaller than for higher rates,
which is what we expected for. An important thing is the logarithmic shape of the curve. This tells us
that the PSNR increases faster when we are in the low rate (low quality) zone of the curve, until it
reaches a saturation point where even if the rate, thus quality, is increased, the PSNR increases at a
slower pace. This somehow resembles the human quality perception. We shall discuss this after the
subjective assessment.
The following PSNR values were calculated for different quantization matrices for the image
Lenna:
Quality level PSNR [dB] rate [bits/pixel]
10 31.48 0.25930 34.89 0.537
50 36.88 0.822
70 39.10 1.48
90 40.81 2.169
Table 2
-
8/4/2019 MLT Report Vfinal2
16/22
16
Figure 12: Rate-distortion graph
As for the subjective quality assessment, the next five figures depict the reconstructed images
with different quality:
Figure 13: Lenna reconstruction : left -> Q10, right ->Q30
-
8/4/2019 MLT Report Vfinal2
17/22
17
Figure 14: Lenna reconstruction : left -> Q50, right ->Q70
Figure 15: Left: Lenna reconstruction with Q90, Right: Lenna
Looking at the reconstructed images we notice the same pattern of low quality when we use a
lower rank quantization table (Q10 or Q30). The quality of the decoded picture is reflected by the rate of
the code, therefore low rate leads to poor quality. It can be seen that reconstruction with Q90 is almost
perfect, the compression is lower and the file size is bigger. When we use Q10 and Q30 quantization
matrices for the reconstruction of the image, the quality is poor, but the compression is higher and the
size of the file is smaller. The use of higher quality levels matrices gives better quality images since after
quantization with these matrices, doesnt result so many values of zeros which dont participate at the
reconstruction of the image.
If we are to trace a curve with the subjective assessment it would have a logarithmic shape,
because the quality (in or perception, though it may not be the case for other people) increases faster
-
8/4/2019 MLT Report Vfinal2
18/22
18
when the rate is low. This translates into the fact that human visual system is more sensitive to when
the quality is poor. If the quality is good (i.e. Q70 or Q90 reconstructions) we cant distinguish or notice
the difference in quality.
As a final conclusion for this subchapter we would say that our PSNR measurement resembles,
at least the trend of, the subjective assessment. In order to obtain a precise graph for the subjective
assessment we should develop a grading system and have many tests to grade the images. In any case it
is easy to say that the potential trace would probably look like the PSNR graph.
There is an issue that we havent addressed so far. It is about our transform block. We must not
forget that this block, in spite the fact that it should offer perfect reconstruction, introduces some
distortion on the borders of the reconstructed image. When we have computed the PSNR we didnt take
into account this effect. If we plot the difference between the new PSNR (PSNR2) and the old PSNR
(PSNR1) against the rate in this case we get the following graph:
Figure 16: Difference between the PSNRs versus rate
If for low quality images the difference is small is becomes more relevant for high quality (when
we quantize the transform coefficients using smaller values). Thus the distortion induced by the
transform block becomes relevant when moving towards higher quality of the reconstructed image. Thisis valid only for objective assessment because when we talk about subjective assessment, the four pixels
affected by distortion at the borders of the image are not noticeable by the eye.
-
8/4/2019 MLT Report Vfinal2
19/22
19
6.ConclusionFor the decorrelation of the image in the project it was used Modulated Lapped Transform. This
technique is better than DCT because eliminates the blocking effect. This is done by using basis functions
that decay smooth to 0 and are longer than in DCT case, yielding an overlap of the samples betweenneighboring blocks. In our case the basis functions have a length of 16 samples and thus the overlapping
between neighboring blocks is 4 samples on each side. We showed that even if at the borders there are
some imperfections in the reconstructed image (because we add some samples), it is almost identical
with the original one. Thus only 4 pixels for each edge were affected by the distortion. Applying the MLT
transformed we decorrelate the pixels.
For the quantization we used the standard quality level 50, for which we obtained good results.
However the measurements were done also for the 10, 30, 70 and 90 quality levels. For the higher
quality levels like 70 and 90 we obtained a good quality image, a small compression thus a bigger file
size. That is because the quantization for the higher quality levels doesnt discard so much information
related to the reconstruction of the image as the lower levels do.
For the entropy encoding we have used Huffman algorithm because its more simple and fastthan arithmetic coding. After the MLT transform, quantization, run length coding and Huffman coding
the data was put in a file and sent to the decoder. In the header of the file we kept the information
necessary for the reconstruction of the codebook in order to be able to decode the symbols. The
average code length obtained with Huffman algorithm was comparable with the entropy. The estimates
for the entropy were done: on the original image, after MLT transform, after quantization and after the
run length vector. The code length was measured relative to the image, relative to the run length vector
and relative to the run length vector without the header. We noticed that after each step, the
estimation of the entropy was better and the Huffman length code was closer to the entropy. That is
because after each step the pixels are more decorrelated, leading to lower entropy and better
compression.
The second part of the measurements comprises the rate-distortion graphs. The measurements
were performed by taking into account the quality levels used. The code rate is higher for the higher
quality levels, thus the PSNR grows as the number of bits/pixel increases. The measurement of the PSNR
was done in both situations: by omitting the imperfections on the borders (the first and last four pixels)
and by taking into account the imperfections.
For the subjective assessment, when we plotted the difference between the two PSNRs (the
error) against the rates it can be seen that the distortion grows faster for the higher rates, which means
that the distortion in quality of the image it is more noticeable for higher quality levels, than for the
small ones.
The compression algorithm used is similar with JPEG but MLT transformed is used instead of
DCT to eliminate the blocking effects. The results obtained are good and the quality of the reconstructed
image is close to the original picture.
As for the future work, improvements have to be made in order to achieve perfectreconstruction at the borders of the image. Also it would be very interesting to compare the MLT
technique against DCT to observe how well the overlapping reduces the blocking effect.
-
8/4/2019 MLT Report Vfinal2
20/22
20
7.Bibliography
1) Til Aach : Fourier, Block and Lapped Transforms, Institute for Signal Processing, University ofLubeck;
2) Henrique S. Malvar : Extended Lapped Transforms: Properties, Applications, and FastAlgorithms, 1992, IEEE Transactions on signal processing vol. 40, no.11 ;
3) Khalid Sayood : Introduction to Data Compression ,Third edition, 2005;4) Gregory K. Wallace, Multimedia Engineering, Digital Equipment Corporation, Maynard,
Massachusetts :The JPEG still compression standard,1991, IEEE Transactions on ConsumerElectronics;
5) Tinku Acharya, Ajoy K. Ray : Image processing Principles and Applications, 2005;6) Cornelius T. Leondes, Database and Data Communication Network Systems, Volume 1,
Academic Press 2002.
-
8/4/2019 MLT Report Vfinal2
21/22
21
8.AppendixThe source code is uploaded in electronic format. Here we will give just a short introduction
about each function we have used to encode and decode the image.The implementation is structured in 2 main parts: encoding part and decoding part.
There is a program that computes the Transform matrix and plots the basis functions:
MLTmatrix.m .
We have defined the five quantization tables under the name Q10, Q30, Q50, Q70, Q90.
Each time someone wishes to run encoder or decoder, these matrixes must be uploaded into
the workspace.
8.1 Encoding part
The main script in the encoding part is encoding1.m. In this script we call individual functions
that are related to different block in the encoder. These are going to be described in the following: MLT_enc.m function : inputs (, ) ;
output : coefficients matrix ;
quant_enc.m function : inputs (,);output : run length vector of the quantized coefficients matrix;
buildmat.m function : input ();output : header information (symbols and frequencies from run length ;
vector);
huffman_dict.m function : input();outputs: codes and symbols associated to each code;
runl2bin.m function :inputs (,,);output: binary representation of run length vector coded with Huffman
algorithm;
linecode_enc.m function : input();output : binary coded header information using the protocol
described in section 4.2;
putinteger2.m function : input ();output : uint8 vector;
filewrite.m function : input (,,);
8.2Decoding partThe main script in this part is decoding1.m which call the following functions :
fileread.m function : input (,);output : header and run length vector in uint8 data format;
putbits.m function : input();output : binary representation of the input vector;
linecode_dec.m function : input ();
-
8/4/2019 MLT Report Vfinal2
22/22
22
output : header information (symbols and frequencies);
huffman_dict_rebuild.m function : input ();output : codes and symbols for Huffman algorithm;
bin2runl.m function : input( ,< codes>, );output : run length vector ;
quant_dec.m function : input (,);output : transform coefficients;
MLT_dec.m function : input ();output: reconstructed image;