data compression tech-cs.ppt

download data compression tech-cs.ppt

of 34

Transcript of data compression tech-cs.ppt

  • 7/27/2019 data compression tech-cs.ppt

    1/34

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    [1]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    A Review of DataCompression Techniques

    Presentedby

    Sudeepta Mishra

    Roll# CS200117052

    At

    NIST,Berhampur

    Under the guidance of

    Mr. Rowdra Ghatak

  • 7/27/2019 data compression tech-cs.ppt

    2/34

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    [2]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Introduction

    Data compression is the process of encoding data so

    that it takes less storage space or less transmission time

    than it would if it were not compressed.

    Compression is possible because most real-world datais very redundant

  • 7/27/2019 data compression tech-cs.ppt

    3/34

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    [3]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Different Compression Techniques Mainly two types of data Compression techniques are

    there.

    Loss less Compression.

    Useful in spreadsheets, text, executable program

    Compression. Lossy less Compression.

    Compression of images, movies and sounds.

  • 7/27/2019 data compression tech-cs.ppt

    4/34

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    [4]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    Nat

    ionalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Types of Loss less data Compression

    Dictionary coders.

    Zip (file format).

    Lempel Ziv.

    Entropy encoding. Huffman coding (simple entropy coding).

    Run-length encoding.

  • 7/27/2019 data compression tech-cs.ppt

    5/34

    NationalInstitut

    e

    ofScience

    and

    Technology

    [5]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    NationalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Dictionary-Based Compression

    Dictionary-based algorithms do not encode singlesymbols as variable-length bit strings; theyencode variable-length strings of symbols assingle tokens.

    The tokens form an index into a phrasedictionary.

    If the tokens are smaller than the phrases theyreplace, compression occurs.

  • 7/27/2019 data compression tech-cs.ppt

    6/34

    NationalInstitut

    e

    ofScience

    and

    Technology

    [6]

    Technical Seminar Presentation 2005

    Sudeepta Mishra

    NationalInstitut

    e

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Types of Dictionary

    Static Dictionary.

    Semi-Adaptive Dictionary.

    Adaptive Dictionary.

    Lempel Ziv algorithms belong to this category ofdictionary coders. The dictionary is being built in a

    single pass, while at the same time encoding the data.

    The decoder can build up the dictionary in the same

    way as the encoder while decompressing the data.

  • 7/27/2019 data compression tech-cs.ppt

    7/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [7]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    Using a English Dictionary the string:

    A good example of how dictionary based compression works

    Gives : 1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2

    Using the dictionary as lookup table, each word is coded as

    x/y, where, x gives the page no. and y gives the number of

    the word on that page. If the dictionary has 2,200 pages

    with less than 256 entries per page: Therefore x requires 12

    bits and y requires 8 bits, i.e., 20 bits per word (2.5 bytes per

    word). Using ASCII coding the above string requires 48bytes, whereas our encoding requires only 20 (

  • 7/27/2019 data compression tech-cs.ppt

    8/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [8]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    Lempel Ziv

    It is a family of algorithms, stemming from the twoalgorithms proposed by Jacob Ziv and Abraham Lempel in

    their landmark papers in 1977 and 1978.

    LZ77 LZ78

    LZR

    LZHLZSS LZB

    LZFG

    LZC LZT LZMW

    LZW

    LZJ

  • 7/27/2019 data compression tech-cs.ppt

    9/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [9]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    LZW Algorithm

    It is An improved version of LZ78 algorithm.

    Published by Terry Welch in 1984.

    A dictionary that is indexed by codesis used. The

    dictionary is assumed to be initialized with 256entries (indexed with ASCII codes 0 through 255)

    representing the ASCII table.

  • 7/27/2019 data compression tech-cs.ppt

    10/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [10]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression)

    W = NIL;while (there is input){

    K = next symbol from input;

    if (WK exists in the dictionary) {

    W = WK;} else {

    output (index(W));

    add WK to the dictionary;

    W = K;}

    }

  • 7/27/2019 data compression tech-cs.ppt

    11/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [11]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Flow Chart

    START

    W= NULL

    IS EOF?

    K=NEXT INPUT

    IS WK

    FOUND?W=WK

    OUTPUT INDEX OF W

    ADD WK TO DICTIONARY

    STOP

    W=K

    YES

    NO

    YES

    NO

  • 7/27/2019 data compression tech-cs.ppt

    12/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [12]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    Input string is

    The InitialDictionary

    contains symbolslike

    a, b, c, d with theirindex values as 1, 2,3, 4 respectively.

    Now the input stringis read from left toright. Starting froma.

    a b d c a d a c

    a 1

    b 2

    c 3

    d 4

  • 7/27/2019 data compression tech-cs.ppt

    13/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [13]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    W = Null

    K = a

    WK = a

    In the dictionary.

    a b d c a d a c

    a 1

    b 2

    c 3

    d 4

    K

  • 7/27/2019 data compression tech-cs.ppt

    14/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [14]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = b.

    WK = ab

    is not in the dictionary.

    Add WK to

    dictionary

    Output code for a.

    Set W = b

    a b d c a d a c

    K

    1

    ab 5a 1

    b 2

    c 3

    d 4

  • 7/27/2019 data compression tech-cs.ppt

    15/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [15]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = d

    WK = bd

    Not in the dictionary.

    Add bd to dictionary.

    Output code b

    Set W = d

    a b d c a d a c

    1

    K

    2

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

  • 7/27/2019 data compression tech-cs.ppt

    16/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [16]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = a

    WK = da

    not in the dictionary.

    Add it to dictionary.

    Output code d

    Set W = a

    a b d a b d a c

    1K

    2 4

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

    da 7

  • 7/27/2019 data compression tech-cs.ppt

    17/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [17]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = b

    WK = ab

    It is in the dictionary.

    a b d a b d a c

    1K

    2 4

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

    da 7

  • 7/27/2019 data compression tech-cs.ppt

    18/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [18]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = d

    WK = abd

    Not in the dictionary.

    Add W to the

    dictionary.

    Output code for W.

    Set W = d

    a b d a b d a c

    1K

    2 4 5

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

    da 7

    abd 8

  • 7/27/2019 data compression tech-cs.ppt

    19/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [19]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = a

    WK = da

    In the dictionary.

    a b d a b d a c

    1K

    2 4 5

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

    da 7

    abd 8

  • 7/27/2019 data compression tech-cs.ppt

    20/34NationalInstitute

    ofScience

    and

    Technol

    ogy

    [20]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    K = c

    WK = dac

    Not in the dictionary.

    Add WK to the

    dictionary.

    Output code for W.

    Set W = c No input left so

    output code for W.

    a b d a b d a c

    1K

    2 4 5

    ab 5a 1

    b 2

    c 3

    d 4

    bd 6

    da 7

    abd 8

    7

    dac 9

  • 7/27/2019 data compression tech-cs.ppt

    21/34

    NationalInstitute

    ofScience

    and

    Technol

    ogy

    [21]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Compression) Example

    The final output

    string is

    1 2 4 5 7 3

    Stop.

    cadbadba

    1K

    2 4 5

    5ab

    4d

    3c

    2b

    1a

    6bd

    7da

    8abd

    7

    9dac

    3

  • 7/27/2019 data compression tech-cs.ppt

    22/34

    NationalInstitute

    ofScience

    and

    Technol

    ogy

    [22]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technol

    ogy

    Sudeepta Mishra CS200117052

    LZW Decompression Algorithmread a character k;output k;

    w = k;

    while ( read a character k )

    /* k could be a character or a code. */

    { entry = dictionary entry for k;

    output entry;

    add w + entry[0] to dictionary;

    w = entry; }

  • 7/27/2019 data compression tech-cs.ppt

    23/34

    NationalInstitute

    ofScience

    and

    Technology

    [23]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    LZW Decompression AlgorithmFlow ChartSTART

    Output K

    IS EOF?

    K=NEXT INPUT

    ENTRY=DICTIONARY INDEX (K)

    ADD W+ENTRY[0] TO DICTIONARY

    STOP

    W=ENTRY

    K=INPUT

    W=K

    YES

    NO

    Output ENTRY

  • 7/27/2019 data compression tech-cs.ppt

    24/34

    NationalInstitute

    ofScience

    and

    Technology

    [24]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 1

    Out put K (i.e. a)

    W = K

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a

  • 7/27/2019 data compression tech-cs.ppt

    25/34

    NationalInstitute

    ofScience

    and

    Technology

    [25]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 2

    entry = b

    Output entry

    Add W + entry[0] to

    dictionary

    W = entry[0] (i.e. b)

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a b

    5ab

  • 7/27/2019 data compression tech-cs.ppt

    26/34

    NationalInstitute

    ofScience

    and

    Technology

    [26]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 4

    entry = d

    Output entry

    Add W + entry[0] to

    dictionary

    W = entry[0] (i.e. d)

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a b

    5ab

    6bd

    d

  • 7/27/2019 data compression tech-cs.ppt

    27/34

    Na

    tionalInstitute

    ofScience

    and

    Technology

    [27]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 5

    entry = ab

    Output entry

    Add W + entry[0] to

    dictionary

    W = entry[0] (i.e. a)

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a b

    5ab

    6bd

    d a b

    7da

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    28/34

    Na

    tionalInstitute

    ofScience

    and

    Technology

    [28]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitute

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 7

    entry = da

    Output entry

    Add W + entry[0] to

    dictionary

    W = entry[0] (i.e. d)

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a b

    5ab

    6bd

    d a b

    7da

    d a

    8abd

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    29/34

    Na

    tionalInstitu

    te

    ofScience

    and

    Technology

    [29]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitu

    te

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    The LZW Algorithm (Decompression) Example

    K = 3

    entry = c

    Output entry

    Add W + entry[0] to

    dictionary

    W = entry[0] (i.e. c)

    1

    K

    2 4 5

    4d

    3c

    2b

    1a

    7 3

    a b

    5ab

    6bd

    d a b

    7da

    d a

    8abd

    c

    9dac

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    30/34

    Na

    tionalInstitu

    te

    ofScience

    and

    Technology

    [30]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitu

    te

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Advantages

    As LZW is adaptive dictionary coding no need totransfer the dictionary explicitly.

    It will be created at the decoder side.

    LZW can be made really fast, it grabs a fixed number

    of bits from input, so bit parsing is very easy, and tablelook up is automatic.

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    31/34

    Na

    tionalInstitu

    te

    ofScience

    and

    Technology

    [31]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitu

    te

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Problems with the encoder

    What if we run out of space? Keep track of unused entries and use LRU (Last

    Recently Used).

    Monitor compression performance and flush

    dictionary when performance is poor.

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    32/34

    Na

    tionalInstitu

    te

    ofScience

    and

    Technology

    [32]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitu

    te

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    Conclusion

    LZW has given new dimensions for the development of

    new compression techniques.

    It has been implemented in well known compression

    format like Acrobat PDF and many other types of

    compression packages.

    In combination with other compression techniques

    many other different compression techniques are

    developed like LZMS.

    T h i l S i P t ti 2005

  • 7/27/2019 data compression tech-cs.ppt

    33/34

    Na

    tionalInstitu

    te

    ofScience

    and

    Technology

    [33]

    Technical Seminar Presentation 2005

    Sudeepta MishraNationalInstitu

    te

    ofScience

    and

    Technology

    Sudeepta Mishra CS200117052

    REFERENCES

    [1] http://www.bambooweb.com/articles/d/a/Data_Compression.html

    [2] http://tuxtina.de/files/seminar/LempelZivReport.pdf

    [3] BELL, T. C., CLEARY, J. G., AND WITTEN, I. H. TextCompression.Prentice Hall, Upper Sadle River, NJ, 1990.

    [4] http://www.cs.cf.ac.uk/Dave/Multimedia/node214.html

    [5] http://download.cdsoft.co.uk/tutorials/rlecompression/Run-Length Encoding (RLE) Tutorial.htm

    [6] David Salomon, Data Compression The Complete Reference,Second Edition.Springer-Verlac, New York, Inc, 2001 reprint.

    [7] http://www.programmersheaven.com/2/Art_Huffman_p1.htm

    [8] http://www.programmersheaven.com/2/Art_Huffman_p2.htm[9] Khalid Sayood, Introduction to Data Compression Second

    Edition, Chapter 5, pp. 137-157, Harcourt India Private Limited.

    Technical Seminar Presentation 2005

  • 7/27/2019 data compression tech-cs.ppt

    34/34

    a

    tionalInstitu

    te

    ofScience

    and

    Technology

    [34]

    Technical Seminar Presentation 2005

    S deepta MishraationalInstitu

    te

    ofScience

    and

    Technology

    S deepta Mishra CS200117052

    Thank You