Data Compression Tech-Cs
Data Compression Tech-Cs
Data Compression Tech-Cs
Presented
by
Sudeepta Mishra
Roll# CS200117052
At
NIST,Berhampur
Introduction
National Institute of Science and Technology
• Dictionary coders.
– Zip (file format).
– Lempel Ziv.
• Entropy encoding.
– Huffman coding (simple entropy coding).
• Run-length encoding.
Dictionary-Based Compression
National Institute of Science and Technology
Types of Dictionary
National Institute of Science and Technology
• Static Dictionary.
• Semi-Adaptive Dictionary.
• Adaptive Dictionary.
– Lempel Ziv algorithms belong to this category of
dictionary coders. The dictionary is being built in a
single pass, while at the same time encoding the data.
– The decoder can build up the dictionary in the same
way as the encoder while decompressing the data.
Lempel Ziv
National Institute of Science and Technology
LZ77 LZ78
LZJ
LZR LZFG
LZW
LZW Algorithm
National Institute of Science and Technology
W = NIL;
while (there is input){
K = next symbol from input;
if (WK exists in the dictionary) {
W = WK;
} else {
output (index(W));
add WK to the dictionary;
W = K;
}
}
START
W= NULL
YES
IS EOF STOP
? NO
K=NEXT INPUT
YES
IS WK
W=WK
FOUND?
NO
OUTPUT INDEX OF W
ADD WK TO DICTIONARY
W=K
• Input string is a b d c a d a c
• The Initial Dictionary
contains symbols like
a, b, c, d with their
index values as 1, 2, 3,
4 respectively.
• Now the input string
a 1
is read from left to
right. Starting from b 2
a. c 3
d 4
• W = Null a b d c a d a c
• K=a
• WK = a
K
In the dictionary.
a 1
b 2
c 3
d 4
• K = b. a b d c a d a c
• WK = ab
is not in the dictionary.
K
• Add WK to
dictionary 1
• Output code for a.
a 1 ab 5
• Set W = b
b 2
c 3
d 4
• K=d a b d c a d a c
• WK = bd
Not in the dictionary.
K
Add bd to dictionary.
• Output code b
1 2
• Set W = d a 1 ab 5
b 2 bd 6
c 3
d 4
• K=a a b d a b d a c
• WK = da
not in the dictionary.
K
• Add it to dictionary.
1 2 4
• Output code d
• Set W = a a 1 ab 5
b 2 bd 6
c 3 da 7
d 4
• K=b a b d a b d a c
• WK = ab
It is in the dictionary.
K
1 2 4
a 1 ab 5
b 2 bd 6
c 3 da 7
d 4
• K=d a b d a b d a c
• WK = abd
Not in the dictionary.
K
• Add W to the
1 2 4 5
dictionary.
• Output code for W.
a 1 ab 5
• Set W = d
b 2 bd 6
c 3 da 7
d 4 abd 8
• K=a a b d a b d a c
• WK = da
In the dictionary.
K
1 2 4 5
a 1 ab 5
b 2 bd 6
c 3 da 7
d 4 abd 8
• K=c a b d a b d a c
• WK = dac
Not in the dictionary.
K
• Add WK to the
1 2 4 5 7
dictionary.
• Output code for W.
a 1 ab 5 dac 9
• Set W = c
b 2 bd 6
• No input left so
c 3 da 7
output code for W.
d 4 abd 8
a 1 ab 5 dac 9
b 2 bd 6
c 3 da 7
d 4 abd 8
read a character k;
output k;
w = k;
while ( read a character k )
/* k could be a character or a code. */
{ entry = dictionary entry for k;
output entry;
add w + entry[0] to dictionary;
w = entry; }
START
K=INPUT
Output K
W=K
YES
IS EOF STOP
?
NO
K=NEXT INPUT
Output ENTRY
W=ENTRY
1 2 4 5 7 3
• K=1
• Out put K (i.e. a)
K
• W=K
a
a 1
b 2
c 3
d 4
1 2 4 5 7 3
• K=2
• entry = b
K
• Output entry
• Add W + entry[0] to a b
dictionary
• W = entry[0] (i.e. b)
a 1 ab 5
b 2
c 3
d 4
1 2 4 5 7 3
• K=4
• entry = d
K
• Output entry
• Add W + entry[0] to a b d
dictionary
• W = entry[0] (i.e. d)
a 1 ab 5
b 2 bd 6
c 3
d 4
1 2 4 5 7 3
• K=5
• entry = ab
K
• Output entry
• Add W + entry[0] to a b d a b
dictionary
• W = entry[0] (i.e. a)
a 1 ab 5
b 2 bd 6
c 3 da 7
d 4
1 2 4 5 7 3
• K=7
• entry = da
K
• Output entry
• Add W + entry[0] to a b d a b d a
dictionary
• W = entry[0] (i.e. d)
a 1 ab 5
b 2 bd 6
c 3 da 7
d 4 abd 8
1 2 4 5 7 3
• K=3
• entry = c
K
• Output entry
• Add W + entry[0] to a b d a b d a c
dictionary
• W = entry[0] (i.e. c)
a 1 ab 5 dac 9
b 2 bd 6
c 3 da 7
d 4 abd 8
Advantages
National Institute of Science and Technology
Conclusion
National Institute of Science and Technology
REFERENCES
National Institute of Science and Technology
[1] http://www.bambooweb.com/articles/d/a/Data_Compression.html
[2] http://tuxtina.de/files/seminar/LempelZivReport.pdf
[3] BELL, T. C., CLEARY, J. G., AND WITTEN, I. H. Text
Compression. Prentice Hall, Upper Sadle River, NJ, 1990.
[4] http://www.cs.cf.ac.uk/Dave/Multimedia/node214.html
[5] http://download.cdsoft.co.uk/tutorials/rlecompression/Run-
Length Encoding (RLE) Tutorial.htm
[6] David Salomon, Data Compression The Complete Reference,
Second Edition. Springer-Verlac, New York, Inc, 2001 reprint.
[7] http://www.programmersheaven.com/2/Art_Huffman_p1.htm
[8] http://www.programmersheaven.com/2/Art_Huffman_p2.htm
[9] Khalid Sayood, Introduction to Data Compression Second
Edition, Chapter 5, pp. 137-157, Harcourt India Private Limited.
Thank You