Data compression procedures utilizing the similarity of data
Y Kambayashi, N Nakatsu, S Yajima - Proceedings of the May 4-7, 1981 …, 1981 - dl.acm.org
Y Kambayashi, N Nakatsu, S Yajima
Proceedings of the May 4-7, 1981, national computer conference, 1981•dl.acm.orgIn large database systems, we usually encounter the situation when a set of similar data is to
be stored. This paper discusses efficient data compression procedures utilizing similarity of
data. These procedures are suitable for compressing versions of programs, a series of data
produced in an office etc. The procedure to compress one string utilizing regularity of data is
as follows: 1. Calculate all maximum repeated substrings in the given string. 2. Since each
repeated substring is required to be stored only once, replace the second and later …
be stored. This paper discusses efficient data compression procedures utilizing similarity of
data. These procedures are suitable for compressing versions of programs, a series of data
produced in an office etc. The procedure to compress one string utilizing regularity of data is
as follows: 1. Calculate all maximum repeated substrings in the given string. 2. Since each
repeated substring is required to be stored only once, replace the second and later …
In large database systems, we usually encounter the situation when a set of similar data is to be stored. This paper discusses efficient data compression procedures utilizing similarity of data. These procedures are suitable for compressing versions of programs, a series of data produced in an office etc.
The procedure to compress one string utilizing regularity of data is as follows:
1. Calculate all maximum repeated substrings in the given string.
2. Since each repeated substring is required to be stored only once, replace the second and later occurrence of the same substring by the code which shows the position of the first occurrence of the substring.
The procedure to compress two strings w1 and w2 utilizing data similarity is as follows:
1. Calculate all maximum common substrings of w1 and w2.
2. Find a minimum cover for w2 using the maximum common substrings contained in w1.
3. Encode w2 by codes, each of which shows a substring of w1.
These procedures are shown to require time only proportional to the total length of data and thus they are efficient. Combinations and variations of these two procedures are also discussed in the paper.

Showing the best result for this search. See all results