Data compression procedures utilizing the similarity...

Data compression procedures utilizing the similarity of data

Y Kambayashi, N Nakatsu, S Yajima - Proceedings of the May 4-7, 1981 …, 1981 - dl.acm.org

Y Kambayashi, N Nakatsu, S Yajima

Proceedings of the May 4-7, 1981, national computer conference, 1981•dl.acm.org

In large database systems, we usually encounter the situation when a set of similar data is to
be stored. This paper discusses efficient data compression procedures utilizing similarity of
data. These procedures are suitable for compressing versions of programs, a series of data
produced in an office etc. The procedure to compress one string utilizing regularity of data is
as follows: 1. Calculate all maximum repeated substrings in the given string. 2. Since each
repeated substring is required to be stored only once, replace the second and later …

The procedure to compress one string utilizing regularity of data is as follows:

1. Calculate all maximum repeated substrings in the given string.

2. Since each repeated substring is required to be stored only once, replace the second and later occurrence of the same substring by the code which shows the position of the first occurrence of the substring.

The procedure to compress two strings w₁ and w₂ utilizing data similarity is as follows:

1. Calculate all maximum common substrings of w₁ and w₂.

2. Find a minimum cover for w₂ using the maximum common substrings contained in w₁.

3. Encode w₂ by codes, each of which shows a substring of w₁.

These procedures are shown to require time only proportional to the total length of data and thus they are efficient. Combinations and variations of these two procedures are also discussed in the paper.

ACM Digital Library

Show moreShow less

Save Cite Cited by 6 Related articles All 2 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Data compression procedures utilizing the similarity of data