Huffman coding is an entropy-based compression algorithm that assigns variable-length binary codes to symbols based on their frequency of occurrence. It can be used to compress many types of data, including raster images. The document demonstrates how Huffman coding works by compressing a sample 5x5 image with 8 colors. By building a Huffman tree and assigning shorter codes to more common colors, the 200-bit uncompressed image is compressed to 41 bits, reducing the file size by over 5x. Lossless JPEG compression and RAW photo formats use Huffman coding to compress images without losing any information.
Huffman coding is an entropy-based compression algorithm that assigns variable-length binary codes to symbols based on their frequency of occurrence. It can be used to compress many types of data, including raster images. The document demonstrates how Huffman coding works by compressing a sample 5x5 image with 8 colors. By building a Huffman tree and assigning shorter codes to more common colors, the 200-bit uncompressed image is compressed to 41 bits, reducing the file size by over 5x. Lossless JPEG compression and RAW photo formats use Huffman coding to compress images without losing any information.
Huffman coding is an entropy-based compression algorithm that assigns variable-length binary codes to symbols based on their frequency of occurrence. It can be used to compress many types of data, including raster images. The document demonstrates how Huffman coding works by compressing a sample 5x5 image with 8 colors. By building a Huffman tree and assigning shorter codes to more common colors, the 200-bit uncompressed image is compressed to 41 bits, reducing the file size by over 5x. Lossless JPEG compression and RAW photo formats use Huffman coding to compress images without losing any information.
Huffman coding is an entropy-based compression algorithm that assigns variable-length binary codes to symbols based on their frequency of occurrence. It can be used to compress many types of data, including raster images. The document demonstrates how Huffman coding works by compressing a sample 5x5 image with 8 colors. By building a Huffman tree and assigning shorter codes to more common colors, the 200-bit uncompressed image is compressed to 41 bits, reducing the file size by over 5x. Lossless JPEG compression and RAW photo formats use Huffman coding to compress images without losing any information.
Download as DOCX, PDF, TXT or read online from Scribd
Download as docx, pdf, or txt
You are on page 1of 13
Huffman Coding –
Base of JPEG Image
Compression Huffman coding can be used to compress all sorts of data. It is an entropy-based algorithm that relies on an analysis of the frequency of symbols in an array.
Huffman coding can be
demonstrated most vividly by compressing a raster image. Suppose we have a 5×5 raster image with 8-bit color, i.e. 256 different colors. The uncompressed image will take 5 x 5 x 8 = 200 bits of storage. First, we count up how many times each color occurs in the image. Then we sort the colors in order of decreasing frequency. We end up with a row that looks like
this:
Now we put the colors together by
building a tree such that the colors farthest from the root are the least frequent. The colors are joined in
pairs, with a node forming the
connection. A node can connect either to another node or to a color. In our example, the tree might look like this:
Our result is known as a Huffman
tree. It can be used for encoding and decoding. Each color is encoded as follows. We create codes by moving from the root of the tree to each
color. If we turn right at a node, we
write a 1, and if we turn left – 0. This process yields a Huffman code table in which each symbol is assigned a bit code such that the most frequently occurring symbol has the shortest code, while the least common symbol is given the longest code. The Huffman tree and code table we created are not the only ones possible. An alternative Huffman tree that looks like this could be created for our image:
The corresponding code table
would then be:
Using the variant is preferable in our
example. This is because it provides better compression for our specific image. Because each color has a unique bit code that is not a prefix of any other, the colors can be replaced by their bit codes in the image file. The most frequently occurring color, white, will be represented with just a single bit rather than 8 bits. Black will take two bits. Red and blue will take three. After these replacements are made, the 200-bit image will be compressed to 14 x 1 + 6 x 2 + 3 x 3 + 2 x 3 = 41 bits, which is about 5 bytes compared to 25 bytes in the original image.
Of course, to decode the image the
compressed file must include the code table, which takes up some space. Each bit code derived from the Huffman tree unambiguously identifies a color, so the compression loses no information. This compression technique is used broadly to encode music, images, and certain communication protocols. Typically, a variation of the algorithm is used for improved efficiency. The method described is generally part of general compression algorithms such as Flate-ZIP for images or FLAC for music.
Lossless JPEG compression uses
the Huffman algorithm in its pure form. Lossless JPEG is common in medicine as part of the DICOM standard, which is supported by the major medical equipment manufacturers (for use in ultrasound machines, nuclear resonance imaging machines, MRI machines, and electron microscopes). Variations of the Lossless JPEG algorithm are also used in the RAW format, which is popular among photo enthusiasts because it saves data from a camera’s image sensor without losing information.