Image Compression Using The Discrete Cosine Transform: Andrew B. Watson, NASA Ames Research Center

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

81-88 Watson.

mj 7/21/99 10:34 AM Page 81

Image Compression Using the Discrete


Cosine Transform
Andrew B. Watson, NASA Ames Research Center

The discrete cosine transform (DCT) is a technique for converting a signal into elementary
frequency components. It is widely used in image compression. Here we develop some simple
functions to compute the DCT and to compress images. These functions illustrate the power of
Mathematica in the prototyping of image processing algorithms.

The rapid growth of digital imaging applications, including Each element of the transformed list S(u) is the inner (dot)
desktop publishing, multimedia, teleconferencing, and high- product of the input list s(x) and a basis vector. The constant
definition television (HDTV) has increased the need for effec- factors are chosen so that the basis vectors are orthogonal
tive and standardized image compression techniques. Among and normalized. The eight basis vectors for n = 8 are shown
the emerging standards are JPEG, for compression of still in Figure 1. The DCT can be written as the product of a
images [Wallace 1991]; MPEG, for compression of motion vector (the input list) and the n ¥ n orthogonal matrix whose
video [Puri 1992]; and CCITT H.261 (also known as Px64), rows are the basis vectors. This matrix, for n = 8, can be
for compression of video telephony and teleconferencing. computed as follows:
All three of these standards employ a basic technique
known as the discrete cosine transform (DCT). Developed by In[1]:= DCTMatrix =
Ahmed, Natarajan, and Rao [1974], the DCT is a close rela- Table[ If[ k==0,
tive of the discrete Fourier transform (DFT). Its application Sqrt[1/8],
to image compression was pioneered by Chen and Pratt Sqrt[2/8] Cos[Pi (2j + 1) k/16] ],
[1984]. In this article, I will develop some simple functions to {k, 0, 7}, {j, 0, 7}] // N;
compute the DCT and show how it is used for image com-
pression. We have used these functions in our laboratory to We can check that the matrix is orthogonal:
explore methods of optimizing image compression for the
human viewer, using information about the human visual In[2]:= DCTMatrix . Transpose[DCTMatrix] // Chop // MatrixForm
system [Watson 1993]. The goal of this paper is to illustrate Out[2]//MatrixForm=
the use of Mathematica in image processing and to provide
1. 0 0 0 0 0 0 0
the reader with the basic tools for further exploration of this
0 1. 0 0 0 0 0 0
subject.
0 0 1. 0 0 0 0 0
0 0 0 1. 0 0 0 0
The One-Dimensional Discrete Cosine Transform 0 0 0 0 1. 0 0 0
The discrete cosine transform of a list of n real numbers s(x), 0 0 0 0 0 1. 0 0
x = 0, ..., n - 1, is the list of length n given by: 0 0 0 0 0 0 1. 0
0 0 0 0 0 0 0 1.
n -1

S(u) = 2 n C(u) Â s(x) cos (2x 2+n1)up ,


x=0
u = 0, K , n - 1 Each basis vector corresponds to a sinusoid of a certain
frequency:
ÏÔ2-1 2 for u = 0 In[3]:= Show[GraphicsArray[Partition[
where C(u) = Ì
ÓÔ1 otherwise ListPlot[#, PlotRange -> {-.5, .5}, PlotJoined -> True,
DisplayFunction -> Identity]&
/@ DCTMatrix, 2] ]]
Andrew B. Watson is the Senior Scientist for Vision Research at NASA Ames
Research Center in Mountain View, California, where he works on models of
visual perception and their application to the coding, understanding, and display
of visual information. He is the author of over 60 articles on human vision, image
processing, and robotic vision.
81-88 Watson.mj 7/21/99 10:34 AM Page 82

The function to compute the DCT of a list of length n = 8


0.4 0.4
0.2 0.2 is then:
–0.2 2 3 4 5 6 7 8 –0.2 2 3 4 5 6 7 8
–0.4 –0.4 In[7]:= DCT[list_] := Re[ DCTTwiddleFactors *
InverseFourier[N[list[[{1, 3, 5, 7, 8, 6, 4, 2}]]]]]
0.4 0.4
0.2 0.2
Note that we use the function InverseFourier to implement
–0.2 2 3 4 5 6 7 8 –0.2 2 3 4 5 6 7 8 what is usually in engineering called the forward DFT. Like-
–0.4 –0.4
wise, we use Fourier to implement what is usually called the
inverse DFT. The function N is used to convert integers to
0.4 0.4
0.2 0.2 reals because (in Version 2.2) Fourier and InverseFourier are
–0.2 2 3 4 5 6 7 8 –0.2 2 3 4 5 6 7 8 not evaluated numerically when their arguments are all inte-
–0.4 –0.4 gers. The special case of a list of zeros needs to be handled
separately by overloading the functions, since N of the integer
0.4 0.4 0 is an integer 0 and not a real 0.0.
0.2 0.2
–0.2 2 3 4 5 6 7 8 –0.2 2 3 4 5 6 7 8 In[8]:= Unprotect[Fourier, InverseFourier];
–0.4 –0.4
Fourier[x:{0 ..}]:= x;
FIGURE 1. The eight basis vectors for the discrete cosine transform of length eight. InverseFourier[x:{0 ..}]:= x;
Protect[Fourier, InverseFourier];
The list s(x) can be recovered from its transform S(u) by
applying the inverse discrete cosine transform (IDCT): We apply DCT to our test input and compare it to the earlier
result computed by matrix multiplication. To compare the
n -1
results, we subtract them and apply the Chop function to sup-
s(x) = 2 n  S(u)C(u)cos (2x 2+n1)up ,
u= 0
x = 0, K , n - 1 press values very close to zero:

In[9]:= DCT[input1]
where C(u) is as defined above. This equation expresses s as Out[9]= {-0.610952, 0.0740846, 0.83188, 0.825302, -0.607786,
a linear combination of the basis vectors. The coefficients -0.410739, 0.157452, -1.0884}
are the elements of the transform S, which may be regarded
as reflecting the amount of each frequency present in the In[10]:= % - output1 // Chop
input s. Out[10]= {0, 0, 0, 0, 0, 0, 0, 0}
We generate a list of random numbers to serve as a test
input: The inverse DCT can be computed by multiplication with
the inverse of the DCT matrix. We illustrate this with our
In[4]:= input1 = Table[Random[Real, {-1, 1}], {8}] previous example:
Out[4]= {0.142689, 0.539381, -0.964253, -0.70434, -0.98625,
0.789134, -0.368739, -0.175656} In[11]:= Inverse[DCTMatrix] . output1
Out[11]= {0.142689, 0.539381, -0.964253, -0.70434, -0.98625,
The DCT is computed by matrix multiplication: 0.789134, -0.368739, -0.175656}

In[5]:= output1 = DCTMatrix . input1 In[12]:= % - input1 // Chop


Out[5]= {-0.610952, 0.0740846, 0.83188, 0.825302, -0.607786, Out[12]= {0, 0, 0, 0, 0, 0, 0, 0}
-0.410739, 0.157452, -1.0884}
As you might expect, the IDCT can also be computed via
the inverse DFT. The “twiddle factors” are the complex con-
As noted above, the DCT is closely related to the discrete
jugates of the DCT factors and the reordering is applied at
Fourier transform (DFT). In fact, it is possible to compute
the end rather than the beginning:
the DCT via the DFT (see [Jain 1989, p. 152]): First create a
new list by extracting the even elements, followed by the In[13]:= IDCTTwiddleFactors = Conjugate[DCTTwiddleFactors]
reversed odd elements. Then multiply the DFT of this re-
Out[13]= {1., 1.38704 + 0.275899 I, 1.30656 + 0.541196 I,
ordered list by so-called “twiddle factors” and take the real
part. We can carry out this process for n = 8 using Mathe- 1.17588 + 0.785695 I, 1. + 1. I, 0.785695 + 1.17588 I,
matica’s DFT function. 0.541196 + 1.30656 I, 0.275899 + 1.38704 I}
In[14:= IDCT[list_] := Re[Fourier[
In[6]:= DCTTwiddleFactors = N @ Join[{1}, IDCTTwiddleFactors list] ][[{1, 8, 2, 7, 3, 6, 4, 5}]]
Table[Sqrt[2] Exp[-I Pi k /16], {k, 7}]]
Out[6]= {1., 1.38704 - 0.275899 I, 1.30656 - 0.541196 I, For example:
1.17588 - 0.785695 I, 1. - 1. I, 0.785695 - 1.17588 I,
In[15]:= IDCT[DCT[input1]] - input1 // Chop
0.541196 - 1.30656 I, 0.275899 - 1.38704 I}
Out[15]= {0, 0, 0, 0, 0, 0, 0, 0}
81-88 Watson.mj 7/21/99 10:34 AM Page 83

The Two-Dimensional DCT


The one-dimensional DCT is useful in processing one-dimen-
sional signals such as speech waveforms. For analysis of two-
dimensional (2D) signals such as images, we need a 2D ver-
sion of the DCT. For an n ¥ m matrix s, the 2D DCT is com-
puted in a simple way: The 1D DCT is applied to each row
of s and then to each column of the result. Thus, the trans-
form of s is given by

S(u, v) = 2 C(u)C(v)
nm
m -1 n -1

  s(x, y) cos (2x 2+n1)up cos (2y 2+m1)vp ,


y=0 x=0

u = 0, K , n - 1; v = 0, K , m - 1

Since the 2D DCT can be computed by applying 1D trans-


forms separately to the rows and columns, we say that the
2D DCT is separable in the two dimensions.
As in the one-dimensional case, each element S(u, v) of the
transform is the inner product of the input and a basis func- FIGURE 2. The 8 ¥ 8 array of basis images for the 2D discrete cosine transform.
tion, but in this case, the basis functions are n ¥ m matrices.
Each two-dimensional basis matrix is the outer product of Each basis matrix is characterized by a horizontal and a
two of the one-dimensional basis vectors. For n = m = 8, the vertical spatial frequency. The matrices shown here are
following expression creates an 8 ¥ 8 array of the 8 ¥ 8 basis arranged left to right and bottom to top in order of increas-
matrices, a tensor with dimensions {8, 8, 8, 8}: ing frequencies.
To illustrate the 2D transform, we apply it to an 8 ¥ 8
In[16]:= DCTTensor = Array[ image of the letter A:
Outer[Times, DCTMatrix[[#1]], DCTMatrix[[#2]]]&,
{8, 8}]; In[19]:= ShowImage[ input2 =
{{0, 1, 0, 0, 0, 1, 0, 0}, {0, 1, 0, 0, 0, 1, 0, 0},
Each basis matrix can be thought of as an image. The 64 {0, 1, 1, 1, 1, 1, 0, 0}, {0, 1, 0, 0, 0, 1, 0, 0},
basis images in the array are shown in Figure 2. {0, 0, 1, 0, 1, 0, 0, 0}, {0, 0, 1, 0, 1, 0, 0, 0},
The package GraphicsImage.m, included in the electronic sup- {0, 0, 1, 0, 1, 0, 0, 0}, {0, 0, 0, 1, 0, 0, 0, 0}}]
plement, contains the functions GraphicsImage and ShowImage
to create and display a graphics object from a given matrix.
GraphicsImage uses the built-in function Raster to translate a
matrix into an array of gray cells. The matrix elements are
scaled so that the image spans the full range of graylevels. An
optional second argument specifies a range of values to
occupy the full grayscale; values outside the range are
clipped. The function ShowImage displays the graphics object
using Show.

In[17]:= << GraphicsImage.m


In[18]:= Show[GraphicsArray[
Map[GraphicsImage[#, {-.25, .25}]&,
Reverse[DCTTensor], As in the 1D case, it is possible to express the 2D DCT as
{2}] ]] an array of inner products (a tensor contraction):

In[20]:= output2 = Array[


(Plus @@ Flatten[DCTTensor[[#1, #2]] input2])&,
{8, 8}];
81-88 Watson.mj 7/21/99 10:34 AM Page 84

In[21]:= ShowImage[output2] In[25]:= ShowImage[Chop[IDCT[output2]]]

As noted earlier, the components of the DCT output indi-


cate the magnitude of image components at various 2D spa-
The pixels in this DCT image describe the proportion of each tial frequencies. To illustrate, we can set the last row and
two-dimensional basis function present in the input image. column of the DCT of the letter A equal to zero:
The pixels are arranged as in Figure 1, with horizontal and
vertical frequency increasing from left to right and bottom to In[26]:= output2[[8]] = Table[0, {8}];
top, respectively. The brightest pixel in the lower left corner Do[output2[[i, 8]] = 0, {i, 8}];
is known as the DC term, with frequency {0, 0}. It is the
average of the pixels in the input, and is typically the largest Now take the inverse transform:
coefficient in the DCT of “natural” images.
An inverse 2D IDCT can also be computed in terms of In[27]:= ShowImage[Chop[IDCT[output2]]]
DCTTensor; we leave this as an exercise for the reader.
Since the two-dimensional DCT is separable, we can
extend our function DCT to the case of two-dimensional input
as follows:

In[22]:= DCT[array_?MatrixQ] :=
Transpose[DCT /@ Transpose[DCT /@ array] ]

This function assumes that its input is an 8 ¥ 8 matrix. It


takes the 1D DCT of each row, transposes the result, takes
the DCT of each new row, and transposes again. This
method is more efficient than computing the tensor contrac-
tion shown above, since it exploits the built-in function
InverseFourier.
The result is a blurred letter A: The highest horizontal and
We compare the result of this function to that obtained
vertical frequencies have been removed. This is easiest to see
using contraction of tensors :
when the image is reduced in size so that individual pixels are
In[23]:= DCT[input2] - output2 // Chop // Abs // Max not so visible.
Out[23]= 0

The definition of the inverse 2D DCT is straightforward:

In[24]:= IDCT[array_?MatrixQ] :=
2D Blocked DCT
Transpose[IDCT /@ Transpose[IDCT /@ array] ]
To this point, we have defined functions to compute the DCT
As an example, we invert the transform of the letter A: of a list of length n = 8 and the 2D DCT of an 8 ¥ 8 array.
We have restricted our attention to this case partly for sim-
plicity of exposition, and partly because when it is used for
image compression, the DCT is typically restricted to this
size. Rather than taking the transformation of the image as a
whole, the DCT is applied separately to 8 ¥ 8 blocks of the
image. We call this a blocked DCT.
81-88 Watson.mj 7/21/99 10:34 AM Page 85

To compute a blocked DCT, we do not actually have to Applying the DCT to this image gives an image consisting
divide the image into blocks. Since the 2D DCT is separable, of 64 blocks, each a DCT of 8 ¥ 8 pixels:
we can partition each row into lists of length 8, apply the
DCT to them, rejoin the resulting lists, and then transpose In[32]:= ShowImage[DCT[shuttle], {-300, 300}]
the whole image and repeat the process:

In[28]:= DCT[list_?(Length[#]>8&)] :=
Join @@ (DCT /@ Partition[list, 8])

It may be worth tracing the progress of this deceptively


simple piece of code as it works upon a 16 ¥ 16 image. First,
we observe the order in which Mathematica stores the three
rules we have given for DCT:

In[29]:= ?DCT
Global`DCT
DCT[(array_)?MatrixQ] :=
Transpose[DCT /@ Transpose[DCT /@ array]]
DCT[(list_)?(Length[#1] > 8 & )] :=
Apply[Join, DCT /@ Partition[list, 8]]
The lattice of bright dots is formed by the DC coefficients
DCT[list_] :=
from each of the DCT blocks. To reduce the dominance of
Re[DCTTwiddleFactors*InverseFourier[N[list[[{1, 3, 5,
these terms, we display the image with a clipped graylevel
7, 8, 6, 4, 2}]]]]]
range. Note also the greater activity in the lower left com-
pared to the upper right, which corresponds mainly to uni-
When evaluating DCT of a 16 ¥ 16 image, Mathematica form sky.
begins by checking the first rule. It recognizes that the input The inverse DCT of a list of length greater than 8 is
is a matrix, and thus invokes the rule and applies DCT to each defined in the same way as the forward transform:
row. When DCT is applied to a row of length 16, the second
rule comes into play. The row is partitioned into two lists of In[33]:= IDCT[list_?(Length[#]>8&)] :=
length 8, and DCT is applied to each. These applications Join @@ (IDCT /@ Partition[list, 8])
invoke the last rule, which simply computes the 1D DCT of the
lists of length 8. The two sub-rows are then rejoined by the
Here is a simple test:
second rule. After each row has been transformed in this
way, the entire matrix is transposed by the first rule. The In[34]:= shuttle - IDCT[DCT[shuttle]] // Chop // Abs // Max
process of partitioning, transforming, and rejoining each row
Out[34]= 0
is then repeated, and the resulting matrix is transposed again.
For a test image, we provide a small 64 ¥ 64 picture of a
space shuttle launch. We use the utility function ReadImageRaw,
Quantization
defined in the package GraphicsImage.m to read a matrix of
graylevels from a file: DCT-based image compression relies on two techniques to
reduce the data required to represent the image. The first is
In[30]:= shuttle = ReadImageRaw[“shuttle”, {64, 64}]; quantization of the image’s DCT coefficients; the second is
entropy coding of the quantized coefficients. Quantization is
In[31]:= ShowImage[shuttle] the process of reducing the number of possible values of a
quantity, thereby reducing the number of bits needed to rep-
resent it. Entropy coding is a technique for representing the
quantized data as compactly as possible. We will develop
functions to quantize images and to calculate the level of
compression provided by different degrees of quantization.
We will not implement the entropy coding required to create
a compressed image file.
A simple example of quantization is the rounding of reals
into integers. To represent a real number between 0 and 7 to
some specified precision takes many bits. Rounding the num-
ber to the nearest integer gives a quantity that can be repre-
sented by just three bits.

In[35]:= x = Random[Real, {0, 7}]


Out[35]= 5.79566
81-88 Watson.mj 7/21/99 10:34 AM Page 86

In[36]:= Round[x] In[41]:= ShowImage[qLum]


Out[36]= 6

In this process, we reduce the number of possible values of


the quantity (and thus the number of bits needed to represent
it) at the cost of losing information. A “finer” quantization,
that allows more values and loses less information, can be
obtained by dividing the number by a weight factor before
rounding:

In[37]:= w = 1/4;
In[38]:= Round[x/w]
Out[38]= 23

Taking a larger value for the weight gives a “coarser”


quantization. To implement the quantization process, we must partition
Dequantization, which maps the quantized value back into the transformed image into 8 ¥ 8 blocks:
its original range (but not its original precision) is achieved
by multiplying the value by the weight: In[42]:= BlockImage[image_, blocksize_:{8, 8}] :=
Partition[image, blocksize] /;
In[39]:= w * % // N And @@ IntegerQ /@ (Dimensions[image]/blocksize)
Out[39]= 5.75
The function UnBlockImage reassembles the blocks into a
The quantization error is the change in a quantity after single image:
quantization and dequantization. The largest possible quan-
tization error is half the value of the quantization weight. In[43]:= UnBlockImage[blocks_] :=
In the JPEG image compression standard, each DCT coef- Partition[
ficient is quantized using a weight that depends on the fre- Flatten[Transpose[blocks, {1, 3, 2}]],
quencies for that coefficient. The coefficients in each 8 ¥ 8 {Times @@ Dimensions[blocks][[{2, 4}]]}]
block are divided by a corresponding entry of an 8 ¥ 8
quantization matrix, and the result is rounded to the nearest For example:
integer.
In general, higher spatial frequencies are less visible to the In[44]:= Table[i + 8(j-1), {j, 4}, {i, 6}] // MatrixForm
human eye than low frequencies. Therefore, the quantiza-
Out[44]//MatrixForm=
tion factors are usually chosen to be larger for the higher
1 2 3 4 5 6
frequencies. The following quantization matrix is widely
9 10 11 12 13 14
used for monochrome images and for the luminance compo-
17 18 19 20 21 22
nent of a color image. It is given in the JPEG standards doc-
uments, yet is not part of the standard, so I call it the “de 25 26 27 28 29 30
facto” matrix: In[45]:= BlockImage[%, {2, 3}] // MatrixForm
Out[45]//MatrixForm=
In[40]:= qLum = 1 2 3 4 5 6
{{16, 11, 10, 16, 24, 40, 51, 61}, 9 10 11 12 13 14
{12, 12, 14, 19, 26, 58, 60, 55},
{14, 13, 16, 24, 40, 57, 69, 56}, 17 18 19 20 21 22
{14, 17, 22, 29, 51, 87, 80, 62}, 25 26 27 28 29 30
{18, 22, 37, 56, 68,109,103, 77}, In[46]:= UnBlockImage[%] // MatrixForm
{24, 35, 55, 64, 81,104,113, 92}, Out[46]//MatrixForm=
{49, 64, 78, 87,103,121,120,101}, 1 2 3 4 5 6
{72, 92, 95, 98,112,100,103, 99}}; 9 10 11 12 13 14
17 18 19 20 21 22
Displaying the matrix as a grayscale image shows the 25 26 27 28 29 30
dependence of the quantization factors on the frequencies:
Our quantization function blocks the image, divides each
block (element-by-element) by the quantization matrix,
reassembles the blocks, and then rounds the entries to the
nearest integer:
81-88 Watson.mj 7/21/99 10:34 AM Page 87

In[47]:= DCTQ[image_, qMatrix_] := To compute the first-order entropy of a list of numbers, we


Map[(#/qMatrix)&, use the function Frequencies, from the standard package
BlockImage[image, Dimensions[qMatrix]], Statistics`DataManipulation`. This function computes the rela-
{2}] // UnBlockImage // Round tive frequencies of elements in a list:

The dequantization function blocks the matrix, multiplies In[51]:= Frequencies[list_List] :=


each block by the quantization factors, and reassembles the Map[{Count[list, #], #}&, Union[list]]
matrix: In[52]:= Characters[“mississippi”]
In[48]:= IDCTQ[image_, qMatrix_] := Out[52]= {m, i, s, s, i, s, s, i, p, p, i}
Map[(# qMatrix)&, In[53]:= Frequencies[%]
BlockImage[image, Dimensions[qMatrix]],
Out[53]= {{4, i}, {1, m}, {2, p}, {4, s}}
{2}] // UnBlockImage
Calculating the first-order entropy is straightforward:
To show the effect of quantization, we will transform,
quantize, and reconstruct our image of the shuttle using the In[54]:= Entropy[list_] :=
quantization matrix introduced above: - Plus @@ N[# Log[2, #]]& @
(First[Transpose[Frequencies[list]]]/Length[list])
In[49]:= qshuttle = shuttle //
DCT // DCTQ[#, qLum]& // IDCTQ[#, qLum]& // IDCT;
For example, the entropy of a list of four distinct symbols is
2, so 2 bits are required to code each symbol:
For comparison, we show the original image together with
the quantized version: In[55]:= Entropy[{“a”, “b”, “c”, “d”}]
In[50]:= Show[GraphicsArray[ Out[55]= 2.
GraphicsImage[#, {0, 255}]& /@ {shuttle, qshuttle}]]
Similarly, 1.82307 bits are required for this longer list:

In[56]:= Entropy[Characters[“mississippi”]]
Out[56]= 1.82307

A list with more symbols and fewer repetitions requires more


bits per symbol:

In[57]:= Entropy[Characters[“california”]]
Out[57]= 2.92193

The appearance of fractional bits may be puzzling to some


Note that some artifacts are visible, particularly around high- readers, since we think of a bit as a minimal, indivisible unit
contrast edges. In the next section, we will compare the of information. Fractional bits are a natural outcome of the
visual effects and the amount of compression obtained from use of what are called “variable word-length” codes. Con-
different degrees of quantization. sider an image containing 63 pixels with a graylevel of 255,
and one pixel with a graylevel of 0. If we employ a code
Entropy that uses a symbol of length 1 bit to represent 255 and a
symbol of length 2 bits to represent 0, then we need 65 bits
To measure how much compression is obtained from a quan-
to represent the image, so the average bit-rate is 65/64 =
tization matrix, we use a famous theorem of Claude Shannon
1.0156 bits per pixel. The entropy as calculated above is a
[Shannon and Weaver 1949]. The theorem states that for a
lower bound on this average bit-rate.
sequence of symbols with no correlations beyond first order,
The compression ratio is another frequently used measure
no code can be devised to represent the sequence that uses
of how effectively an image has been compressed. It is simply
fewer bits per element of the sequence than the first-order
the ratio of the size of the image file before and after com-
entropy, which is given by
pression. It is equal to the ratio of bit-rates, in bits per pixel,
before and after compression. Since the initial bit-rate is usu-
h=- Â p log (p )
i
i 2 i
ally 8 bits per pixel and the entropy is our estimate of the
compressed bit-rate, the compression ratio is estimated by
where pi is the relative frequency of the ith symbol. dividing 8 by the entropy.
81-88 Watson.mj 7/21/99 10:34 AM Page 88

We will use the following function to examine the effects Timing


of quantization:
Most of the computation time required to transform, quan-
tize, dequantize, and reconstruct an image is spent on for-
In[58]:= f[image_, qMatrix_] :=
ward and inverse DCT calculations. Because these trans-
{Entropy[Flatten[#]], IDCT[IDCTQ[#, qMatrix]]}& @
forms are applied to blocks, the time required is propor-
DCTQ[DCT[image], qMatrix]
tional to the size of the image. On a SUN Sparcstation 2, the
timings increase (at a rate of 0.005 second/pixel) from about
This function transforms and quantizes an image, computes 20 seconds for a 642 pixel image to about 320 seconds for
the entropy, and dequantizes and reconstructs the image. It 2562 pixels.
returns the entropy and the resulting image. A simple way to These times are much longer than for comparable func-
experiment with different degrees of quantization is to divide tions written in a low-level language such as C. For example,
the “de facto” matrix qLum by a scalar and look at the results a C program performed the same computations in under two
for various values of this parameter: seconds for an image of 2562 pixels, more than 100 times
faster than our Mathematica functions. However, for the
In[59]:= test = f[shuttle, qLum/#]& /@ {1/4, 1/2, 1, 4};
purposes for which our code was developed, namely educa-
tion, algorithm development, and prototyping other applica-
Here are the reconstructed images and the corresponding tions, the timings are acceptable.
entropies:
References
In[60]:= Show[GraphicsArray[
Partition[ Ahmed, N., T. Natarajan, and K. R. Rao. 1974. On image
Apply[ processing and a discrete cosine transform. IEEE Transac-
ShowImage[#2, {0, 255}, PlotLabel -> #1, tions on Computers C-23(1): 90–93.
DisplayFunction -> Identity]&, Chen, W. H., and W. K. Pratt. 1984. Scene adaptive coder.
test, 1], IEEE Transactions on Communications COM-32:
2] ] ] 225–232.
Jain, A. K. 1989. Fundamentals of digital image processing.
Prentice Hall: Englewood Cliffs, NJ.
Puri, A. 1992. Video coding using the MPEG-1 compression
standard. Society for Information Display Digest of Tech-
nical Papers 23: 123–126.
Shannon, C. E., and W. Weaver. 1949. The mathematical
theory of communication. Urbana: University of Illinois
Press.
Wallace, G. 1991. The JPEG still picture compression stan-
dard. Communications of the ACM 34(4): 30–44.
Watson, A. B. 1993. DCT quantization matrices visually
optimized for individual images. Proceedings of the SPIE
1913:202–216. (Human Vision, Visual Processing, and
Digital Display IV. Rogowitz ed. SPIE. Bellingham,WA).

Andrew B. Watson
NASA Ames Research Center
[email protected]

The electronic supplement contains the packages


DCT.m and GraphicsImage.m, and the image file shuttle.

You might also like