Lossless Recompression of JPEG Images Using Transform Domain Intra Prediction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

88 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL.

32, 2023

Lossless Recompression of JPEG Images Using


Transform Domain Intra Prediction
Chentian Sun , Xiaopeng Fan , Senior Member, IEEE, and Debin Zhao , Member, IEEE

Abstract— JPEG, which was developed 30 years ago, is the an important part in social life. More importantly, more and
most widely used image coding format, especially favored by more people use social media websites such as Facebook and
the resource-deficient devices, due to its simplicity and efficiency. Flickr to share and communicate [5], [6]. These websites have
With the evolution of the Internet and the popularity of mobile
devices, a huge amount of user-generated JPEG images are to save a large number of JPEG images uploaded by people
uploaded to social media sites like Facebook and Flickr or every day. The massive application of JPEG images inevitably
stored in personal computers or notebooks, which leads to an leads to a huge demand for storage resources [7], [8]. That
increase in storage cost. However, the performance of JPEG means how to reduce the storage cost of JPEG images (or
is far from the-state-of-the art coding methods. Therefore, the files)1 in cloud or personal computers or notebooks is an
lossless recompression of JPEG images is urgent to be studied,
which will further reduce the storage cost while maintaining important problem to be studied.
the image fidelity. In this paper, a hybrid coding framework On the one hand, still image compression techniques
for the lossless recompression of JPEG images (LLJPEG) using have achieved great success in compression performance. For
transform domain intra prediction is proposed, including block example, JPEG2000 [9], which is based on wavelets and
partition and intraprediction, transform and quantization, and arithmetic coding, not only obtains higher quality images, but
entropy coding. Specifically, in LLJPEG, intra prediction is first
used to obtain a predicted block. Then the predicted block is also provides scalable capability. BPG [10], which utilizes the
transformed by DCT and then quantized to obtain the predicted HEVC intra coding techniques, achieves the similar image
coefficients. After that, the predicted coefficients are subtracted quality only with half of JPEG file size. However, due to
from the original coefficients to get the DCT coefficient residuals. their complexity, they are not widely used. Meanwhile, lossless
Finally, the DCT residuals are entropy coded. In LLJPEG,
image compression methods have made some progress and are
some new coding tools are proposed for intra prediction and
the entropy coding is redesigned. The experiments show that still under developed. For example, JPEG-LS [11] applies a
LLJPEG can reduce the storage space by 29.43% and 26.40% simple fixed context model to explore high-order correlation
on the Kodak and DIV2K datasets respectively without any loss with Golomb type codes. CALIC [12] uses a large number
for JPEG images, while maintaining low decoding complexity. of modeling contexts to condition a non-linear predictor
Index Terms— JPEG, recompression, lossless, intra prediction. and make it adaptive to varying source statistics. FLIF [13]
builds the contexts as the nodes of decision trees with a
I. I NTRODUCTION context-adaptive binary arithmetic coding. The lossless coding
techniques of HEVC [23], [24] and VVC [29], [30] explore
J PEG is a compression standard for still images developed
by the Joint Photographic Experts Group in the 1990s [1],
[2]. Currently, JPEG is still the most important and widely
more complex and efficient intra prediction. It should be
noted that all these lossless image compression methods use
used image compression format, due to its simplicity and the high order spatial correlation to get higher compression
efficiency [3], [4]. However, its compression performance is performance. However, they are not efficient to losslessly
far from the-state-of-the-art. compress a JPEG file, and even increase the JPEG file size
Nowadays, with the development of digital devices, it has when an image lossless encoder is applied to compress the
become a habit for people to use digital photo albums to record decoded JPEG image.
their lives, and digital images represented by JPEG has become On the other hand, the recompression of JPEG images
has attracted attention because it can further reduce the
Manuscript received 23 September 2021; revised 31 August 2022;
accepted 14 November 2022. Date of publication 7 December 2022; date
storage space of JPEG files [14]. The recompression of
of current version 16 December 2022. This work was supported in part by JPEG images can be classified into two categories: lossy and
the National Key Research and Development Program of China under Grant lossless JPEG recompression. In lossy JPEG recompression,
2021YFF0900500; in part by the National Natural Science Foundation of
China (NSFC) under Grant 61972115 and Grant 62272128; and in part by
TinyPNG [15], Mozjpeg [16], and Guetzli [17], [18], achieve
the Media Innovation Laboratory, Architecture and Technology Innovation better compression performance by merging colors, changing
Department, Huawei Cloud, and the Media Service Product Department, scanning order or using human visual model. However,
Huawei Cloud. The associate editor coordinating the review of this manuscript
and approving it for publication was Dr. Marc Antonini. (Corresponding
lossy JPEG recompression will inevitably lead to permanent
author: Xiaopeng Fan.) loss of JPEG images, which is unacceptable especially for
The authors are with the School of Computer Science and Technology, some applications such as medical research and criminal
Harbin Institute of Technology, Harbin 150001, China, and also with the
Peng Cheng Laboratory, Shenzhen 519055, China (e-mail: [email protected];
[email protected]; [email protected]). 1 In this paper, a JPEG image or JPEG file means the generated bitstream
Digital Object Identifier 10.1109/TIP.2022.3226409 of an image after JPEG encoding.
1941-0042 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 89

investigation. Therefore, the lossless recompression of JPEG


images is much desirable. Jpegtran [19], [20] is a lossless
JPEG recompression tool in which the transform coefficients
and the corresponding quantization tables are rotated. Jpegtran
achieves about 5% of JPEG image file size reduction. The
proposed lossless JPEG recompression method [21] exploits
known image priors for reverse mapping to recover original
fine quantization bin indices with deterministic guarantee,
which can save about 7.76% of the JPEG storage space.
Although some progress has been made in the lossless
recompression of JPEG images, there is still room for further
improvement. Fig. 1. Cloud storage system with LLJPEG.
As indicated in [19], for the lossless recompression of JPEG
images, we need to find a way to reduce the redundancy of
transform coefficients in JPEG files directly. In this paper, datasets respectively without any loss for JPEG
a lossless recompression of JPEG images (LLJPEG) using images.
transform domain intra prediction is proposed. Since JPEG LLJPEG can be easily integrated into cloud storage system.
images are stored in the form of transform coefficients, As shown in Fig. 1, when a user uploads a JPEG image, the
LLJPEG is designed for the lossless recompression of cloudlet can use LLJPEG encoder to recompress the JPEG
transform coefficients to reduce the storage space of JPEG image and stores it in the central cloud. When accessing the
files. However, unlike the JPEG lossless recompression JPEG image again, the cloudlet can use LLJPEG decoder to
methods in [7], [8], [19], and [21], LLJPEG is a hybrid coding quickly recover the original JPEG image from the central
framework, including block partition and intra prediction, cloud for a user to download. From the user’s point of view,
transform and quantization, and entropy coding. Specifically, the image upload/download process and image quality are no
in LLJPEG, intra prediction is first used to get a predicted different from the typical cloud storage. From the perspective
block. Then the predicted block is transformed and quantized of service providers, with a small amount of extra computing,
to obtain the predicted coefficients. After that, the predicted the cost of cloud storage is greatly reduced.
coefficients are subtracted from the original coefficients to LLJPEG can also be easily used by individual users
get the DCT coefficient residuals. Finally, the DCT residuals to losslessly recompress their JPEG albums in personal
are entropy coded. In LLJPEG, some new coding tools computers or notebooks, like Winzip to compress text files.
are proposed for intra prediction and the entropy coding is Further, LLJPEG can be used to compress any JPEG image.
redesigned. The experiments show that LLJPEG can reduce For example, satellite images [22], medical images, image
the storage space by 29.43% and 26.40% on the Kodak database for deep learning.
and DIV2K datasets respectively without any loss for JPEG The outline of the paper is as follows. The related
images, while maintaining a low decoding computational work is reviewed in Section II. Section III introduces the
complexity. framework of LLJPEG, including the motivation, the overall
The contributions of this paper can be summarized as design, block partition and intra prediction, transform and
follows: quantization, and entropy coding. Finally, the experimental
1) A hybrid coding framework using transform domain results and conclusions are provided in Sections IV and V,
intra prediction for the lossless recompression of JPEG respectively.
images (LLJPEG) is proposed, including block partition
and intra prediction, transform and quantization, and
II. R ELATED W ORK
entropy coding.
2) Unlike HEVC or VVC, which reduce the redundancy
A. Lossy Recompression of JPEG Images
between blocks in spatial domain, LLJPEG uses intra
prediction to reduce the redundancy between blocks in TinyPNG [15] is a high performance lossy JPEG recompres-
transform domain. sion tool, which reduces the number of colors and save bytes
3) Some new coding tools for intra prediction are proposed by intelligently selecting colors for merging. Mozjpeg [16]
to improve the performance, such as the new decision performs lossy JPEG recompression by improving the
for DC mode and the filtering after intra prediction. scanning process of DCT coefficients. Guetzli [17] is a lossy
In addition, the two-stage intra prediction mode decision JPEG recompression tool developed by Google for high
is also proposed to reduce the encoding complexity. quality image compression. Guetzli uses a new human visual
4) The entropy coding is redesigned, including the model [18] to determine which colors and details should be
reorganization of transform residuals, the coding of skip preserved or discarded. Guetzli uses the closed-loop optimizer
flag, and the coefficient scanning order. to improve the JPEG global quantization table and DCT
5) More importantly, LLJPEG improves the compres- coefficient values, and applies the human visual model to the
sion performance significantly, saving 29.43% and optimization process. Guetzli obtains more than 30% storage
26.40% storage space on the Kodak and DIV2K space saving, which is much higher than others.

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
90 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 32, 2023

B. Lossless Recompression of JPEG Images


In recent years, the lossless recompression of JPEG images
has been paid much attention. Jpegtran [19], [20] is a lossless
recompression tool designed for JPEG images. The principle
of jpegtran is to rotate the transform coefficients, reverse
the sign in the DCT domain, and rotate the corresponding
quantization table accordingly. Since the rotation of coefficient
matrix will change its coefficient scanning order, this will
slightly reduce the compressed file size, thus optimizing the
overall compression performance. It should be pointed out that
the output of jpegtran is still in JPEG format which can be
easily decoded by a JPEG decoder.
Recently, the lossless recompression of JPEG images is
applied to cloud storage [21] and personal digital album Fig. 2. Intra prediction modes and the reference pixels in HEVC and VVC.
compression [7], [8]. In [21], a lossless recompression method
for cloud storage of JPEG images is proposed, which
exploits known signal priors, sparsity priors and graph signal or VVC uses the spatial correlation between adjacent blocks
smoothness priors for reverse mapping to recover original fine to reduce the redundancy of the current block. Therefore,
quantization bin indices with deterministic guarantee. For fast we should make the intra prediction in transform domain to
reverse mapping, the small dictionaries and sparse graphs are adapt to DCT coefficient prediction.
used. It can save about 7.76% of the JPEG storage space. The In order to show that coding DCT coefficient residuals is
lossless recompression methods for JPEG albums in personal more conducive to compression than that of the original DCT
digital devices are also proposed in [7] and [8]. The method coefficients, we do the following experiment.
in [7] explores the correlation between album images based on First, for an 8 × 8 block, four modes: DC, Planar, vertical,
motion incremental structure (SfM). SfM generates inherent and horizontal in HEVC are applied in intra prediction and
geometric relationships between images to take advantage the four predicted blocks are obtained. Then, the predicted
of the redundancy between them. The method in [8] jointly blocks are DCT and quantized (based on the quantization table
removes the inter-image redundancy in the feature, spatial, extracted from the JPEG file) to obtain the predicted DCT
and frequency domains. An HEVC-like encoder is used to coefficient blocks. After that, the predicted DCT coefficients in
compress the JPEG photos, which achieves 12% bits saving each predicted DCT coefficient block are subtracted from the
on average. corresponding original coefficients to get four DCT coefficient
residual blocks. Finally, the DCT coefficient residual block
III. F RAMEWORK OF THE L OSSLESS with the smallest sum of squared errors is selected.
R ECOMPRESSION OF JPEG I MAGES The distributions are shown in Fig. 3, in which the red
A. Motivation lines are the distributions of the original DCT coefficients
directly extracted from JPEG files, while the blue lines are the
1) Intra Prediction in Hybrid Coding Frameworks: In
distributions of the DCT coefficient residuals obtained through
HEVC [23], [24], the intra prediction includes 33 angle modes,
the above steps. It can be seen that the distributions of DCT
DC mode, and Planar mode [25], [26]. What’s more, the Most
coefficient residuals are more concentrated at 0, which is more
Probable Mode (MPM) [27] is designed to further improve the
conducive to compression.
performance of intra coding. The quadtree structure adopted by
HEVC supports different block sizes from 4×4 to 64×64 [28].
VVC [29], [30] is the latest generation of video coding B. Overall Framework of LLJPEG
standard. Compared with HEVC, its intra prediction modes In this paper, a hybrid coding framework for the lossless
increased to 67 (65 angle modes, DC mode, and Planar recompression of JPEG images using transform domain intra
mode) [31], [32]. It also introduces some new improvements prediction (LLJPEG) is proposed. The overall structure of
for intra prediction, such as position-dependent intra prediction LLJPEG encoder and decoder are shown in Fig. 4 and Fig. 5
combination (PDPC) [33] and mode dependent intra smooth- respectively.
ing (MDIS) [34]. Fig. 2 shows the intra prediction modes and The LLJPEG encoder takes a JPEG file as input. The
the reference pixels in HEVC and VVC. input JPEG file undergoes the JPEG entropy decoding
Although HEVC-Intra and VVC-Intra have very high process to extract the JPEG transform coefficients and the
compression performance, they are designed based on pixel JPEG quantization table required for JPEG recompression.
domain prediction and are not efficient to compress JPEG files In LLJPEG encoding, the surrounding reference pixels are
losslessly. decoded by inverse quantization and inverse transform. Then,
2) Intra Prediction in Transform Domain: As indicated the predicted block is obtained by block partition and intra
in [19], for lossless recompression of JPEG files, we need to prediction. After that, the predicted block is transformed and
find a way to reduce the redundancy of transform coefficients quantized to obtain the predicted transform coefficients. In the
in JPEG files directly. However, the intra prediction in HEVC following, the predicted coefficients are subtracted from the

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 91

Fig. 3. The distribution comparison of DCT coefficient residuals (blue lines) and original DCT coefficients (red lines) on Lena image (512 × 512) with
QF = 80.

Fig. 4. The overall framework of LLJPEG encoder.

Fig. 5. The overall framework of LLJPEG decoder.

original coefficients to get the DCT coefficient residuals. quantization and entropy coding. On the one hand, the fixed
Finally, the DCT coefficient residuals are encoded by the coding block size in JPEG cannot adapt to a variety of image
redesigned entropy coder. contents, so LLJPEG can adopt more flexible block partition
For LLJPEG decoder, the decoding process corresponds to to improve the coding performance. On the other hand, the
the encoding process one by one. After decoding the transform transform and quantization units of JPEG is 8 × 8 blocks, that
coefficients of all blocks, the LLJPEG decoder encodes the means the transform and quantization unit of LLJPEG must
transform coefficients and the JPEG header information into be consistent with JPEG to perform intra prediction for DCT
the output JPEG file using the JPEG entropy encoder. coefficients in a JPEG file.
Based on the above observation, a quadtree structure for
C. Block Partition and Intra Prediction block partition is adopted by LLJPEG. In the block partition
1) Block Partition in LLJPEG: The coding block size in scheme, a unit called coding block (CU) is defined, the size
JPEG is fixed to 8 × 8, which is the unit of transform, of which is 8 × 8, 16 × 16, 32 × 32, and 64 × 64. CU is the

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
92 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 32, 2023

Fig. 6. An example of the coding tree structure of LLJPEG.

Fig. 7. Block partition of Lena image.

Fig. 8. Flowchart of the rough selection process for intra prediction.


basic unit of intra prediction and entropy coding. Meanwhile,
the transform and quantization units in LLJPEG are fixed to
8 × 8 blocks, as the data stored in JPEG files are all DCT LLJPEG only considers the size of the bitstream in MD cost
coefficients of 8 × 8 blocks. What’s more, the minimum size calculation, as the distortion is always zero.
of CU is limited to 8 × 8 correspondingly. An example of the 3) Two-Stage Intra Prediction Mode Decision: The com-
quadtree structure of LLJPEG is shown in Fig. 6. plete MD cost calculation process in LLJPEG includes four
For LLJPEG encoder, starting from a 64 × 64 CU, the parts: intra prediction, DCT, quantization, and entropy coding.
LLJPEG encoder will traverse each partition of the current The size of the output bitstream of entropy coding is the MD
CU, and calculate the coding cost of each partition. The coding cost of current mode.
cost of each partition is the sum of the cost of luma and Considering that there are 35 modes for intra prediction,
chroma intra prediction as well as the cost of encoding control if each mode undergoes the complete MD process, the
information. The block partition with the lowest coding cost encoding complexity will be too high. In order to reduce
will be selected for the current CU. A flag indicating the the complexity, a new fast intra mode decision method is
partition mode of the current CU will be sent to the decoder proposed for LLJPEG, which is designed based on DC and
side. Fig. 7 is an example of block partition of Lena image. AC coefficient distribution to further reduce the encoding
2) Intra Prediction in LLJPEG: The intra prediction of computational complexity.
LLJPEG is the same with HEVC intra prediction, which is In LLJPEG, the MD process of intra prediction is divided
shown in Fig. 2. According to 35 intra prediction modes, the into two stages: rough selection stage and fine selection stage.
predicted block is generated from a linear copy of the nearest The flowchart of rough mode selection is shown in Fig. 8,
row and column of the reference pixels. in which S DC and S AC represent the sum of absolute values
LLJPEG is a lossless coding framework. Compared with of DC and AC coefficients in corresponding original blocks of
the intra prediction in HEVC and VVC, there are some current CU, respectively. Two intra modes selected in Step 3,
differences. First, the reference pixels in LLJPEG are directly four intra modes selected in Step 4, and only DC and Planar
extracted from the decoded JPEG image. Since the image selected in Step 5 are experimentally selected. When the DC
decoded from the JPEG file is the same as the reconstructed coefficient of the current block is much larger than the sum
image at the LLJPEG decoder, the reconstructed image is not of the AC coefficients, LLJPEG will directly determine that
needed in the LLJPEG encoder. Second, there is no need to DC and Planar mode are selected for fine selection for current
further filter the reference pixels. Because there is no distortion coding block, and skip the subsequent rough selection process.
for the reference pixels in LLJPEG, the reference pixel filtering In addition, in HEVC and VVC, it is necessary to perform
improves the compression performance little, but increases Hadamard transform on the residual block generated by
the complexity. Third, in LLJPEG, it is needed to determine subtracting predicted block from original block to estimate
which mode is the best. We call it mode decision process (MD the output bitstream size of the current mode. However,
process). Unlike MD cost calculation in HEVC and VVC, in LLJPEG, DCT coefficient residuals can be obtained in the

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 93

TABLE I its filtered pixel. For vertical modes and horizontal modes, the
H IT R ATE OF DC M ODE W HEN R EFERENCE P IXELS A RE THE S AME filtered boundary pixels are calculated as follows, respectively:
p ′ (0, j) = (5 ∗ p (0, j) + 2 ∗ ( p (−1, j) − p (−1, −1))
+ ( p (−2, j) − p (−2, −1)) + 4) ≫ 3 (3)
p (i, 0) = (5 ∗ p (i, 0) + 2 ∗ ( p (i, −1) − p (−1, −1))

+ ( p (i, −2) − p (−1, −2)) + 4) ≫ 3 (4)


For DC mode, the filtered boundary pixels are calculated as
follows:

encoding process directly and accurately, the Sum of Absolute p ′ (0, 0) = (2 ∗ p (−1, −1) + 3 ∗ ( p (−1, 0)
Hadamard Transformed Difference (SATD) is not needed. The + p (0, −1))+8 ∗ dc + 8) ≫ 4 (5)
cost of LLJPEG rough mode selection is calculated as follows p (i, 0) = ( p (i, −2) + 2 ∗ p (i, −1) + 5 ∗ dc + 4) ≫ 3 (6)

cost mode = Sum abs C p − Co + λ ∗ R (i mode ) (1) p ′ (0, j) = ( p (−2, j) + 2 ∗ p (−1, j) + 5 ∗ dc + 4) ≫ 3 (7)


C p = DC T (B P ) (2) The above filtering method is only applicable to the pixels in


where cost mode represents the cost of the current intra mode the row or column closest to the CU boundary. As the larger
in the rough mode selection stage. B P is the predicted coding block usually corresponds to the smoother texture, the
block obtained by intra prediction. C p is the predicted filtering gain is usually small in this case, so the above filtering
DCT coefficient block. Co is the original DCT coefficient method is only used for CU with size less than 32 × 32, and
block extracted from JPEG files. λ is set to 1, which is only for the luma component.
experimentally selected. R (i mode ) is the size of the bitstream
to encode the intra prediction mode. D. Transform and Quantization
After rough selection, several candidate modes are obtained In LLJPEG, the size of predicted block ranges from 8×8 to
for fine selection. The above candidates combine with MPM 64 × 64. The predicted block is then divided into several
modes will go through a complete MD process to get the 8 × 8 blocks to ensure that the block size of the transform
best intra prediction mode of current CU. The experiment on and quantization is consistent with JPEG. After that the same
the proposed two-stage intra prediction mode decision will be DCT and quantization in the JPEG file are applied to each
provided in Section IV-D. 8 × 8 block to obtain the predicted transform coefficients.
4) New Decision for DC Mode: Considering that there Finally, the transform coefficient residuals are obtained by
is no distortion in reference pixels of LLJPEG, we infer subtracting the original DCT coefficients from its predicted
that the intra prediction mode of the current block can be DCT coefficients. The flowchart of transform and quantization
determined directly according to the reference pixels in some in LLJPEG is shown in Fig. 9.
cases. We calculate the probability of DC mode selected for
different CU sizes when all reference pixels are the same with
quality factor (QF) ranging from 10 to 90 on 24 images of E. CABAC Based Entropy Coding
the Kodak dataset. Table I shows the results. It can be seen 1) Reorganization of Transform Coefficient Residuals: In
the average probability of selecting DC mode in this case is LLJPEG, the basic unit of DCT and quantization is 8 × 8
greater than 70%. block, while the basic unit of entropy coding is the CU
Therefore, when the reference pixels of the current block ranging from 8 × 8 to 64 × 64. For better CABAC entropy
are all the same, the DC mode is directly determined. What’s coding, the reorganization of transform coefficient residuals is
more, it is not necessary to encode the intra mode flag to needed. The same frequency band is grouped from upper left
indicate to the decoder in this case. to lower right conforming to the trend of energy decrease,
5) Filtering After Intra Prediction: After intra prediction, which is more conducive to the CABAC entropy coding.
some intra prediction modes may produce discontinuity at Fig. 10 is an example of coefficient residuals reorganization for
the boundary of the predicted blocks. For example, in DC a 16 × 16 CU.
mode, discontinuities may appear on the top and left boundary. 2) Coding of Skip Flag: In LLJPEG, there is a skip flag
In vertical mode, discontinuities may appear on the left to indicate whether the coefficient residuals in the current CU
boundary; and in horizontal mode, discontinuities may appear are all zeros. If the skip flag is true, the subsequent encoding
on the upper boundary. steps can be skipped.
In order to reduce the discontinuity, it is necessary to We calculate the distribution of the number of non-zero
filter the boundary pixels after intra prediction according to coefficients when all reference pixels are the same under
the reference pixels. This filtering is only applicable to DC, different CU sizes on 24 Kodak dataset images. Table II shows
horizontal modes (mode_9 to mode_11) and vertical modes the results.
(mode_25 to mode_27). It can be seen that when all reference pixels are the same,
Suppose (i, j) are the coordinates of a pixel in the current there is about 60% probability that the number of non-zero
predicted block. p(i, j) is the predicted pixel and p ′ (i, j) is coefficient residuals of the current CU is equal to 0. Therefore,

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
94 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 32, 2023

Fig. 9. Transform and quantization in an m × m CU. m ranges from 8 to 64, n = (m/8)∗ (m/8).

Fig. 10. Reorganization of transform coefficient residuals for a CU with 16 × 16. The left is the coefficient residuals in a 16 × 16 CU. Different colors are
used to represent the coefficient residuals in different 8 × 8 blocks. The right is the coefficient residuals after reorganization.

TABLE II TABLE III


N UMBER OF N ON -Z ERO C OEFFICIENT R ESIDUALS W HEN A LL R EFER - T HE S ELECTION OF S CANNING O RDER
ENCE P IXELS A RE THE S AME U NDER D IFFERENT CU S IZES

the lower right corner. The non-uniform distribution of energy


makes it necessary to divide the coefficient residual block into
smaller ones to improve the coding performance. In LLJPEG,
the coefficient residual block in a CU is divided into several
4 × 4 blocks, each 4 × 4 block is called a coefficient group
whether all reference pixels are the same can be considered (CG). Fig. 11 shows the three scanning orders used in LLJPEG
as a context to encode the skip flag. In addition, there is for a CG, which are horizontal scanning, vertical scanning and
about 25% probability that the number of non-zero coefficient diagonal scanning. The scanning order among CGs is the same
residuals is equal to 1. Another flag is encoded to indicate as the scanning order in a CG in a CU.
whether there is only one non-zero coefficient residual in the The selection of which scanning order to use for a CU
current CU. If this flag is true, the non-zero coefficient residual depends on the intra prediction mode and the CU size,
is encoded separately, and the subsequent encoding steps are as shown in Table III.
skipped. After scanning, the transform coefficient residuals in a CU
can be represented by the positions and the amplitudes of non-
zero coefficient residuals.
F. Coefficient Group and Scanning Order 1) Position Coding: When coding the position of non-
After the reorganization of transform coefficient residuals, zero coefficient residuals, the position of the last non-zero
the energy of the upper left corner is often higher than that of coefficient residual in the current CU is encoded first. Then,

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 95

TABLE IV
C OMPRESSION P ERFORMANCE C OMPARISON (B ITS S AVING ) ON KODAK DATASET

TABLE V
C OMPRESSION P ERFORMANCE C OMPARISON (B ITS S AVING ) ON DIV2K DATASET

of 800 training images with a resolution of 1020 × 800 or


1020×1020. All the images in RGB format are first converted
into YUV444 and compressed by Libjpeg [37] to obtain the
JPEG files.
The experiments are conducted on a PC under windows
10 system with Intel i7 8700k processor and 32GB RAM.
Fig. 11. Scanning orders for a CG. The source code of LLJPEG can be downloaded from
https://github.com/vilab-sct/LLJPEG.
the flag of whether there is a non-zero coefficient residual
B. Compression Performance of LLJPEG
in each CG is encoded. For a CG with non-zero coefficient
residuals, a flag is needed for each position in the CG to We first compare the compression performance of LLJPEG
indicate whether the coefficient residual at the position is a with jpegtran and Winzip on the Kodak and DIV2K datasets.
non-zero one or not. The reason for choosing jpegtran and Winzip for comparison
2) Amplitude Coding: The encoding of the amplitudes in a is that jpegtran is the most widely used JPEG lossless
CG is as follows: recompression method, while Winzip is the most widely used
Step 1: A flag of whether the amplitude of the first eight general lossless compression method today, which is designed
non-zero coefficient residuals in current CG is greater than based on the sliding window compression and Huffman
1 is encoded. coding.
Step 2: In the subset with amplitude greater than 1, the flag The comparison results are shown in Table IV and Table V.
of whether the current amplitude is greater than 2 is encoded. The bits saving is calculated as follows:
Step 3: The signs of all non-zero coefficient residuals in a SJ P E G − S L L
CG is encoded. Bits Saving = ( ) ∗ 100% (8)
SJ P E G
Step 4: The remaining amplitudes is encoded by bypass
where S J P E G is the file size of a JPEG file and SL L is the
coding.
file size after lossless compression.
After the above designed CABAC entropy coding process,
Table IV shows the compression performance on the
JPEG image is compressed into a new bit stream by LLJPEG.
Kodak dataset. It can be seen that when QF = 10, the
compression performance of LLJPEG, jpegtran, and Winzip
IV. E XPERIMENTAL R ESULTS
is 51.65%, 26.96% and 12.90%. When QF = 50, the
A. Experimental Setup compression performance of LLJPEG, jpegtran, and Winzip is
We test the performance of LLJPEG on the Kodak [35] and 26.87%, 6.80% and 2.26%. When QF = 90, the compression
DIV2K [36] datasets. The Kodak dataset includes 25 images performance of LLJPEG, jpegtran, and Winzip is 18.03%,
with resolution of 768 × 512. The DIV2K dataset consists 1.42% and 0.06% respectively. On average, LLJPEG achieves

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
96 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 32, 2023

TABLE VI
C OMPRESSION P ERFORMANCE (B ITS S AVING ) C OMPARISON ON KODAK DATASET W HEN QF = 80

TABLE VII
AVERAGE E NCODING AND D ECODING T IME C OMPARISON ON KODAK DATASET ( IN S ECONDS )

TABLE VIII
AVERAGE E NCODING AND D ECODING T IME C OMPARISON ON DIV2K DATASET ( IN S ECONDS )

29.43% storage space saving, while jpegtran and Winzip


achieve 9.31% and 3.98% respectively.
Table V shows the compression performance on the DIV2K
Fig. 12. The process of compressing JPEG files using a lossless encoder.
dataset. It can be seen that LLJPEG saves 26.40% of storage (VVC-intra, HEVC-intra, and CALIC.)
space on average, while jpegtran and Winzip save 7.82% and
2.96% respectively. The results on the DIV2K dataset are on the Kodak dataset, the average file size increases about by
slightly lower than that on the Kodak dataset. According to 137%, 175%, and 230% after compression using the lossless
our observations, the images in the DIV2K dataset have more coding techniques of VVC-Intra and HEVC-Intra, and CALIC,
texture than those in Kodak dataset, which may affect the respectively. On the DIV2K dataset, the average file size
compression performance. increases about by 121%, 158%, and 222%, respectively. The
Furthermore, we compare LLJPEG with the lossless coding experimental results are easy to understand, because LLJPEG
techniques of VVC-Intra and HEVC-Intra, and CALIC. The operates on the quantized DCT coefficients, while CALIC,
process of a counterpart is shown in Fig. 12. The experimental the lossless coding techniques of VVC-Intra and HEVC-Intra
results are also shown in Tables IV and V. We can see that operate on the decoded images by the JPEG decoder.

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 97

TABLE IX
C OMPRESSION P ERFORMANCE AND C OMPLEXITY OF F OUR I NTRA P REDICTION M ODES , C LOSING THE B LOCK PARTION , C LOSING THE
N EW DC M ODE D ECISION M ETHOD , C LOSING THE F ILTERING OF P REDICTED B LOCKS ON KODAK DATASET

TABLE X
C OMPRESSION P ERFORMANCE AND C OMPLEXITY OF F OUR I NTRA P REDICTION M ODES , C LOSING THE B LOCK PARTION , C LOSING THE
N EW DC M ODE D ECISION M ETHOD , C LOSING THE F ILTERING OF P REDICTED B LOCKS ON DIV2K DATASET

We also compare the compression performance of LLJPEG time of LLJPEG is 0.058 seconds, while the encoding time is
with the prior-based JPEG lossless recompression method [21] 1.903 seconds. On the DIV2K dataset, the decoding time of
on the Kodak dataset under the quality factor of 80. The LLJPEG is about 0.101 seconds, while the encoding time is
experimental results are shown in Table VI. As [21] only 3.320 seconds. The experimental results on the two datasets
provides the experimental result on the 12 images with show that Winzip has the shortest encoding and decoding time,
QF = 80 and does not provide the experimental result on followed by jpegtran and LLJPEG.
other images or other QFs, Table VI only shows its results on
the 12 images. For a complete comparison, we also provide
the experimental results for all images on the Kodak dataset D. Ablation Experiments
using LLJPEG, Winzip and jpegtran. It can be seen that [21] In addition, in order to test the performance of LLJPEG’s
saves 7.76% of the storage space on average, while the storage coding tools, four ablation experiments are designed. We limit
space saved in LLJPEG is 21.57%. the intra prediction of LLJPEG to four intra modes (DC
mode, Planar mode, mode_26 and mode_10 only), turn off
the block partition of LLJPEG, turn off the new decision
C. Complexity Comparison for DC mode, and turn off the filtering after intra prediction.
The computational complexity of the above methods is also These experiments are carried out on the Kodak and DIV2K
tested. The experiments are carried out on the Kodak and datasets with QF from 90 to 10. The results are shown in
DIV2K datasets. The results are shown in Tables VII and VIII. Table IX and Table X.
Since the output of jpegtran is still in JPEG format, we only It can be seen that on the Kodak dataset, when only 4 intra
provide its encoding time. On the Kodak dataset, the decoding prediction modes are used, the encoding time is 66.22% of

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
98 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 32, 2023

TABLE XI
P ERFORMANCE AND E NCODING T IME C OMPARISON OF THE P ROPOSED T WO -S TAGE I NTRA P REDICTION M ODE D ECISION ON KODAK DATASET

TABLE XII
P ERFORMANCE AND E NCODING T IME C OMPARISON OF THE P ROPOSED T WO -S TAGE I NTRA P REDICTION M ODE D ECISION ON DIV2K DATASET

the original encoding time, and the encoding performance R EFERENCES


drops by 7.56%. When the block partition is turned off, [1] G. K. Wallace, “The JPEG still picture compression standard,” Commun.
the encoding time is 94.36%, and the encoding performance ACM, vol. 34, no. 4, pp. 30–44, Apr. 1991.
decreases by 1.09%. When the new DC mode decision method [2] G. K. Wallace, “Overview of the JPEG (ISO/CCITT) still image
compression standard,” Proc. SPIE, vol. 1244, pp. 220–233, Jun. 1990.
is turned off, the encoding time is 102.95%, and the encoding [3] T. Richter and R. Clark, “Why JPEG is not JPEG—Testing a 25 years
performance decreases by 0.55%. When the intra filtering is old standard,” in Proc. Picture Coding Symp. (PCS), Jun. 2018, pp. 1–5.
turned off, the encoding time is 98.16%, and the encoding [4] T. Richter, “JPEG on STEROIDS: Common optimization techniques
performance drops by 0.37%. Similar results are also shown for JPEG image compression,” in Proc. IEEE Int. Conf. Image Process.
(ICIP), Sep. 2016, pp. 61–65.
on the DIV2K dataset. This indicates that the above tools can [5] C. Zhao, S. Ma, and W. Gao, “Thousand to one: An image compression
improve the coding performance while the time complexity is system via cloud search,” in Proc. IEEE 17th Int. Workshop Multimedia
acceptable. Signal Process. (MMSP), Oct. 2015, pp. 1–5.
[6] X. Liu, G. Cheung, X. Wu, and D. Zhao, “Random walk graph
We also tested the performance of the proposed two- Laplacian-based smoothness prior for soft decoding of JPEG images,”
stage intra prediction mode decision on the Kodak and IEEE Trans. Image Process., vol. 26, no. 2, pp. 509–524, Feb. 2017.
DIV2K datasets. The results are shown in Table XI and [7] H. Wu, X. Sun, J. Yang, and F. Wu, “Incremental SfM based lossless
compression of JPEG coded photo album,” in Proc. Vis. Commun. Image
Table XII. It can be seen that the proposed fast intra mode Process. (VCIP), Dec. 2015, pp. 1–4.
decision reduces the encoding time by 34.94% with only [8] H. Wu, X. Sun, J. Yang, and F. Wu, “Lossless compression of JPEG
0.19% compression performance loss on the Kodak dataset, coded photo albums,” in Proc. IEEE Vis. Commun. Image Process. Conf.,
and 36.35% encoding time only with 0.24% compression Dec. 2014, pp. 538–541.
[9] C. Christopoulos, A. Skodras, and T. Ebrahimi, “The JPEG2000 still
performance loss on the DIV2K dataset. image coding system: An overview,” IEEE Trans. Consum. Electron.,
vol. 46, no. 4, pp. 1103–1127, Nov. 2000.
V. C ONCLUSION [10] F. Bellard. (2015). BPG Image Format Website. [Online]. Available:
http://bellard.org/bpg/
[11] M. J. Weinberger, G. Seroussi, and G. Sapiro, “The LOCO-I lossless
In order to reduce the storage cost of JPEG images in the image compression algorithm: Principles and standardization into
cloud or in personal computers or notebooks, a hybrid coding JPEG-LS,” IEEE Trans. Image Process., vol. 9, no. 8, pp. 1309–1324,
framework for recompression of JPEG image (LLJPEG) using Aug. 2000.
transform domain intra prediction is proposed, including block [12] X. Wu and N. Memon, “CALIC—A context based adaptive lossless
image codec,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.
partition and intra prediction, transform and quantization, and (ICASSP), vol. 4, May 1996, pp. 1890–1893.
entropy coding. In addition, some new tools are proposed for [13] J. Sneyers. (2021). FLIF Image Format Website. [Online]. Available:
intra prediction and the entropy coding is redesigned. The http://flif.info/
[14] G. Hudson, A. Léger, B. Niss, and I. Sebestyén, “JPEG at 25: Still going
experiments show that LLJPEG can reduce the storage space strong,” IEEE Multimedia, vol. 24, no. 2, pp. 96–103, Apr./Jun. 2017.
by 29.43% and 26.40% on the Kodak and DIV2K datasets [15] A. Kyleduo. (2017). Tinypng Project. [Online]. Available:
respectively without any loss for JPEG images. LLJPEG not https://tinypng.com/developers
only can be used to reduce the cloud storage cost of JPEG [16] Mozilla. (2019). Mozilla Mozjpeg. [Online]. Available:
https://github.com/mozilla/mozjpeg
images for social medias, but also the computers or notebooks [17] J. Alakuijala, R. Obryk, O. Stoliarchuk, Z. Szabadka, L. Vandevenne,
storage cost of JPEG albums for personal users. Through fast and J. Wassenberg, “Guetzli: Perceptually guided JPEG encoder,” 2017,
algorithm and fast implementation, the encoding and decoding arXiv:1703.04421.
[18] M. Hou, P. Shi, D. Pan, and H. Yang, “A speed up method for
time can be further shortened. In the future work, we will focus Guetzli encoder,” in Proc. 2nd Int. Conf. Inf. Syst. Comput. Aided Educ.
on it. (ICISCAE), Sep. 2019, pp. 90–94.

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.
SUN et al.: LOSSLESS RECOMPRESSION OF JPEG IMAGES USING TRANSFORM DOMAIN INTRA PREDICTION 99

[19] H. Kulissen. (2005). Jpegtran Description. [Online]. Available: Chentian Sun received the B.S. and M.S. degrees
http://jpegclub.org/articles/Verlustfreie_JPEG_Drehung.pdf from the School of Computer Science and Technol-
[20] H. Kulissen. (2005). Jpegtran Project. [Online]. Available: https://www. ogy, Harbin Institute of Technology, Harbin, China,
npmjs.com/package/jpegtran in 2014 and 2016, respectively, where he is currently
[21] X. Liu, G. Cheung, C. Lin, D. Zhao, and W. Gao, “Prior-based pursuing the Ph.D. degree. His research interests
quantization bin matching for cloud storage of JPEG images,” IEEE include data compression, image and video coding,
Trans. Image Process., vol. 27, no. 7, pp. 3222–3235, Jul. 2018. computer vision, and deep learning.
[22] T.-A. Pham and M. Delalandre, “Effective decompression of JPEG
document images,” IEEE Trans. Image Process., vol. 25, no. 8,
pp. 3655–3670, Aug. 2016.
[23] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the
high efficiency video coding (HEVC) standard,” IEEE Trans. Circuits
Syst. Video Technol., vol. 22, no. 12, pp. 1649–1668, Dec. 2012.
[24] F. Bossen, B. Bross, K. Sühring, and D. Flynn, “HEVC complexity and
implementation analysis,” IEEE Trans. Circuits Syst. Video Technol.,
vol. 22, no. 12, pp. 1685–1696, Dec. 2012.
[25] X. Zhang, S. Liu, and S. Lei, “Intra mode coding in HEVC standard,” Xiaopeng Fan (Senior Member, IEEE) received the
in Proc. Vis. Commun. Image Process., Nov. 2012, pp. 1–6. B.S. and M.S. degrees from the Harbin Institute
[26] J. Lainema, F. Bossen, W.-J. Han, J. Min, and K. Ugur, “Intra coding of of Technology (HIT), Harbin, China, in 2001 and
the HEVC standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, 2003, respectively, and the Ph.D. degree from The
no. 12, pp. 1792–1801, Dec. 2012. Hong Kong University of Science and Technology
[27] L. Zhao, L. Zhang, S. Ma, and D. Zhao, “Fast mode decision algorithm (HKUST), Hong Kong, in 2009.
for intra prediction in HEVC,” in Proc. Vis. Commun. Image Process. From 2003 to 2005, he was a Software Engineer
(VCIP), Nov. 2011, pp. 1–4. at Intel Corporation, China. In 2009, he joined HIT,
[28] J.-R. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan, and T. Wiegand, where he is currently a Professor. From 2011 and
“Comparison of the coding efficiency of video coding standards- 2012, he was a Visiting Researcher at Microsoft
including high efficiency video coding (HEVC),” IEEE Trans. Circuits Research Asia. From 2015 to 2016, he was a
Syst. Video Tech., vol. 22, no. 12, pp. 1669–1684, Dec. 2012. Research Assistant Professor at HKUST. Since 2018, he has been with the
[29] G. J. Sullivan, “Video coding standards progress report: Joint video Peng Cheng Laboratory. He has authored one book and over 150 papers in
experts team launches the versatile video coding project,” SMPTE refereed journals and conference proceedings. His current research interests
Motion Imag. J., vol. 127, no. 8, pp. 94–98, Sep. 2018. include video coding and transmission, image processing, and computer
[30] F. Pakdaman, M. A. Adelimanesh, M. Gabbouj, and M. R. Hashemi, vision. He was a recipient of the Outstanding Contributions Award to the
“Complexity analysis of next-generation VVC encoding and decoding,” Development of IEEE Standard 1857 by IEEE in 2013. He served as the
in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2020, Program Chair for PCM2017, the Chair for IEEE SGC2015, and the Co-
pp. 3134–3138. Chair for MCSN2015. He has been an Associate Editor of the IEEE 1857
[31] G. Van der Auwera, J. Heo, and A. Filippov, Ce3: Summary Report on S TANDARD : E MPOWERING S MART V IDEO S URVEILLANCE S YSTEMS since
Intra Prediction and Mode Coding, document JVET-K0023, 11th JVET 2012.
Meeting, 2018.
[32] J. Chen, M. Karczewicz, Y.-W. Huang, K. Choi, J.-R. Ohm,
and G. J. Sullivan, “The joint exploration model (JEM) for video
compression with capability beyond HEVC,” IEEE Trans. Circuits Syst.
Video Technol., vol. 30, no. 5, pp. 1208–1225, May 2020.
[33] A. Said, X. Zhao, M. Karczewicz, J. Chen, and F. Zou, “Position
dependent prediction combination for intra-frame video coding,” in Proc. Debin Zhao (Member, IEEE) received the B.S.,
IEEE Int. Conf. Image Process. (ICIP), Sep. 2016, pp. 534–538. M.S., and Ph.D. degrees in computer science from
[34] B. Bross, J. Chen, J.-R. Ohm, G. J. Sullivan, and Y.-K. Wang, the Harbin Institute of Technology (HIT), China,
“Developments in international video coding standardization after AVC, in 1985, 1988, and 1998, respectively.
with an overview of versatile video coding (VVC),” Proc. IEEE, In 1987, he joined HIT, where he is currently a
vol. 109, no. 9, pp. 1463–1493, Sep. 2021. Professor with the Department of Computer Science.
[35] R. Franzen. (2010). Kodak Dataset. [Online]. Available: http://r0k. Since 2018, he has been with the Peng Cheng Labo-
us/graphics/kodak/ ratory. He has published over 200 technical paper in
[36] R. Timofte. (2018). DIV2K Dataset. [Online]. Available: https://data. refereed journals and conference proceedings. His
vision.ee.ethz.ch/cvl/DIV2K/ current research interests include image and video
[37] B. Bfriesen. (2000). Libjpeg Project. [Online]. Available: http://libjpeg. coding, compressive sensing, deep networks, and
sourceforge.net/ computer vision.

Authorized licensed use limited to: Rajeev Gandhi Memorial College of Eng and Tech. Downloaded on October 10,2023 at 13:51:42 UTC from IEEE Xplore. Restrictions apply.

You might also like