Research On GAN-based Image Super-Resolution Method
Research On GAN-based Image Super-Resolution Method
Research On GAN-based Image Super-Resolution Method
Abstract—Super-Resolution (SR) refers to the single image. However, this method is not universal, and the
reconstruction of high-resolution image from low-resolution effects of image reconstruction and resolution enhancement
image, which has important application value in object are greatly limited. The traditional image super-resolution
detection, medical imaging, satellite remote sensing and other processing methods are mostly interpolation methods, by
fields. In recent years, with the rapid development of deep finding the nearest pixels of the image to select an
learning, the image super-resolution reconstruction method appropriate interpolation basis function, and calculating the
based on deep learning has made remarkable progress. In this gray value of the point to be interpolated to rebuild the
paper, R-SRGAN (Residual Super-Resolution Generative up-sampling and resolution of the image. With the
Adversarial Networks) is used to build the model and realize
development of artificial intelligence, a variety of deep
image super-resolution. By adding residual blocks between
adjacent convolutional layers of the GAN generator, more
learning super-resolution methods have been produced, such
detailed information is retained. At the same time, the as SRCNN (Super-Resolution Convolutional Neural
Wassertein distance is used as a loss function to enhance the Network, SRCNN)[4] method based on convolutional neural
training effect and achieve image super-resolution. network and SRGAN based on Generative Adversarial
Network[5] method, etc. Methods based on deep learning use
Keywords—Super-Resolution, Image Processing, Generative a large number of training samples to find the mapping
Adversarial Networks relationship between the low-resolution image and the
corresponding high-resolution image.
I. INTRODUCTION
B. Principles of GANs
With the rapid development of science and technology,
images, as the carrier of information, occupy the main Generative Adversarial Networks (GAN)[6] was first
position in information transmission and are widely spread. proposed by Ian Goodfellow in 2014. It is one of the most
However, due to the interference of hardware or environment, promising deep learning recent years, using the method of
the images obtained are often blurred, and there are problems adversarial games to learn the deep characteristics.
of distortion and low-resolution, which cannot meet people's GAN consists of a generator G and a discriminator D.
needs. While high-resolution images have high pixel density, The generator G is used to simulate the data distribution. The
it can provide people with more and more accurate discriminator D is used to calculate the probability that the
information, which can meet the needs of practical sample data comes from the training data but not G.
applications. How to improve the image quality to obtain Assuming input training data , the goal of generator G is to
high-resolution images has become a hot issue in research. learn the distribution of data . Its initial input noise
Image Super-Resolution (SR) method, converting distribution is ( ), ( ; ) represents the mapping of
existing low-resolution (LR) images into high-resolution noise to data space, where is the parameter of the
with the signal processing and image processing methods[1], distribution. Similarly, suppose ( ; ) is the mapping
is widely used in many fields such as medicine, military, function of the discriminator. Then its goal is to minimize the
remote sensing, video surveillance, etc. discrimination error rate, while the training goal of the
generator G is to maximize the probability of the errors
This paper proposes an image super-resolution method produced by the discriminator D. So the objective function
based on R-SRGAN for the problems of instability and mode of the generation adversarial network is a minimization
collapse in GAN training, with the method of which the problem:
super-resolution effect is optimized, the diversity of samples
is enhanced and the image generation quality is improved.
min max ( , ) = ~ ( )[ ( )]
II. RELATED WORK + ~ ( ) [ (1 − ( ( )))] (1)
A. Image Super-Resolution
The basic idea of image super-resolution was first However, the initial GAN has some problems, such as
proposed by J.L.Harris[2] and J.W.Goodman[3] in 1964 and difficult training process and lack of diversity of generated
1968, respectively. It combines prior knowledge to regain the samples. The emergence of WGAN[7] solves the problem of
high frequency information lost during the reconstruction of GAN. It uses EarthMover (EM) distance instead of JS
super-resolution, so that to get reconstructed image for a divergence to measure the distance between real samples and
(a)Generator
(b)Discriminator
Fig.3 R-SRGAN generator and discriminator structure
The loss function consists of two parts, MSE loss and B. L2 Loss
adversarial loss: For the discriminator, the most intuitive and widely used
loss metric is the MSE loss. In the WGAN-based image
= + (1 − ) (3) super-resolution algorithm, we apply MSE to calculate the
603
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 01,2023 at 11:37:24 UTC from IEEE Xplore. Restrictions apply.
pixel distance between the generated image and the standard including infants, women, avatars, Birds, butterflies, Set14
high-resolution image: dataset has 14 images, including men, zebras, etc. BSD100 is
relatively rich, including 100 images of planes, people, vases,
1 etc.
= ( , − , ) (4) After obtaining a 4x high-resolution image through
super-resolution reconstruction technology, we evaluate the
quality of high-resolution image through the Peak
Among them, represents the standard Signal-to-Noise Ratio (PSNR)[16] and Structural Similarity
high-resolution image, and ( ) is a generated image. (SSIM)[17]. PSNR, usually expressed in decibel units, is a
The MSE loss is a good measure of the pixel difference commonly used image objective evaluation index. It is based
between the two. The MSE loss tends to be infinitely small, on the error between the corresponding pixels and the
which means that the generated image is infinitely close to error-sensitive image quality evaluation. SSIM is an index to
the original high-resolution image. measure the similarity of two images. These two evaluation
C. Adversarial Loss indexes are most widely used in the field of image
By modifying the loss function of the WGAN generator processing because of their simple calculation and clear
and the regression analysis of the generated high-resolution mathematical meaning. The larger the values of PSNR and
image by the discriminator. of the following formula SSIM, the higher the image quality and the better the
super-resolution performance.
represents the low-resolution image of the input generator.
B. Results Comparison
= max ~ ( )[ , − ( ( ))] (5) From the objective evaluation, the three algorithms have
achieved good results in the Set5 and Set14 datasets. As the
IV. EXPERIMENT dataset grows, the effect of the algorithm decreases
significantly with BSD100. Under various datasets, the
A. Dataset learning-based SRCNN, SRGAN, and R-SRGAN methods
The dataset used in this experiment is DIV2K[12]. The are better than the bicubic interpolation method, and
DIV2K dataset contains a total of 1000 2K resolution R-SRGAN is better than the SRCNN and SRGAN methods.
pictures, in which 800 are used as the training set, 100 Because the bicubic interpolation uses the gray values of
verification pictures, and 100 test pictures. In addition, in neighboring pixels to generate the gray values of the pixels
order to perform more accurate and detailed training, we to be interpolated, so comparing with the original image, the
segmented the training set pictures to generate 32220 image processed by the bicubic interpolation method will not
pictures for training. When comparing these with other increase the sharpness of the image, while SRCNN, SRGAN,
results, we used three standard benchmark datasets: Set5[13], and R-SRGAN can better simulate and restore the image
Set14[14], and BSD100[15]. The Set5 dataset has 5 images, with learning the data distribution of the original image.
TABLE I METHOD RESULTS COMPARISON漡BICUBIC INTERPOLATION,SRCNN,SRGAN,R-SRGAN
604
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 01,2023 at 11:37:24 UTC from IEEE Xplore. Restrictions apply.
Image Super-Resolution Using a Generative Adversarial Network. [12] Yitong Yan, Chuangchuang Liu, Changyou Chen, Xianfang Sun,
2017 IEEE Conference on CVPR Longcun Jin, Xiang Zhou. Fine-grained Attention and
[6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, Feature-sharing Generative Adversarial Networks for Single Image
S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Super-Resolution. arXiv preprint arXiv:1911.10773, 2019.
Advances in Neural Information Processing Systems (NIPS), pages [13] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel.
2672–2680, 2014. Low-complexity single-image super-resolution based on nonnegative
[7] Xin Guo, Johnny Hong, Tianyi Lin, and Nan Yang, Relaxed neighbor embedding. BMVC, 2012.
Wasserstein with applications to GANs, 2017, arXiv:1705.07164. [14] R. Zeyde, M. Elad, and M. Protter. On single image scale-up using
[8] Lim, B., Son, S., Kim, H., Nah, S., Lee, K.M.: Enhanced deep sparse-representations. In Curves and Surfaces, pages 711–730.
residual networks for single image super-resolution. In: CVPRW. Springer, 2012.
(2017) [15] Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human
[9] He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image segmented natural images and its application to evaluating
recognition. In: CVPR. (2016) segmentation algorithms and measuring ecological statistics. In:
ICCV. (2001)
[10] Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep
network training by reducing internal covariate shift. In: ICMR. [16] Kim, J., Kwon Lee, J., Mu Lee, K.: Accurate image super-resolution
(2015). using very deep convolutional networks. In: CVPR. (2016)
[11] K. Simonyan and A. Zisserman. Very deep convolutional networks [17] Nao Takano, Gita Alaghband. SRGAN: Training Dataset Matters.
for large-scale image recognition. In International Conference on arXiv preprint arXiv:1903.09922, 2019.
Learning Representations (ICLR), 2015.
Original Image Bicubic Interpolation SRCNN SRGAN R-SRGAN
605
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on October 01,2023 at 11:37:24 UTC from IEEE Xplore. Restrictions apply.