Enhanced Deep Residual Networks For Single Image Super-Resolution
Enhanced Deep Residual Networks For Single Image Super-Resolution
Bee Lim Sanghyun Son Heewon Kim Seungjun Nah Kyoung Mu Lee
Abstract
Recent research on super-resolution has progressed with
the development of deep convolutional neural networks
(DCNN). In particular, residual learning techniques exhibit
improved performance. In this paper, we develop an en-
hanced deep super-resolution network (EDSR) with perfor-
mance exceeding those of current state-of-the-art SR meth- HR
ods. The significant performance improvement of our model (PSNR / SSIM)
is due to optimization by removing unnecessary modules in
conventional residual networks. The performance is further
improved by expanding the model size while we stabilize
the training procedure. We also propose a new multi-scale
deep super-resolution system (MDSR) and training method,
which can reconstruct high-resolution images of different
upscaling factors in a single model. The proposed methods 0853 from DIV2K [26] Bicubic
show superior performance over the state-of-the-art meth- (30.80 dB / 0.9537)
ods on benchmark datasets and prove its excellence by win-
ning the NTIRE2017 Super-Resolution Challenge [26].
1. Introduction
Image super-resolution (SR) problem, particularly sin-
gle image super-resolution (SISR), has gained increasing VDSR [11] SRResNet [14] EDSR+ (Ours)
research attention for decades. SISR aims to reconstruct (32.82 dB / 0.9623) (34.00 dB / 0.9679) (34.78 dB / 0.9708)
a high-resolution image I SR from a single low-resolution
image I LR . Generally, the relationship between I LR and Figure 1: ×4 Super-resolution result of our single-scale SR
the original high-resolution image I HR can vary depending method (EDSR) compared with existing algorithms.
on the situation. Many studies assume that I LR is a bicubic
downsampled version of I HR , but other degrading factors
such as blur, decimation, or noise can also be considered for ferent initialization and training techniques. Thus, carefully
practical applications. designed model architecture and sophisticated optimization
Recently, deep neural networks [11, 12, 14] provide sig- methods are essential in training the neural networks.
nificantly improved performance in terms of peak signal-to- Second, most existing SR algorithms treat super-
noise ratio (PSNR) in the SR problem. However, such net- resolution of different scale factors as independent prob-
works exhibit limitations in terms of architecture optimality. lems without considering and utilizing mutual relationships
First, the reconstruction performance of the neural network among different scales in SR. As such, those algorithms re-
models is sensitive to minor architectural changes. Also, the quire many scale-specific networks that need to to be trained
same model achieves different levels of performance by dif- independently to deal with various scales. Exceptionally,
1
VDSR [11] can handle super-resolution of several scales al. [30] introduced another approach that clusters the patch
jointly in the single network. Training the VDSR model spaces and learns the corresponding functions. Some ap-
with multiple scales boosts the performance substantially proaches utilize image self-similarities to avoid using exter-
and outperforms scale-specific training, implying the redun- nal databases [8, 6, 29], and increase the size of the limited
dancy among scale-specific models. Nonetheless, VDSR internal dictionary by geometric transformation of patches
style architecture requires bicubic interpolated image as the [10].
input, that leads to heavier computation time and memory Recently, the powerful capability of deep neural net-
compared to the architectures with scale-specific upsam- works has led to dramatic improvements in SR. Since Dong
pling method [5, 22, 14]. et al. [4, 5] first proposed a deep learning-based SR method,
While SRResNet [14] successfully solved those time various CNN architectures have been studied for SR. Kim
and memory issue with good performance, it simply em- et al. [11, 12] first introduced the residual network for train-
ploys the ResNet architecture from He et al. [9] without ing much deeper network architectures and achieved su-
much modification. However, original ResNet was pro- perior performance. In particular, they showed that skip-
posed to solve higher-level computer vision problems such connection and recursive convolution alleviate the burden
as image classification and detection. Therefore, applying of carrying identity information in the super-resolution net-
ResNet architecture directly to low-level vision problems work. Similarly to [20], Mao et al. [16] tackled the general
like super-resolution can be suboptimal. image restoration problem with encoder-decoder networks
To solve these problems, based on the SRResNet ar- and symmetric skip connections. In [16], they argue that
chitecture, we first optimize it by analyzing and removing those nested skip connections provide fast and improved
unnecessary modules to simplify the network architecture. convergence.
Training a network becomes nontrivial when the model is In many deep learning based super-resolution algo-
complex. Thus, we train the network with appropriate loss rithms, an input image is upsampled via bicubic interpo-
function and careful model modification upon training. We lation before they fed into the network [4, 11, 12]. Rather
experimentally show that the modified scheme produces than using an interpolated image as an input, training up-
better results. sampling modules at the very end of the network is also pos-
Second, we investigate the model training method that sible as shown in [5, 22, 14]. By doing so, one can reduce
transfers knowledge from a model trained at other scales. much of computations without losing model capacity be-
To utilize scale-independent information during training, cause the size of features decreases. However, those kinds
we train high-scale models from pre-trained low-scale mod- of approaches have one disadvantage: They cannot deal
els. Furthermore, we propose a new multi-scale architecture with the multi-scale problem in a single framework as in
that shares most of the parameters across different scales. VDSR [11]. In this work, we resolve the dilemma of multi-
The proposed multi-scale model uses significantly fewer pa- scale training and computational efficiency. We not only
rameters compared with multiple single-scale models but exploit the inter-relation of learned feature for each scale
shows comparable performance. but also propose a new multi-scale model that efficiently
We evaluate our models on the standard benchmark reconstructs high-resolution images for various scales. Fur-
datasets and on a newly provided DIV2K dataset. The thermore, we develop an appropriate training method that
proposed single- and multi-scale super-resolution networks uses multiple scales for both single- and multi-scale mod-
show the state-of-the-art performances on all datasets in els.
terms of PSNR and SSIM. Our methods ranked first and Several studies also have focused on the loss functions
second, respectively, in the NTIRE 2017 Super-Resolution to better train network models. Mean squared error (MSE)
Challenge [26]. or L2 loss is the most widely used loss function for general
image restoration and is also major performance measure
2. Related Works (PSNR) for those problems. However, Zhao et al. [35]
reported that training with L2 loss does not guarantee better
To solve the super-resolution problem, early approaches performance compared to other loss functions in terms of
use interpolation techniques based on sampling theory [1, PSNR and SSIM. In their experiments, a network trained
15, 34]. However, those methods exhibit limitations in pre- with L1 achieved improved performance compared with the
dicting detailed, realistic textures. Previous studies [25, 23] network trained with L2.
adopted natural image statistics to the problem to recon-
struct better high-resolution images. 3. Proposed Methods
Advanced works aim to learn mapping functions be-
tween I LR and I HR image pairs. Those learning meth- In this section, we describe proposed model architec-
ods rely on techniques ranging from neighbor embed- tures. We first analyze recently published super-resolution
ding [3, 2, 7, 21] to sparse coding [31, 32, 27, 33]. Yang et network and suggest an enhanced version of the residual
network architecture with the simpler structure. We show
that our network outperforms the original ones while ex-
Upsample
hibiting improved computational efficiency. In the follow-
ResBlock
ResBlock
•••
Conv
Conv
Conv
ing sections, we suggest a single-scale architecture (EDSR)
that handles a specific super-resolution scale and a multi-
scale architecture (MDSR) that reconstructs various scales
of high-resolution images in a single model.
X2 X3
Shuffle
Shuffle
ReLU
3.1. Residual blocks
Conv
Conv
Conv
Conv
Mult
Recently, residual networks [11, 9, 14] exhibit excellent X4
performance in computer vision problems from the low-
Shuffle
Shuffle
Conv
Conv
level to high-level tasks. Although Ledig et al. [14] success-
fully applied the ResNet architecture to the super-resolution
problem with SRResNet, we further improve the perfor- Figure 3: The architecture of the proposed single-scale SR
mance by employing better ResNet structure. network (EDSR).
ResBlock
3.2. Single-scale model
Conv
(X2)
X2
The simplest way to enhance the performance of the net-
work model is to increase the number of parameters. In
ResBlock
ResBlock
ResBlock
the convolutional neural• • •network, model performance can
Conv
Conv
Conv
(X3)
X3
be enhanced by stacking many layers or by increasing the
number of filters. General CNN architecture with depth (the
number of layers) B and width (the number of feature chan-
nels) F occupies roughly O(BF ) memory with O(BF 2 )
ResBlock
Conv
(X4)
X4
imize the model capacity when considering limited compu-
tational resources.
(a) Original (b) SRResNet (c) Proposed However, we found that increasing the number of feature
maps above a certain level would make the training pro-
Figure 2: Comparison of residual blocks in original cedure numerically unstable. A similar phenomenon was
ResNet, SRResNet, and ours. reported by Szegedy et al. [24]. We resolve this issue by
adopting the residual scaling [24] with factor 0.1. In each
In Fig. 2, we compare the building blocks of each net- residual block, constant scaling layers are placed after the
work model from original ResNet [9], SRResNet [14], and last convolution layers. These modules stabilize the train-
our proposed networks. We remove the batch normalization ing procedure greatly when using a large number of filters.
layers from our network as Nah et al.[19] presented in their In the test phase, this layer can be integrated into the previ-
image deblurring work. Since batch normalization layers ous convolution layer for the computational efficiency.
normalize the features, they get rid of range flexibility from We construct our baseline (single-scale) model with our
networks by normalizing the features, it is better to remove proposed residual blocks in Fig. 2. The structure is similar
them. We experimentally show that this simple modifica- to SRResNet [14], but our model does not have ReLU acti-
tion increases the performance substantially as detailed in vation layers outside the residual blocks. Also, our baseline
Sec. 4. model does not have residual scaling layers because we use
Furthermore, GPU memory usage is also sufficiently re- only 64 feature maps for each convolution layer. In our final
duced since the batch normalization layers consume the single-scale model (EDSR), we expand the baseline model
same amount of memory as the preceding convolutional by setting B = 32, F = 256 with a scaling factor 0.1. The
layers. Our baseline model without batch normalization model architecture is displayed in Fig. 3.
layer saves approximately 40% of memory usage during When training our model for upsampling factor ×3 and
training, compared to SRResNet. Consequently, we can ×4, we initialize the model parameters with pre-trained ×2
build up a larger model that has better performance than network. This pre-training strategy accelerates the training
conventional ResNet structure under limited computational and improves the final performance as clearly demonstrated
resources. in Fig. 4. For upscaling ×4, if we use a pre-trained scale ×2
X2 X3
Shuffle
Shuffle
ReLU
Conv
Conv
Conv
Conv
Mult
X4
Shuffle
Shuffle
Conv
Conv
model (blue line), the training converges much faster than
the one started from random initialization (green line).
ResBlock
Conv
(X2)
X2
PSNR(dB) on DIV2K validation set (x4)
30
ResBlock
ResBlock
ResBlock
•••
Conv
Conv
Conv
(X3)
X3
29.5
29
ResBlock
Conv
(X4)
X4
28.5
From pre-trained x2
28
From scratch
From scratch (Best performance)
27.5
0 50 100 150 200 250 300
Figure 5: The architecture of the proposed multi-scale SR
Updates (k)
network (MDSR).
Table 2: Performance comparison between architectures on the DIV2K validation set (PSNR(dB) / SSIM). Red indicates the
best performance and blue indicates the second best. EDSR+ and MDSR+ denote self-ensemble versions of EDSR and
MDSR.
our model with ADAM optimizer [13] by setting β1 = 0.9, This self-ensemble method has an advantage over other
β2 = 0.999, and = 10−8 . We set minibatch size as 16. ensembles as it does not require additional training of sepa-
The learning rate is initialized as 10−4 and halved at every rate models. It is beneficial especially when the model size
2 × 105 minibatch updates. or training time matters. Although self-ensemble strategy
For the single-scale models (EDSR), we train the net- keeps the total number of parameters same, we notice that
works as described in Sec. 3.2. The ×2 model is trained it gives approximately same performance gain compared
from scratch. After the model converges, we use it as a pre- to conventional model ensemble method that requires in-
trained network for other scales. dividually trained models. We denote the methods using
At each update of training a multi-scale model (MDSR), self-ensemble by adding ’+’ postfix to the method name;
we construct the minibatch with a randomly selected scale i.e. EDSR+/MDSR+. Note that geometric self-ensemble
among ×2, ×3 and ×4. Only the modules that correspond is valid only for symmetric downsampling methods such as
to the selected scale are enabled and updated. Hence, scale- bicubic downsampling.
specific residual blocks and upsampling modules that corre-
spond to different scales other than the selected one are not 4.4. Evaluation on DIV2K Dataset
enabled nor updated.
We test our proposed networks on the DIV2K dataset.
We train our networks using L1 loss instead of L2. Min-
Starting from the SRResNet, we gradually change various
imizing L2 is generally preferred since it maximizes the
settings to perform ablation tests. We train SRResNet [14]
PSNR. However, based on a series of experiments we em-
on our own. 2 3 First, we change the loss function from
pirically found that L1 loss provides better convergence
L2 to L1, and then the network architecture is reformed as
than L2. The evaluation of this comparison is provided in
described in the previous section and summarized in Table
Sec. 4.4
1.
We implemented the proposed networks with the Torch7
We train all those models with 3 × 105 updates in this
framework and trained them using NVIDIA Titan X GPUs.
experiment. Evaluation is conducted on the 10 images of
It takes 8 days and 4 days to train EDSR and MDSR, re-
DIV2K validation set, with PSNR and SSIM criteria. For
spectively. The source code is publicly available online.1
the evaluation, we use full RGB channels and ignore the (6
4.3. Geometric Self-ensemble + scale) pixels from the border.
Table 2 presents the quantitative results. SRResNet
In order to maximize the potential performance of our trained with L1 gives slightly better results than the orig-
model, we adopt the self-ensemble strategy similarly to inal one trained with L2 for all scale factors. Modifications
[28]. During the test time, we flip and rotate the input of the network give an even bigger margin of improvements.
image I LR to generate seven augmented inputs In,i LR
= The last 2 columns of Table 2 show significant performance
LR
Ti In for each sample, where Ti represents the 8 ge- gains of our final bigger models, EDSR+ and MDSR+ with
ometric transformations including indentity. With those the geometric self-ensemble technique. Note that our mod-
augmented low-resolution images,
SR we generate correspond- els require much less GPU memory since they do not have
SR
ing super-resolved images In,1 , · · · , In,8 using the net- batch normalization layers.
works. We then apply inverse transform to those output
images to get the original geometry I˜n,i = Ti−1 In,i
SR SR
. 2 We confirmed our reproduction is correct by getting comparable re-
Finally, we average the transformed outputs all together to sults in an individual experiment, using the same settings of the pa-
8 per [14]. In our experiments, however, it became slightly different to
make the self-ensemble result as follows. InSR = 18 I˜n,i
SR
P
. match the settings of our baseline model training. See our codes at
i=1 https://github.com/LimBee/NTIRE2017.
3 We used the original paper (https://arxiv.org/abs/1609.04802v3) as a
1 https://github.com/LimBee/NTIRE2017 reference.
HR Bicubic A+ [27] SRCNN [4]
(PSNR / SSIM) (21.41 dB / 0.4810) (22.21 dB / 0.5408) (22.33 dB / 0.5461)
img034 from Urban100 [10] VDSR [11] SRResNet [14] EDSR+ (Ours) MDSR+ (Ours)
(22.62 dB / 0.5657) (23.14 dB / 0.5891) (23.48 dB / 0.6048) (23.46 dB / 0.6039)
img062 from Urban100 [10] VDSR [11] SRResNet [14] EDSR+ (Ours) MDSR+ (Ours)
(20.75 dB / 0.7504) (21.70 dB / 0.8054) (22.70 dB / 0.8537) (22.66 dB / 0.8508)
0869 from DIV2K [26] VDSR [11] SRResNet [14] EDSR+ (Ours) MDSR+ (Ours)
(23.36 dB / 0.8365) (23.71 dB / 0.8485) (23.89 dB / 0.8563) (23.90 dB / 0.8558)
Table 3: Public benchmark test results and DIV2K validation results (PSNR(dB) / SSIM). Red indicates the best
performance and blue indicates the second best. Note that DIV2K validation results are acquired from published demo
codes.
4.5. Benchmark Results Therefore, more robust mechanisms are required to deal
with the second track. We submitted our two SR models
We provide the quantitative evaluation results of our final
(EDSR and MDSR) for each competition and prove that our
models (EDSR+, MDSR+) on public benchmark datasets in
algorithms are very robust to different downsampling con-
Table 3. The evaluation of the self-ensemble is also pro-
ditions. Some results of our algorithms on the unknown
vided in the last two columns. We trained our models using
downsampling track are illustrated in Fig. 7. Our meth-
106 updates with batch size 16. We keep the other settings
ods successfully reconstruct high-resolution images from
same as the baseline models. We compare our models with
severely degraded input images. Our proposed EDSR+ and
the state-of-the-art methods including A+ [27], SRCNN [4],
MDSR+ won the first and second places, respectively, with
VDSR [11], and SRResNet [14]. For comparison, we mea-
outstanding performances as shown in Table 4.
sure PSNR and SSIM on the y channel and ignore the same
amount of pixels as scales from the border. We used MAT- 6. Conclusion
LAB [18] functions for evaluation. Comparative results on
DVI2K dataset are also provided. Our models exhibit a sig- In this paper, we proposed an enhanced super-resolution
nificant improvement compared to the other methods. The algorithm. By removing unnecessary modules from con-
gaps further increase after performing self-ensemble. We ventional ResNet architecture, we achieve improved results
also present the qualitative results in Fig. 6. The proposed while making our model compact. We also employ resid-
models successfully reconstruct the detailed textures and ual scaling techniques to stably train large models. Our
edges in the HR images and exhibit better-looking SR out- proposed singe-scale model surpasses current models and
puts compared with the previous works. achieves the state-of-the-art performance.
Furthermore, we develop a multi-scale super-resolution
5. NTIRE2017 SR Challenge network to reduce the model size and training time. With
scale-dependent modules and shared main network, our
This work is initially proposed for the purpose of par- multi-scale model can effectively deal with various scales
ticipating in the NTIRE2017 Super-Resolution Challenge of super-resolution in a unified framework. While the
[26]. The challenge aims to develop a single image super- multi-scale model remains compact compared with a set of
resolution system with the highest PSNR. single-scale models, it shows comparable performance to
In the challenge, there exist two tracks for different de- the single-scale SR model.
graders (bicubic, unknown) with three downsample scales Our proposed single-scale and multi-scale models have
(×2, 3, 4) each. Input images for the unknown track are achieved the top ranks in both the standard benchmark
not only downscaled but also suffer from severe blurring. datasets and the DIV2K dataset.
HR Bicubic HR Bicubic
(PSNR / SSIM) (22.20 dB / 0.7979) (PSNR / SSIM) (21.59 dB / 0.6846)
0791 from DIV2K [26] EDSR (Ours) MDSR (Ours) 0792 from DIV2K [26] EDSR (Ours) MDSR (Ours)
(29.05 dB / 0.9257) (28.96 dB / 0.9244) (27.24 dB / 0.8376) (27.14 dB / 0.8356)
HR Bicubic HR Bicubic
(PSNR / SSIM) (23.81 dB / 0.8053) (PSNR / SSIM) (19.77 dB / 0.8937)
0793 from DIV2K [26] EDSR (Ours) MDSR (Ours) 0797 from DIV2K [26] EDSR (Ours) MDSR (Ours)
(30.94 dB / 0.9318) (30.81 dB / 0.9301) (25.48 dB / 0.9597) (25.38 dB / 0.9590)
Figure 7: Our NTIRE2017 Super-Resolution Challenge results on unknown downscaling ×4 category. In the challenge, we
excluded images from 0791 to 0800 from training for validation. We did not use geometric self-ensemble for unknown
downscaling category.
Table 4: Performance of our methods on the test dataset of NTIRE2017 Super-Resolution Challenge [26]. The results of top
5 methods are displayed for two tracks and six categories. Red indicates the best performance and blue indicates the second
best.
References [21] S. T. Roweis and L. K. Saul. Nonlinear dimensionality reduc-
tion by locally linear embedding. Science, 290(5500):2323–
[1] J. Allebach and P. W. Wong. Edge-directed interpolation. In 2326, 2000. 2
ICIP 1996. 2
[22] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken,
[2] M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-
R. Bishop, D. Rueckert, and Z. Wang. Real-time single im-
Morel. Low-complexity single-image super-resolution based
age and video super-resolution using an efficient sub-pixel
on nonnegative neighbor embedding. In BMVC 2012. 2, 4
convolutional neural network. In CVPR 2016. 2
[3] H. Chang, D.-Y. Yeung, and Y. Xiong. Super-resolution
[23] J. Sun, Z. Xu, and H.-Y. Shum. Image super-resolution using
through neighbor embedding. In CVPR 2004. 2
gradient profile prior. In CVPR 2008. 2
[4] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep
[24] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi. Inception-
convolutional network for image super-resolution. In ECCV
v4, inception-resnet and the impact of residual connections
2014. 2, 6, 7
on learning. arXiv:1602.07261, 2016. 3
[5] C. Dong, C. C. Loy, and X. Tang. Accelerating the super-
[25] Y.-W. Tai, S. Liu, M. S. Brown, and S. Lin. Super resolution
resolution convolutional neural network. In ECCV 2016. 2
using edge prior and single image detail synthesis. In CVPR
[6] G. Freedman and R. Fattal. Image and video upscaling from
2010. 2
local self-examples. ACM Transactions on Graphics (TOG),
30(2):12, 2011. 2 [26] R. Timofte, E. Agustsson, L. Van Gool, M.-H. Yang,
L. Zhang, et al. Ntire 2017 challenge on single image super-
[7] X. Gao, K. Zhang, D. Tao, and X. Li. Image super-resolution
resolution: Methods and results. In CVPR 2017 Workshops.
with sparse neighbor embedding. IEEE Transactions on Im-
1, 2, 4, 6, 7, 8
age Processing, 21(7):3194–3205, 2012. 2
[8] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a [27] R. Timofte, V. De Smet, and L. Van Gool. A+: Adjusted
single image. In ICCV 2009. 2 anchored neighborhood regression for fast super-resolution.
In ACCV 2014. 2, 6, 7
[9] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning
for image recognition. In CVPR 2016. 3 [28] R. Timofte, R. Rothe, and L. Van Gool. Seven ways to
[10] J.-B. Huang, A. Singh, and N. Ahuja. Single image super- improve example-based single image super resolution. In
resolution from transformed self-exemplars. In CVPR 2015. CVPR 2016. 5
2, 4, 6 [29] Z. Wang, Y. Yang, Z. Wang, S. Chang, J. Yang, and T. S.
[11] J. Kim, J. Kwon Lee, and K. M. Lee. Accurate image super- Huang. Learning super-resolution jointly from external and
resolution using very deep convolutional networks. In CVPR internal examples. IEEE Transactions on Image Processing,
2016. 1, 2, 3, 4, 6, 7 24(11):4359–4371, 2015. 2
[12] J. Kim, J. Kwon Lee, and K. M. Lee. Deeply-recursive [30] C.-Y. Yang and M.-H. Yang. Fast direct super-resolution by
convolutional network for image super-resolution. In CVPR simple functions. In ICCV 2013. 2
2016. 1, 2 [31] J. Yang, Z. Wang, Z. Lin, S. Cohen, and T. Huang. Coupled
[13] D. Kingma and J. Ba. Adam: A method for stochastic opti- dictionary training for image super-resolution. IEEE Trans-
mization. In ICLR 2014. 5 actions on Image Processing, 21(8):3467–3478, 2012. 2
[14] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, [32] J. Yang, J. Wright, T. S. Huang, and Y. Ma. Image super-
A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. resolution via sparse representation. IEEE Transactions on
Photo-realistic single image super-resolution using a gener- Image Processing, 19(11):2861–2873, 2010. 2
ative adversarial network. arXiv:1609.04802, 2016. 1, 2, 3, [33] R. Zeyde, M. Elad, and M. Protter. On single image scale-up
4, 5, 6, 7 using sparse-representations. In Proceedings of the Interna-
[15] X. Li and M. T. Orchard. New edge-directed interpolation. tional Conference on Curves and Surfaces, 2010. 2, 4
IEEE Transactions on Image Processing, 10(10):1521–1527, [34] L. Zhang and X. Wu. An edge-guided image interpolation al-
2001. 2 gorithm via directional filtering and data fusion. IEEE Trans-
[16] X. Mao, C. Shen, and Y.-B. Yang. Image restoration us- actions on Image Processing, 15(8):2226–2238, 2006. 2
ing very deep convolutional encoder-decoder networks with [35] H. Zhao, O. Gallo, I. Frosio, and J. Kautz. Loss functions for
symmetric skip connections. In NIPS 2016. 2 neural networks for image processing. arXiv:1511.08861,
[17] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database 2015. 2
of human segmented natural images and its application to
evaluating segmentation algorithms and measuring ecologi-
cal statistics. In ICCV 2001. 4
[18] MATLAB. version 9.1.0 (R2016b). The MathWorks Inc.,
Natick, Massachusetts, 2016. 7
[19] S. Nah, T. H. Kim, and K. M. Lee. Deep multi-scale
convolutional neural network for dynamic scene deblurring.
arXiv:1612.02177, 2016. 3
[20] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolu-
tional networks for biomedical image segmentation. In MIC-
CAI 2015. 2