Sag-Gan: Semi-Supervised Attention-Guided Gans For Data Augmentation On Medical Images

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

SAG-GAN: Semi-Supervised Attention-Guided

GANs for Data Augmentation on Medical Images


Chang Qi1,2 , Junyang Chen3 , Guizhi Xu1,2 ,Zhenghua Xu1,2,† , Thomas Lukasiewicz4 and Yang Liu5
1
State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, China.
2
Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, China.
3
Department of Computer Science, University of Macau, China.
4
Department of Computer Science, University of Oxford, United Kingdom.
5
College of Computer Science and Technology, Harbin Institute of Technology, China.

Corresponding author, email: [email protected]
arXiv:2011.07534v1 [eess.IV] 15 Nov 2020

Abstract—Recently deep learning methods, in particular, con- with high class imbalance or poor diversity leads to bad
volutional neural networks (CNNs), have led a massive break- model performance. This often proves to be problematic in
through in the range of computer vision. Also, the large-scale the field of medical imaging, where abnormal findings are,
annotated dataset is the essential key to a successful training
procedure. However, it is a huge challenge getting such datasets by definition uncommon. While traditional data augmentation
in the medical domain. Towards this, we present a data aug- schemes (e.g., crop, rotation, flip, and translation) can mit-
mentation method for generating synthetic medical images using igate some of these issues, those augmented images have a
cycle-consistency Generative Adversarial Networks (GANs). We similar distribution to the original images, leading to limited
add semi-supervised attention modules to generate images with performance improvement. Also, the diversity that those mod-
convincing details. We treat tumor images and normal images
as two domains. The proposed GANs-based model can generate ifications of the images can bring is relatively small. Motivated
a tumor image from a normal image, and in turn, it can also by the GANs, researchers try to add synthetic samples to the
generate a normal image from a tumor image. Furthermore, we training process. GANs-based data augmentation can improve
show that generated medical images can be used for improving performance by filling the distribution that uncovered by origin
the performance of ResNet18 for medical image classification. images. Since it can generate new but realistic images, it can
Our model is applied to three limited datasets of tumor MRI
images. We first generate MRI images on limited datasets, then achieve outstanding performance in medical image analysis
we trained three popular classification models to get the best [5]–[7].
model for tumor classification. Finally, we train the classifica- Despite these efforts, GANs based data augmentation in
tion model using real images with classic data augmentation some topics of medical imaging such as converting a normal
methods and classification models using synthetic images. The image without tumor to a tumor image remains a challenging
classification results between those trained models showed that
the proposed SAG-GAN data augmentation method can boost problem. Essentially, this is an image generation problem, but
Accuracy and AUC compare with classic data augmentation unlike the style translation task such as synthesizing PET im-
methods. We believe the proposed data augmentation method age from CT scans [8] or MRI [9], the attribute manipulation
can apply to other medical image domains, and improve the is more challenging due to the requirement of only modifying
accuracy of computer-assisted diagnosis. some image features while keeping others unchanged. A
Index Terms—Generative adversarial networks (GANs), Data
Augmentation, Attention module, Medical image processing straightforword option is to modify attribute manually in high
dimension [10], convert the attribute manipulation task to style
transform task. The other option is to generate tumor images
I. I NTRODUCTION
from noise [11] rather than normal images. However those
Over recent years, there is great progress in the field of Gen- solutions define the position and size of tumors manually,
erative Adversarial Networks (GANs) [1] and its extensions. which may break image prior, cause unacceptable additional
Their wide and successful applications in image generation false positives in the following image processing tasks.
task [2]–[4] have attracted growing interests across many To overcome the issues mentioned above, in this paper,
communities, including medical imaging. GANs proposed a we proposed a novel GAN-based data augmentation model
minimax game between two Neural Networks - generator gen- guided by semi-supervised attention mechanisms. Inspired by
erates samples, discriminator identifies the source of samples. CycleGAN [12], The proposed data augmentation model com-
In this game, the adversarial loss brought by discriminator pro- prises of two generators and two discriminators. The cycle-
vides a clever way to capture high dimensional and complex consistence constrains the model to change the image features
distributions, which imposed higher-order consistency that is that need to be modified. Besides, It not only generates images
proven to be useful in many cases, such as domain adaptation, by enabling manipulate high-level features but also pushes the
data augmentation, and image-to-image translation. generator to locate the areas to translate in each image with
It is widely known that sufficient data is critical to success additional attention module. The attention modules are trained
when training deep learning models for computer vision. Data by both adversarial loss and pixel-wise loss. That is why we
call it semi-supervised attention mechanisms. Moreover, we brain masks with white matter, grey matter, and CSF from the
add spectral normalization [13] to stabilize the training of input MRI images. The discriminator is trained to distinguish
the discriminator. We evaluate the generated images’ realism the real brain masks that annotated by doctors versus the
by tumor classification results with/without the proposed data fake brain masks generated by the generator. The second
augmentation method. stage is a brain-to-image synthesis GAN-based model. The
The contributions of this paper are summarized as follows: brain masks generated in stage one was merged with tumor
• GAN-based data augmentation method for medical masks. The merged brain masks as the brain-to-image model’s
image: We proposed a novel semi-supervised attention- input generate the abnormal MRI images with brain tumors.
guided CycleGAN to generate tumor in normal images However, there are some limits in this two-stages model.
and recover normal images from tumor images. It’s Firstly, there is no brain mask annotation in dataset A. Thus the
the first model that generate tumor in normal images tumor mask that merged to brain mask is inferred by the model
naturally. So that the image prior will not be destroyed. trained by dataset B. But dataset B does not contain tumor
• Semi-supervised attention mechanism: To the best of information, so the quality of the generated tumor mask is
our knowledge, we are the first that integrates the semi- doubtful. Secondly, the position of the tumor merged to brain
supervised attention mechanism into Generative Adver- mask is decided by researchers, which means it is manually
sarial Networks. We proposed an attention module trained and random. However, the tumor location is related to other
with adversarial loss and pixel-wise loss. The additional feature of the tumor, such as size, shape, degree of malignancy,
pixel-wise loss push the attention module locates the the attempt that locate a wrong tumor location with specific
location of tumors as accurately as possible. other features may damage the MRI image prior, causing
higher false positives and less robustness of the model.
II. R ELATED W ORK Han et al. [11] proposed a CPG-GAN model for generating
Generative Adversarial Networks(GANs) [1] have two tumor in the normal MRI images. The ’condition’ in their
models to train: a model G to learn the target data distribution model is a [0, 1] mask, where 0 stands for the nontumor
pdata (x), a model D to assess the source of D’s input, is it from area, 1 stands for the area that needs to generate tumor
pdata (x) or from model G(z). The aim of the training model feature. However, there are some problems in there work:
G is to maximize the chance of model D making mistakes, Firstly, like shin et al.’s work [10], the position of the tumor
while the aim of training model D is to maximize the proba- is decided by researchers, which will damages the image
bility of assigning the correct label to both training examples prior. The experiments in work prove it: the FPs per slice
and samples from G. GANs are powerful generative models, of detection task increase 3.52 with only 0.1 increase of
which have achieved impressive results in many computer sensitivity. Secondly, the adversarial loss is not enough to
vision tasks. The adversarial loss is the key to GAN’s success. generate a realistic tumor image from normal MRI image.
It forces the model to generate images that indistinguishable The goal of our work is to generate abnormal MRI images
from real images. The optimism of the generator is not the from normal MRI images to fulfill the data distribution. It
pixel-wise loss but another network-discriminator. means that the model needs to learn the mapping function
Attention-guided GANs address the issues that the from normal domain (N ) to tumor domain (T ) with un-
instance-level correspondences are indistinguishable from the paired training samples {Ni }ni=1 ∈ N and {Tj }m j=1 ∈ T .
distribution of the target set. The attention module forces the As illustrated in Fig. 1, our data augmentation method in-
network to pay more attention to the area under the attention cludes two mapping function G1: Normal→Tumor and G2:
map. Mejjati et al. [14] propose an attention mechanism that Tumor→Normal. Meanwhile, there are two adversarial dis-
is jointly trained with the generators and discriminators. Chen criminators D1: distinguish between real tumor images {Tr },
et al. propose AttentionGAN [15], an extra attention network generated tumor images from real normal images {Tgr }, and
is used to generate attention maps. Kastaniotis et al. [16] generated tumor images from generated normal images {Tgg },
proposed ATAGAN, a teacher network is used to produce in the same way, D2: distinguish between real normal images
attention maps. Zhang et al. [17] proposed a Self-Attention {Nr } , generated normal images from real tumor images
Generative Adversarial Networks (SAGAN). The Non-local {Ngr }, and generated normal images from generated tumor
module [18] is used to produce the attention map. Liang et al. images {Ngg }. The attention networks in both generator G1/G2
[19] propose a Contrasting GAN that takes the segmentation and discriminator D1/D2 predict the region of interest in the
mask as the attention map. Sun et al. [20] proposed an input image.
attention GANs using FCN to generate a facial mask for face
III. P ROPOSED M ODEL
attribute manipulation.
To the best of our knowledge, there are two research groups A. Channel Attention Module
tried to generate tumor images. Shin et al. [10] tried to The residual block built the whole model, to enhance the
duplicate tumors in BRATS dataset to normal MRI images model ability in capture hierarchical patterns and then improve
in ADNI dataset. They proposed a two-stages GANs model the quaility of image representations, we adopt improved
for the transform task. The first stage is an image-to-brain Squeeze-and-Excitation(SE) block [21] to the original residual
segmentation GAN-based model. The generator generates the module following the standard intergration design. The goal
2) Channel Attention Module

𝑵𝒓 𝑳𝒓
Adversarial Residual
Loss
𝑫𝑳

GMP
𝑫𝑵

Sigmoid
ReLU
FC

FC
+ · + Generator

GAP
𝒈 𝒈
𝑵𝒈 𝑵𝒓𝒈 𝑳𝒓𝒈 𝑳𝒈 Attention Module

Normal-Abnormal-Normal
Generating path

AMSE

AMSE

AMSE
··· · Abnormal-Normal-Abnormal
𝑮𝑵 𝑮𝑳 𝑺𝒂
Generating path

+ Output 𝑳𝒓
Input

AMSE

AMSE

AMSE
Unpaired abnormal and
··· 𝟏 − 𝑺𝒂 normal images
Cycle- · 𝑵𝒓
𝑳𝒓 Consistency 𝑵𝒓
Loss
3) Attention-Guided Generator
1) Architecture

Fig. 1. Illustration of our proposed Semi-supervised Attention-Guided CycleGAN (SAGGAN). ’L’ represents for abnormal images with target
object, ’N’ represents for normal images. GL / GN : The Attention-guided generators, DL / DN : The Attention-guided discriminators.

is achieved by explicitly modeling the interdependencies be- solve two tasks: 1) location the area to translate, and 2) taking
tween the channels in feature space, which allows the model the proper translation in the located area. So we proposed
to emphasise channels with informative features and suppress two attention networks AN and AT to achieve this. Where
less useful channels. AN aims to select the area to generate tumor that maximizes
Specifically, as shown in Fig. II, the proposed AMSE the probability that the discriminator makes a mistake and
(Average-Maximum Squeeze-and-Excitation) -Residual mod- minimizes the probability that the generator makes a mistake;
ule consistent with two blocks: the transformation function AT aims to locate the place that has tumor and generates the
Ft , and the AMSE block. Ft takes the image features X ∈ possibility map, which will guide the generator recover the
0 0 0
RC ×H ×W captured in the last AMSE-Residual block as normal images from tumor images.
input, and outputs transformed image features U ∈ RC×H×W . In the forward processing, the generated image contains two
At the squeeze stage, a statistic Z ∈ RC is generated by parts, the foreground from the generator and the background
ignore U’s spatial dimensions H × S. The pth element of Z from the input image. Take the translation from normal sam-
is calculated by: ples to tumor samples as an expmple. Firstly, the normal brain
H−1 −1 MRI {Nr } ∈ N is fed into the generator GN →T , which maps
1 XW
{N r } to the target domain T , resulting the generated tumor
X
Z(p) = Fsq (U ) = U (p, i, j)+max U (p, i, j)
H × W i=0 j=0 image GT 0 = GN →T ({Nr }). Then, the same input {Nr } is
(1) fed into the attention module AN , resulting in the attention
To learn the nonlinear interaction between channels while map MN 0 = AN ({Nr }). To create the ‘foreground’ object
maintain the non-mutually-exclusive relationship, at the ex- {Tf0 } ∈ T , we apply MN 0 to GT 0 via an element-wise product:
citation stage, a gating mechanism consist with two fully- {Nf0 } = MN 0 GT 0 . Secondly, the inverse of attention map
0
connected (FC) layer and one relu layer is being used. Given MN 0 = 1 − MN 0 will be applied to the input image via an
0 C element-wise product as the background. Thus, the mapped
the output of the squeeze stage Z, the output activate Z ∈ R
of the excitation stage is: image{T g } is obtained by:
0
Z0 = Fex (Z) (2) Tg = MN 0 GT 0 + MN
| {z } | {z }r
0 N . (4)
Foreground Background
The final outputs X 0 ∈ RC×H×W of this block is the
rescaled U plus the origin X: We only described the map FN →T ; the inverse map FT →N
is defined similarly. Fig. 2 visualizes those processes.
X = Frescale (Z0 , U ) + X = Z0 U + X (3)
C. Attention-Guided Discriminator
B. Attention-Guided Generator Eq. 4 constrains the generators to modify only on the atten-
Unlike style translation tasks in domain translation tasks, tion regions: However, the discriminators currently consider
translation between normal images to tumor images require to the whole image. Vanilla discriminator DT takes the whole
generated image {Tg } and the whole real image {Tr } ∈ T as 𝐷𝑇𝐴 Adversarial Loss 𝐷𝑁𝐴 Adversarial Loss

input and tries to distinguish them. The attention mechanism 𝐺𝑁→𝑇 𝑇𝑔 𝐺𝑇→𝑁 𝐺𝑇→𝑁 𝑁𝑔 𝐺𝑁→𝑇
is added into the discriminator so that discriminators only N 𝐴𝑁 𝑀𝑁′ 𝐴𝑇 𝑁𝑔 T 𝐴𝑇 𝑀𝑇′ 𝐴𝑁 𝑇𝑔
consider the regions inside the attention map. We propose two
attention-guided discriminators. The attention-guided discrim- T Cycle-consistency
Loss 𝑇𝑔
Cycle-consistency
inator takes the attention mask, the generated images, and the N Loss 𝑁𝑔
real images as inputs. For attention-guided discriminator DTA , 𝑀𝑇′ Pixel-wise
Loss M
which tries to distinguish the fake image with attention map Normal-Tumor-Normal Training Path Tumor-Normal-Tumor Training Path
MN 0 Tg and the real image with attention map MN 0 Tr .
Similar to DTA , DN A
tries to distinguish the fake image with Fig. 2. The training loss of the proposed GAN-based network.
attention map MT 0 Ng and the real image with attention map
MT 0 Nr . Discriminators can focus on the most discriminative Attention-guided Adversarial Loss The attention-guided ad-
content by this attention-guided method. versarial loss is proposed to training the attention-guided
discriminators. It can be formulated as follows:
LN A A
 
D. Spectral Normalization AGAN (GN →T , DT ) = Et∼pdata (t) log DT (MN 0 t) +
En∼pdata (n) [log(1 − DTA (MN 0 GN →T (n)))].
It is wildly known that the reason why GANs is challenging (6)
to train is that the objective function of the native GANs is where GN →T aims to translation the normal image to tumor
equivalent to the J-S divergence between the distribution pg of image and maximize the probability that the discriminator
the optimized generated data and the distribution pr of the real makes a mistake, while DTA trained to distinguish between the
data. However, J-S metric fails to provide a meaningful value generated image with its attention mask (MN 0 t). Which
when two distributions are disjoint. It makes no guarantee means GN →T tries to minimize the attention-guided adver-
convergence to a unique solution such that pr = pg . Then, sarial loss LGAN (GN →T , DTA ), while DTA tries to maximize
WGAN [3] was proposed to replace the J-S divergence in the A
it. There are also another loss for the discriminator DN and
native GAN with a good Wasserstein distance. The KR duality the generator GT →N :
principle is used to transform the Wasserstein distance problem
LTAGAN (GT →N , DN A A
 
into a solution to the optimal Lipschitz continuous function. ) = En∼pdata (n) log DN (MT 0 n) +
The spectral normalization proposed by Miyato et al. [13] use A
Et∼pdata (t) [log(1 − DN (MT 0 GT →N (t)))].
a more elegant way to make the discriminator meet Lipschitz (7)
continuity. Cycle-Consistency Loss. The cycle-consistency loss can be
used to enforce forward and backward consistency. For exam-
E. Semi-supervised Attention Mechanism ple, if a tumor image is transformed into a normal image, the
transformed from the generated normal image to the tumor
Segmentation annotations are available in our case. For image should be brought back to a cycle. Thus, the loss
example, the attention map of tumor → normal translation is function of cycle-consistency is defined as:
exactly the whole tumor region in the segmentation annotation
of the tumor. Therefore, we supervise the training process of Lcycle (GN →T , GT →N ) =
attention network AT by segmentation label. Given a training En∼pdata (n) [kGT →N (GN →T (n)) − nk1 ]+ (8)
set {(T1 , M1 ), · · · , (TN , MN )} of N examples, where Mi Et∼pdata (t) [kGN →T (GT →N (t)) − tk1 ].
stands for the tumor label of segmentation. To reduce changes
and constrain generators, we adopt pixel-wise loss between Loss Function. We obtain the final energy to optimize by com-
the tumor label Mi and the generated attention map MTi0 . We bining the adversarial loss, cycle-consistency loss, and semi-
express this loss as: supervised pixel losses for both source and target domains:
A
L(GN →T , GT →N , AN , AT , DN , DTA ) =
LM (Mi , MTi0 ) = kMi − MTi0 k1 . (5)
λgan (LN T
AGAN + LAGAN )+ (9)
We adopt L1 distance as loss measurement in pixel loss. λcyc × Lcycle (GN →T , GT →N ) + LM (Mi , MTi0 )
Also, we denote the tumor segmentation label as the ground-
IV. E XPERIMENTS AND R ESULTS
truth annotation for the attention map MTi0 . This added loss
makes our model more robust by encouraging the attention We present sets of experiments and results in this section. To
maps to be sharp (converging towards a binary map), while evaluate the performance of proposed CycleGAN-based data
the attention mask of normal areas will always be zero. augmentation method, we employed a convolutional neural
network with the deep residual block (ResNet18) [22] to com-
pare the classification results using generated tumor images
F. Optimization Objective
to the classification results of real images. We implemented
Fig. 3 shows the training loss of the proposed data augmen- five models to generate tumor images, as described in Section
tation model. IV.A.b.
For the implementation of the tumor classification model PGGAN MSG-GAN Original CycleGAN UAG-GAN AGGAN Ours

ResNet18 and GAN-based data augmentation architecture,

𝑩𝒓𝒂𝑻𝑺
Generated Positive Samples
we used the Pytorch framework. All training processes were
performed in an NVIDIA GeForce GTX 1080 Ti GPU.

𝑩𝒓𝒂𝑻𝑺𝑺
A. Dataset Evaluation and Implementation Datails
1) Classification: For brain tumor classification, we chose

𝑼𝑵𝑺
ResNet18 [22] because, among three popular neural net-
work models [22]–[24], our initial experiments showed that

𝑩𝒓𝒂𝑻𝑺
ResNet18 took the shortest training time, the best classification

Generated Negative Samples


performance and has the lowest number of parameters, making
it potentially more portable and less prone to overfitting. In this

𝑩𝒓𝒂𝑻𝑺𝑺
study, we split the datasets into 70% training, 20% validation,
and 10% test images.

𝑼𝑵𝑺
We calculated TPR and TNR to meature the performance
of our data augmentation method in tumor classification task.
In the following equations, we present these measures: Fig. 3. Results of synthtic images.

TP
TPR = (10) 2) Quantitative results: Table. I shows the performance
TP + FN
of our data augmentation method in the image classification
TN task. The results in Table. I proved our hypothesis that adding
TNR = (11) generated samples can improve the classification performance.
TN + FP
where T stands for correct classification, in the opposite,
V. C ONCLUSIONS
F stands for the wrong classification. P stands for the
classification result is tumor category, and N stands for the This work focus on generating abnormal images from nor-
classification result is the normal category. So, T P stands for mal images and recovering normal images from abnormal im-
the tumor image is classification to the tumor category, T N ages with GAN. This GAN-based data augmentation method
stands for the normal image is classification to the normal can enlarge small medical datasets, fulfill data distribution.
category, F P stands for the normal image is classification While recent data augmentation method in the medical image
to the tumor category, and F N stands for the tumor image can generate a new abnormal sample, they also have some
classification to the normal category. limits. For example, previous works need masks to teach the
2) Baselines: We compare our model with leading image to generator the place to generate abnormal lesion. Also, it is hard
image translation model: CycleGAN [12], and the extension for GANs-based model to generate a large-size medical image.
of CycleGAN: Attention-guided CycleGAN [14]. For a fair Most generated abnormal samples are small (32px × 32px).
comparison, we then add the spectral normalization method The data augmentation method we proposed can generate
to those models. abnormal images of the real medical image size (In this case,
3) Datasets: We use the BraTS dataset provided by Menze 240px × 240px). We expect to get significant improvements
et al. [25] to evaluate our data augmentation method. These in the quality of generated abnormal images by incorporating
datasets contain segmentation mask for each case, and for each an attention module into both generator and discriminator.
case, there are four modal MRI images: T1, T1c, T2, and Flair. However, then we found that the attention mapping is not
There are 322 cases in the Brats 2019 dataset, where 20% robust since the shape of abnormal lesion is changing between
of all cases is signed as the testing data, and 10% of all the images. So we add a semi-supervised mechanism to stabilize
cases if signed as the validation data. To test the performance this training procedure. The result shows that this approach
of the proposed data augmentation scheme with limited data, improves the robust of attention modules. Experimental results
we randomly selected one-eighth of the training data as the on three datasets demonstrate that our data augmentation
training set BraTSS . method can generate striking results with convincing details
than the state-of-the-art models.
B. Evaluation of the Data Augmentation There are several limitations to this work. One possible
1) Generated Images by proposed SAG-GAN: Fig. 4 il- extension could be the evaluation from the classification task
lustrates examples of synthetic MR images by our data to the segmentation task. Data insufficient happens more in
augmentation method. Observing the generated images and the field of tumor/lesion segmentation. In the future, we
learned attention maps by our model, we can see that our plan to extend our work to other medical domains that can
model successfully captures the T1c-specific texture and tumor benefit from generated abnormal images to improve training
appearance in the right position. performance.
TABLE I
C OMPARISON WITH S TATE -O F -T HE -A RTS

Accuracy AUC
Method BraTS BraTSS UNS BraTS BraTSS UNS
w/o DA 0.9337 0.8334 0.6837 0.9612 0.9306 0.7940
OverSampling 0.9375 0.8418 0.7196 0.9643 0.9328 0.8072
UnderSampling 0.9369 0.8373 0.7285 0.9647 0.9314 0.8192
PGGAN [26] 0.9419 0.8500 0.7331 0.9670 0.9345 0.8199
MSG-GAN [27] 0.9410 0.8530 0.7354 0.9667 0.9350 0.8190
CycleGAN [12] 0.9430 0.8539 0.7387 0.9679 0.9354 0.8208
UAGGAN [14] 0.9467 0.8704 0.7634 0.9689 0.9465 0.8290
AGGAN [28] 0.9460 0.8587 0.7457 0.9682 0.9353 0.8226
Ours 0.9503 0.8759 0.7650 0.9697 0.9530 0.8297
Relative imp.% +25.03% +25.51% +25.70% +21.91% +32.27% +17.33%

ACKNOWLEDGMENTS
[12] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image
This work was supported in part by the National Natural translation using cycle-consistent adversarial networks,” in Proceedings
Science Foundation of China under grant 61906063, in part of the IEEE international conference on computer vision, 2017, pp.
2223–2232.
by the Natural Science Foundation of Tianjin, China, under [13] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral
grant 19JCQNJC00400, and in part by the Yuanguang Scholar normalization for generative adversarial networks,” arXiv preprint
Fund of Hebei University of Technology, China. arXiv:1802.05957, 2018.
[14] Y. A. Mejjati, C. Richardt, J. Tompkin, D. Cosker, and K. I. Kim, “Un-
R EFERENCES supervised attention-guided image-to-image translation,” in Advances in
Neural Information Processing Systems, 2018, pp. 3693–3703.
[1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, [15] X. Chen, C. Xu, X. Yang, and D. Tao, “Attention-gan for object trans-
S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in figuration in wild images,” in Proceedings of the European Conference
Advances in neural information processing systems, 2014, pp. 2672– on Computer Vision (ECCV), 2018, pp. 164–180.
2680. [16] D. Kastaniotis, I. Ntinou, D. Tsourounis, G. Economou, and S. Fotopou-
[2] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” los, “Attention-aware generative adversarial networks (ata-gans),” in
arXiv preprint arXiv:1411.1784, 2014. 2018 IEEE 13th Image, Video, and Multidimensional Signal Processing
[3] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative ad- Workshop (IVMSP). IEEE, 2018, pp. 1–5.
versarial networks,” in International conference on machine learning, [17] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention gen-
2017, pp. 214–223. erative adversarial networks,” arXiv preprint arXiv:1805.08318, 2018.
[4] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation [18] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural net-
learning with deep convolutional generative adversarial networks,” arXiv works,” in Proceedings of the IEEE Conference on Computer Vision
preprint arXiv:1511.06434, 2015. and Pattern Recognition, 2018, pp. 7794–7803.
[5] S. K. Lim, Y. Loo, N.-T. Tran, N.-M. Cheung, G. Roig, and Y. Elovici, [19] X. Liang, H. Zhang, and E. P. Xing, “Generative semantic manipulation
“Doping: Generative data augmentation for unsupervised anomaly detec- with contrasting gan,” arXiv preprint arXiv:1708.00315, 2017.
tion with gan,” in 2018 IEEE International Conference on Data Mining [20] R. Sun, C. Huang, J. Shi, and L. Ma, “Mask-aware photorealistic face
(ICDM). IEEE, 2018, pp. 1122–1127. attribute manipulation,” arXiv preprint arXiv:1804.08882, 2018.
[6] C. Bowles, L. Chen, R. Guerrero, P. Bentley, R. Gunn, A. Hammers, [21] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in
D. A. Dickie, M. V. Hernández, J. Wardlaw, and D. Rueckert, “Gan Proceedings of the IEEE conference on computer vision and pattern
augmentation: augmenting training data using generative adversarial recognition, 2018, pp. 7132–7141.
networks,” arXiv preprint arXiv:1810.10863, 2018.
[22] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
[7] A. Madani, M. Moradi, A. Karargyris, and T. Syeda-Mahmood, “Chest
recognition,” in Proceedings of the IEEE conference on computer vision
x-ray generation and data augmentation for cardiovascular abnormality
and pattern recognition, 2016, pp. 770–778.
classification,” in Medical Imaging 2018: Image Processing, vol. 10574.
[23] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
International Society for Optics and Photonics, 2018, p. 105741M.
large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[8] A. Ben-Cohen, E. Klang, S. P. Raskin, M. M. Amitai, and H. Greenspan,
“Virtual pet images from ct data using deep convolutional networks: [24] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
initial results,” in International Workshop on Simulation and Synthesis with deep convolutional neural networks,” in Advances in neural infor-
in Medical Imaging. Springer, 2017, pp. 49–57. mation processing systems, 2012, pp. 1097–1105.
[9] Y. Pan, M. Liu, C. Lian, T. Zhou, Y. Xia, and D. Shen, “Synthesizing [25] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani,
missing pet from mri with cycle-consistent generative adversarial net- J. Kirby, Y. Burren, N. Porz, J. Slotboom, R. Wiest et al., “The
works for alzheimer’s disease diagnosis,” in International Conference multimodal brain tumor image segmentation benchmark (brats),” IEEE
on Medical Image Computing and Computer-Assisted Intervention. transactions on medical imaging, vol. 34, no. 10, pp. 1993–2024, 2014.
Springer, 2018, pp. 455–463. [26] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing
[10] H.-C. Shin, N. A. Tenenholtz, J. K. Rogers, C. G. Schwarz, M. L. of gans for improved quality, stability, and variation,” arXiv preprint
Senjem, J. L. Gunter, K. P. Andriole, and M. Michalski, “Medical image arXiv:1710.10196, 2017.
synthesis for data augmentation and anonymization using generative [27] A. Karnewar and O. Wang, “Msg-gan: multi-scale gradient gan for stable
adversarial networks,” in International Workshop on Simulation and image synthesis,” arXiv preprint arXiv:1903.06048, 2019.
Synthesis in Medical Imaging. Springer, 2018, pp. 1–11. [28] H. Tang, D. Xu, N. Sebe, and Y. Yan, “Attention-guided generative
[11] C. Han, K. Murao, S. SATOH, and H. Nakayama, “Learning more adversarial networks for unsupervised image-to-image translation,” in
with less: Gan-based medical image augmentation,” Medical Imaging 2019 International Joint Conference on Neural Networks (IJCNN).
Technology, vol. 37, no. 3, pp. 137–142, 2019. IEEE, 2019, pp. 1–8.

You might also like