0% found this document useful (0 votes)

27 views

Object Detection Based On An Adaptive Attention Mechanism: Wei Li, Kai Liu, Lizhe Zhang & Fei Cheng

Uploaded by

Muhammad Ridho

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

Object Detection Based On An Adaptive Attention Mechanism: Wei Li, Kai Liu, Lizhe Zhang & Fei Cheng

Uploaded by

Muhammad Ridho

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

www.nature.

com/scientificreports

OPEN Object detection based

on an adaptive attention
mechanism
Wei Li , Kai Liu*, Lizhe Zhang & Fei Cheng

Object detection is an important component of computer vision. Most of the recent successful object
detection methods are based on convolutional neural networks (CNNs). To improve the performance
of these networks, researchers have designed many different architectures. They found that the CNN
performance benefits from carefully increasing the depth and width of their structures with respect to
the spatial dimension. Some researchers have exploited the cardinality dimension. Others have found
that skip and dense connections were also of benefit to performance. Recently, attention mechanisms
on the channel dimension have gained popularity with researchers. Global average pooling is used in
SENet to generate the input feature vector of the channel-wise attention unit. In this work, we argue
that channel-wise attention can benefit from both global average pooling and global max pooling. We
designed three novel attention units, namely, an adaptive channel-wise attention unit, an adaptive
spatial-wise attention unit and an adaptive domain attention unit, to improve the performance of a
CNN. Instead of concatenating the output of the two attention vectors generated by the two channel-
wise attention sub-units, we weight the two attention vectors based on the output data of the two
channel-wise attention sub-units. We integrated the proposed mechanism with the YOLOv3 and
MobileNetv2 framework and tested the proposed network on the KITTI and Pascal VOC datasets. The
experimental results show that YOLOv3 with the proposed attention mechanism outperforms the
original YOLOv3 by mAP values of 2.9 and 1.2% on the KITTI and Pascal VOC datasets, respectively.
MobileNetv2 with the proposed attention mechanism outperforms the original MobileNetv2 by a mAP
value of 1.7% on the Pascal VOC dataset.

Object detection is a basic task in computer vision. Many other computer vision tasks, such as object tracking
and image segmentation, can benefit from fast and accurate object detection results. Over the past few decades,
many object detection methods have been proposed. Early methods mainly deployed handcrafted features and
shallow machine learning models. These methods were vulnerable to overfitting and often included a large
amount of calculations. Recently, convolutional neural networks have been used to learn feature representations
from images automatically and are the dominant approach for object detection.
Convolutional neural networks (CNNs) progressively extract semantic information from input images
through convolution layers and discards pixels on the feature maps that are less informative, forcing the atten-
tion of the CNNs to focus on the more informative pixels within the spatial dimension.
Learning from previous works, the performance of CNNs can be boosted through strengthening the rep-
resentative ability of different dimensions. V
GG1 and R esNet2 have more layers than previous CNNs such as
AlexNet3. They encode higher-dimensional features so that the classifier can differentiate positive and negative
samples more easily. D enseNet4 uses dense connections to integrate information from different layers which
also boosts the performance. R esNeXt5 and Xception6 also lift the performance effectively by expanding the
cardinality dimension. Some works7–9 exploit an attention mechanism on the channel dimension or/and spatial
dimension to improve the performance of CNNs.
Apart from the design schemes of CNNs mentioned above, we aim to improve their representative ability by
strengthening other dimensions, namely, channel-wise attention, spatial-wise attention and domain attention in
a fully data-driven manner. Channel-wise attention performs feature recalibration with respect to the channel
dimension. On the one hand, we obtain channel-wise attention tensors from both global max pooling and global

School of Computer Science and Technology, Xidian University, Xi’an 710071, China. *email: [email protected].
edu.cn

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 1

Vol.:(0123456789)
www.nature.com/scientificreports/

average pooling the input feature maps. On the other hand, we obtain spatial-wise attention tensors from fully
convolutional layers. Because the proposed attention units are fully data drived, we call them adaptive attention
units. Through adaptive attention, a CNN learns to use global and local information to selectively emphasize
informative features and suppress less useful ones.
Previous works10,11 have proposed ’adaptive attention’. Lu et al.10 proposed a novel adaptive attention model
(AAM) with a visual sentinel for image captioning. Recent neural models for image captioning usually employ an
encoder–decoder framework with an attention mechanism. In such frameworks, a CNN-based image encoder is
used to extract feature vectors for an input image, and an RNN-based caption decoder is used to generate caption
words recurrently. In contrast to the attention based encoder–decoder framework that relies on only a visual
signal or the a language model, the AAM can automatically decide when to look at an image (visual signal) and
when to rely on the language model. Based on the AAM, their methods significantly outperform other state-of-
the-art approaches on COCO and Flickr30k. Lun Huang11 etc. proposed an attention model, namely, adaptive
attention time (AAT), to automatically align the source and the target for image captioning. As opposed to the
attention mechanism that assumes the one-to-one or one-to-fixed-number of steps to map from source image
regions and target caption words, which is never possible, AAT allows the framework to learn how many attention
steps to take to output a caption word at each decoding step (one-to-many mapping). Experiments also show
that the proposed AAT outperforms previously published image captioning models. In the current paper, we
proposed an adaptive domain attention unit. It automatically weights the two kinds of channel-wise attentions
generated by the two-branch structure without predefined parameters. Because the domain attention generated
by the domain attention unit and the output channel attention only depend on the features of the input data, we
call the output of the proposed module the adaptive channel-wise attention.
YOLOv312 is a kind of CNN with a high inference speed and detection accuracy performance that uses Dark-
Net53 as a backbone network. It contains many 1 × 1 kernels to extract important information and strengthen
the nonlinear representation ability. It also utilizes residual connections and multi-scale detectors that greatly
improve its performance. M obileNetv213 is built upon depth inverted residual modules(IRMs). Each IRM con-
tains depth-separable convolution, point-wise convolution, and an inverted residual connection. Detection mod-
els built upon MobileNetv2 can run extremely fast. In this work, we test the proposed mechanisms on YOLOv3
and MobileNetv2.
The contributions of our work are as follows:
1. We designed three novel adaptive attention units: an adaptive channel-wise attention unit, an adaptive
spatial-wise attention unit and an adaptive domain attention unit.
2. The proposed adaptive attention mechanism is fully data driven, lightweight and easy to apply.
3. We applied an adaptive attention mechanism to YOLOv3 and MobileNetv2. Experiments are performed on
the KITTI and Pascal VOC datasets that show that the proposed model achieves a better performance compared
to the original YOLOv3 and MobileNetv2 implementation.

Related works
Many CNNs are applied to object detection tasks. These CNNs can be roughly divided into two-stage methods
and single-stage methods. Two-stage methods consist of a stage for region proposal generation and another stage
for positive sample classification and localization. Single-stage methods treat the background as the (c + 1)th class
(c is the number of positive classes) and resolve the object detection task as a regression problem. Single-stage
methods outperform two-stage methods in terms of inference speed by a large margin.
Ross et al. proposed the R-CNN14. The R-CNN divides the object detection task into two stages. In the first
stage, a selective search method is used to generate thousands of region proposals. In the second stage, the
classification task and bounding box regression task on these region proposals are finished simultaneously. The
R-CNN is the first work that uses a CNN to solve the object detection task. Despite the pioneering aspects of
R-CNNs, they are time consuming because they process each region proposal independently. The Fast R-CNN15
remedies this by shared convolution computing. It processes each input image as a whole and obtains the feature
maps of each input. ROI pooling is used to obtain the same sized feature maps of each proposal so that they can
fit into subsequent fully connected layers. Running Fast R-CNN is 9 times and 213 times faster than R-CNN
during training and testing stages, respectively. Faster R-CNN16 improves the the performance of Fast R-CNN by
replacing the selective search module with a region proposal network (RPN). The RPN generates anchor boxes
of different sizes and aspect ratios. They provide the initial position and scale of the predicted bounding boxes.
The RPN outputs the regressed anchor boxes and a tag indicating if it is a positive sample. These outputs are fed
into subsequent networks that act as a multi-class classifier and a bounding box regressor. Faster R-CNN can
be trained end-to-end and has a better performance than its predecessors. Many two-stage methods have been
proposed after Faster R-CNN, such as MS-CNN17, cascading R-CNN18, and so on. Many of them are based on
the aforementioned models.
SSD19, YOLOv120 and their derived versions are the representative works of single-stage detectors. SSD is a
fully convolutional network. VGG16 is applied as its backbone network for feature extraction, and it deploys
multi-scale features for object detection. It can be trained end-to-end and had a great impact on the succeeding
works. YOLOv1 divides the input images into 7 × 7 grades. If the centre of an object falls into one of the grades,
that grade is responsible for the detection of the object. This means that if the centre of two objects falls into
the same grade, only one object can be correctly detected. Furthermore, the last two layers of YOLOv1 are fully
connected layers. As a result, the inputs of YOLOv1 should be resized to the same scale, making YOLOv1 less
flexible. YOLOv221 remedies these defects through constructing a fully convolutional network and introducing
the anchor mechanism. YOLOv2 also exploit the K-means algorithm to select initial sizes and aspect ratios of
anchor boxes. While it is still fast, the detection precision and performance on small object detection remain to

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 2

Vol:.(1234567890)
www.nature.com/scientificreports/

be improved. YOLOv3 integrated many advantageous design concepts of the CNN, such as residual connections,
1 × 1 convolution kernels, and detectors with multi-scale features which balance the performance between pre-
cision and speed. Many single-stage methods have been proposed after SSD, YOLO and their derived versions,
such as DSSD, RSSD, Tiny YOLO with improved performance in accuracy or inference speed. MobileNets13,22,23
run extremely fast with a slight accuracy decrease. Apart from these good design concepts, we improve YOLOv3
through a novel fully data-driven attention mechanism.
Jie et al. proposed squeeze-and-excitation module (SE module)7. It is a light plug-in module that allows
the network to perform feature recalibration through which the network learns to use global information to
selectively emphasize informative features and suppress less useful ones. In their work, global information is
obtained by a global average pooling operation. Sanghyun et al. proposed the convolutional block attention
module (CBAM)8. They gather global information through both global average pooling and global max pooling
because global max pooling gathers finer channel-wise attention. Moreover, they also devised a spatial attention
module through an inter-spatial relationship of features. Different from channel-wise attention that focuses on
‘what’ to attend to, the spatial attention focuses on ‘where’ as an informative part. CBAM gathers channel-wise
attention and spatial-wise attention in a sequential manner. Jongchan et al. also exploited both channel and spa-
tial attention and proposed the bottleneck attention module (BAM)9. In their work, BAM gathers channel-wise
attention and spatial-wise attention in a parallel manner.
Despite the good design concepts of10,11, they weight global average pooling and global max pooling equally.
Given an input feature map, global average pooling tends to identify discriminative regions of an object, while
small object can be more beneficial from global max pooling that identifies global max values. Given a set of
input images, the distribution of objects size may not be uniform; thus, equally weighting the global average
pooling and global max pooling may have a negative impact on the detection performance of objects, both large
and small. Based on the above analysis, three novel and fully data-driven attention units, namely, a channel-wise
attention unit, a domain attention unit, and a space-wise attention unit, are proposed. The proposed domain
attention unit adaptively weights the two attention tensors obtained by the adaptive channel-wise attention units.
For adaptive space-wise attention, because lower layers of a network contain abundant positional information
but less semantic information, and higher layers contain abundant semantic information but less positional
information, we only apply spatial attention to several lower layers of a detection network and apply channel
attention to several higher layers of a detection network. The key novelty of our methods lies in the domain
attention unit. Notably, our domain attention is different from other models with the same name. Our domain
attention inherits the merits of both global average pooling and global max pooling. The inputs of our domain
attention unit are the outputs of the two sibling squeeze-and-excitation modules. The domain attention in24 is
used for domain adaptation and consists of a fully connected layer and a nonlinear layer. Its input is the feature
vector produced by a global average pooling layer. The feature vector is also fed into the SE unit in their work.

Method description
In this section, first, the proposed network structure is introduced. Then, the adaptive channel-wise attention,
adaptive domain attention and spatial-wise attention are described. As shown in Fig. 1, the above modules are
integrated into YOLOv3. Spatial attention modules reside after the first ‘Res’ block in ‘ARes*N’ blocks as these
lower layers contain abundant positional information but less semantic information. Channel attention units
and domain attention units separately reside after the remaining seven ‘Res’ blocks. Channel attention units and
domain attention units also reside in each ‘ACBL’ module after each ‘CBL’ module as these higher layers contain
more semantic information but less positional information. Note that domain attention units reside in channel
attention units. The detailed structure of the adaptive channel attention units is shown in Fig. 2.

Adaptive channel‑wise attention. We obtain adaptive channel-wise attention by the squeeze-and-exci-

tation structure and a domain attention unit that acts as a calibrator of the outputs of the adaptive channel-wise
attention units. Different from SENet, which only considered global average pooling when designing SEBlock,
we consider that both global average pooling and global max pooling are useful. The basic intuition behind this
is that given an input feature map, global average pooling tends to identify the object extent. On the other hand,
the global max point identified by global max pooling indicates that the position contains the feature of an object
that can be used for the detection task. Global max pooling is more useful when the object is small and when
the scale of feature map shrinks considerably with respect to the spatial dimension during forward propagation.
Although several works have used both global average pooling and global max pooling for channel-wise atten-
tion, they weight the two kinds of attentions equally. In some cases, that is sub-optimal because the two kinds of
attentions emphasize different aspects of a feature map. For example, the KITTI dataset contains many objects
of various sizes. Weighting the two kinds of attentions equally may have a negative impact on other objects.
The key novelty of our methods lies in the domain attention unit. For designing the domain attention unit,
several preconditions need to be met. 1. It should be fully data driven. Its intermediate values and output can
adapt to the input data. 2. It is sufficiently powerful to weight raw attention vectors. 3. It should be as light as
possible to minimize the computational overhead. As a result, it is natural to consider feature-based attention
mechanisms for weighting the raw attention tensors. Furthermore, the SE module that accounts for the channel-
wise attention is constructed by fully connected layers with only one hidden layer. Other works have also proven
its effectiveness and e fficiency10,11. Hence, we use a simple method to construct the domain attention module
that consists of three fully connected layers.
The structure of domain attention module is shown in Fig. 3. It outputs a domain-sensitive weight tensor
(domain attention) that is used to recalibrate the raw channel-wise attention obtained from the two SE units.
The domain attention vector is generated by the following formula:

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 3

Vol.:(0123456789)
www.nature.com/scientificreports/

Figure 1. Flowchart of YOLOv3 architecture with adaptive attention.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 4

Vol:.(1234567890)
www.nature.com/scientificreports/

Figure 2. Adaptive channel-wise attention units. r and s are compression ratios.

Figure 3. Flowchart of the domain attention unit. s is the compression ratio.

XDA = softmax(FC2 (Relu(FC1 (Xraw )))), (2)

where Xraw is the input of the domain attention unit. FCns are fully connected layers, Relu is a nonlinear acti-
vation function. Softmax is the normalized exponential function that maps the output of FC2 to a probability
distribution.
As shown in Fig. 2, adaptive channel-wise attention units use feature maps in the CNN architecture as their
inputs, and their outputs are channel-wise attention tensors. We use the squeeze-and-excitation structure and
both global max pooling and global average pooling to generate two kinds of attention tensors. They are concat-
enated within the channel dimension for subsequent usage. We call the concatenated tensor the ‘raw attention
tensor’. Formally, suppose the input of the adaptive channel-wise attention module is X; then, the raw attention
tensor is generated by the following formula:

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 5

Vol.:(0123456789)
www.nature.com/scientificreports/

Figure 4. Flowchart of the spatial attention unit. t is the compression ratio.

Xraw = Concat(SEmax (X)), SEavg (X)), (1)

where the SEmax and SEavg are used to generate the channel-wise attention tensor based on global max pooling
and global average pooling, respectively. They are concatenated to generate raw attention tensors of dimension
2c ∗ 1.
Then, we use the domain attention and raw attention tensor to generate the adaptive channel-wise attention:
Xweighted = Scale(Xraw , XDA ), (3)
where Xweighted is the adaptive channel attention. Scale is matrix multiplication operation used to weight the raw
attention tensor Xraw by the domain attention XDA.

Spatial attention. Different layers of the CNN contain spatial features of objects of different dimensions.
Low layers of the CNN mainly contain edge and corner features. As the layers deepen, they contain higher-
dimensional features of objects, such as features of objects parts or the whole objects. As a result, it is important
to focus the computational resources on the most informative positions with respect to spatial dimensions.
Generally, channel-wise attention resolves what to focus on; spatial attention resolves where to focus on. We
designed a spatial-wise attention module to improve the performance of YOLOv3. The proposed spatial atten-
tion module is also lightweight and fully data driven.
Different from9, which generated spatial-wise attention though both global max pooling and global average
pooling across the channel dimension, we generate spatial-wise attention through fully convolutional layers in
a learning manner. The proposed spatial attention module is shown in Fig. 4. It produces a spatial attention map
to recalibrate the features in different spatial locations. Because the spatial attention unit is composed of a 1 × 1
convolution layer and a 3 × 3 convolution layer, the relative position and receptive field of pixels on the spatial
attention map are the same as the output of the backbone layers. As a result, the pixels on spatial attention map
only weight the pixels of the same locations of the output feature maps. The 1 × 1 convolution layers are used to
squeeze the feature map across the channel dimension. It also prevents the direct influence of backpropagation
on the backbone network. The 3 × 3 convolution layers are used to excite a local area response to amplify their
efficiency. The spatial-wise attention is generated by the following formula:
Xsp = f3 (f1 (X)) (4)
where f1 and f3 are the 1 × 1 and 3 × 3 convolution layers with nonlinear functions, respectively.

Integrating the attention modules with YOLOv3. YOLOv3 integrated many advantageous design
concepts of the CNN such as residual connections, 1 × 1 convolution kernels, and detectors with multi-scale
features. We improve the performance of YOLOv3 by integrating YOLOv3 with the adaptive channel-wise atten-
tion, domain attention and spatial-wise attention proposed in the previous two subsections.
The proposed attention modules are easily implemented in a plug-in manner. We only apply spatial atten-
tion to lower layers of several modules of YOLOv3 as the spatial dimension of higher layers is small; thus, they
contain little positional information. On the other hand, we only apply channel attention to higher layers as the
channel dimension of lower layers is also small; thus, they contain little semantic information. Furthermore,
modern CNN-based detectors rely largely on transport learning. We do not modify the first few layers so that

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 6

Vol:.(1234567890)
www.nature.com/scientificreports/

Figure 5. Detailed intro-block connections of ‘ACBL’ and ‘ARes*N’.

we can make use of pre-trained DarkNet53 model to initialize the first few layers of the proposed network dur-
ing the training stage.
According to the above analysis, we designed a novel network based on the YOLOv3 model. It is shown
in Fig. 1. As shown in the figure, adaptive channel-wise attention modules reside in both ‘ACBL’ modules and
‘ARes*N’ blocks. On the other hand, spatial-wise attention modules reside only in ‘ARes*N’ blocks. The detailed
intro-block connections of ‘ACBL’ and ‘ARes*N’ are shown in Fig. 5. In the next section, we will introduce how
to train and evaluate the proposed model in detail.

Experimental results and analysis

The performance of the proposed improved YOLOv3 model was evaluated on the KITTI d ataset25 and PASCAL
VOC 2007 dataset26. We carried out experiments with single NVIDIA 1080 Ti GPU with 8 GB of main memory
and a Core i7 processor (3.2 GHz) on the Ubuntu 16.04 operating system. The code was implemented in Ten-
sorFlow and compiled with cuDNN. The compression ratios r, s, and t, respectively, in Figs. 2, 3 and 4 are set to
16, 16 and 32, in both experiments on the KITTI and Pascal VOC datasets.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 7

Vol.:(0123456789)
www.nature.com/scientificreports/

Method Car Pedestrian Cyclist mAP Inference time (ms) Model size (M) Gflops
YOLOv3 (original) 91.1 74.6 79.0 81.6 47 246.3 112.6
YOLOv3+SE (GMP) 91.4 75.1 81.6 82.7 49 251.6 115.3
YOLOv3+SE (GAP) 91.2 75.5 82.2 83.0 49 251.6 115.3
YOLOv3+SE (GAP and GMP) 91.7 75.6 82.8 83.4 50 257.0 117.2
YOLOv3+ACA (ours) 92.3 77.0 83.4 84.2 50 262.3 119.9
YOLOv3+ACA+ASA (ours) 92.6 77.4 83.6 84.5 51 262.5 120.0

Table 1. Comparison of the proposed model with recent works on KITTI dataset.

Method Car Pedestrian Cyclist mAP Inference time (ms)

Gaussian YOLOv328 87.3 79.9 83.6 83.6 47
RefineDet29 92.7 78.5 83.6 81.9 72
SSD19 85.1 48.1 50.7 61.3 69
RFBNet30 86.4 61.6 71.7 73.4 51
SqueezeDet+32 85.5 73.7 82.0 80.4 31
MS-CNN31 87.4 80.4 86.3 84.7 246
YOLOv3+adaptive attention (ours) 91.2 75.0 81.3 84.5 51

Table 2. Evaluation results on KITTI dataset.

Experiment on KITTI dataset. The KITTI object detection dataset contains 7381 training images with
ground truth labels. It contains 3 categories of objects: vehicles, people and cyclists. The objects to be detected
are of various sizes and contain many small samples which make it a challenging task. The labels of the KITTI
testing set are not publicly available, so we split the training images in half into a training set and a validation set.
In all the experiments, the training dataset was used for training, and the validation dataset was used for testing.
For a fair comparison, we used the same training set and validation set for all experiments.
For the experiment on the KITTI dataset, the training is divided into two stages. In the first stage, the back-
bone network is frozen, and the weights of the network are only updated after the conv 52 layer. In the second
stage, whole network is updated. The Adam optimizer with a default learning rate of 0.001 at the beginning is
used and is re-initialised after the first stage is finished. Both the first stage and the second stage are trained for
40 epochs. We scale the input image size into various sizes, such as 320 × 320 and 352 × 352, in the training
stage. In the evaluation stage, the input image size is scaled into 544 × 544 to achieve a better performance. We
randomly split the 7381 training images in half into a training set and a validation set. The mean average preci-
sion (mAP) results are evaluated on the validation set. The batch size is 6. The weights of backbone layers of the
model are initialized by the DarkNet53 model pre-trained on ImageNet. We stop training after the 200th epoch.
Data augmentation techniques such as random cropping and flipping are adopted to avoid overfitting. The model
was pre-trained on the COCO d ataset27 and fine-tuned on the Pascal VOC dataset.
The performances of the models are evaluated based on mAP, inference time, model size and Gflops. We re-
evaluated YOLOv3 and YOLOv3 with SE units for a fair comparison. Other studies are trained using the default
settings in the official code of each algorithm. We compared the performance of our proposed model with the
original YOLOv3, YOLOv3 with SE modules, and our proposed model. To test the effect of each part, we con-
duct a control experiment. First, we test the original YOLOv3 model on the KITTI dataset. Second, we use only
the global average pooling scheme or only the global average pooling strategy to test the effects of each branch.
Third, we test the two-branch structure. We do not use adaptive domain attention in this experiment. This is a
special case of adaptive domain attention that equivalently weights the output of each branch of the two-branch
structure. Fourth, we test YOLOv3 with the proposed adaptive channel-wise attention module. Last, we test
YOLOv3 with both the adaptive channel-wise attention module and adaptive spatial-wise attention module.
The performance for each configuration is shown in Table 1.
As shown in the table, both branch structures and the adaptive channel-wise attention have positive effects on
the original YOLOv3 model, and the proposed adaptive attention module achieves better performance than the
other methods in the table. The proposed model improves the mAP value by 2.9%, while YOLOv3 with the SE
unit improves the mAP value by 1.4%. The proposed model outperforms the other ones with a small increase in
inference time. Besides, we compare the number of trainable parameters and GFLOPs of the models in Table 2.
GFLOPs for each model are measured with input images of size 544 × 544. From the table, the proposed model
achieves better performance with a small model size and computational complexity increase. We believe the
performance improvement is mainly due to the innovative architecture.
We compare the proposed model with recent works (Gaussian YOLOv328, RefineDet29, RFBNet30) and a
two-stage detection model (MS-CNN)31 in Table 2.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 8

Vol:.(1234567890)
www.nature.com/scientificreports/

Figure 6. Loss curves.

As shown in the table, the proposed model achieves better performance than most of the other models in
the table. MS-CNN achieves a similar performance to the proposed model, but it runs much slower than the
proposed model.

Experiment on the Pascal VOC dataset. The Pascal VOC dataset contains three computer vision tasks:
classification, detection and segmentation. For the object detection task, it has 20 different classes to be detected
such as people, birds, and cats. The VOC 07+12 train and val sets are employed for training, and the VOC 07 test
set is employed for evaluation. In this experiment, the models were pre-trained on the COCO dataset and fine-
tuned on the Pascal VOC dataset, which needs only a few epochs to make the training converge.
The training is divided into two stages. In the first stage, the backbone network is frozen, and the weights of
the network are only updated after the conv 52 layer. In the second stage, the whole network is updated. Both
the first stage and the second stage are trained for 20 epochs. We scale the input image size into various sizes,
such as 320 × 320 and 352 × 352, in the training stage. In the evaluation stage, the input image size is scaled into
544 × 544. The loss curve for each model is shown in Fig. 6. The curves with red colour and blue colour depict the
training loss of the proposed model and the original YOLOv3 model, respectively. Figure 6a shows the total loss
curve, which is the summation of the other three losses. As shown in Fig. 6b, the red curve and blue curve almost
coincide with each other, denoting that the proposed method does not affect the foreground and background
classification much. Given a set of anchors with a positive tag, Fig. 6c and d show that the initial prob_loss and
giou_loss of the proposed model are larger than that of the original YOLOv3. This is because adaptive attention
units added new weights to the model. However, as the training goes on, the red curves descend faster than do
the blue curves. Thus, the adaptive attention weights towards more informative weights and is the reason that
YOLOv3 with the adaptive attention mechanism achieved better performance. For each class, the PR-curves are
shown in Fig. 7. As shown in the sub-figures, in most cases, the performance of the original YOLOv3 is worse
than that of the proposed model. Finally, we list the mAP and inference time of the original YOLOv3 and the
proposed model in Table 3.
As shown in the table, the proposed model with adaptive attention modules achieves a better performance
than YOLOv3 with a small increase in inference time. This research sheds new light on the design of attention
mechanism modules.

Experiment on MobileNetv2 with modified SSD detector. We evaluate the proposed adaptive
attention over MobileNetv2 with a modified SSD detector (Fig. 8). MobileNetv2 is built upon inverted residual
modules(IRMs) (Fig. 9). We added four additional convolution blocks behind the truncated MobileNetv2 back-

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 9

Vol.:(0123456789)
www.nature.com/scientificreports/

Figure 7. AP curves for each class.

Method mAP Inference time (ms)

YOLOv3 81.0 49
YOLOv3+adaptive attention (ours) 82.2 54

Table 3. Evaluation results on the PASCAL VOC dataset.

bone network. Each of the four additional convolution blocks is built upon two sequential IRMs. The modified
SSD detector heads are connected behind the IRM5_3, the last IRM of the truncated MobileNet, and each of
the four additional convolution blocks. There are six detector heads in total. We apply the adaptive spatial-wise
attention module in the last IRM of the truncated MobileNetv2 backbone that has large feature maps in the spa-
tial dimension (orange cube in Fig. 8), and apply the adaptive channel-wise attention module to the following
four IRMs that have more feature map channels (blue cube in Fig. 8). In the experiment, the adaptive attention

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 10

Vol:.(1234567890)
www.nature.com/scientificreports/

Figure 8. MobileNetv2 with modified SSD detector model. ASA is adaptive spatial-wise attention. ACA is
adaptive channel-wise attention.

Figure 9. Inverted residual module. ASA or ACA is used to recalibrate the feature maps (blue cube) generated
by the point-wise convolution layer within the IRM module.

Method mAP Inference time (ms)

MobileNetv2 with modified SSD detector 68.4 3.66
MobileNetv2 with modified SSD detector + adaptive attention 70.1 3.92

Table 4. MobileNetv2 with modified SSD detector evaluation results on the PASCAL VOC dataset.

module is used to recalibrate the feature maps generated by the point-wise convolution layer within the IRM
module (Fig. 9).
We test the original model and the model with the adaptive attention modules on the PASCAL VOC dataset.
The experimental results are shown in Table 4. We use mAP to evaluate both models. As shown in the table, the
mAP value of the proposed model is 1.7 times higher than the original model.

Conclusion
In this paper, a novel adaptive attention mechanism was proposed to build up attention units that are fully
date driven. Based on this principle, three kinds of attention units, namely, an adaptive channel-wise attention
unit, an adaptive domain attention unit and an adaptive spatial-wise attention unit, were proposed. They were
both lightweight and easy to apply. We applied these adaptive attention units to the YOLOv3 and MobileNetv2
architecture in a plug-in manner. The proposed model was evaluated on the KITTI and Pascal VOC datasets.
The experimental results show that the performance was improved with a small increase in inference time com-
pared with the original YOLOv3 and MobileNetv2 architecture. We believe the performance improvements are
mainly due to the innovative architecture. Thus, the issues mentioned in the introduction section were resolved.
In the future, our challenge is to apply the proposed method to other computer vision tasks, such as semantic
segmentation, and serve this function better.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 11

Vol.:(0123456789)
www.nature.com/scientificreports/

Received: 27 December 2019; Accepted: 5 June 2020

References
1. Karen, S., & Andrew, Z. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556. (2014)
2. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (2015).
3. Krizhevsky, A., Sutskever, I., & Hinton G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf.
Process. Syst. 25: 84–90 (2012).
4. Huang, G., Liu, Z., Laurens, V. D. M. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (2017).
5. Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. Aggregated residual transformations for deep neural networks. In Proceedings of
the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2017).
6. François, C. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Computer Society Confer-
ence on Computer Vision and Pattern Recognition (2016).
7. Hu, J., Shen, L., Albanie, S., Sun, G. & Wu, E. Squeeze-and-excitation networks. In IEEE Transactions on Pattern Analysis and
Machine Intelligence (2018).
8. Woo, S., Park, J., Lee J., & Kweon, I. CBAM: Convolutional block attention module. In European Conference on Computer Vision
(2018).
9. Park, J., Woo, S., Lee J., & Kweon, I. BAM: Bottleneck attention module. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (2018).
10. Lu, J., Xiong, C.M., Parikh, D., & Socher, R. Knowing when to look: Adaptive attention via a visual sentinel for image captioning.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2017).
11. Huang, L., Wang, W.M., Xia, Y.X., & Chen, J. Adaptively aligned image captioning via adaptive attention time. NeurIPS (2019).
12. Redmon, J., & Farhadi, A. YOLOv3: An incremental improvement. In Proceedings of the IEEE Computer Society Conference on
Computer Vision and Pattern Recognition (2018).
13. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceed-
ings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2018).
14. Girshick, R., Donahue, J., Darrell, T., & Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2014).
15. Girshick, G. Fast R-CNN. In IEEE International Conference on Computer Vision (2015).
16. Ren, S., He, K., Girshick, G., & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Pro-
ceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2015).
17. Cai, Z., Fan, Q., Feris, R. S., & Vasconcelos, N. A Unified multi-scale deep convolutional neural network for fast object detection.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016).
18. Cai, Z., & Vasconcelos, N., Cascade R-CNN: Delving into high quality object detection. In Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition (2018).
19. Liu, W. et al. SSD: Single shot MultiBox detector. In European Conference on Computer Vision (2016).
20. Redmon, J., Divvala, S., Girshick, R. & Farhadi, A. You only look once: Unified, real-time object detection. In: Proceedings of the
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016).
21. Redmon, J. & Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (2016).
22. Howard, A., et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. In Proceedings of the IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (2017).
23. Howard, A., et al. Searching for MobileNetV3. In Proceedings of the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition (2019).
24. Wang, X.d., Cai, z.w., Gao, D.S., & Vasconcelos, N. Towards universal object detection by domain attention. In Proceedings of the
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2019).
25. Geiger, A., Lenz, P. & Urtasun, R. Are we ready for autonomous driving? The kitti vision benchmark suite. In Proceedings of the
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2012).
26. Everingham, M. & Winn, J. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit (2006).
27. Lin, T. Y. et al. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (2015).
28. Choi, J., Chun, D., Kim, H., & Lee, H.J. Gaussian YOLOv3: An accurate and fast object detector using localization uncertainty for
autonomous driving. In IEEE International Conference on Computer Vision (2019).
29. Zhang, S., Wen, L., Bian X., Lei, Z., & Li, S.Z. Single-shot refinement neural network for object detection. In IEEE International
Conference on Computer Vision (2017).
30. Liu, S., et al. Receptive field block net for accurate and fast object detection. In European Conference on Computer Vision (2018).
31. Cai, Z., Fan, Q., Feris, R.S., & Vasconcelos, N. A unified multi-scale deep convolutional neural network for fast object detection.
In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016).
32. Wu, B., Iandola, F., Jin, P., & Keutzer, K. SqueezeDet: Unified, small, low power fully convolutional neural networks for real-time
object detection for autonomous driving. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern
Recognition (2017).

Acknowledgements
This research work is supported by the National Natural Science Foundation of China under Grant Nos. 61571345
and 91538101, the Research on the Development Strategy of Intelligent Society under Grant No. RW190411, the
Huawei Innovation Research Program under Grant No. 2017050310, the Space Telemetry and Control Innovation
Funding under Grant No. 201705B, and the Key Research and Development Plan of Jiangxi Province under Grant
No. 20181ACE50029. The authors would like to thank the reviewers for their helpful comments and criticism.

Author contributions
W.L. wrote the main manuscript text. K.L. and W.L. conceived and designed the experiments. L.Z.Z. prepared
Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9. F.C. contributed to interpretation of the fundamental theories. All authors discussed
the issues and exchanged views on the manuscript.

Competing interests
The authors declare no competing interests.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 12

Vol:.(1234567890)
www.nature.com/scientificreports/

Additional information
Correspondence and requests for materials should be addressed to K.L.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons license, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons license and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Scientific Reports | (2020) 10:11307 | https://doi.org/10.1038/s41598-020-67529-x 13

Vol.:(0123456789)

Fast Animal Detection in Uav Images Using Convolutional Neural Networks
No ratings yet
Fast Animal Detection in Uav Images Using Convolutional Neural Networks
4 pages
IFS FTP Tutorial For AS400 Iseries
100% (1)
IFS FTP Tutorial For AS400 Iseries
9 pages
MRS Sot Seminar Report
No ratings yet
MRS Sot Seminar Report
16 pages
151180080_BM466_HOMEWORK 4
No ratings yet
151180080_BM466_HOMEWORK 4
10 pages
Untitled Document
No ratings yet
Untitled Document
23 pages
Multiple Object Tracking For Video Analysis and Surveillance A Literature Survey
No ratings yet
Multiple Object Tracking For Video Analysis and Surveillance A Literature Survey
10 pages
Convolutional Neural Network With Data Augmentation For SAR Target Recognition
No ratings yet
Convolutional Neural Network With Data Augmentation For SAR Target Recognition
5 pages
Todos_Tienen_Celular_Uso_Apropiacion_e_I
No ratings yet
Todos_Tienen_Celular_Uso_Apropiacion_e_I
15 pages
Feature Fusion Based On Convolutional Neural Netwo PDF
No ratings yet
Feature Fusion Based On Convolutional Neural Netwo PDF
8 pages
Retele Neuronale Convolutionale
No ratings yet
Retele Neuronale Convolutionale
60 pages
05_AIHC_Exp03
No ratings yet
05_AIHC_Exp03
7 pages
Learning Hierarchical Features For Scene Labeling
No ratings yet
Learning Hierarchical Features For Scene Labeling
15 pages
Object Recognition and Detection With Deep Learning For Autonomous Driving Applications
No ratings yet
Object Recognition and Detection With Deep Learning For Autonomous Driving Applications
11 pages
Research On Image Classification Model Based On de
No ratings yet
Research On Image Classification Model Based On de
12 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
Manuscript Template 2
No ratings yet
Manuscript Template 2
13 pages
CNN架构综述
No ratings yet
CNN架构综述
67 pages
2023-1-7
No ratings yet
2023-1-7
8 pages
Object Detection Using CNN
No ratings yet
Object Detection Using CNN
5 pages
MPFNet_Multiscale_Prediction_Network_with_Cross_Fu
No ratings yet
MPFNet_Multiscale_Prediction_Network_with_Cross_Fu
12 pages
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
No ratings yet
CNN Model For Image Classification Using Resnet: Dr. Senbagavalli M & Swetha Shekarappa G
10 pages
Spatial-Temporal Fusion Convolutional Neural Network For Simulated Driving Behavior Recognition
No ratings yet
Spatial-Temporal Fusion Convolutional Neural Network For Simulated Driving Behavior Recognition
7 pages
Real Time Object Detection Using YOLO
No ratings yet
Real Time Object Detection Using YOLO
6 pages
Hybrid_Deep_Learning_Algorithms_for_Dog_Breed_IdentificationA_Comparative_Analysis
No ratings yet
Hybrid_Deep_Learning_Algorithms_for_Dog_Breed_IdentificationA_Comparative_Analysis
12 pages
Improved Fault Diagnosis in Wireless Sensor Networks Using Deep Learning Technique
No ratings yet
Improved Fault Diagnosis in Wireless Sensor Networks Using Deep Learning Technique
4 pages
DLP
No ratings yet
DLP
50 pages
Department of Information Science and Engineering Technical Seminar (18Css84) Convolutional Neural Networks
No ratings yet
Department of Information Science and Engineering Technical Seminar (18Css84) Convolutional Neural Networks
15 pages
1 s2.0 S0031320322007075 Main
No ratings yet
1 s2.0 S0031320322007075 Main
12 pages
Object Detection and Its Implementation On Android Devices
No ratings yet
Object Detection and Its Implementation On Android Devices
8 pages
Sensors 22 06500 v2
No ratings yet
Sensors 22 06500 v2
11 pages
1.convolutional Neural Networks For Image Classification
No ratings yet
1.convolutional Neural Networks For Image Classification
11 pages
Qian 2015
No ratings yet
Qian 2015
10 pages
Semantic Segmentation of Remote Sensing Images Usi
No ratings yet
Semantic Segmentation of Remote Sensing Images Usi
12 pages
Asım et al. - Unknown - A Vehicle Detection Approach using Deep Learning Methodologies-annotated
No ratings yet
Asım et al. - Unknown - A Vehicle Detection Approach using Deep Learning Methodologies-annotated
7 pages
Research Article: A Model For Surface Defect Detection of Industrial Products Based On Attention Augmentation
No ratings yet
Research Article: A Model For Surface Defect Detection of Industrial Products Based On Attention Augmentation
12 pages
Hedge Tracking
No ratings yet
Hedge Tracking
9 pages
Research 3rdsem
No ratings yet
Research 3rdsem
10 pages
Image Caption Generator Using CNN and LSTM
No ratings yet
Image Caption Generator Using CNN and LSTM
8 pages
Vehicle_Detection_and_Tracking_Techniques_Based_on
No ratings yet
Vehicle_Detection_and_Tracking_Techniques_Based_on
5 pages
A_review_of_advances_in_image_recognition_models_F
No ratings yet
A_review_of_advances_in_image_recognition_models_F
5 pages
YOLOP You Only Look Once For Panoptic Driving Perception
No ratings yet
YOLOP You Only Look Once For Panoptic Driving Perception
9 pages
Learning and Transferring Representations For Image Steganalysis Using Convolutional Neural Network
No ratings yet
Learning and Transferring Representations For Image Steganalysis Using Convolutional Neural Network
5 pages
[email protected]
No ratings yet
[email protected]
9 pages
PEC CS 802C Deep Learning
No ratings yet
PEC CS 802C Deep Learning
13 pages
Remotesensing 13 04743 v2
No ratings yet
Remotesensing 13 04743 v2
14 pages
Dual Spin Max Pooling Convolutional Neural Network For Solar Cell Crack Detection
No ratings yet
Dual Spin Max Pooling Convolutional Neural Network For Solar Cell Crack Detection
16 pages
ObjectDetectionwithConvolutionalNeuralNetworks
No ratings yet
ObjectDetectionwithConvolutionalNeuralNetworks
12 pages
Pramono Hierarchical Self-Attention Network For Action Localization in Videos ICCV 2019 Paper
No ratings yet
Pramono Hierarchical Self-Attention Network For Action Localization in Videos ICCV 2019 Paper
10 pages
IMINT Target Acquisition Using Deep Learning
No ratings yet
IMINT Target Acquisition Using Deep Learning
5 pages
An Efficient Self Attention Network For Skeleton Based Action Recognition
No ratings yet
An Efficient Self Attention Network For Skeleton Based Action Recognition
10 pages
CNN Model Introduction and Overview
No ratings yet
CNN Model Introduction and Overview
2 pages
Design and Implementation of A Convolutional Neura
No ratings yet
Design and Implementation of A Convolutional Neura
11 pages
Convolutional Neural Networks For Image Classification
No ratings yet
Convolutional Neural Networks For Image Classification
5 pages
50 Most Important CNN Interview Questions
No ratings yet
50 Most Important CNN Interview Questions
18 pages
(IJCST-V10I5P12) :mrs J Sarada, P Priya Bharathi
No ratings yet
(IJCST-V10I5P12) :mrs J Sarada, P Priya Bharathi
6 pages
Deep Learning Image Classification
No ratings yet
Deep Learning Image Classification
11 pages
DL_EXP-4_16010422230
No ratings yet
DL_EXP-4_16010422230
4 pages
Gearbox Fault Diagnosis Method Based On Lightweight Channel Attention Mechanism and Transfer Learning
No ratings yet
Gearbox Fault Diagnosis Method Based On Lightweight Channel Attention Mechanism and Transfer Learning
15 pages
jimaging-10-00197
No ratings yet
jimaging-10-00197
19 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Aero
No ratings yet
Aero
27 pages
Sample CV
No ratings yet
Sample CV
6 pages
Responsibilities of A Journalist
No ratings yet
Responsibilities of A Journalist
2 pages
Third Quarter Worksheet No.18
No ratings yet
Third Quarter Worksheet No.18
3 pages
English 10 Long Test 2nd Quarter
No ratings yet
English 10 Long Test 2nd Quarter
3 pages
Air On The G String: Lento
No ratings yet
Air On The G String: Lento
5 pages
13. MEASUREMENT OF NON - ELECTRICAL QUANTITIES(P_3)
No ratings yet
13. MEASUREMENT OF NON - ELECTRICAL QUANTITIES(P_3)
20 pages
faculty-list
No ratings yet
faculty-list
36 pages
Scala and Spark Practice Questions - Free Practice Test - Spark Quiz and Test
No ratings yet
Scala and Spark Practice Questions - Free Practice Test - Spark Quiz and Test
9 pages
Chapter 2 - Environmental Scanning - Industry Analysis
No ratings yet
Chapter 2 - Environmental Scanning - Industry Analysis
20 pages
Maimun Palace Is Located - Untuk Mengidentifikasi Karena Hanya Berupa Informasi
0% (1)
Maimun Palace Is Located - Untuk Mengidentifikasi Karena Hanya Berupa Informasi
2 pages
Lista de Plagas Cuarentenarias 2016
No ratings yet
Lista de Plagas Cuarentenarias 2016
16 pages
OS MCQ (Part 2)
No ratings yet
OS MCQ (Part 2)
20 pages
A List of Commands and A Quick Description
No ratings yet
A List of Commands and A Quick Description
35 pages
The Digestive System: Dr. Ali Ebneshahidi
No ratings yet
The Digestive System: Dr. Ali Ebneshahidi
46 pages
Lesson 2 The Importance of Research (RM)
No ratings yet
Lesson 2 The Importance of Research (RM)
2 pages
Intermediate 3 Workbook Unit 2-1
No ratings yet
Intermediate 3 Workbook Unit 2-1
8 pages
MCQ For C-1-13
No ratings yet
MCQ For C-1-13
13 pages
Macdonell Vedic Grammar For Stundents
No ratings yet
Macdonell Vedic Grammar For Stundents
521 pages
3604 4157 - 00 - SW Fanuc Man
No ratings yet
3604 4157 - 00 - SW Fanuc Man
24 pages
Digital Exciter
100% (2)
Digital Exciter
80 pages
Pharm 17 Chapter 17
No ratings yet
Pharm 17 Chapter 17
15 pages
PolyMax PC PIS EN V1.1
No ratings yet
PolyMax PC PIS EN V1.1
16 pages
Kovalchuk2008 (Recovered) (Recovered)
No ratings yet
Kovalchuk2008 (Recovered) (Recovered)
7 pages
CY3201
No ratings yet
CY3201
1 page
Ecotank Presentation
No ratings yet
Ecotank Presentation
16 pages
Reprint Title Pages Edited by Sir Sol Final
No ratings yet
Reprint Title Pages Edited by Sir Sol Final
12 pages
Rahma Karnia Lakamudi
No ratings yet
Rahma Karnia Lakamudi
18 pages
CH Heat Transfer
No ratings yet
CH Heat Transfer
18 pages