4 September 2024 Attention-injective scale aggregation network for crowd counting
Haojie Zou, Yingchun Kuang, Jianqiang Luo, Mingwei Yao, Haoyu Zhou, Sha Yang
Author Affiliations +
Abstract

Crowd counting has gained widespread attention in the fields of public safety management, video surveillance, and emergency response. Currently, background interference and scale variation of the head are still intractable problems. We propose an attention-injective scale aggregation network (ASANet) to cope with the above problems. ASANet consists of three parts: shallow feature attention network (SFAN), multi-level feature aggregation (MLFA) module, and density map generation (DMG) network. SFAN effectively overcomes the noise impact of a cluttered background by cross-injecting the attention module in the truncated VGG16 structure. To fully utilize the multi-scale crowd information embedded in the feature layers at different positions, we densely connect the multi-layer feature maps in the MLFA module to solve the scale variation problem. In addition, to capture large-scale head information, the DMG network introduces successive dilated convolutional layers to further expand the receptive field of the model, thus improving the accuracy of crowd counting. We conduct extensive experiments on five public datasets (ShanghaiTech Part_A, ShanghaiTech Part_B, UCF_QNRF, UCF_CC_50, JHU-Crowd++), and the results show that ASANet outperforms most of the existing methods in terms of counting and at the same time demonstrates satisfactory superiority in dealing with background noise in different scenes.

© 2024 SPIE and IS&T
Haojie Zou, Yingchun Kuang, Jianqiang Luo, Mingwei Yao, Haoyu Zhou, and Sha Yang "Attention-injective scale aggregation network for crowd counting," Journal of Electronic Imaging 33(5), 053008 (4 September 2024). https://doi.org/10.1117/1.JEI.33.5.053008
Received: 23 April 2024; Accepted: 13 August 2024; Published: 4 September 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Head

Visualization

Education and training

Convolution

Data modeling

Feature extraction

Quantum networks

Back to Top