Feature boosting with efficient attention for scene parsing

Singh, Vivek; Sharma, Shailza; Cuzzolin, Fabio

doi:10.1016/j.neucom.2024.128222

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.19250 (cs)

[Submitted on 29 Feb 2024]

Title:Feature boosting with efficient attention for scene parsing

Authors:Vivek Singh, Shailza Sharma, Fabio Cuzzolin

View PDF HTML (experimental)

Abstract:The complexity of scene parsing grows with the number of object and scene classes, which is higher in unrestricted open scenes. The biggest challenge is to model the spatial relation between scene elements while succeeding in identifying objects at smaller scales. This paper presents a novel feature-boosting network that gathers spatial context from multiple levels of feature extraction and computes the attention weights for each level of representation to generate the final class labels. A novel `channel attention module' is designed to compute the attention weights, ensuring that features from the relevant extraction stages are boosted while the others are attenuated. The model also learns spatial context information at low resolution to preserve the abstract spatial relationships among scene elements and reduce computation cost. Spatial attention is subsequently concatenated into a final feature set before applying feature boosting. Low-resolution spatial attention features are trained using an auxiliary task that helps learning a coarse global scene structure. The proposed model outperforms all state-of-the-art models on both the ADE20K and the Cityscapes datasets.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.19250 [cs.CV]
	(or arXiv:2402.19250v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.19250
Related DOI:	https://doi.org/10.1016/j.neucom.2024.128222

Submission history

From: Shailza Sharma [view email]
[v1] Thu, 29 Feb 2024 15:22:21 UTC (3,764 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Feature boosting with efficient attention for scene parsing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Feature boosting with efficient attention for scene parsing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators