CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

Shen, Guanlin; Huang, Jingwei; Hu, Zhihua; Wang, Bin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.04198 (cs)

[Submitted on 7 Mar 2024 (v1), last revised 9 Apr 2024 (this version, v2)]

Title:CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

Authors:Guanlin Shen, Jingwei Huang, Zhihua Hu, Bin Wang

View PDF HTML (experimental)

Abstract:This paper introduces CN-RMA, a novel approach for 3D indoor object detection from multi-view images. We observe the key challenge as the ambiguity of image and 3D correspondence without explicit geometry to provide occlusion information. To address this issue, CN-RMA leverages the synergy of 3D reconstruction networks and 3D object detection networks, where the reconstruction network provides a rough Truncated Signed Distance Function (TSDF) and guides image features to vote to 3D space correctly in an end-to-end manner. Specifically, we associate weights to sampled points of each ray through ray marching, representing the contribution of a pixel in an image to corresponding 3D locations. Such weights are determined by the predicted signed distances so that image features vote only to regions near the reconstructed surface. Our method achieves state-of-the-art performance in 3D object detection from multi-view images, as measured by [email protected] and [email protected] on the ScanNet and ARKitScenes datasets. The code and models are released at this https URL.

Comments:	CVPR2024 poster paper, 8 pages of main part, and 4 pages of supplementary material
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.04198 [cs.CV]
	(or arXiv:2403.04198v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.04198

Submission history

From: Guanlin Shen [view email]
[v1] Thu, 7 Mar 2024 03:59:47 UTC (3,738 KB)
[v2] Tue, 9 Apr 2024 15:07:08 UTC (3,995 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators