Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Charoenpitaks, Korawat; Nguyen, Van-Quang; Suganuma, Masanori; Takahashi, Masahiro; Niihara, Ryoma; Okatani, Takayuki

Computer Science > Computer Vision and Pattern Recognition

arXiv:2310.04671 (cs)

[Submitted on 7 Oct 2023 (v1), last revised 1 Jul 2024 (this version, v4)]

Title:Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Authors:Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Masahiro Takahashi, Ryoma Niihara, Takayuki Okatani

View PDF HTML (experimental)

Abstract:This paper addresses the problem of predicting hazards that drivers may encounter while driving a car. We formulate it as a task of anticipating impending accidents using a single input image captured by car dashcams. Unlike existing approaches to driving hazard prediction that rely on computational simulations or anomaly detection from videos, this study focuses on high-level inference from static images. The problem needs predicting and reasoning about future events based on uncertain observations, which falls under visual abductive reasoning. To enable research in this understudied area, a new dataset named the DHPR (Driving Hazard Prediction and Reasoning) dataset is created. The dataset consists of 15K dashcam images of street scenes, and each image is associated with a tuple containing car speed, a hypothesized hazard description, and visual entities present in the scene. These are annotated by human annotators, who identify risky scenes and provide descriptions of potential accidents that could occur a few seconds later. We present several baseline methods and evaluate their performance on our dataset, identifying remaining issues and discussing future directions. This study contributes to the field by introducing a novel problem formulation and dataset, enabling researchers to explore the potential of multi-modal AI for driving hazard prediction.

Comments:	Main Paper: 11 pages, Supplementary Materials: 25 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.04671 [cs.CV]
	(or arXiv:2310.04671v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2310.04671
Journal reference:	IEEE Trans. Intell. Veh. (2024) 1-11

Submission history

From: Korawat Charoenpitaks Mr. [view email]
[v1] Sat, 7 Oct 2023 03:16:30 UTC (23,355 KB)
[v2] Tue, 10 Oct 2023 02:31:24 UTC (23,355 KB)
[v3] Tue, 27 Feb 2024 14:22:09 UTC (27,549 KB)
[v4] Mon, 1 Jul 2024 09:29:39 UTC (46,442 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators