iMotion-LLM: Motion Prediction Instruction Tuning

Felemban, Abdulwahab; Bakr, Eslam Mohamed; Shen, Xiaoqian; Ding, Jian; Mohamed, Abduallah; Elhoseiny, Mohamed

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.06211 (cs)

[Submitted on 10 Jun 2024 (v1), last revised 11 Jun 2024 (this version, v2)]

Title:iMotion-LLM: Motion Prediction Instruction Tuning

Authors:Abdulwahab Felemban, Eslam Mohamed Bakr, Xiaoqian Shen, Jian Ding, Abduallah Mohamed, Mohamed Elhoseiny

View PDF HTML (experimental)

Abstract:We introduce iMotion-LLM: a Multimodal Large Language Models (LLMs) with trajectory prediction, tailored to guide interactive multi-agent scenarios. Different from conventional motion prediction approaches, iMotion-LLM capitalizes on textual instructions as key inputs for generating contextually relevant trajectories. By enriching the real-world driving scenarios in the Waymo Open Dataset with textual motion instructions, we created InstructWaymo. Leveraging this dataset, iMotion-LLM integrates a pretrained LLM, fine-tuned with LoRA, to translate scene features into the LLM input space. iMotion-LLM offers significant advantages over conventional motion prediction models. First, it can generate trajectories that align with the provided instructions if it is a feasible direction. Second, when given an infeasible direction, it can reject the instruction, thereby enhancing safety. These findings act as milestones in empowering autonomous navigation systems to interpret and predict the dynamics of multi-agent environments, laying the groundwork for future advancements in this field.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.06211 [cs.CV]
	(or arXiv:2406.06211v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.06211

Submission history

From: Abdulwahab Felemban [view email]
[v1] Mon, 10 Jun 2024 12:22:06 UTC (10,584 KB)
[v2] Tue, 11 Jun 2024 12:37:23 UTC (10,585 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:iMotion-LLM: Motion Prediction Instruction Tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:iMotion-LLM: Motion Prediction Instruction Tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators