Learning non-Markovian Decision-Making from State-only Sequences

Qin, Aoyang; Gao, Feng; Li, Qing; Zhu, Song-Chun; Xie, Sirui

Computer Science > Machine Learning

arXiv:2306.15156 (cs)

[Submitted on 27 Jun 2023 (v1), last revised 30 Oct 2023 (this version, v3)]

Title:Learning non-Markovian Decision-Making from State-only Sequences

Authors:Aoyang Qin, Feng Gao, Qing Li, Song-Chun Zhu, Sirui Xie

View PDF

Abstract:Conventional imitation learning assumes access to the actions of demonstrators, but these motor signals are often non-observable in naturalistic settings. Additionally, sequential decision-making behaviors in these settings can deviate from the assumptions of a standard Markov Decision Process (MDP). To address these challenges, we explore deep generative modeling of state-only sequences with non-Markov Decision Process (nMDP), where the policy is an energy-based prior in the latent space of the state transition generator. We develop maximum likelihood estimation to achieve model-based imitation, which involves short-run MCMC sampling from the prior and importance sampling for the posterior. The learned model enables \textit{decision-making as inference}: model-free policy execution is equivalent to prior sampling, model-based planning is posterior sampling initialized from the policy. We demonstrate the efficacy of the proposed method in a prototypical path planning task with non-Markovian constraints and show that the learned model exhibits strong performances in challenging domains from the MuJoCo suite.

Comments:	Accepted at NeurIPS 2023
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.15156 [cs.LG]
	(or arXiv:2306.15156v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.15156

Submission history

From: Aoyang Qin [view email]
[v1] Tue, 27 Jun 2023 02:26:01 UTC (1,528 KB)
[v2] Sat, 1 Jul 2023 08:33:38 UTC (1,528 KB)
[v3] Mon, 30 Oct 2023 06:18:02 UTC (1,650 KB)

Computer Science > Machine Learning

Title:Learning non-Markovian Decision-Making from State-only Sequences

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning non-Markovian Decision-Making from State-only Sequences

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators