Dec 18, 2023 · We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q Learning (CIQL).
Imitation learning attracts much attention for its ability to allow robots to quickly learn human manipulation skills through demonstrations.
Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning. X Bu, W Li, Z Liu, Z Ma, P Huang. IEEE Robotics and ...
Sep 30, 2024 · Aligning Human Intent from Imperfect Demonstrations with Confidence-Based Inverse Soft-Q Learning. Bu, Xizhou, Northwestern Polytechnical ...
Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning · Xizhou BuWenjuan LiZhengxiong LiuZhiqiang MaPanfeng Huang.
We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q Learning (CIQL), as ...
Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning · Xizhou BuWenjuan LiZhengxiong LiuZhiqiang MaPanfeng Huang.
We build a sequential decision-making framework to formulate the problem of aligning LLMs using demonstration datasets. Drawing insights from inverse ...
Aug 12, 2024 · We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q ...
Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning. Article. Aug 2024. Xizhou Bu · Wenjuan Li ...