Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning.

AllBooks Images Videos Maps News Shopping

Aligning Human Intent from Imperfect Demonstrations with Confidence ...

Dec 18, 2023 · We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q Learning (CIQL).

Aligning Human Intent From Imperfect Demonstrations With Confidence ...

www.researchgate.net › publication › 38...

Imitation learning attracts much attention for its ability to allow robots to quickly learn human manipulation skills through demonstrations.

‪Bu Xizhou‬ - ‪Google 学术搜索‬

scholar.google.com.hk › citations

Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning. X Bu, W Li, Z Liu, Z Ma, P Huang. IEEE Robotics and ...

ICRA@40 Program | Thursday September 26, 2024

ras.papercept.net › ICRAX24 › program

Sep 30, 2024 · Aligning Human Intent from Imperfect Demonstrations with Confidence-Based Inverse Soft-Q Learning. Bu, Xizhou, Northwestern Polytechnical ...

VILD: Variational Imitation Learning with Diverse-quality Demonstrations

www.semanticscholar.org › paper › VIL...

Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning · Xizhou BuWenjuan LiZhengxiong LiuZhiqiang MaPanfeng Huang.

A Benchmark for Imitation Learning with Human Demonstrations

arxiv-sanity-lite.com › ...

We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q Learning (CIQL), as ...

[PDF] Inverse Reinforcement Learning with Multiple Ranked Experts

www.semanticscholar.org › paper › Inve...

Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning · Xizhou BuWenjuan LiZhengxiong LiuZhiqiang MaPanfeng Huang.

Aligning Language Models with Demonstrated Feedback - arxiv-sanity

www.arxiv-sanity-lite.com › ...

We build a sequential decision-making framework to formulate the problem of aligning LLMs using demonstration datasets. Drawing insights from inverse ...

Representation Alignment from Human Feedback for Cross ...

www.aimodels.fyi › papers › arxiv › repr...

Aug 12, 2024 · We develop a generalized confidence-based imitation learning framework for guiding policy learning, called Confidence-based Inverse soft-Q ...

Efficient Reductions for Imitation Learning. - ResearchGate

www.researchgate.net › publication › 22...

Aligning Human Intent From Imperfect Demonstrations With Confidence-Based Inverse Soft-Q Learning. Article. Aug 2024. Xizhou Bu · Wenjuan Li ...