default search action
Yaodong Yang 0001
Person information
- unicode name: 杨耀东
- affiliation: Peking University, Institute for AI, Beijing, China
- affiliation (former): King's College London, UK
- affiliation (former): Huawei Technologies, Noah's Ark Lab, UK
- affiliation (PhD): University College London, UK
Other persons with the same name
- Yaodong Yang — disambiguation page
- Yaodong Yang 0002 — Chinese University of Hong Kong, Hong Kong (and 2 more)
- Yaodong Yang 0003 — University of Nebraska - Lincoln, USA
- Yaodong Yang 0004 — University of Science and Technology Beijing, Beijing, China
- Yaodong Yang 0005 — Hefei University of Technology, School of Mathematics, China
- Adam X. Yang (aka: Adam Yang 0002) — University of Bristol, Department of Computer Science, UK
- Adam Yang 0003 — University of Maryland, Department of Computer Science, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j20]Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang:
Heterogeneous-Agent Reinforcement Learning. J. Mach. Learn. Res. 25: 32:1-32:67 (2024) - [j19]Dongzi Wang, Fangwei Zhong, Minglong Li, Muning Wen, Yuanxi Peng, Teng Li, Adam Yang:
RoMAT: Role-based multi-agent transformer for generalizable heterogeneous cooperation. Neural Networks 174: 106129 (2024) - [j18]Jie Liu, Yinmin Zhang, Chuming Li, Yaodong Yang, Yu Liu, Wanli Ouyang:
Adaptive pessimism via target Q-value for offline reinforcement learning. Neural Networks 180: 106588 (2024) - [j17]Yuanpei Chen, Yiran Geng, Fangwei Zhong, Jiaming Ji, Jiechuang Jiang, Zongqing Lu, Hao Dong, Yaodong Yang:
Bi-DexHands: Towards Human-Level Bimanual Dexterous Manipulation. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 2804-2818 (2024) - [j16]Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang:
ASP: Learn a Universal Neural Solver! IEEE Trans. Pattern Anal. Mach. Intell. 46(6): 4102-4114 (2024) - [j15]Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, Siyuan Huang:
Grasp Multiple Objects With One Hand. IEEE Robotics Autom. Lett. 9(5): 4027-4034 (2024) - [j14]Yang Li, Fanglei Sun, Jingchen Hu, Chang Liu, Fan Wu, Kai Li, Ying Wen, Zheng Tian, Yaodong Yang, Jiangcheng Zhu, Zhifeng Chen, Jun Wang, Yang Yang:
Self-Supervised MAFENN for Classifying Low-Labeled Distorted Images Over Mobile Fading Channels. IEEE Trans. Mob. Comput. 23(8): 8077-8091 (2024) - [c68]Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning. AAAI 2024: 16908-16916 - [c67]Sirui Chen, Zhaowei Zhang, Yaodong Yang, Yali Du:
STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning. AAAI 2024: 17337-17345 - [c66]Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang:
ProAgent: Building Proactive Cooperative Agents with Large Language Models. AAAI 2024: 17591-17599 - [c65]Shaoting Feng, Qinya Li, Yaodong Yang, Fan Wu, Guihai Chen:
GIPUT: Maximizing Photo Coverage Efficiency for UAV Trajectory. APWeb/WAIM (1) 2024: 391-406 - [c64]Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang:
A Summary of Online Markov Decision Processes with Non-oblivious Strategic Adversary. AAMAS 2024: 2830-2832 - [c63]Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang:
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents. CVPR 2024: 852-862 - [c62]Weidong Huang, Jiaming Ji, Chunhe Xia, Borong Zhang, Yaodong Yang:
SafeDreamer: Safe Reinforcement Learning with World Models. ICLR 2024 - [c61]Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang:
Safe RLHF: Safe Reinforcement Learning from Human Feedback. ICLR 2024 - [c60]Simin Li, Jun Guo, Jingqiao Xiu, Ruixiao Xu, Xin Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu:
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game. ICLR 2024 - [c59]Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang:
Maximum Entropy Heterogeneous-Agent Reinforcement Learning. ICLR 2024 - [c58]Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Yaodong Yang, Song-Chun Zhu:
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents. ICLR 2024 - [c57]Juntao Dai, Yaodong Yang, Qian Zheng, Gang Pan:
Safe Reinforcement Learning using Finite-Horizon Gradient-based Estimation. ICML 2024 - [c56]Yizhe Huang, Anji Liu, Fanqi Kong, Yaodong Yang, Song-Chun Zhu, Xue Feng:
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning. ICML 2024 - [c55]Ruiqing Chen, Xiaoyuan Zhang, Yali Du, Yifan Zhong, Zheng Tian, Fanglei Sun, Yaodong Yang:
Off-Agent Trust Region Policy Optimization. IJCAI 2024: 3798-3806 - [c54]Yue Zhang, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Remember the Past for Better Future: Memory-Augmented Offline RL. IJCNN 2024: 1-8 - [i105]Siyuan Qi, Shuo Chen, Yexin Li, Xiangyu Kong, Junqi Wang, Bangcheng Yang, Pring Wong, Yifan Zhong, Xiaoyuan Zhang, Zhaowei Zhang, Nian Liu, Wei Wang, Yaodong Yang, Song-Chun Zhu:
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents. CoRR abs/2401.10568 (2024) - [i104]Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Qingfu Zhang, Siyuan Qi, Yaodong Yang:
Panacea: Pareto Alignment via Preference Adaptation for LLMs. CoRR abs/2402.02030 (2024) - [i103]Jiaming Ji, Boyuan Chen, Hantao Lou, Donghai Hong, Borong Zhang, Xuehai Pan, Juntao Dai, Yaodong Yang:
Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction. CoRR abs/2402.02416 (2024) - [i102]Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Han Yang, Josef Dai, Xuehai Pan, Yaodong Yang:
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective. CoRR abs/2402.10184 (2024) - [i101]Zhaowei Zhang, Fengshuo Bai, Mingzhi Wang, Haoyang Ye, Chengdong Ma, Yaodong Yang:
Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects. CoRR abs/2402.12907 (2024) - [i100]Naming Liu, Mingzhi Wang, Youzhi Zhang, Yaodong Yang, Bo An, Ying Wen:
Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games. CoRR abs/2403.00255 (2024) - [i99]Tianhao Wu, Yunchong Gan, Mingdong Wu, Jingbo Cheng, Yaodong Yang, Yixin Zhu, Hao Dong:
UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy. CoRR abs/2403.12421 (2024) - [i98]Jieming Cui, Tengyu Liu, Nian Liu, Yaodong Yang, Yixin Zhu, Siyuan Huang:
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents. CoRR abs/2403.12835 (2024) - [i97]Zhiyu Zhao, Ning Yang, Xue Yan, Haifeng Zhang, Jun Wang, Yaodong Yang:
Correlated Mean Field Imitation Learning. CoRR abs/2404.09324 (2024) - [i96]Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han:
Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation. CoRR abs/2405.18688 (2024) - [i95]Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang:
Efficient Model-agnostic Alignment via Bayesian Persuasion. CoRR abs/2405.18718 (2024) - [i94]Jiesong Lian, Yucong Huang, Mingzhi Wang, Chengdong Ma, Yixue Hao, Ying Wen, Yaodong Yang:
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles. CoRR abs/2405.21027 (2024) - [i93]Jiaming Ji, Kaile Wang, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Changye Li, Hantao Lou, Yaodong Yang:
Language Models Resist Alignment. CoRR abs/2406.06144 (2024) - [i92]Yizhe Huang, Anji Liu, Fanqi Kong, Yaodong Yang, Song-Chun Zhu, Xue Feng:
Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning. CoRR abs/2406.08002 (2024) - [i91]Josef Dai, Tianle Chen, Xuyao Wang, Ziran Yang, Taiye Chen, Jiaming Ji, Yaodong Yang:
SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset. CoRR abs/2406.14477 (2024) - [i90]Jiaming Ji, Donghai Hong, Borong Zhang, Boyuan Chen, Josef Dai, Boren Zheng, Tianyi Qiu, Boxun Li, Yaodong Yang:
PKU-SafeRLHF: A Safety Alignment Preference Dataset for Llama Family Models. CoRR abs/2406.15513 (2024) - [i89]Tianyi Qiu, Yang Zhang, Xuchuan Huang, Jasmine Xinze Li, Jiaming Ji, Yaodong Yang:
ProgressGym: Alignment with a Millennium of Moral Progress. CoRR abs/2406.20087 (2024) - [i88]Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang:
A Survey on Self-play Methods in Reinforcement Learning. CoRR abs/2408.01072 (2024) - [i87]Jiayi Zhou, Jiaming Ji, Juntao Dai, Yaodong Yang:
Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback. CoRR abs/2409.00162 (2024) - [i86]Naming Liu, Mingzhi Wang, Xihuai Wang, Weinan Zhang, Yaodong Yang, Youzhi Zhang, Bo An, Ying Wen:
Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games. CoRR abs/2410.01575 (2024) - 2023
- [j13]Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang:
Online Markov decision processes with non-oblivious strategic adversary. Auton. Agents Multi Agent Syst. 37(1): 15 (2023) - [j12]Shangding Gu, Jakub Grudzien Kuba, Yuanpei Chen, Yali Du, Long Yang, Alois C. Knoll, Yaodong Yang:
Safe multi-agent reinforcement learning for multi-robot control. Artif. Intell. 319: 103905 (2023) - [j11]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Hai-Feng Zhang, Weinan Zhang:
Large sequence models for sequential decision-making: a survey. Frontiers Comput. Sci. 17(6): 176349 (2023) - [j10]Linghui Meng, Muning Wen, Chenyang Le, Xiyun Li, Dengpeng Xing, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Yaodong Yang, Bo Xu:
Offline Pre-trained Multi-agent Decision Transformer. Mach. Intell. Res. 20(2): 233-248 (2023) - [j9]Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Yong Yu, Jun Wang, Weinan Zhang:
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. J. Mach. Learn. Res. 24: 150:1-150:12 (2023) - [j8]Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Xiaodan Liang, Zhihui Li, Xiaojun Chang, Yaodong Yang:
MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library. J. Mach. Learn. Res. 24: 315:1-315:23 (2023) - [j7]Jie Ren, Xidong Feng, Bo Liu, Xuehai Pan, Yao Fu, Luo Mai, Yaodong Yang:
TorchOpt: An Efficient Library for Differentiable Optimization. J. Mach. Learn. Res. 24: 367:1-367:14 (2023) - [j6]Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen Marcus McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang:
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games. Trans. Mach. Learn. Res. 2023 (2023) - [c53]Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
ACE: Cooperative Multi-Agent Q-learning with Bidirectional Action-Dependency. AAAI 2023: 8536-8544 - [c52]David Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez Nieves, Wenbin Song, Feifei Tong, Matthew E. Taylor, Tianpei Yang, Zipeng Dai, Hui Chen, Jiangcheng Zhu, Kun Shao, Jun Wang, Yaodong Yang:
Learning to Shape Rewards Using a Game of Two Partners. AAAI 2023: 11604-11612 - [c51]Pei Xu, Junge Zhang, Qiyue Yin, Chao Yu, Yaodong Yang, Kaiqi Huang:
Subspace-Aware Exploration for Sparse-Reward Multi-Agent Tasks. AAAI 2023: 11717-11725 - [c50]Zhijian Duan, Wenhan Huang, Dinghuai Zhang, Yali Du, Jun Wang, Yaodong Yang, Xiaotie Deng:
Is Nash Equilibrium Approximator Learnable? AAMAS 2023: 233-241 - [c49]Binghao Huang, Yuanpei Chen, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang:
Dynamic Handover: Throw and Catch with Bimanual Hands. CoRL 2023: 1887-1902 - [c48]Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning. ECAI 2023: 1381-1388 - [c47]Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang:
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning. ICCV 2023: 3868-3879 - [c46]Shuang Wu, Jian Yao, Haobo Fu, Ye Tian, Chao Qian, Yaodong Yang, Qiang Fu, Wei Yang:
Quality-Similar Diversity via Population Based Reinforcement Learning. ICLR 2023 - [c45]David Henry Mguni, Haojun Chen, Taher Jafferjee, Jianhong Wang, Longfei Yue, Xidong Feng, Stephen Marcus McAleer, Feifei Tong, Jun Wang, Yaodong Yang:
MANSA: Learning Fast and Slow in Multi-Agent Systems. ICML 2023: 24631-24658 - [c44]Oliver Slumbers, David Henry Mguni, Stefano B. Blumberg, Stephen Marcus McAleer, Yaodong Yang, Jun Wang:
A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems. ICML 2023: 32059-32087 - [c43]Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang:
Regret-Minimizing Double Oracle for Extensive-Form Games. ICML 2023: 33599-33615 - [c42]Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai:
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. ICML 2023: 36380-36390 - [c41]Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong:
RLAfford: End-to-End Affordance Learning for Robotic Manipulation. ICRA 2023: 5880-5886 - [c40]Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang:
GenDexGrasp: Generalizable Dexterous Grasping. ICRA 2023: 8068-8074 - [c39]Jiaming Ji, Mickel Liu, Josef Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Chen, Ruiyang Sun, Yizhou Wang, Yaodong Yang:
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. NeurIPS 2023 - [c38]Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Josef Dai, Yaodong Yang:
Safety Gymnasium: A Unified Safe Reinforcement Learning Benchmark. NeurIPS 2023 - [c37]Stephen McAleer, Gabriele Farina, Gaoyue Zhou, Mingzhi Wang, Yaodong Yang, Tuomas Sandholm:
Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning. NeurIPS 2023 - [c36]Mingyu Yang, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Hierarchical Multi-Agent Skill Discovery. NeurIPS 2023 - [c35]Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang:
Policy Space Diversity for Non-Transitive Games. NeurIPS 2023 - [c34]Youpeng Zhao, Yaodong Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li:
Multi-Agent First Order Constrained Optimization in Policy Space. NeurIPS 2023 - [c33]Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter R. Pietzuch, Lei Chen:
MSRL: Distributed Reinforcement Learning with Dataflow Fragments. USENIX ATC 2023: 977-993 - [i85]David Mguni, Taher Jafferjee, Haojun Chen, Jianhong Wang, Long Fei, Xidong Feng, Stephen McAleer, Feifei Tong, Jun Wang, Yaodong Yang:
MANSA: Learning Fast and Slow in Multi-Agent Systems. CoRR abs/2302.05910 (2023) - [i84]Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Yaodong Yang, Jan Peters, Alois C. Knoll:
A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors. CoRR abs/2302.13137 (2023) - [i83]Chenguang Wang, Zhouliang Yu, Stephen McAleer, Tianshu Yu, Yaodong Yang:
ASP: Learn a Universal Neural Solver! CoRR abs/2303.00466 (2023) - [i82]Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang:
UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning. CoRR abs/2304.00464 (2023) - [i81]Sirui Chen, Zhaowei Zhang, Yali Du, Yaodong Yang:
STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning. CoRR abs/2304.07520 (2023) - [i80]Yifan Zhong, Jakub Grudzien Kuba, Siyi Hu, Jiaming Ji, Yaodong Yang:
Heterogeneous-Agent Reinforcement Learning. CoRR abs/2304.09870 (2023) - [i79]Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang:
Regret-Minimizing Double Oracle for Extensive-Form Games. CoRR abs/2304.10498 (2023) - [i78]Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang:
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research. CoRR abs/2305.09304 (2023) - [i77]Simin Li, Jun Guo, Jingqiao Xiu, Xini Yu, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu:
Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game. CoRR abs/2305.12872 (2023) - [i76]Zhaowei Zhang, Nian Liu, Siyuan Qi, Ceyao Zhang, Ziqi Rong, Song-Chun Zhu, Shuguang Cui, Yaodong Yang:
Heterogeneous Value Evaluation for Large Language Models. CoRR abs/2305.17147 (2023) - [i75]Yonggang Jin, Chenxu Wang, Liuyu Xiang, Yaodong Yang, Jie Fu, Zhaofeng He:
Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork. CoRR abs/2306.10698 (2023) - [i74]Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang:
Maximum Entropy Heterogeneous-Agent Mirror Learning. CoRR abs/2306.10715 (2023) - [i73]Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang:
Large Sequence Models for Sequential Decision-Making: A Survey. CoRR abs/2306.13945 (2023) - [i72]Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang:
Policy Space Diversity for Non-Transitive Games. CoRR abs/2306.16884 (2023) - [i71]Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Boyuan Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang:
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset. CoRR abs/2307.04657 (2023) - [i70]Weidong Huang, Jiaming Ji, Borong Zhang, Chunhe Xia, Yaodong Yang:
Safe DreamerV3: Safe Reinforcement Learning with World Models. CoRR abs/2307.07176 (2023) - [i69]Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning. CoRR abs/2307.12933 (2023) - [i68]Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen McAleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang:
JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games. CoRR abs/2308.04719 (2023) - [i67]Ceyao Zhang, Kaijie Yang, Siyi Hu, Zihao Wang, Guanghe Li, Yihang Sun, Cheng Zhang, Zhaowei Zhang, Anji Liu, Song-Chun Zhu, Xiaojun Chang, Junge Zhang, Feng Yin, Yitao Liang, Yaodong Yang:
ProAgent: Building Proactive Cooperative AI with Large Language Models. CoRR abs/2308.11339 (2023) - [i66]Jingbang Chen, Yian Wang, Xingwei Qu, Shuangjia Zheng, Yaodong Yang, Hao Dong, Jie Fu:
Mixup-Augmented Meta-Learning for Sample-Efficient Fine-Tuning of Protein Simulators. CoRR abs/2308.15116 (2023) - [i65]Binghao Huang, Yuanpei Chen, Tianyu Wang, Yuzhe Qin, Yaodong Yang, Nikolay Atanasov, Xiaolong Wang:
Dynamic Handover: Throw and Catch with Bimanual Hands. CoRR abs/2309.05655 (2023) - [i64]Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yang:
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models. CoRR abs/2310.00322 (2023) - [i63]Zhaowei Zhang, Fengshuo Bai, Jun Gao, Yaodong Yang:
Measuring Value Understanding in Language Models through Discriminator-Critique Gap. CoRR abs/2310.00378 (2023) - [i62]Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai:
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. CoRR abs/2310.05205 (2023) - [i61]Simin Li, Ruixiao Xu, Jun Guo, Pu Feng, Jiakai Wang, Aishan Liu, Yaodong Yang, Xianglong Liu, Weifeng Lv:
MIR2: Towards Provably Robust Multi-Agent Reinforcement Learning by Mutual Information Regularization. CoRR abs/2310.09833 (2023) - [i60]Jie Liu, Yinmin Zhang, Chuming Li, Chao Yang, Yaodong Yang, Yu Liu, Wanli Ouyang:
Masked Pretraining for Multi-Agent Decision Making. CoRR abs/2310.11846 (2023) - [i59]Jiaming Ji, Borong Zhang, Jiayi Zhou, Xuehai Pan, Weidong Huang, Ruiyang Sun, Yiran Geng, Yifan Zhong, Juntao Dai, Yaodong Yang:
Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark. CoRR abs/2310.12567 (2023) - [i58]Josef Dai, Xuehai Pan, Ruiyang Sun, Jiaming Ji, Xinbo Xu, Mickel Liu, Yizhou Wang, Yaodong Yang:
Safe RLHF: Safe Reinforcement Learning from Human Feedback. CoRR abs/2310.12773 (2023) - [i57]Yuyang Li, Bo Liu, Yiran Geng, Puhao Li, Yaodong Yang, Yixin Zhu, Tengyu Liu, Siyuan Huang:
Grasp Multiple Objects with One Hand. CoRR abs/2310.15599 (2023) - [i56]Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao:
AI Alignment: A Comprehensive Survey. CoRR abs/2310.19852 (2023) - [i55]Zihao Wang, Shaofei Cai, Anji Liu, Yonggang Jin, Jinbing Hou, Bowei Zhang, Haowei Lin, Zhaofeng He, Zilong Zheng, Yaodong Yang, Xiaojian Ma, Yitao Liang:
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models. CoRR abs/2311.05997 (2023) - [i54]Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning. CoRR abs/2312.07685 (2023) - 2022
- [j5]Ricky Sanjaya, Jun Wang, Yaodong Yang:
Measuring the Non-Transitivity in Chess. Algorithms 15(5): 152 (2022) - [j4]Qingduo Zeng, Qiang Zhang, Shancun Liu, Yaodong Yang:
Illiquidity Comovement and Market Crisis. J. Syst. Sci. Complex. 35(5): 1863-1874 (2022) - [j3]Le Cong Dinh, Stephen Marcus McAleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Jun Wang, Haitham Bou-Ammar, Yaodong Yang:
Online Double Oracle. Trans. Mach. Learn. Res. 2022 (2022) - [c32]Ying Wen, Hui Chen, Yaodong Yang, Minne Li, Zheng Tian, Xu Chen, Jun Wang:
A Game-Theoretic Approach to Multi-agent Trust Region Optimization. DAI 2022: 74-87 - [c31]Jakub Grudzien Kuba, Ruiqing Chen, Muning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang:
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning. ICLR 2022 - [c30]David Henry Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez Nieves, Oliver Slumbers, Feifei Tong, Yang Li, Jiangcheng Zhu, Yaodong Yang, Jun Wang:
LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning. ICLR 2022 - [c29]Yurong Chen, Xiaotie Deng, Chenchen Li, David Mguni, Jun Wang, Xiang Yan, Yaodong Yang:
On the Convergence of Fictitious Play: A Decomposition Approach. IJCAI 2022: 179-185 - [c28]Yali Du, Chengdong Ma, Yuchen Liu, Runji Lin, Hao Dong, Jun Wang, Yaodong Yang:
Scalable Model-based Policy Optimization for Decentralized Networked Systems. IROS 2022: 9019-9026 - [c27]Bo Liu, Xidong Feng, Jie Ren, Luo Mai, Rui Zhu, Haifeng Zhang, Jun Wang, Yaodong Yang:
A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning. NeurIPS 2022 - [c26]Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuan Jiang, Zongqing Lu, Stephen McAleer, Hao Dong, Song-Chun Zhu, Yaodong Yang:
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning. NeurIPS 2022 - [c25]Runze Liu, Fengshuo Bai, Yali Du, Yaodong Yang:
Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning. NeurIPS 2022 - [c24]Xuehai Pan, Mickel Liu, Fangwei Zhong, Yaodong Yang, Song-Chun Zhu, Yizhou Wang:
MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control. NeurIPS 2022 - [c23]Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang:
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem. NeurIPS 2022 - [c22]Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan:
Constrained Update Projection Approach to Safe Policy Optimization. NeurIPS 2022 - [c21]Zhitao Zhu, Shijing Si, Jianzong Wang, Yaodong Yang, Jing Xiao:
Debias the Black-Box: A Fair Ranking Framework via Knowledge Distillation. WISE 2022: 395-405 - [i53]Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu:
Efficient Policy Space Response Oracles. CoRR abs/2202.00633 (2022) - [i52]Juliusz Krysztof Ziomek, Jun Wang, Yaodong Yang:
Settling the Communication Complexity for Distributed Offline Reinforcement Learning. CoRR abs/2202.04862 (2022) - [i51]Zehao Dou, Jakub Grudzien Kuba, Yaodong Yang:
Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning. CoRR abs/2202.04868 (2022) - [i50]Yurong Chen, Xiaotie Deng, Chenchen Li, David Mguni, Jun Wang, Xiang Yan, Yaodong Yang:
On the Convergence of Fictitious Play: A Decomposition Approach. CoRR abs/2205.01469 (2022) - [i49]Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois C. Knoll:
A Review of Safe Reinforcement Learning: Methods, Theory and Applications. CoRR abs/2205.10330 (2022) - [i48]Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang:
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem. CoRR abs/2205.14953 (2022) - [i47]Oliver Slumbers, David Henry Mguni, Stephen McAleer, Jun Wang, Yaodong Yang:
Learning Risk-Averse Equilibria in Multi-Agent Systems. CoRR abs/2205.15434 (2022) - [i46]Yuanpei Chen, Yaodong Yang, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Hao Dong, Zongqing Lu, Song-Chun Zhu:
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning. CoRR abs/2206.08686 (2022) - [i45]Yali Du, Chengdong Ma, Yuchen Liu, Runji Lin, Hao Dong, Jun Wang, Yaodong Yang:
Fully Decentralized Model-based Policy Optimization for Networked Systems. CoRR abs/2207.06559 (2022) - [i44]Jakub Grudzien Kuba, Xidong Feng, Shiyao Ding, Hao Dong, Jun Wang, Yaodong Yang:
Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL. CoRR abs/2208.01682 (2022) - [i43]Zhitao Zhu, Shijing Si, Jianzong Wang, Yaodong Yang, Jing Xiao:
Debias the Black-box: A Fair Ranking Framework via Knowledge Distillation. CoRR abs/2208.11628 (2022) - [i42]Long Yang, Jiaming Ji, Juntao Dai, Linrui Zhang, Binbin Zhou, Pengfei Li, Yaodong Yang, Gang Pan:
Constrained Update Projection Approach to Safe Policy Optimization. CoRR abs/2209.07089 (2022) - [i41]Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong:
End-to-End Affordance Learning for Robotic Manipulation. CoRR abs/2209.12941 (2022) - [i40]Puhao Li, Tengyu Liu, Yuyang Li, Yiran Geng, Yixin Zhu, Yaodong Yang, Siyuan Huang:
GenDexGrasp: Generalizable Dexterous Grasping. CoRR abs/2210.00722 (2022) - [i39]Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter R. Pietzuch, Lei Chen:
MSRL: Distributed Reinforcement Learning with Dataflow Fragments. CoRR abs/2210.00882 (2022) - [i38]Siyi Hu, Yifan Zhong, Minquan Gao, Weixun Wang, Hao Dong, Zhihui Li, Xiaodan Liang, Xiaojun Chang, Yaodong Yang:
MARLlib: Extending RLlib for Multi-agent Reinforcement Learning. CoRR abs/2210.13708 (2022) - [i37]Jie Ren, Xidong Feng, Bo Liu, Xuehai Pan, Yao Fu, Luo Mai, Yaodong Yang:
TorchOpt: An Efficient Library for Differentiable Optimization. CoRR abs/2211.06934 (2022) - [i36]Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang:
Contextual Transformer for Offline Meta Reinforcement Learning. CoRR abs/2211.08016 (2022) - [i35]Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang:
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency. CoRR abs/2211.16068 (2022) - 2021
- [b1]Yaodong Yang:
Many-agent reinforcement learning. University College London (University of London), UK, 2021 - [c20]David Henry Mguni, Yutong Wu, Yali Du, Yaodong Yang, Ziyi Wang, Minne Li, Ying Wen, Joel Jennings, Jun Wang:
Learning in Nonzero-Sum Stochastic Games with Potentials. ICML 2021: 7688-7699 - [c19]Nicolas Perez Nieves, Yaodong Yang, Oliver Slumbers, David Henry Mguni, Ying Wen, Jun Wang:
Modelling Behavioural Diversity for Learning in Open-Ended Games. ICML 2021: 8514-8524 - [c18]Vittorio Caggiano, Guillaume Durandau, Huawei Wang, Alberto Silvio Chiappa, Alexander Mathis, Pablo Tano, Nisheet Patel, Alexandre Pouget, Pierre Schumacher, Georg Martius, Daniel F. B. Haeufle, Yiran Geng, Boshi An, Yifan Zhong, Jiaming Ji, Yuanpei Chen, Hao Dong, Yaodong Yang, Rahul Siripurapu, Luis Eduardo Ferro Diez, Michael Kopp, Vihang Patil, Sepp Hochreiter, Yuval Tassa, Josh Merel, Randy Schultheis, Seungmoon Song, Massimo Sartori, Vikash Kumar:
MyoChallenge 2022: Learning contact-rich manipulation using a musculoskeletal hand. NeurIPS (Competition and Demos) 2021: 233-250 - [c17]Xiangyu Liu, Hangtian Jia, Ying Wen, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu, Yaodong Yang:
Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games. NeurIPS 2021: 941-952 - [c16]Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang:
Neural Auto-Curricula in Two-Player Zero-Sum Games. NeurIPS 2021: 3504-3517 - [c15]Jakub Grudzien Kuba, Muning Wen, Linghui Meng, Shangding Gu, Haifeng Zhang, David Mguni, Jun Wang, Yaodong Yang:
Settling the Variance of Multi-Agent Policy Gradients. NeurIPS 2021: 13458-13470 - [i34]Le Cong Dinh, Yaodong Yang, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou-Ammar, Jun Wang:
Online Double Oracle. CoRR abs/2103.07780 (2021) - [i33]Nicolas Perez Nieves, Yaodong Yang, Oliver Slumbers, David Henry Mguni, Jun Wang:
Modelling Behavioural Diversity for Learning in Open-Ended Games. CoRR abs/2103.07927 (2021) - [i32]David Mguni, Jianhong Wang, Taher Jafferjee, Nicolas Perez Nieves, Wenbin Song, Yaodong Yang, Feifei Tong, Hui Chen, Jiangcheng Zhu, Yali Du, Jun Wang:
Learning to Shape Rewards using a Game of Switching Controls. CoRR abs/2103.09159 (2021) - [i31]David Mguni, Yutong Wu, Yali Du, Yaodong Yang, Ziyi Wang, Minne Li, Ying Wen, Joel Jennings, Jun Wang:
Learning in Nonzero-Sum Stochastic Games with Potentials. CoRR abs/2103.09284 (2021) - [i30]Xidong Feng, Oliver Slumbers, Yaodong Yang, Ziyu Wan, Bo Liu, Stephen McAleer, Ying Wen, Jun Wang:
Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games. CoRR abs/2106.02745 (2021) - [i29]Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu:
Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games. CoRR abs/2106.04958 (2021) - [i28]Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang:
A Game-Theoretic Approach to Multi-Agent Trust Region Optimization. CoRR abs/2106.06828 (2021) - [i27]Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang:
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. CoRR abs/2106.07551 (2021) - [i26]Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang:
Settling the Variance of Multi-Agent Policy Gradients. CoRR abs/2108.08612 (2021) - [i25]Xiaotie Deng, Yuhao Li, David Henry Mguni, Jun Wang, Yaodong Yang:
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games. CoRR abs/2109.01795 (2021) - [i24]Yixin Wu, Rui Luo, Chen Zhang, Jun Wang, Yaodong Yang:
Revisiting the Characteristics of Stochastic Gradient Noise and Dynamics. CoRR abs/2109.09833 (2021) - [i23]Jakub Grudzien Kuba, Ruiqing Chen, Muning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang:
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning. CoRR abs/2109.11251 (2021) - [i22]Shangding Gu, Jakub Grudzien Kuba, Muning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois C. Knoll, Yaodong Yang:
Multi-Agent Constrained Policy Optimisation. CoRR abs/2110.02793 (2021) - [i21]Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang:
Online Markov Decision Processes with Non-oblivious Strategic Adversary. CoRR abs/2110.03604 (2021) - [i20]Ricky Sanjaya, Jun Wang, Yaodong Yang:
Measuring the Non-Transitivity in Chess. CoRR abs/2110.11737 (2021) - [i19]David Mguni, Joel Jennings, Taher Jafferjee, Aivar Sootla, Yaodong Yang, Changmin Yu, Usman Islam, Ziyan Wang, Jun Wang:
DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention. CoRR abs/2110.14468 (2021) - [i18]Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang:
A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers. CoRR abs/2110.15105 (2021) - [i17]David Henry Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez Nieves, Oliver Slumbers, Feifei Tong, Yang Li, Jiangcheng Zhu, Yaodong Yang, Jun Wang:
LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning. CoRR abs/2112.02618 (2021) - [i16]Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu:
Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks. CoRR abs/2112.02845 (2021) - [i15]Bo Liu, Xidong Feng, Haifeng Zhang, Jun Wang, Yaodong Yang:
Settling the Bias and Variance of Meta-Gradient Estimation for Meta-Reinforcement Learning. CoRR abs/2112.15400 (2021) - [i14]Xiaotie Deng, Yuhao Li, David Mguni, Jun Wang, Yaodong Yang:
On the Complexity of Computing Markov Perfect Equilibrium in General-Sum Stochastic Games. Electron. Colloquium Comput. Complex. TR21 (2021) - 2020
- [j2]Alisa Kim, Yaodong Yang, Stefan Lessmann, Tiejun Ma, Ming-Chien Sung, Johnnie E. V. Johnson:
Can deep learning predict risky retail investors? A case study in financial risk behavior forecasting. Eur. J. Oper. Res. 283(1): 217-234 (2020) - [j1]Qiang Zhang, Chao Wang, Shancun Liu, Yaodong Yang:
Order Execution Probability and Order Queue in Limit Order Markets. J. Syst. Sci. Complex. 33(5): 1545-1557 (2020) - [c14]Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, Jun Wang:
Bi-Level Actor-Critic for Multi-Agent Coordination. AAAI 2020: 7325-7332 - [c13]Yaodong Yang, Rasul Tutunov, Phu Sakulwongtana, Haitham Bou-Ammar:
αα-Rank: Practically Scaling α-Rank through Stochastic Optimisation. AAMAS 2020: 1575-1583 - [c12]Zhaoqing Peng, Junqi Jin, Lan Luo, Yaodong Yang, Rui Luo, Jun Wang, Weinan Zhang, Miao Xu, Chuan Yu, Tiejian Luo, Han Li, Jian Xu, Kun Gai:
Sequential Advertising Agent with Interpretable User Hidden Intents. AAMAS 2020: 1966-1968 - [c11]Zhaoqing Peng, Junqi Jin, Lan Luo, Yaodong Yang, Rui Luo, Jun Wang, Weinan Zhang, Haiyang Xu, Miao Xu, Chuan Yu, Tiejian Luo, Han Li, Jian Xu, Kun Gai:
Learning to Infer User Hidden States for Online Sequential Advertising. CIKM 2020: 2677-2684 - [c10]Yaodong Yang, Ying Wen, Jun Wang, Liheng Chen, Kun Shao, David Mguni, Weinan Zhang:
Multi-Agent Determinantal Q-Learning. ICML 2020: 10757-10766 - [c9]Ying Wen, Yaodong Yang, Jun Wang:
Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning. IJCAI 2020: 414-421 - [c8]Rui Luo, Qiang Zhang, Yaodong Yang, Jun Wang:
Replica-Exchange Nosé-Hoover Dynamics for Bayesian Learning on Large Datasets. NeurIPS 2020 - [i13]Yaodong Yang, Ying Wen, Liheng Chen, Jun Wang, Kun Shao, David Mguni, Weinan Zhang:
Multi-Agent Determinantal Q-Learning. CoRR abs/2006.01482 (2020) - [i12]Zhaoqing Peng, Junqi Jin, Lan Luo, Yaodong Yang, Rui Luo, Jun Wang, Weinan Zhang, Haiyang Xu, Miao Xu, Chuan Yu, Tiejian Luo, Han Li, Jian Xu, Kun Gai:
Learning to Infer User Hidden States for Online Sequential Advertising. CoRR abs/2009.01453 (2020) - [i11]Yaodong Yang, Jun Wang:
An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective. CoRR abs/2011.00583 (2020)
2010 – 2019
- 2019
- [c7]Ming Zhou, Yong Chen, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang, Dell Zhang, Jun Wang:
Factorized Q-learning for large-scale multi-agent systems. DAI 2019: 7:1-7:7 - [c6]Yaodong Yang, Rui Luo, Yuanyuan Liu:
Adversarial Variational Bayes Methods for Tweedie Compound Poisson Mixed Models. ICASSP 2019: 3377-3381 - [c5]Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan:
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. ICLR (Poster) 2019 - [c4]Minne Li, Zhiwei (Tony) Qin, Yan Jiao, Yaodong Yang, Jun Wang, Chenxi Wang, Guobin Wu, Jieping Ye:
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning. WWW 2019: 983-994 - [i10]Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan:
Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. CoRR abs/1901.09207 (2019) - [i9]Ying Wen, Yaodong Yang, Rui Lu, Jun Wang:
Multi-Agent Generalized Recursive Reasoning. CoRR abs/1901.09216 (2019) - [i8]Minne Li, Zhiwei (Tony) Qin, Yan Jiao, Yaodong Yang, Zhichen Gong, Jun Wang, Chenxi Wang, Guobin Wu, Jieping Ye:
Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning. CoRR abs/1901.11454 (2019) - [i7]Rui Luo, Qiang Zhang, Yaodong Yang, Jun Wang:
Replica-exchange Nosé-Hoover dynamics for Bayesian learning on large datasets. CoRR abs/1905.12569 (2019) - [i6]Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Weinan Zhang, Jun Wang:
Bi-level Actor-Critic for Multi-agent Coordination. CoRR abs/1909.03510 (2019) - 2018
- [c3]Yaodong Yang, Lantao Yu, Yiwei Bai, Ying Wen, Weinan Zhang, Jun Wang:
A Study of AI Population Dynamics with Million-agent Reinforcement Learning. AAMAS 2018: 2133-2135 - [c2]Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang:
Mean Field Multi-Agent Reinforcement Learning. ICML 2018: 5567-5576 - [c1]Rui Luo, Jianhong Wang, Yaodong Yang, Jun Wang, Zhanxing Zhu:
Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning. NeurIPS 2018: 10696-10705 - [i5]Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, Jun Wang:
Mean Field Multi-Agent Reinforcement Learning. CoRR abs/1802.05438 (2018) - [i4]Yong Chen, Ming Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Weinan Zhang, Dell Zhang, Jun Wang, Han Liu:
Factorized Q-Learning for Large-Scale Multi-Agent Systems. CoRR abs/1809.03738 (2018) - [i3]Qiang Zhang, Rui Luo, Yaodong Yang, Yuanyuan Liu:
Benchmarking Deep Sequential Models on Volatility Predictions for Financial Time Series. CoRR abs/1811.03711 (2018) - 2017
- [i2]Peng Peng, Quan Yuan, Ying Wen, Yaodong Yang, Zhenkun Tang, Haitao Long, Jun Wang:
Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games. CoRR abs/1703.10069 (2017) - [i1]Yaodong Yang, Lantao Yu, Yiwei Bai, Jun Wang, Weinan Zhang, Ying Wen, Yong Yu:
An Empirical Study of AI Population Dynamics with Million-agent Reinforcement Learning. CoRR abs/1709.04511 (2017)
Coauthor Index
aka: Stephen Marcus McAleer
aka: David Henry Mguni
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-08 20:28 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint