default search action

combined dblp search
author search
venue search
publication search

ask others

Shengyi Huang

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2024
[c5]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/HuangWCLXO24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/HuangWCLXO24
Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón:
Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform. ICLR 2024
[i13]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-03046
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-03046
Shengyi Huang, Quentin Gallouédec, Florian Felten, Antonin Raffin, Rousslan Fernand Julien Dossa, Yanxiao Zhao, Ryan Sullivan, Viktor Makoviychuk, Denys Makoviichuk, Mohamad H. Danesh, Cyril Roumégous, Jiayi Weng, Chufan Chen, Md Masudur Rahman, João G. M. Araújo, Guorui Quan, Daniel Tan, Timo Klein, Rujikorn Charakorn, Mark Towers, Yann Berthelot, Kinal Mehta, Dipam Chakraborty, Arjun KG, Valentin Charraut, Chang Ye, Zichen Liu, Lucas N. Alegre, Alexander Nikulin, Xiao Hu, Tianlin Liu, Jongwook Choi, Brent Yi:
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning. CoRR abs/2402.03046 (2024)
[i12]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-17031
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-17031
Shengyi Huang, Michael Noukhovitch, Arian Hosseini, Kashif Rasul, Weixun Wang, Lewis Tunstall:
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization. CoRR abs/2403.17031 (2024)
2023
[c4]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/SullivanKHDS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/SullivanKHDS23
Ryan Sullivan, Akarsh Kumar, Shengyi Huang, John P. Dickerson, Joseph Suarez:
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks. NeurIPS 2023
[i11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-00036
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-00036
Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón:
Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform. CoRR abs/2310.00036 (2023)
[i10]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16944
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16944
Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf:
Zephyr: Direct Distillation of LM Alignment. CoRR abs/2310.16944 (2023)
[i9]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-17805
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-17805
Ryan Sullivan, Akarsh Kumar, Shengyi Huang, John P. Dickerson, Joseph Suarez:
Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks. CoRR abs/2310.17805 (2023)
2022
[j2]
- view
  - electronic edition @ jmlr.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/jmlr/HuangDYBCMA22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/jmlr/HuangDYBCMA22
Shengyi Huang, Rousslan Fernand Julien Dossa, Chang Ye, Jeff Braga, Dipam Chakraborty, Kinal Mehta, João G. M. Araújo:
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms. J. Mach. Learn. Res. 23: 274:1-274:18 (2022)
[c3]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/flairs/HuangO22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/flairs/HuangO22
Shengyi Huang, Santiago Ontañón:
A Closer Look at Invalid Action Masking in Policy Gradient Algorithms. FLAIRS 2022
[c2]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WengLHLMMLSLJXY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WengLHLMMLSLJXY22
Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan:
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine. NeurIPS 2022
[i8]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-09123
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-09123
Shengyi Huang, Anssi Kanervisto, Antonin Raffin, Weixun Wang, Santiago Ontañón, Rousslan Fernand Julien Dossa:
A2C is a special case of PPO. CoRR abs/2205.09123 (2022)
[i7]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-10558
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-10558
Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan:
EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine. CoRR abs/2206.10558 (2022)
2021
[j1]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/access/DossaHO021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/access/DossaHO021
Rousslan Fernand Julien Dossa, Shengyi Huang, Santiago Ontañón, Takashi Matsubara:
An Empirical Investigation of Early Stopping Optimizations in Proximal Policy Optimization. IEEE Access 9: 117981-117992 (2021)
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/cig/HuangO0G21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cig/HuangO0G21
Shengyi Huang, Santiago Ontañón, Chris Bamford, Lukasz Grela:
Gym-µRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning. CoG 2021: 1-8
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2105-13807
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2105-13807
Shengyi Huang, Santiago Ontañón, Chris Bamford, Lukasz Grela:
Gym-μRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning. CoRR abs/2105.13807 (2021)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-08819
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-08819
Shengyi Huang, Rousslan Fernand Julien Dossa, Chang Ye, Jeff Braga:
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms. CoRR abs/2111.08819 (2021)
2020
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-14171
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-14171
Shengyi Huang, Santiago Ontañón:
A Closer Look at Invalid Action Masking in Policy Gradient Algorithms. CoRR abs/2006.14171 (2020)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2010-03956
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-03956
Shengyi Huang, Santiago Ontañón:
Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games. CoRR abs/2010.03956 (2020)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2011-06363
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2011-06363
Chris Bamford, Shengyi Huang, Simon M. Lucas:
Griddly: A platform for AI research in games. CoRR abs/2011.06363 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-12134
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-12134
Shengyi Huang, Santiago Ontañón:
Comparing Observation and Action Representations for Deep Reinforcement Learning in MicroRTS. CoRR abs/1910.12134 (2019)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.