Soft Actor-Critic with Inhibitory Networks for Faster Retraining

Ide, Jaime S.; Mićović, Daria; Guarino, Michael J.; Alcedo, Kevin; Rosenbluth, David; Pope, Adrian P.

Computer Science > Machine Learning

arXiv:2202.02918 (cs)

[Submitted on 7 Feb 2022 (v1), last revised 8 Feb 2022 (this version, v2)]

Title:Soft Actor-Critic with Inhibitory Networks for Faster Retraining

Authors:Jaime S. Ide, Daria Mićović, Michael J. Guarino, Kevin Alcedo, David Rosenbluth, Adrian P. Pope

View PDF

Abstract:Reusing previously trained models is critical in deep reinforcement learning to speed up training of new agents. However, it is unclear how to acquire new skills when objectives and constraints are in conflict with previously learned skills. Moreover, when retraining, there is an intrinsic conflict between exploiting what has already been learned and exploring new skills. In soft actor-critic (SAC) methods, a temperature parameter can be dynamically adjusted to weight the action entropy and balance the explore $\times$ exploit trade-off. However, controlling a single coefficient can be challenging within the context of retraining, even more so when goals are contradictory. In this work, inspired by neuroscience research, we propose a novel approach using inhibitory networks to allow separate and adaptive state value evaluations, as well as distinct automatic entropy tuning. Ultimately, our approach allows for controlling inhibition to handle conflict between exploiting less risky, acquired behaviors and exploring novel ones to overcome more challenging tasks. We validate our method through experiments in OpenAI Gym environments.

Comments:	16 pages including Appendix
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:2202.02918 [cs.LG]
	(or arXiv:2202.02918v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2202.02918

Submission history

From: Jaime Ide [view email]
[v1] Mon, 7 Feb 2022 03:10:34 UTC (2,041 KB)
[v2] Tue, 8 Feb 2022 02:38:35 UTC (2,041 KB)

Computer Science > Machine Learning

Title:Soft Actor-Critic with Inhibitory Networks for Faster Retraining

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Soft Actor-Critic with Inhibitory Networks for Faster Retraining

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators