Enhancing the episodic natural actor-critic algorithm by a regularisation term to stabilize learning of control structures | IEEE Conference Publication | IEEE Xplore