Goto

Collaborating Authors

 mat


'I am valued here': the extraordinary film that recreates a disabled boy's rich digital life

The Guardian

The night after their son Mats died aged just 25, Trude and Robert Steen sat on the sofa in their living room in Oslo with their daughter Mia. "Everything was a blur," remembers Trude of that day 10 years ago. "Then Robert said, 'Maybe we should reach out to Mats' friends in World of Warcraft.'" Mats was born with Duchenne muscular dystrophy, a progressive condition that causes the muscles to weaken gradually. He was diagnosed aged four and started using a wheelchair at 10.


The Remarkable Life of Ibelin review – moving tale of disabled gamer's digital double life

The Guardian

It's probably just an accident of scheduling, but this deeply affecting documentary is arriving just when there's a debate raging at the school gates about children's use of smartphones and social media. So while it's undoubtedly troubling how tech platforms set out to addict and exploit young minds, The Remarkable Life of Ibelin provides a fascinating counterargument about how online gaming at least can be a lifeline for some individuals who find themselves isolated in the real world, or IRL as the kids like to say. Born in 1989, Mats Steen started out like many other Norwegian children of his generation: energetic, sweet-natured, unusually pale. However, his parents Robert and Trude soon discovered that he had Duchenne muscular dystrophy, a genetic condition that eroded his ability to move and breathe and which would eventually kill him at the age of 25. By that point in 2014, Robert, Trude and Mats' sister Mia knew that Mats spent hours of his life online playing World of Warcraft using special equipment to accommodate his disability and had been publishing a blog about his life.


Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Neural Information Processing Systems

Large sequence models (SM) such as GPT series and BERT have displayed outstanding performance and generalization capabilities in natural language process, vision and recently reinforcement learning. A natural follow-up question is how to abstract multi-agent decision making also as an sequence modeling problem and benefit from the prosperous development of the SMs. In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the objective is to map agents' observation sequences to agents' optimal action sequences. Our goal is to build the bridge between MARL and SMs so that the modeling power of modern sequence models can be unleashed for MARL. Central to our MAT is an encoder-decoder architecture which leverages the multi-agent advantage decomposition theorem to transform the joint policy search problem into a sequential decision making process; this renders only linear time complexity for multi-agent problems and, most importantly, endows MAT with monotonic performance improvement guarantee.


Stationary Activations for Uncertainty Calibration in Deep Learning

Neural Information Processing Systems

We introduce a new family of non-linear neural network activation functions that mimic the properties induced by the widely-used Mat\'ern family of kernels in Gaussian process (GP) models. We show an explicit link to the corresponding GP models in the case that the network consists of one infinitely wide hidden layer. In the limit of infinite smoothness the Mat\'ern family results in the RBF kernel, and in this case we recover RBF activations. Mat\'ern activation functions result in similar appealing properties to their counterparts in GP models, and we demonstrate that the local stationarity property together with limited mean-square differentiability shows both good performance and uncertainty calibration in Bayesian deep learning tasks. In particular, local stationarity helps calibrate out-of-distribution (OOD) uncertainty.





Multi-Agent Reinforcement Learning is A Sequence Modeling Problem Muning Wen

Neural Information Processing Systems

Large sequence models (SM) such as GPT series and BERT have displayed outstanding performance and generalization capabilities in natural language process, vision and recently reinforcement learning. A natural follow-up question is how to abstract multi-agent decision making also as an sequence modeling problem and benefit from the prosperous development of the SMs. In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the objective is to map agents' observation sequences to agents' optimal action sequences. Our goal is to build the bridge between MARL and SMs so that the modeling power of modern sequence models can be unleashed for MARL. Central to our MAT is an encoder-decoder architecture which leverages the multi-agent advantage decomposition theorem to transform the joint policy search problem into a sequential decision making process; this renders only linear time complexity for multiagent problems and, most importantly, endows MAT with monotonic performance improvement guarantee. Unlike prior arts such as Decision Transformer fit only precollected offline data, MAT is trained by online trial and error from the environment in an on-policy fashion. To validate MAT, we conduct extensive experiments on StarCraftII, Multi-Agent MuJoCo, Dexterous Hands Manipulation, and Google Research Football benchmarks. Results demonstrate that MAT achieves superior performance and data efficiency compared to strong baselines including MAPPO and HAPPO. Furthermore, we demonstrate that MAT is an excellent few-short learner on unseen tasks regardless of changes in the number of agents.



Appendix A Trainability and Generalization

Neural Information Processing Systems

A.1 Trainability Following the previous work (Jacot et al., 2018) training neural networks in function space instead of parameter space, we analyze the trainability of an over-parameterized model by investigating the evolution of its predictions. L. Motivated by Arora et al. (2019), we rewrite the equation into the form of norm and eigenpairs: kf(X; Then we can derive Eq. 9 into: As we can see, at every step of the gradient descent, the model learns the target function faster along the eigen-directions corresponding to the larger eigenvalues. Further, for the loss function assumed by squared error loss, we characterize the loss reduction by the following directional derivative (Wang et al., 2020): L( + r The directional derivative of the loss function is closely related to the eigenspectrum of mNTKs. Y Arora et al. (2019) have studied the relationship between the projection norm and labels, and they demonstrate that true labels generate better alignment with top Let (0) and (t) denote the weights initialized from scratch and the weights after t iterations, respectively. Specifically, a greater weight distance implies a more significant amount of Rademacher complexity and is thus associated with weaker generalization ability.