Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Dowdell, Thomas; Zhang, Hongyu

Computer Science > Machine Learning

arXiv:1912.11959 (cs)

[Submitted on 27 Dec 2019 (v1), last revised 30 Dec 2019 (this version, v2)]

Title:Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Authors:Thomas Dowdell, Hongyu Zhang

View PDF

Abstract:The key to a Transformer model is the self-attention mechanism, which allows the model to analyze an entire sequence in a computationally efficient manner. Recent work has suggested the possibility that general attention mechanisms used by RNNs could be replaced by active-memory mechanisms. In this work, we evaluate whether various active-memory mechanisms could replace self-attention in a Transformer. Our experiments suggest that active-memory alone achieves comparable results to the self-attention mechanism for language modelling, but optimal results are mostly achieved by using both active-memory and self-attention mechanisms together. We also note that, for some specific algorithmic tasks, active-memory mechanisms alone outperform both self-attention and a combination of the two.

Comments:	7 pages, 2 figures
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)
MSC classes:	I.2.7
ACM classes:	I.2.7
Cite as:	arXiv:1912.11959 [cs.LG]
	(or arXiv:1912.11959v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1912.11959

Submission history

From: Thomas Dowdell BCom(Hons) [view email]
[v1] Fri, 27 Dec 2019 02:01:13 UTC (339 KB)
[v2] Mon, 30 Dec 2019 09:01:18 UTC (339 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-12

Change to browse by:

cs
cs.CL
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hongyu Zhang

export BibTeX citation

Computer Science > Machine Learning

Title:Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators