New Approaches to Long Document Summarization: Fourier Transform Based Attention in a Transformer Model

Kiruluta, Andrew; Lemos, Andreas; Lundy, Eric

Computer Science > Computation and Language

arXiv:2111.15473 (cs)

[Submitted on 25 Nov 2021]

Title:New Approaches to Long Document Summarization: Fourier Transform Based Attention in a Transformer Model

Authors:Andrew Kiruluta, Andreas Lemos, Eric Lundy

View PDF

Abstract:In this work, we extensively redesign the newly introduced method of token mixing using Fourier Transforms (FNET) to replace the computationally expensive self-attention mechanism in a full transformer implementation on a long document summarization task (> 512 tokens). As a baseline, we also carried out long document summarization using established methods such as Longformer and Big Bird transformer models that are capable of processing over 8000 tokens and are currently the state of the art methods for these type of problems. The original FNET paper implemented this in an encoder only architecture while abstractive summarization requires both an encoder and a decoder. Since such a pretrained transformer model does not currently exist in the public domain, we decided to implement a full transformer based on this Fourier token mixing approach in an encoder/decoder architecture which we trained starting with Glove embeddings for the individual words in the corpus. We investigated a number of different extensions to the original FNET architecture and evaluated them on their Rouge F1-score performance on a summarization task. All modifications showed better performance on the summarization task than when using the original FNET encoder in a transformer architecture.

Comments:	7 pages, 4 figures, 5 tables
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2111.15473 [cs.CL]
	(or arXiv:2111.15473v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2111.15473

Submission history

From: Andrew Kiruluta [view email]
[v1] Thu, 25 Nov 2021 18:03:41 UTC (807 KB)

Computer Science > Computation and Language

Title:New Approaches to Long Document Summarization: Fourier Transform Based Attention in a Transformer Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:New Approaches to Long Document Summarization: Fourier Transform Based Attention in a Transformer Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators