Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Raffel, Colin; Shazeer, Noam; Roberts, Adam; Lee, Katherine; Narang, Sharan; Matena, Michael; Zhou, Yanqi; Li, Wei; Liu, Peter J.

Computer Science > Machine Learning

arXiv:1910.10683 (cs)

[Submitted on 23 Oct 2019 (v1), last revised 19 Sep 2023 (this version, v4)]

Title:Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Authors:Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

View PDF

Abstract:Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as:	arXiv:1910.10683 [cs.LG]
	(or arXiv:1910.10683v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1910.10683

Submission history

From: Colin Raffel [view email]
[v1] Wed, 23 Oct 2019 17:37:36 UTC (499 KB)
[v2] Thu, 24 Oct 2019 15:13:50 UTC (501 KB)
[v3] Tue, 28 Jul 2020 13:10:01 UTC (257 KB)
[v4] Tue, 19 Sep 2023 15:14:48 UTC (258 KB)

Computer Science > Machine Learning

Title:Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Submission history

Access Paper:

References & Citations

19 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Submission history

Access Paper:

References & Citations

19 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators