Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

Ghaffari, Alireza; Yu, Justin; Nejad, Mahsa Ghazvini; Asgharian, Masoud; Chen, Boxing; Nia, Vahid Partovi

Computer Science > Computation and Language

arXiv:2312.09211 (cs)

[Submitted on 14 Dec 2023 (v1), last revised 13 Jan 2024 (this version, v3)]

Title:Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

Authors:Alireza Ghaffari, Justin Yu, Mahsa Ghazvini Nejad, Masoud Asgharian, Boxing Chen, Vahid Partovi Nia

View PDF HTML (experimental)

Abstract:Low-precision fine-tuning of language models has gained prominence as a cost-effective and energy-efficient approach to deploying large-scale models in various applications. However, this approach is susceptible to the existence of outlier values in activation. The outlier values in the activation can negatively affect the performance of fine-tuning language models in the low-precision regime since they affect the scaling factor and thus make representing smaller values harder. This paper investigates techniques for mitigating outlier activation in low-precision integer fine-tuning of the language models. Our proposed novel approach enables us to represent the outlier activation values in 8-bit integers instead of floating-point (FP16) values. The benefit of using integers for outlier values is that it enables us to use operator tiling to avoid performing 16-bit integer matrix multiplication to address this problem effectively. We provide theoretical analysis and supporting experiments to demonstrate the effectiveness of our approach in improving the robustness and performance of low-precision fine-tuned language models.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2312.09211 [cs.CL]
	(or arXiv:2312.09211v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2312.09211

Submission history

From: Alireza Ghaffari [view email]
[v1] Thu, 14 Dec 2023 18:41:32 UTC (677 KB)
[v2] Fri, 15 Dec 2023 14:46:53 UTC (677 KB)
[v3] Sat, 13 Jan 2024 13:52:16 UTC (696 KB)

Computer Science > Computation and Language

Title:Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators