Large Language Models Can Self-Improve

Huang, Jiaxin; Gu, Shixiang Shane; Hou, Le; Wu, Yuexin; Wang, Xuezhi; Yu, Hongkun; Han, Jiawei

Computer Science > Computation and Language

arXiv:2210.11610 (cs)

[Submitted on 20 Oct 2022 (v1), last revised 25 Oct 2022 (this version, v2)]

Title:Large Language Models Can Self-Improve

Authors:Jiaxin Huang, Shixiang Shane Gu, Le Hou, Yuexin Wu, Xuezhi Wang, Hongkun Yu, Jiawei Han

View PDF

Abstract:Large Language Models (LLMs) have achieved excellent performances in various tasks. However, fine-tuning an LLM requires extensive supervision. Human, on the other hand, may improve their reasoning abilities by self-thinking without external inputs. In this work, we demonstrate that an LLM is also capable of self-improving with only unlabeled datasets. We use a pre-trained LLM to generate "high-confidence" rationale-augmented answers for unlabeled questions using Chain-of-Thought prompting and self-consistency, and fine-tune the LLM using those self-generated solutions as target outputs. We show that our approach improves the general reasoning ability of a 540B-parameter LLM (74.4%->82.1% on GSM8K, 78.2%->83.0% on DROP, 90.0%->94.4% on OpenBookQA, and 63.4%->67.9% on ANLI-A3) and achieves state-of-the-art-level performance, without any ground truth label. We conduct ablation studies and show that fine-tuning on reasoning is critical for self-improvement.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.11610 [cs.CL]
	(or arXiv:2210.11610v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.11610

Submission history

From: Jiaxin Huang [view email]
[v1] Thu, 20 Oct 2022 21:53:54 UTC (174 KB)
[v2] Tue, 25 Oct 2022 17:45:17 UTC (174 KB)

Computer Science > Computation and Language

Title:Large Language Models Can Self-Improve

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Language Models Can Self-Improve

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators