Style Vectors for Steering Generative Large Language Model

Konen, Kai; Jentzsch, Sophie; Diallo, Diaoulé; Schütt, Peer; Bensch, Oliver; Baff, Roxanne El; Opitz, Dominik; Hecking, Tobias

Computer Science > Computation and Language

arXiv:2402.01618 (cs)

[Submitted on 2 Feb 2024]

Title:Style Vectors for Steering Generative Large Language Model

Authors:Kai Konen, Sophie Jentzsch, Diaoulé Diallo, Peer Schütt, Oliver Bensch, Roxanne El Baff, Dominik Opitz, Tobias Hecking

View PDF HTML (experimental)

Abstract:This research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of activation engineering using such style vectors to influence the style of generated text in a nuanced and parameterisable way, distinguishing it from prompt engineering. The presented research constitutes a significant step towards developing more adaptive and effective AI-empowered interactive systems.

Comments:	Will be published as findings paper at EACL2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2402.01618 [cs.CL]
	(or arXiv:2402.01618v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.01618

Submission history

From: Sophie Jentzsch [view email]
[v1] Fri, 2 Feb 2024 18:31:15 UTC (700 KB)

Computer Science > Computation and Language

Title:Style Vectors for Steering Generative Large Language Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Style Vectors for Steering Generative Large Language Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators