Extrinsic Evaluation of Cultural Competence in Large Language Models

Bhatt, Shaily; Diaz, Fernando

Computer Science > Computation and Language

arXiv:2406.11565 (cs)

[Submitted on 17 Jun 2024 (v1), last revised 3 Oct 2024 (this version, v3)]

Title:Extrinsic Evaluation of Cultural Competence in Large Language Models

Authors:Shaily Bhatt, Fernando Diaz

View PDF HTML (experimental)

Abstract:Productive interactions between diverse users and language technologies require outputs from the latter to be culturally relevant and sensitive. Prior works have evaluated models' knowledge of cultural norms, values, and artifacts, without considering how this knowledge manifests in downstream applications. In this work, we focus on extrinsic evaluation of cultural competence in two text generation tasks, open-ended question answering and story generation. We quantitatively and qualitatively evaluate model outputs when an explicit cue of culture, specifically nationality, is perturbed in the prompts. Although we find that model outputs do vary when varying nationalities and feature culturally relevant words, we also find weak correlations between text similarity of outputs for different countries and the cultural values of these countries. Finally, we discuss important considerations in designing comprehensive evaluation of cultural competence in user-facing tasks.

Comments:	Accepted to EMNLP Findings 2024
Subjects:	Computation and Language (cs.CL); Computers and Society (cs.CY)
Cite as:	arXiv:2406.11565 [cs.CL]
	(or arXiv:2406.11565v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.11565

Submission history

From: Shaily Bhatt [view email]
[v1] Mon, 17 Jun 2024 14:03:27 UTC (455 KB)
[v2] Wed, 19 Jun 2024 05:18:56 UTC (455 KB)
[v3] Thu, 3 Oct 2024 19:28:35 UTC (739 KB)

Computer Science > Computation and Language

Title:Extrinsic Evaluation of Cultural Competence in Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Extrinsic Evaluation of Cultural Competence in Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators