Dialect-robust Evaluation of Generated Text.

AllImages Books Videos Maps News Shopping

[2211.00922] Dialect-robust Evaluation of Generated Text - arXiv

Nov 2, 2022 · We introduce a suite of methods and corresponding statistical tests one can use to assess metrics in light of the two goals.

Scholarly articles for Dialect-robust Evaluation of Generated Text.

scholar.google.com › citations

Dialect-robust evaluation of generated text
Sun · Cited by 16

Natural language processing for dialects of a language …
Joshi · Cited by 12

A benchmark for evaluating machine translation …
Aepli · Cited by 5

Dialect-robust Evaluation of Generated Text - ACL Anthology

aclanthology.org › 2023.acl-long.331

In this paper, we introduce a suite of methods to assess whether metrics are dialect robust. These methods show that state-of-the-art metrics are not dialect ...

(PDF) Dialect-robust Evaluation of Generated Text - ResearchGate

www.researchgate.net › publication › 36...

Nov 2, 2022 · We introduce a suite of methods and corresponding statistical tests one can use to assess metrics in light of the two goals.

Dialect-robust Evaluation of Generated Text - ACL 2023

virtual2023.aclweb.org › paper_P2589

TLDR: Text generation metrics that are not robust to dialect variation make it impossible to tell how well systems perform for many groups of users, ...

Dialect-robust Evaluation of Generated Text | alphaXiv

www.alphaxiv.org › abs

Evaluation metrics that are not robust to dialect variation make it impossible to tell how well systems perform for many groups of users, ...

Evaluating Dialect Robustness of Language Models via Conversation ...

arxiv.org › html

May 9, 2024 · The focus of this paper is to evaluate dialect robustness by comparing the performance on USEng and IndEng. All LLMs perform better on USEng as ...

Dialect-robust Evaluation of Generated Text - NLPExplorer

lingo.iitgn.ac.in › 2023.acl-long.331

Elizabeth Clark | Papers With Code

paperswithcode.com › search

For evaluating machine-generated texts, automatic methods hold the promise of avoiding collection of human judgments, which can be expensive and time-consuming.

Evaluating Dialect Robustness of Language Models via Conversation ...

www.aimodels.fyi › papers › arxiv › eval...

Aug 21, 2024 · The paper presents a novel approach to evaluating the dialect robustness of LLMs using a pre-existing dataset of conversational dialogues. By ...

Dialect Awareness test on simplified Man- darin. We score each variant...

www.researchgate.net › figure › Dialect-...

... Dialect-robust Evaluation of Generated Text | Evaluation metrics that are not robust to dialect variation make it impossible to tell how well systems ...