Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Xu, Yang; Wang, Yu; An, Hao; Liu, Zhichen; Li, Yongyuan

Computer Science > Computation and Language

arXiv:2406.19874 (cs)

[Submitted on 28 Jun 2024 (v1), last revised 9 Oct 2024 (this version, v2)]

Title:Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Authors:Yang Xu, Yu Wang, Hao An, Zhichen Liu, Yongyuan Li

View PDF HTML (experimental)

Abstract:Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model's capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likelihood for the human-model text detection task. We propose a detection procedure with two classification methods, supervised and heuristic-based, respectively, which results in competitive performances with previous zero-shot detection methods and a new state-of-the-art on short-text detection. Our method can also reveal subtle differences between human and model languages, which find theoretical roots in psycholinguistics studies. Our code is available at this https URL

Comments:	14 pages, 12 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
ACM classes:	I.2.7
Cite as:	arXiv:2406.19874 [cs.CL]
	(or arXiv:2406.19874v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.19874

Submission history

From: Yang Xu [view email]
[v1] Fri, 28 Jun 2024 12:28:52 UTC (1,111 KB)
[v2] Wed, 9 Oct 2024 09:36:49 UTC (1,125 KB)

Computer Science > Computation and Language

Title:Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators