Multi-Label Training for Text-Independent Speaker Identification

Xue, Yuqi

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2211.07373 (eess)

[Submitted on 14 Nov 2022 (v1), last revised 16 Aug 2024 (this version, v2)]

Title:Multi-Label Training for Text-Independent Speaker Identification

Authors:Yuqi Xue

View PDF HTML (experimental)

Abstract:In this paper, we propose a novel strategy for text-independent speaker identification system: Multi-Label Training (MLT). Instead of the commonly used one-to-one correspondence between the speech and the speaker label, we divide all the speeches of each speaker into several subgroups, with each subgroup assigned a different set of labels. During the identification process, a specific speaker is identified as long as the predicted label is the same as one of his/her corresponding labels. We found that this method can force the model to distinguish the data more accurately, and somehow takes advantages of ensemble learning, while avoiding the significant increase of computation and storage burden. In the experiments, we found that not only in clean conditions, but also in noisy conditions with speech enhancement, Multi-Label Training can still achieve better identification performance than commom methods. It should be noted that the proposed strategy can be easily applied to almost all current text-independent speaker identification models to achieve further improvements.

Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2211.07373 [eess.AS]
	(or arXiv:2211.07373v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2211.07373

Submission history

From: Yuqi Xue [view email]
[v1] Mon, 14 Nov 2022 14:07:25 UTC (936 KB)
[v2] Fri, 16 Aug 2024 08:40:12 UTC (942 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Label Training for Text-Independent Speaker Identification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Multi-Label Training for Text-Independent Speaker Identification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators