Revisiting Discriminative Entropy Clustering and its relation to K-means

Zhang, Zhongwen; Boykov, Yuri

Computer Science > Machine Learning

arXiv:2301.11405v1 (cs)

[Submitted on 26 Jan 2023 (this version), latest version 29 Sep 2024 (v4)]

Title:Revisiting Discriminative Entropy Clustering and its relation to K-means

Authors:Zhongwen Zhang, Yuri Boykov

View PDF

Abstract:Maximization of mutual information between the model's input and output is formally related to "decisiveness" and "fairness" of the softmax predictions, motivating such unsupervised entropy-based losses for discriminative neural networks. Recent self-labeling methods based on such losses represent the state of the art in deep clustering. However, some important properties of entropy clustering are not well-known, or even misunderstood. For example, we provide a counterexample to prior claims about equivalence to variance clustering (K-means) and point out technical mistakes in such theories. We discuss the fundamental differences between these discriminative and generative clustering approaches. Moreover, we show the susceptibility of standard entropy clustering to narrow margins and motivate an explicit margin maximization term. We also propose an improved self-labeling loss; it is robust to pseudo-labeling errors and enforces stronger fairness. We develop an EM algorithm for our loss that is significantly faster than the standard alternatives. Our results improve the state-of-the-art on standard benchmarks.

Comments:	8 pages; 4 figures
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2301.11405 [cs.LG]
	(or arXiv:2301.11405v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2301.11405

Submission history

From: Zhongwen Zhang [view email]
[v1] Thu, 26 Jan 2023 20:35:30 UTC (1,154 KB)
[v2] Tue, 23 May 2023 21:25:10 UTC (1,682 KB)
[v3] Fri, 24 May 2024 19:18:42 UTC (7,666 KB)
[v4] Sun, 29 Sep 2024 15:31:37 UTC (9,271 KB)

Computer Science > Machine Learning

Title:Revisiting Discriminative Entropy Clustering and its relation to K-means

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Revisiting Discriminative Entropy Clustering and its relation to K-means

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators