Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation

Qin, Meng; Zhang, Chaorui; Gao, Yu; Zhang, Weixi; Yeung, Dit-Yan

Computer Science > Social and Information Networks

arXiv:2405.20277 (cs)

[Submitted on 30 May 2024 (v1), last revised 7 Jun 2024 (this version, v2)]

Title:Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation

Authors:Meng Qin, Chaorui Zhang, Yu Gao, Weixi Zhang, Dit-Yan Yeung

View PDF HTML (experimental)

Abstract:Community detection (CD) is a classic graph inference task that partitions nodes of a graph into densely connected groups. While many CD methods have been proposed with either impressive quality or efficiency, balancing the two aspects remains a challenge. This study explores the potential of deep graph learning to achieve a better trade-off between the quality and efficiency of K-agnostic CD, where the number of communities K is unknown. We propose PRoCD (Pre-training & Refinement fOr Community Detection), a simple yet effective method that reformulates K-agnostic CD as the binary node pair classification. PRoCD follows a pre-training & refinement paradigm inspired by recent advances in pre-training techniques. We first conduct the offline pre-training of PRoCD on small synthetic graphs covering various topology properties. Based on the inductive inference across graphs, we then generalize the pre-trained model (with frozen parameters) to large real graphs and use the derived CD results as the initialization of an existing efficient CD method (e.g., InfoMap) to further refine the quality of CD results. In addition to benefiting from the transfer ability regarding quality, the online generalization and refinement can also help achieve high inference efficiency, since there is no time-consuming model optimization. Experiments on public datasets with various scales demonstrate that PRoCD can ensure higher efficiency in K-agnostic CD without significant quality degradation.

Comments:	Accepted by ACM KDD 2024
Subjects:	Social and Information Networks (cs.SI)
Cite as:	arXiv:2405.20277 [cs.SI]
	(or arXiv:2405.20277v2 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.2405.20277

Submission history

From: Meng Qin [view email]
[v1] Thu, 30 May 2024 17:30:04 UTC (928 KB)
[v2] Fri, 7 Jun 2024 04:03:14 UTC (929 KB)

🚨2024-09-29: arxiv.org is experiencing DB issues.🚨

Computer Science > Social and Information Networks

Title:Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

🚨2024-09-29: arxiv.org is experiencing DB issues.🚨

Computer Science > Social and Information Networks

Title:Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators