Exact Recovery in the Stochastic Block Model

Abbe, Emmanuel; Bandeira, Afonso S.; Hall, Georgina

Computer Science > Social and Information Networks

arXiv:1405.3267 (cs)

[Submitted on 13 May 2014 (v1), last revised 27 Oct 2014 (this version, v4)]

Title:Exact Recovery in the Stochastic Block Model

Authors:Emmanuel Abbe, Afonso S. Bandeira, Georgina Hall

View PDF

Abstract:The stochastic block model (SBM) with two communities, or equivalently the planted bisection model, is a popular model of random graph exhibiting a cluster behaviour. In the symmetric case, the graph has two equally sized clusters and vertices connect with probability $p$ within clusters and $q$ across clusters. In the past two decades, a large body of literature in statistics and computer science has focused on providing lower-bounds on the scaling of $|p-q|$ to ensure exact recovery. In this paper, we identify a sharp threshold phenomenon for exact recovery: if $\alpha=pn/\log(n)$ and $\beta=qn/\log(n)$ are constant (with $\alpha>\beta$), recovering the communities with high probability is possible if $\frac{\alpha+\beta}{2} - \sqrt{\alpha \beta}>1$ and impossible if $\frac{\alpha+\beta}{2} - \sqrt{\alpha \beta}<1$. In particular, this improves the existing bounds. This also sets a new line of sight for efficient clustering algorithms. While maximum likelihood (ML) achieves the optimal threshold (by definition), it is in the worst-case NP-hard. This paper proposes an efficient algorithm based on a semidefinite programming relaxation of ML, which is proved to succeed in recovering the communities close to the threshold, while numerical experiments suggest it may achieve the threshold. An efficient algorithm which succeeds all the way down to the threshold is also obtained using a partial recovery algorithm combined with a local improvement procedure.

Comments:	31 pages
Subjects:	Social and Information Networks (cs.SI); Probability (math.PR); Physics and Society (physics.soc-ph)
Cite as:	arXiv:1405.3267 [cs.SI]
	(or arXiv:1405.3267v4 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.1405.3267

Submission history

From: Georgina Hall [view email]
[v1] Tue, 13 May 2014 19:25:51 UTC (22 KB)
[v2] Mon, 19 May 2014 19:41:51 UTC (24 KB)
[v3] Thu, 12 Jun 2014 19:31:43 UTC (287 KB)
[v4] Mon, 27 Oct 2014 20:15:54 UTC (1,167 KB)

Computer Science > Social and Information Networks

Title:Exact Recovery in the Stochastic Block Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Social and Information Networks

Title:Exact Recovery in the Stochastic Block Model

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators