Neural discovery of balance-aware polarized communities

Gullo, Francesco; Mandaglio, Domenico; Tagarelli, Andrea

doi:10.1007/s10994-024-06581-4

Neural discovery of balance-aware polarized communities

Open access
Published: 09 July 2024

Volume 113, pages 6611–6644, (2024)
Cite this article

Download PDF

You have full access to this open access article

Machine Learning Aims and scope Submit manuscript

Neural discovery of balance-aware polarized communities

Download PDF

Francesco Gullo¹,
Domenico Mandaglio² &
Andrea Tagarelli²

574 Accesses
Explore all metrics

Abstract

Signed graphs are a model to depict friendly (positive) or antagonistic (negative) interactions (edges) among users (nodes). 2-Polarized-Communities (2pc) is a well-established combinatorial-optimization problem whose goal is to find two polarized communities from a signed graph, i.e., two subsets of nodes (disjoint, but not necessarily covering the entire node set) which exhibit a high number of both intra-community positive edges and negative inter-community edges. The state of the art in 2pc suffers from the limitations that (i) existing methods rely on a single (optimal) solution to a continuous relaxation of the problem in order to produce the ultimate discrete solution via rounding, and (ii) 2pc objective function comes with no control on size balance among communities. In this paper, we provide advances to the 2pc problem by addressing both these limitations, with a twofold contribution. First, we devise a novel neural approach that allows for soundly and elegantly explore a variety of suboptimal solutions to the relaxed 2pc problem, so as to pick the one that leads to the best discrete solution after rounding. Second, we introduce a generalization of 2pc objective function – termed $\gamma $-polarity – which fosters size balance among communities, and we incorporate it into the proposed machine-learning framework. Extensive experiments attest high accuracy of our approach, its superiority over the state of the art, and capability of function $\gamma $-polarity to discover high-quality size-balanced communities.

ICS-GNN$^+$: lightweight interactive community search via graph neural network

Article 18 July 2022

Local Spectral for Polarized Communities Search in Attributed Signed Network

BVNS for Overlapping Community Detection

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The widespread use of modern social media has created a huge amount of online social interactions, fostered the formation of communities (e.g., (Ghoshal et al., 2021; Ghoshal & Das, 2017; Ghoshal et al., 2019)) and facilitating discussions about a variety of topics. Users establish positive relationships such as friendships, agreements, and supports, as well as negative relationships such as foes, disagreements, and distrusts. The existence of such mixed interactions has led to an ever-growing polarization phenomenon, i.e., a division of the set of users into groups with opposite view on controversial topics (e.g., politics, religion, sport).

In the past few years, we have witnessed a plethora of studies about polarization on social media (Conover et al., 2011; Hohmann et al., 2023; Garimella et al., 2017). Polarization is often distinguished in ideological and affective polarization (Lelkes, 2016): the former refers to increased ideological divergence and reduced dialogue among individuals with differing views, whereas the latter focuses on affective attitude that individuals show toward others based on their opinions (Hohmann et al., 2023). Nonetheless, other approaches to the study of polarization discard a particular qualification of the term polarization, while adopting a graph-theoretic setting where the goal is to discover polarized communities in signed graphs (Bonchi et al., 2019; Tzeng et al., 2020). In this work, we follow the latter line of studies. Remarkably, a key novelty in our work is the exploitation of machine learning, particularly neural network models, for discovering a polarization structure.

Polarization in signed graphs. Signed graphs are graphs whose edges are assigned either a positive or a negative label, denoting whether the interaction depicted by an edge is friendly or antagonistic, respectively (Harary, 1953). Signed graphs are used to model a variety of data and study numerous (social) phenomena, such as emergence of polarized discussions in social media, or analysis of trust/distrust in review platforms (Beigi et al., 2016; Derr et al., 2018; Yao et al., 2022; Zheng et al., 2015). Bonchi et al. (2019) employ signed graphs to define the problem of 2-Polarized-Communities (for short, 2pc), which requires finding two subsets of nodes, generally referred to as communities, of the input signed graph such that there are (R1) mostly positive edges within each community and (R2) mostly negative edges between the two communities, and (R3) the subgraph induced by these two communities is as much dense as possible; for instance, assuming that positive (resp. negative) edges denote agreement (resp. disagreement) of social media users w.r.t. a given context of debate, identifying the two polarized communities correspond to detecting two groups of users, where users of the same group mostly agree with each other, while having divergent opinions with respect to the users of the other group. Also, the two communities are required to be non-overlapping, but they do not necessarily need to cover the entire node set. The rationale of the latter is to comply the most with real-world situations, where polarized communities are concealed within a body of other graph nodes which do not (yet) have a strongly formed opinion, and, as such, they are neutral in terms of polarization.

Motivation: limitations of the state of the art in $\textsc{2PC}$. The above R1–R3 requirements for the $\textsc{2PC}$ problem are jointly pursued by maximizing a single objective function, termed polarity. Bonchi et al. (2019) show that maximizing polarity is $\textbf{NP}$-hard, but also that a continuous relaxation of that problem is solvable in polynomial time. They exploit this finding to devise algorithms which consist in properly rounding (i.e., discretizing) the optimal solution of the relaxed problem.

$\underline{Limitation\, 1}$. Despite Bonchi et al.’s algorithms are principled and rather effective, they suffer from the fact that deriving a solution to 2pc starting from the optimal solution of the relaxed problem may be limiting in terms of polarity. In fact, as illustrated in Figure 1, suboptimal solutions to the relaxed problem can lead to better solutions to 2pc after rounding.

$\underline{Limitation\, 2}$. The polarity function does not require or foster the detection of size-balanced communities. Indeed, maximizing polarity can easily lead to degenerate solutions with a single sufficiently large community and another (almost) empty community, even if the input signed graph does contain “natural” polarized communities that are both non-empty and possibly of comparable size. Several types of social environments, from social media platforms to online forums, from political systems to scientific communities, can benefit from identifying and maintaining balanced polarized communities. By ensuring that communities are diverse while still containing balanced viewpoints, constructive debates can be facilitated, critical thinking can be encouraged, and the individuals’ perspectives can be broadened, thus reducing echo chambers and mitigating the spread of misinformation. Also, companies involved in market research and product development can benefit from gathering insights from diverse consumer groups balanced in size, and ultimately better understand market preferences and anticipate consumer trends.

Therefore, there is a need for methods that can detect balanced polarized communities. In this respect, turning back to example of Figure 1, the assignment of A to one community, D-E to the other community, and B-C as neutral, is also much more balanced than the one derived from the optimal relaxed-solution, where node A forms its own community, and all the other nodes are assigned to the other community.

The above example also highlights the need for detecting fine-grained polarization phenomena, i.e., polarized communities that may not be apparent in terms of node size or amount of connections involved therein. This is in fact essential to recognize minorities in polarization, which in turn might correspond to harmful situations like isolation, where a small group of individuals are marginalized or isolated by a larger, cohesive group; later in this paper, we will provide an example of such polarization setting.

Contributions. In this paper, we advance the state of the art in the 2pc problem by properly addressing the above limitations. Specifically, we provide a twofold contribution.

First, targeting Limitation 1, we devise a novel machine-learning approach that allows for soundly and effectively exploring a variety of suboptimal solutions to the relaxed problem, so as to ultimately select the one that leads to the best discrete solution to 2pc after rounding.

Second, to overcome Limitation 2, we devise a generalization of the polarity function, named $\gamma $-polarity. When optimizing standard polarity, in fact, 2pc solutions tend to produce strongly imbalanced polarized communities, especially when dealing with large graphs. Our proposed $\gamma $-polarity is designed to produce polarized communities that, depending on the setting of $\gamma $, can be either more balanced or larger than those yielded by standard polarity.

The proposed approach leverages a neural-network-based framework, whose core component is a signed graph neural network (GNN) model, to learn continuous vector representations of the input nodes, for the task of assigning each node a real-valued score between -1 and 1. Such a score is ultimately rounded onto $\{-1, 0, 1\}$, so as to determine whether the corresponding node is part of one of the two communities ($-1$ or 1), or is neutral (0). To this purpose, our neural framework is optimized via a loss corresponding to the relaxed polarization function, coupled with a suitable regularization term.

Rationale and benefits of our proposal are as follows:

($\textbf{1}$) A neural approach is well-suited for 2pc due to the compatibility of continuous relaxation of 2pc with neural network differentiability. By setting the loss function to the relaxed 2pc objective and performing rounding after each learning step, we bridge the gap between discrete constraints of the underlying combinatorial-optimization problem and the inherently continuous mathematical framework of neural networks. Also, both input and output of (relaxed) 2pc are naturally handled by neural-network building blocks too: the input graph by a signed GNN, and the output $[-1, 1]$ score by a $\tanh $ activation function. Furthermore, external information associated with nodes can easily be integrated into our framework, since GNNs are designed to initialize the hidden node representations (embeddings) with any available node features.

($\textbf{2}$) While simple, our approach is backed by solid machine-learning fundamentals, which make it principled and sound. In fact, training our neural framework via standard gradient descent provides an elegant solution to the aforementioned requirement of exploring a variety of suboptimal solutions to relaxed 2pc. Every epoch of training of our framework ends up with a rounding which produces a discrete 2pc solution, where a proper loss regularization term is introduced to enforce the continuous scores to be closer to discrete $\{-1, 0, 1\}$ values.

($\textbf{3}$) Our framework is lightweight, yet highly versatile and modular, facilitating easy maintenance and updates to keep pace with the latest GNN models and deep learning advancements. Future improvements, like enhanced signed GNNs, can be seamlessly integrated by modifying a single building block. Additionally, our framework allows for seamlessly incorporating additional requirements on the yielded solutions, such as fostering size balance, as we discuss next.

($\textbf{4}$) We effectively address non-size-balanced communities by maximizing $\gamma $-polarity, a generalization of standard polarity, using it as a loss in our neural framework. This links our second main contribution ($\gamma $-polarity) with our first (neural approach to 2pc).

($\textbf{5}$) Our proposal offers multiple benefits to the research community. It represents the first machine-learning approach to 2pc, which opens the door to further research and improvement. The same applies to $\gamma $-polarity, which warrants additional exploration from a combinatorial optimization perspective. Moreover, our work can serve as inspiration for other combinatorial optimization problems, which share the common trait with 2pc that suboptimal relaxed solutions may lead to improved rounded solutions.

Summary and roadmap. Our main contributions in this work can be summarized as follows:

We tackle 2pc (Bonchi et al., 2019), i.e., the problem of discovering two polarized communities from an input signed graph (Section 2), and define a novel neural-network-based approach to address it (Section 3).
We introduce a generalization of 2pc ’s objective function, termed $\gamma $-polarity, which favors size balance among communities, and show how to optimize it within the proposed neural framework (Section 4).
We provide extensive experiments on a large variety of real-world and synthetic signed graphs (Section 5). Results (Section 6) attest high accuracy of our approach, its superiority over the state of the art, and the effectiveness of $\gamma $-polarity in detecting balanced communities.

Section 7 concludes the paper and discusses future work.

2 Preliminaries and background

Let $G = (V, E^+, E^-)$ be an undirected signed graph, where V is a set of nodes, and $E^+, E^- \subseteq V \times V$, $E^+ \cap E^- = \emptyset $, are sets of positive and negative edges, respectively. We assume an arbitrary order over V, such that nodes are assigned a unique integer ID within $\{1, \ldots , |V|\}$. With a little abuse of notation, we interchangeably refer to $u \in V$ as both the node u itself and the u-th node in the order. This keeps vector/matrix notations simpler. $\textbf{A}\in \lbrace -1, 0, 1 \rbrace ^{|V| \times |V|}$ is the signed adjacency matrix of G, defined as $\textbf{A}[u,v] = 1$ if $(u, v) \in E^+$, $\textbf{A}[u,v] = -1$ if $(u, v) \in E^-$, and $\textbf{A}[u,v] = 0$ otherwise. Table 1 summarizes main notations used throughout this paper.

Table 1 Main notations used in this paper

Neural discovery of balance-aware polarized communities

Abstract

Similar content being viewed by others

ICS-GNN\(^+\): lightweight interactive community search via graph neural network

Local Spectral for Polarized Communities Search in Attributed Signed Network

BVNS for Overlapping Community Detection

Explore related subjects

1 Introduction

2 Preliminaries and background

2.1 Problem statement

Definition 1

Problem 1

Problem 2

2.2 Related works

3 Proposed approach: Neural2PC

Example 1

4 Balancing the size of the communities

Definition 2

Example 2

Example 3

5 Experimental methodology

6 Results

7 Conclusion

Data availability

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendices

Appendix A: Software and hardware configurations

Appendix B: Additional results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation