Open In App

Introduction to Disjoint Set (Union-Find Algorithm)

Last Updated : 26 Feb, 2025
Summarize
Comments
Improve
Suggest changes
Like Article
Like
Share
Report
News Follow

Two sets are called disjoint sets if they don’t have any element in common. The disjoint set data structure is used to store such sets. It supports following operations:

  • Merging two disjoint sets to a single set using Union operation.
  • Finding representative of a disjoint set using Find operation.
  • Check if two elements belong to same set or not. We mainly find representative of both and check if same.

Consider a situation with a number of persons and the following tasks to be performed on them:

  • Add a new friendship relation, i.e. a person x becomes the friend of another person y i.e adding new element to a set.
  • Find whether individual x is a friend of individual y (direct or indirect friend)

Examples: 

We are given 10 individuals say, a, b, c, d, e, f, g, h, i, j

Following are relationships to be added:
a <-> b  
b <-> d
c <-> f
c <-> i
j <-> e
g <-> j

Given queries like whether a is a friend of d or not. We basically need to create following 4 groups and maintain a quickly accessible connection among group items:
G1 = {a, b, d}
G2 = {c, f, i}
G3 = {e, g, j}
G4 = {h}

Find whether x and y belong to the same group or not, i.e. to find if x and y are direct/indirect friends.

Partitioning the individuals into different sets according to the groups in which they fall. This method is known as a Disjoint set Union which maintains a collection of Disjoint sets and each set is represented by one of its members.

To answer the above question two key points to be considered are:

  • How to Resolve sets? Initially, all elements belong to different sets. After working on the given relations, we select a member as a representative.
  • Check if 2 persons are in the same group? If representatives of two individuals are the same, then they are friends.

Data Structures used are: 

Array: An array of integers is called Parent[]. If we are dealing with N items, i’th element of the array represents the i’th item. More precisely, the i’th element of the Parent[] array is the parent of the i’th item. These relationships create one or more virtual trees.

Tree: It is a Disjoint set. If two elements are in the same tree, then they are in the same Disjoint set. The root node (or the topmost node) of each tree is called the representative of the set. There is always a single unique representative of each set. A simple rule to identify a representative is if ‘i’ is the representative of a set, then Parent[i] = i. If i is not the representative of his set, then it can be found by traveling up the tree until we find the representative.

Operations on Disjoint Set Data Structures:

1. Find:

The task is to find representative of the set of a given element. The representative is always root of the tree. So we implement find() by recursively traversing the parent array until we hit a node that is root (parent of itself).

2. Union: 

The task is to combine two sets and make one. It takes two elements as input and finds the representatives of their sets using the Find operation, and finally puts either one of the trees (representing the set) under the root node of the other tree.

C++
#include <iostream>
#include <vector>
using namespace std;

class UnionFind {
    vector<int> parent;
public:
    UnionFind(int size) {
      
        parent.resize(size);
      
        // Initialize the parent array with each 
        // element as its own representative
        for (int i = 0; i < size; i++) {
            parent[i] = i;
        }
    }

    // Find the representative (root) of the
    // set that includes element i
    int find(int i) {
      
        // If i itself is root or representative
        if (parent[i] == i) {
            return i;
        }
      
        // Else recursively find the representative 
        // of the parent
        return find(parent[i]);
    }

    // Unite (merge) the set that includes element 
    // i and the set that includes element j
    void unite(int i, int j) {
      
        // Representative of set containing i
        int irep = find(i);
      
        // Representative of set containing j
        int jrep = find(j);
       
        // Make the representative of i's set
        // be the representative of j's set
        parent[irep] = jrep;
    }
};

int main() {
    int size = 5;
    UnionFind uf(size);
    uf.unite(1, 2);
    uf.unite(3, 4);
    bool inSameSet = (uf.find(1) == uf.find(2));
    cout << "Are 1 and 2 in the same set? " 
         << (inSameSet ? "Yes" : "No") << endl;
    return 0;
}
Java Python C# JavaScript

Output
Are 1 and 2 in the same set? Yes

The above union() and find() are naive and the worst case time complexity is linear. The trees created to represent subsets can be skewed and can become like a linked list. Following is an example of worst case scenario. 

Optimization (Path Compression and Union by Rank/Size):

The main idea is to reduce heights of trees representing different sets. We achieve this with two most common methods:
1) Path Compression
2) Union By Rank

Path Compression (Used to improve find()):

The idea is to flatten the tree when find() is called. When find() is called for an element x, root of the tree is returned. The find() operation traverses up from x to find root. The idea of path compression is to make the found root as parent of x so that we don’t have to traverse all intermediate nodes again. If x is root of a subtree, then path (to root) from all nodes under x also compresses.

It speeds up the data structure by compressing the height of the trees. It can be achieved by inserting a small caching mechanism into the Find operation. Take a look at the code for more details:

Union by Rank (Modifications to union())

Rank is like height of the trees representing different sets. We use an extra array of integers called rank[]. The size of this array is the same as the parent array Parent[]. If i is a representative of a set, rank[i] is the rank of the element i.  Rank is same as height if path compression is not used. With path compression, rank can be more than the actual height.
Now recall that in the Union operation, it doesn’t matter which of the two trees is moved under the other. Now what we want to do is minimize the height of the resulting tree. If we are uniting two trees (or sets), let’s call them left and right, then it all depends on the rank of left and the rank of right

  • If the rank of left is less than the rank of right, then it’s best to move left under right, because that won’t change the rank of right (while moving right under left would increase the height). In the same way, if the rank of right is less than the rank of left, then we should move right under left.
  • If the ranks are equal, it doesn’t matter which tree goes under the other, but the rank of the result will always be one greater than the rank of the trees.


C++
#include <iostream>
#include <vector>
using namespace std;

class DisjointUnionSets {
    vector<int> rank, parent;

public:
  
    // Constructor to initialize sets
    DisjointUnionSets(int n) {
        rank.resize(n, 0);
        parent.resize(n);

        // Initially, each element is in its own set
        for (int i = 0; i < n; i++) {
            parent[i] = i;
        }
    }

    // Find the representative of the set that x belongs to
    int find(int i) {
        int root = parent[i];
      
        if (parent[root] != root) {
            return parent[i] = find(root);
        }
      
        return root;
    }

    // Union of sets containing x and y
    void unionSets(int x, int y) {
        int xRoot = find(x);
        int yRoot = find(y);

        // If they are in the same set, no need to union
        if (xRoot == yRoot) return;

        // Union by rank
        if (rank[xRoot] < rank[yRoot]) {
            parent[xRoot] = yRoot;
        } else if (rank[yRoot] < rank[xRoot]) {
            parent[yRoot] = xRoot;
        } else {
            parent[yRoot] = xRoot;
            rank[xRoot]++;
        }
    }
};

int main() {
    // Let there be 5 persons with ids 0, 1, 2, 3, and 4
    int n = 5;
    DisjointUnionSets dus(n);

    // 0 is a friend of 2
    dus.unionSets(0, 2);

    // 4 is a friend of 2
    dus.unionSets(4, 2);

    // 3 is a friend of 1
    dus.unionSets(3, 1);

    // Check if 4 is a friend of 0
    if (dus.find(4) == dus.find(0))
        cout << "Yes\n";
    else
        cout << "No\n";

    // Check if 1 is a friend of 0
    if (dus.find(1) == dus.find(0))
        cout << "Yes\n";
    else
        cout << "No\n";

    return 0;
}
Java Python C# JavaScript

Output
Yes
No

Time complexity: O(n) for creating n single item sets . The two techniques -path compression with the union by rank/size, the time complexity will reach nearly constant time. It turns out, that the final amortized time complexity is O(α(n)), where α(n) is the inverse Ackermann function, which grows very steadily (it does not even exceed for n<10600  approximately).

Space complexity: O(n) because we need to store n elements in the Disjoint Set Data Structure.

Union by Size (Alternate of Union by Rank)

We use an array of integers called size[]. The size of this array is the same as the parent array Parent[]. If i is a representative of a set, size[i] is the number of the elements in the tree representing the set. 
Now we are uniting two trees (or sets), let’s call them left and right, then in this case it all depends on the size of left and the size of right tree (or set).

  • If the size of left is less than the size of right, then it’s best to move left under right and increase size of right by size of left. In the same way, if the size of right is less than the size of left, then we should move right under left. and increase size of left by size of right.
  • If the sizes are equal, it doesn’t matter which tree goes under the other.
C++
// C++ program for Union by Size with Path Compression
#include <iostream>
#include <vector>
using namespace std;

class UnionFind {
    vector<int> Parent;
    vector<int> Size;
public:
    UnionFind(int n) {      
        Parent.resize(n);
        for (int i = 0; i < n; i++) {
            Parent[i] = i;
        }

        // Initialize Size array with 1s
        Size.resize(n, 1);
    }

    // Function to find the representative (or the root
    // node) for the set that includes i
    int find(int i) {
        int root = Parent[i];
      
        if (Parent[root] != root) {
            return Parent[i] = find(root);
        }
      
        return root;
    }

    // Unites the set that includes i and the set that
    // includes j by size
    void unionBySize(int i, int j) {
      
        // Find the representatives (or the root nodes) for
        // the set that includes i
        int irep = find(i);

        // And do the same for the set that includes j
        int jrep = find(j);

        // Elements are in the same set, no need to unite
        // anything.
        if (irep == jrep)
            return;

        // Get the size of i’s tree
        int isize = Size[irep];

        // Get the size of j’s tree
        int jsize = Size[jrep];

        // If i’s size is less than j’s size
        if (isize < jsize) {
          
            // Then move i under j
            Parent[irep] = jrep;

            // Increment j's size by i's size
            Size[jrep] += Size[irep];
        }
        // Else if j’s size is less than i’s size
        else {
            // Then move j under i
            Parent[jrep] = irep;

            // Increment i's size by j's size
            Size[irep] += Size[jrep];
        }
    }
};

int main() {
    int n = 5;
    UnionFind unionFind(n);
    unionFind.unionBySize(0, 1);
    unionFind.unionBySize(2, 3);
    unionFind.unionBySize(0, 4);
    for (int i = 0; i < n; i++) {
        cout << "Element " << i << ": Representative = " 
             << unionFind.find(i) << endl;
    }
    return 0;
}
Java Python C# JavaScript

Output
Element 0: Representative = 0
Element 1: Representative = 0
Element 2: Representative = 2
Element 3: Representative = 2
Element 4: Representative = 0


Similar Reads

three90RightbarBannerImg