What Is Union-Find?

Union-Find is a data structure that tracks elements partitioned into disjoint sets (groups that don’t overlap). It supports two operations:

  1. Find(x): Which group does element x belong to? (returns the group’s representative/root)

  2. Union(x, y): Merge the groups containing x and y

Both operations are nearly O(1) with two optimizations: path compression and union by rank.

The Mental Model

Imagine people at a party forming groups:

  • Initially, everyone is in their own group (standing alone)

  • Union(Alice, Bob): Alice and Bob’s groups merge (they start chatting)

  • Find(Alice): "Who is the 'leader' of Alice’s group?" → returns the group’s representative

The key question Union-Find answers efficiently: "Are these two elements in the same group?"

Implementation

class UnionFind:
    def __init__(self, size: int):
        # parent[i] = i means i is its own root (initial state)
        self.parent = list(range(size))
        self.rank = [0] * size

    def find(self, x: int) -> int:
        """Find root with PATH COMPRESSION."""
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Flatten the tree
        return self.parent[x]

    def union(self, x: int, y: int) -> bool:
        """Merge groups. Returns False if already same group."""
        root_x, root_y = self.find(x), self.find(y)
        if root_x == root_y:
            return False  # Already connected
        # UNION BY RANK: attach shorter tree under taller tree
        if self.rank[root_x] < self.rank[root_y]:
            self.parent[root_x] = root_y
        elif self.rank[root_x] > self.rank[root_y]:
            self.parent[root_y] = root_x
        else:
            self.parent[root_y] = root_x
            self.rank[root_x] += 1
        return True

    def connected(self, x: int, y: int) -> bool:
        """Check if x and y are in the same group. O(α(n)) ≈ O(1)."""
        return self.find(x) == self.find(y)

Why Path Compression + Union by Rank = Nearly O(1)

Path compression: After find(x), every node on the path from x to root becomes a direct child of the root. This flattens the tree, making future finds faster.

Union by rank: Always attaches the shorter tree under the taller tree, preventing the tree from becoming a linked list.

Together, both operations are O(α(n)) amortized, where α is the inverse Ackermann function. This function grows so incredibly slowly that for all practical purposes (n < 10^80, the number of atoms in the universe), α(n) < 5. So it’s effectively O(1).

Complexity

Operation Amortized Complexity

find(x)

O(α(n)) ≈ O(1)

union(x, y)

O(α(n)) ≈ O(1)

connected(x, y)

O(α(n)) ≈ O(1)

Space

O(n)

When to Use Union-Find

  1. Count connected components (Number of Islands — Ch 9)

  2. Detect cycles in undirected graphs (if find(x) == find(y) before union → cycle!)

  3. Kruskal’s MST algorithm (process edges in weight order, union endpoints)

  4. Dynamic connectivity ("are x and y connected after adding this edge?")

Union-Find vs BFS/DFS for connectivity: - BFS/DFS: Better for one-time traversal or when you need the path - Union-Find: Better for INCREMENTAL connectivity (adding edges over time) and counting components

Exercises

  • Redundant Connection — find the cycle-causing edge

  • Number of Islands was covered in Chapter 09 using Union-Find