Kruskal's algorithm

Kruskal's algorithm builds an MST by scanning edges in globally increasing weight. Each edge is accepted if its endpoints lie in different connected components of the forest built so far; otherwise it would close a cycle and is discarded. Components are tracked with a disjoint-set (union–find) structure—near–constant amortized time per operation with path compression and union by rank.

On this page
  • Greedy rule — cheapest edges first, skip cycles
  • Union–find — find, unite, when edges merge components
  • C — qsort, path compression, union by rank
  • Trace — same weighted K4 as MST overview

1) Idea

Sort all edges by non‑decreasing weight. Maintain a spanning forest (initially |V| isolated vertices). For each edge (u, v) in sorted order: if u and v are in different DSU sets, union them and add the edge to the MST; if they already share a root, the edge is heavier than necessary for connecting those vertices and would introduce a cycle—skip it. Stop after |V| − 1 edges for a connected graph.

2) Union–find

  • find(x) — representative (root) of x's set. Path compression flattens the tree while traversing so future finds are faster.
  • union(a, b) — merge sets containing a and b. Union by rank attaches the shorter tree under the taller root to keep depth small.

Together, m operations run in O(m · α(V)) amortized time where α is the inverse Ackermann function—effectively constant for practical V.

3) Example graph

Weighted K4 on vertices 0 … 3 (MST total weight 6):

  0–1 : 1     2–3 : 2     0–3 : 3     1–2 : 4     0–2 : 5     1–3 : 6

4) C — Kruskal

#include <stdio.h>
#include <stdlib.h>

typedef struct { int u, v, w; } Edge;

static int cmp_edge(const void *a, const void *b) {
    Edge *ea = (Edge*)a, *eb = (Edge*)b;
    return ea->w - eb->w;
}

enum { V = 4 };
static int dsu[V];
static int rnk[V];

static int dsu_find(int x) {
    if (dsu[x] != x)
        dsu[x] = dsu_find(dsu[x]);   /* path compression */
    return dsu[x];
}

static void dsu_union(int a, int b) {
    a = dsu_find(a); b = dsu_find(b);
    if (a == b) return;
    if (rnk[a] < rnk[b]) { int t = a; a = b; b = t; }
    dsu[b] = a;
    if (rnk[a] == rnk[b]) rnk[a]++;
}

long long kruskal_mst(Edge *edges, int E) {
    qsort(edges, E, sizeof(Edge), cmp_edge);
    for (int i = 0; i < V; i++) {
        dsu[i] = i;
        rnk[i] = 0;
    }
    long long total = 0;
    int taken = 0;
    for (int i = 0; i < E && taken < V - 1; i++) {
        int u = edges[i].u, v = edges[i].v;
        if (dsu_find(u) != dsu_find(v)) {
            dsu_union(u, v);
            total += edges[i].w;
            taken++;
        }
    }
    return total;
}

int main(void) {
    Edge edges[] = {
        {0,1,1}, {2,3,2}, {0,3,3},
        {1,2,4}, {0,2,5}, {1,3,6}
    };
    int E = (int)(sizeof edges / sizeof edges[0]);
    printf("MST weight %lld\n", kruskal_mst(edges, E));
    return 0;
}

5) Complexity

  • Sorting: O(E log E) (equivalently O(E log V) since E ≤ V2).
  • DSU: O(E · α(V)) amortized for all finds and unions.
  • Overall: dominated by sort for sparse graphs; competitive with Prim when edges are given explicitly as a list.

6) Trace — edges taken

Sorted scan until |V| − 1 edges accepted
Step Edge Decision Components (informal) Sum
10–1 (1)Take{0,1}, {2}, {3}1
22–3 (2)Take{0,1}, {2,3}3
30–3 (3)Take — bridges the two compssingle component6

7) Trace — remaining edges (skipped)

If you continue scanning after the MST is complete, every later edge has both endpoints in the same DSU set—the algorithm would form a cycle.

Would-be cycle edges (same graph, forest already spanning)
Order Edge Weight Reason
41–24find(1) == find(2) — cycle
50–25same component
61–36same component

Key takeaway

Kruskal is edge‑centric: sort once, then merge components with DSU. It shines when edges are readily available as a list. Prim grows one tree from a seed—often better for dense graphs with an adjacency matrix or when you prioritize a heap-friendly structure.

Related: Union-find (disjoint set) · Minimum spanning tree · Prim's algorithm · Cycle detection