
In the graph traversal article, we explored how BFS and DFS visit every reachable vertex in a graph. In the topological sort article, we used those traversals to order vertices in directed acyclic graphs. Both of those techniques operate on unweighted or uniformly weighted edges, where the goal is reachability or ordering. But many real-world problems attach a cost to each edge, and the question shifts from "can we reach every vertex?" to "what is the cheapest way to connect every vertex?"
Consider the problem of laying cable between a set of buildings. Every pair of buildings can be connected, but each potential connection has a different cost depending on distance, terrain, or existing infrastructure. We want every building to be reachable from every other building, but we want to minimize the total cost of cable used. We do not need redundant connections: a single path between any two buildings is sufficient.
This is the Minimum Spanning Tree problem. Given a weighted, undirected, connected graph, find the subset of edges that connects all vertices with the minimum possible total weight, forming a tree (connected and acyclic). The problem appears in network design, circuit wiring, clustering, and approximation algorithms for harder problems like the traveling salesman.
This article develops the theoretical foundation that makes MST algorithms correct (the cut property), then presents the two classical algorithms, Kruskal's and Prim's, with their implementations and the data structures that make them efficient.
A spanning tree of a connected undirected graph is a subgraph that includes all vertices and is both connected and acyclic. Because it is a tree, it has exactly edges. A minimum spanning tree (MST) is a spanning tree whose total edge weight is as small as possible.
More formally, given a weight function that assigns a real-valued weight to each edge, the MST is a spanning tree that minimizes .
A few properties are worth noting immediately.
A connected graph with vertices and edges always has at least one spanning tree, and therefore at least one MST. The MST is not necessarily unique: when multiple edges share the same weight, different choices can lead to different spanning trees with the same total cost. However, when all edge weights are distinct, the MST is unique. This is because the cut property (discussed next) selects a unique lightest edge for each cut, leaving no room for alternative choices.
Every MST has exactly edges. This follows directly from the definition of a tree: a connected acyclic graph on vertices has exactly edges. Adding any edge to an MST creates exactly one cycle, and removing the heaviest edge from that cycle yields a spanning tree of equal or lesser weight. This cycle property is the dual of the cut property and provides an alternative lens for understanding MST structure.
The cut property is the theoretical backbone of all greedy MST algorithms. It explains why locally greedy decisions lead to a globally optimal solution, something that is far from obvious and distinguishes MST from many other optimization problems where greedy approaches fail.
A cut of a graph is a partition of the vertices into two non-empty sets and . An edge crosses the cut if one endpoint is in and the other is in . The set of all edges crossing a given cut is called the cut-set.
The cut property states: for any cut of the graph, the lightest edge crossing the cut belongs to every MST (assuming edge weights are distinct; if weights can repeat, at least one lightest crossing edge belongs to some MST).
The proof is by contradiction. Suppose the lightest crossing edge is not in some MST . Adding to creates a cycle (since is a spanning tree and adding any edge to a tree creates exactly one cycle). This cycle must contain at least one other edge that also crosses the cut, because the cycle enters and exits the set . Since is the lightest crossing edge, . Removing and keeping yields a new spanning tree with strictly smaller total weight, contradicting the assumption that was a minimum spanning tree.
This property is powerful because it is local: it applies to any cut, not just a specific one. Both Kruskal's and Prim's algorithms work by repeatedly identifying cuts and selecting their lightest crossing edges, but they do so in different orders and use different data structures to manage the process.
Kruskal's algorithm builds the MST by processing edges in order of increasing weight. It starts with a forest of isolated vertices (each vertex is its own tree) and greedily adds the cheapest edge that does not create a cycle. After edges have been added, the forest has merged into a single spanning tree.
The algorithm can be summarized in three steps:
find(u) === find(v)). If they are, skip the edge (it would create a cycle). If they are not, add the edge to the MST and call union(u, v) to merge the two components.The algorithm terminates when the MST contains edges (or when all edges have been processed).
Kruskal's correctness follows directly from the cut property. When the algorithm considers edge and and are in different components, there exists a cut separating those two components. Among all edges crossing this cut that have not yet been added to the MST, is the lightest, because edges are processed in sorted order and any lighter crossing edge would have already been processed (and either added or skipped because its endpoints were already connected). The cut property guarantees that this edge belongs to some MST.
The implementation relies on the Union-Find data structure for efficient cycle detection.
As discussed in the Union-Find article, the combination of path compression and union by rank
makes each find and union operation run in amortized time, where is the inverse Ackermann function,
effectively constant for all practical input sizes.
class UnionFind {
parent: number[];
rank: number[];
constructor(n: number) {
this.parent = Array(n);
this.rank = Array(n).fill(0);
for (let i = 0; i < n; i++) {
this.parent[i] = i;
}
}
find(x: number): number {
if (this.parent[x] !== x) {
this.parent[x] = this.find(this.parent[x]);
}
return this.parent[x];
}
union(x: number, y: number): boolean {
const rootX = this.find(x);
const rootY = this.find(y);
if (rootX === rootY) {
return false;
}
if (this.rank[rootX] < this.rank[rootY]) {
this.parent[rootX] = rootY;
} else if (this.rank[rootX] > this.rank[rootY]) {
this.parent[rootY] = rootX;
} else {
this.parent[rootY] = rootX;
this.rank[rootX]++;
}
return true;
}
}
type Edge = [number, number, number]; // [u, v, weight]
function kruskal(n: number, edges: Edge[]): Edge[] {
edges.sort((a, b) => a[2] - b[2]);
const uf = new UnionFind(n);
const mst: Edge[] = [];
for (const edge of edges) {
const [u, v] = edge;
if (uf.union(u, v)) {
mst.push(edge);
if (mst.length === n - 1) {
break;
}
}
}
return mst;
}
The structure is clean: sort once, then scan the edge list linearly, using Union-Find to test and merge components.
The early termination when mst.length === n - 1 is an optimization that avoids scanning remaining edges once the tree is complete.
Prim's algorithm builds the MST by growing a single tree from an arbitrary starting vertex. At each step, it adds the cheapest edge that connects a vertex already in the tree to a vertex not yet in the tree. Where Kruskal's algorithm processes edges globally (sorted by weight across the entire graph), Prim's algorithm operates locally, always expanding the frontier of the current partial tree.
The algorithm works as follows:
Prim's correctness also follows from the cut property. At every step, the vertices already in the partial tree form one side of a cut, and the remaining vertices form the other side. The algorithm selects the lightest edge crossing this cut, which the cut property guarantees belongs to some MST. By induction, every edge added by Prim's belongs to an MST, and after edges the result is a complete MST.
The implementation uses the generic Heap class from the heap article as a min-heap
to efficiently retrieve the cheapest frontier edge.
The graph is represented as an adjacency list with weights.
type Comparator<T> = (a: T, b: T) => number;
class Heap<T> {
private readonly heap: T[] = [];
private readonly compare: Comparator<T>;
constructor(compare: Comparator<T>) {
this.compare = compare;
}
size(): number {
return this.heap.length;
}
insert(value: T): void {
this.heap.push(value);
this.bubbleUp();
}
extract(): T | null {
if (this.heap.length === 0) { return null; }
if (this.heap.length === 1) { return this.heap.pop()!; }
const top = this.heap[0];
this.heap[0] = this.heap.pop()!;
this.bubbleDown();
return top;
}
private bubbleUp(): void {
let index = this.heap.length - 1;
while (index > 0) {
const parentIndex = Math.floor((index - 1) / 2);
if (this.compare(this.heap[index], this.heap[parentIndex]) >= 0) { break; }
[this.heap[index], this.heap[parentIndex]] = [this.heap[parentIndex], this.heap[index]];
index = parentIndex;
}
}
private bubbleDown(): void {
let index = 0;
while (true) {
const left = 2 * index + 1;
const right = 2 * index + 2;
let smallest = index;
if (left < this.heap.length && this.compare(this.heap[left], this.heap[smallest]) < 0) {
smallest = left;
}
if (right < this.heap.length && this.compare(this.heap[right], this.heap[smallest]) < 0) {
smallest = right;
}
if (smallest === index) { break; }
[this.heap[index], this.heap[smallest]] = [this.heap[smallest], this.heap[index]];
index = smallest;
}
}
}
function prim(n: number, adjList: [number, number][][]): [number, number, number][] {
const inMST = new Array<boolean>(n).fill(false);
const mst: [number, number, number][] = [];
const heap = new Heap<[number, number, number]>((a, b) => a[0] - b[0]);
// Start from vertex 0
inMST[0] = true;
for (const [neighbor, weight] of adjList[0]) {
heap.insert([weight, 0, neighbor]);
}
while (heap.size() > 0 && mst.length < n - 1) {
const [weight, from, to] = heap.extract()!;
if (inMST[to]) {
continue;
}
inMST[to] = true;
mst.push([from, to, weight]);
for (const [neighbor, w] of adjList[to]) {
if (!inMST[neighbor]) {
heap.insert([w, to, neighbor]);
}
}
}
return mst;
}
The inMST boolean array tracks which vertices have been absorbed into the growing tree.
When a vertex is added, all its edges to non-tree vertices are pushed onto the heap.
Edges whose destination is already in the tree are simply discarded when popped.
A more sophisticated variant uses a decrease-key operation to update the priority of vertices already in the heap, rather than inserting duplicate entries. This reduces the heap size from to and improves the theoretical complexity, but the simpler "lazy deletion" approach shown above is easier to implement and works well in practice for most problem sizes.
Both algorithms produce a correct MST, but their performance characteristics differ depending on the graph's structure.
Kruskal's algorithm sorts all edges upfront and processes them sequentially. Its running time is dominated by the sort: . Since , we have , so this is also . The Union-Find operations contribute , which is effectively linear. Kruskal's is the natural choice for sparse graphs where is close to , because the sorting step is efficient and the algorithm does not require an adjacency list, only an edge list.
Prim's algorithm, with a binary heap, runs in time: each of the edges may be inserted into and extracted from the heap, and each heap operation takes . With a Fibonacci heap, the complexity improves to , which is better than Kruskal's when is much larger than (dense graphs). Prim's is the natural choice for dense graphs where is close to , and it is also preferable when the graph is given as an adjacency list (the common representation for traversal problems).
When the graph is implicit and dense, as in problems where every pair of points can be connected (a complete graph), both algorithms face the challenge of edges. Kruskal's must sort all edges, resulting in time. Prim's with a binary heap achieves the same bound but avoids materializing and sorting the full edge list upfront, since it processes edges lazily as the tree grows. For very large complete graphs, Prim's with a Fibonacci heap achieves , which is optimal since simply reading all edge weights takes time.
In practice, for competitive programming and interview problems, Kruskal's with Union-Find is the more commonly implemented choice because of its simplicity: sort the edges, iterate through them, and use Union-Find for cycle detection. The implementation is short, clean, and easy to adapt to variations (e.g., finding the MST cost without storing the tree, or building the MST from a set of points with Manhattan or Euclidean distances).
| Algorithm | Time Complexity | Space Complexity | Best suited for |
|---|---|---|---|
| Kruskal's (with Union-Find) | Sparse graphs, edge-list input | ||
| Prim's (with binary heap) | Dense graphs, adjacency-list input | ||
| Prim's (with Fibonacci heap) | Very dense graphs, theoretical optimality |
For Kruskal's, the time comes from sorting the edge list. The Union-Find operations over all edges contribute , which is effectively . The space is for the Union-Find structure plus for storing the edge list.
For Prim's with a binary heap, the time comes from inserting up to edges into the heap
(each insertion and extraction is in the decrease-key variant, or in the lazy variant, which is asymptotically equivalent).
The space is for the inMST array and the MST itself, plus for the heap in the worst case.
For Prim's with a Fibonacci heap, the improved bound comes from the amortized decrease-key operation. Each vertex is extracted from the heap at most once ( total for extractions), and each edge triggers at most one decrease-key ( total). The Fibonacci heap is rarely used in practice due to its implementation complexity and large constant factors, but it establishes the theoretical lower bound for comparison-based MST algorithms on dense graphs.
| Exercise | Difficulty | Description |
|---|---|---|
| Min Cost to Connect All Points | Medium | Find the minimum cost to connect all points on a 2D plane using Manhattan distance. |