Topological Sort Explained from First Principles

Q: What's the time and space complexity of topological sort?

Both algorithms run in O(V + E) time, where V is the number of vertices and E is the number of edges. Each node and each edge gets processed exactly once. Space complexity is also O(V + E) for the adjacency list, the in degree or visited arrays, and the output list. That makes topological sort efficient even for large dependency graphs with thousands of nodes.

Topological sort explained through two algorithms derived from one invariant, with traced walkthroughs and cycle detection.

10 minutes

Intermediate

What you will learn

What topological sort solves and when it applies

How Kahn's algorithm builds the ordering using in degrees

How DFS post order produces the same result differently

Why cycle detection falls out of both approaches naturally

If you've searched "topological sort explained" and walked away more confused than before, that's fair. The name sounds abstract, but the problem it solves is concrete and shows up everywhere in software engineering.

⚡TL;DR

Topological sort orders the nodes of a directed acyclic graph so every edge points from an earlier node to a later one. Two algorithms solve it: Kahn's (BFS with in degree tracking) and DFS with post order reversal. Both follow from one invariant: process a node only after all its dependencies are processed.

Topological sort explained: What it is and why it exists

Topological sort orders the nodes of a directed acyclic graph (DAG) so that for every directed edge from node A to node B, A appears before B in the ordering. It solves one problem. Given a set of tasks with dependencies, it finds an execution order that respects every dependency.

That's the whole idea. If task B depends on task A, then A must come first. A topological sort finds an ordering where that holds for every dependency in the graph simultaneously. The word "topological" is a historical artifact from graph theory that makes the concept sound harder than it is.

In practice, it's just dependency ordering. Build tools use it to decide which modules to compile first. Package managers use it to resolve installation sequences. Course prerequisites follow the same logic.

But there's a catch.

Topological sort only works on directed acyclic graphs. If the graph contains a cycle (A depends on B, B depends on C, C depends on A), no valid ordering exists. You can't place all three before each other simultaneously.

❗ Important

A graph has a valid topological ordering if and only if it's a DAG. If topological sort fails, the graph contains a cycle. This makes it double as a cycle detection method for directed graphs.

Two algorithms solve this. They look different on the surface, but they're both enforcing the same dependency invariant. Never process a node until every dependency is already processed. The difference is how they enforce it.

Kahn's algorithm: BFS on in degrees

Kahn's algorithm is the more direct of the two. It asks one question. Which nodes have zero dependencies right now? You process those first, update the dependency counts, and repeat.

Suppose you have five modules to compile, with these dependencies:

Module A has no dependencies
Module B depends on A
Module C depends on A
Module D depends on B and C
Module E depends on D

The in degree of a node is the number of edges pointing into it, which is the number of dependencies it has. Module A has in degree 0. Module D has in degree 2 because both B and C must finish before D can start.

The algorithm works in three steps. First, compute the in degree of every node. Second, add all zero in degree nodes to a queue.

Third, process nodes from the queue one at a time. For each processed node, decrement the in degree of its neighbors, and add any neighbor that reaches zero to the queue.

Python

Walk through this with the module example. Initially, only A has in degree 0, so it enters the queue first. After processing A, both B and C drop to in degree 0 and enter the queue. Next, B gets processed and D's in degree drops from 2 to 1. Then C gets processed, bringing D's in degree to 0. Finally D gets processed, E drops to 0, and the algorithm finishes. The final order is A, B, C, D, E. (Or A, C, B, D, E, since B and C are interchangeable.)

The invariant holds at every step. A node only enters the queue when its in degree hits zero, meaning all its dependencies have been processed.

“The ordering isn't something you compute all at once. It emerges one node at a time, as dependencies resolve.”

Kahn's algorithm in one sentence

DFS based topological sort: Post order reversal

The DFS approach works backwards. Instead of asking "what can I process now?", it asks "what do I need last?" and builds the order in reverse.

The key observation is that in a depth first traversal, when you finish processing a node (all its descendants have been fully visited), everything reachable from that node is already handled. If you collect nodes in this post order and then reverse the list, you get a valid topological ordering.

Why does the reversal work? In post order, a node appears after all nodes it can reach. But topological sort needs a node to appear before all nodes it can reach. Reversing flips that relationship exactly.

Python

Trace this on the same module graph. Starting DFS from A, we recurse: A goes to B, B goes to D, D goes to E. E has no unvisited neighbors, so E enters post order first. The call returns to D, which finishes and enters post order. The call returns to B, which also finishes. Then A explores C, which finishes. Finally A finishes. The post order list is [E, D, B, C, A]. Reversed, that gives [A, C, B, D, E]. Valid topological order.

Both algorithms produce correct results. They won't always produce the same ordering, because multiple valid orderings exist for any DAG. Any ordering that satisfies all dependency constraints is correct.

💡 Tip

When choosing between the two: Kahn's is generally more intuitive and naturally produces the ordering front to back. DFS is more convenient when you already have a recursive traversal in place or when the graph representation favors depth first exploration.

When multiple valid orderings exist

Run Kahn's algorithm on the module example and you'll get A, B, C, D, E. Run DFS starting from a different node and you might get A, C, B, D, E instead. Both are correct. Any DAG with nodes that don't depend on each other (like B and C in the example) produces multiple valid orderings, because those independent nodes can appear in any relative position.

This matters in interviews because some problems ask you to return the lexicographically smallest valid ordering. Kahn's algorithm handles that naturally. Replace the regular queue with a min heap (a heapq in Python), and the algorithm always picks the smallest-numbered available node at each step. The time complexity shifts from O(V + E) to O(V log V + E) because of the heap operations, but the logic stays identical. The DFS approach doesn't give you that control as easily, since the ordering gets built in reverse during the recursive unwinding.

If the problem doesn't specify which valid ordering to return, any correct one works. Don't overthink it.

Why topological sort fails on cyclic graphs

Topological sort and cycle detection are two sides of the same coin. If a valid ordering exists, the graph is a DAG. If no valid ordering exists, the graph contains a cycle. Neither algorithm needs extra machinery to detect this.

In Kahn's algorithm, a cycle means some nodes will never reach in degree 0. They're stuck waiting on each other indefinitely. After the algorithm finishes, if the output list has fewer nodes than the graph, the missing nodes are caught in a cycle.

In the DFS version, a cycle shows up when you encounter a node that's already on the current DFS path (tracked by the in_stack array). If you're exploring from A and reach a node whose recursive call hasn't returned yet, you've found a back edge, and back edges in directed graphs mean cycles.

A single topological sort handles both compilation ordering and circular dependency detection.

One thing that trips people up in interviews: undirected cycle detection and directed cycle detection aren't the same problem. For undirected graphs, checking if you visit an already-visited node (that isn't the parent) is enough. For directed graphs, you need the in_stack versus visited distinction because a node can be reachable from multiple paths without forming a cycle. Only a node on the current recursive path creates a true cycle. Confusing the two is one of the most common mistakes in graph interview problems, and it's worth understanding why the directed case needs the extra tracking.

Common edge cases worth knowing

Beyond cycles, a few edge cases come up regularly in practice and in interviews.

Disconnected components: The graph doesn't have to be connected. If you've got three separate groups of tasks with no dependencies between them, topological sort still works. Kahn's algorithm handles this automatically because it initializes the queue with all zero in degree nodes, regardless of which component they belong to. The DFS version handles it through the outer loop that starts a new traversal from every unvisited node. You don't need any special logic for disconnected graphs.
Empty dependency sets: If none of your nodes depend on anything (every node has in degree 0), the graph has no edges. Every permutation of the nodes is a valid topological ordering. Kahn's algorithm adds all nodes to the queue at once, and the output order depends on which one the queue returns first.
Duplicate edges: If the input contains the edge (A, B) twice, Kahn's algorithm counts B's in degree as 2 instead of 1. That won't produce an incorrect result, but it wastes work. Deduplicating the edge list before running the algorithm keeps the in_degree array accurate.

Topological sort explained through build systems

Consider what happens every time you run a build. Suppose your project has these source files.

main.py imports server.py and config.py
server.py imports database.py and auth.py
auth.py imports config.py
database.py imports config.py

The dependency graph is a DAG. A build system needs to process config.py first (zero dependencies), then database.py and auth.py (which only depend on config), then server.py, and finally main.py. That's topological sort running behind the scenes.

Package managers like pip and npm solve the same problem when resolving installation order. Task schedulers use it for job ordering in data pipelines.

In interviews, the harder part isn't implementing the algorithm. It's recognising when it applies. The problem statement won't say "apply topological sort." It'll describe tasks, prerequisites, and ordering constraints. Your job is to see the DAG underneath the problem description and pick the right approach.

“The interview problem won't mention topological sort by name. It'll describe tasks, dependencies, and constraints. Recognising the DAG underneath is the actual skill.”

Pattern recognition in graph problems

What topological sort connects to

Topological sort connects to several graph concepts that build on it. Cycle detection in directed graphs, covered above, is the immediate extension. Shortest path algorithms on DAGs use topological ordering to process nodes in dependency order, which eliminates the need for a priority queue entirely. Strongly connected components extend the idea to graphs with cycles by decomposing them into acyclic meta-graphs that can then be topologically sorted.

DFS based topological sort also reinforces a broader graph skill, showing how recursive traversal and post order processing interact. If you've worked through BFS vs DFS and understand when each traversal applies, topological sort is a natural next step.

For where topological sort fits within the full graph algorithm family, from traversals through shortest paths and max flow, see our complete guide to graph algorithms.

Codeintuition's Graph course covers topological sort across four dedicated lessons. The understanding lesson builds both algorithms from the dependency invariant before you write any code, with 289 illustrations tracing every variable through each step so the mechanics become visual before they become code. The structured learning path covers 75+ patterns the same way. Understand the invariant, learn the identification triggers, then apply with increasing difficulty. Topological sort is one pattern in a sequence that includes cycle detection, connected components, shortest paths, and five graph specific pattern families.

Six months from now, you'll hit a problem that describes task dependencies with constraints and asks for a valid execution order. If you've traced the in degree queue and the post order reversal until both feel like natural consequences of the same invariant, you won't need to recall which algorithm to use. You'll see the structure and build from it.

See topological sort traced step by step

The Graph course builds both algorithms from the dependency invariant with 289 illustrations tracing every variable. Start with the free courses to see derivation first teaching in action. Permanently FREE

Start Learning

Both produce a valid topological ordering of a DAG, but they work in opposite directions. Kahn's uses BFS with in degree tracking, processing nodes front to back as their dependencies resolve. The DFS approach collects nodes in post order and reverses the list. Kahn's is often more intuitive and makes cycle detection straightforward by counting processed nodes. DFS based sorting is convenient when you already have a recursive traversal framework in your code.

No. Topological sort requires directed edges to define the "before" and "after" relationship between nodes. An undirected edge between A and B doesn't tell you whether A should come before B or the other way around. The concept of dependency ordering only applies to directed graphs, and the graph must be acyclic for a valid ordering to exist.

In Kahn's algorithm, if the output list contains fewer nodes than the graph after processing finishes, the remaining nodes form a cycle. They never reached in degree zero because they depend on each other. In the DFS approach, a cycle is detected when you encounter a node that's still on the current DFS path, meaning you've followed edges back to a node whose recursive call hasn't returned yet.

Both algorithms run in O(V + E) time, where V is the number of vertices and E is the number of edges. Each node and each edge gets processed exactly once. Space complexity is also O(V + E) for the adjacency list, the in degree or visited arrays, and the output list. That makes topological sort efficient even for large dependency graphs with thousands of nodes.

Look for problems describing tasks with dependencies, prerequisites, or ordering constraints. Common signals include phrases like "must complete before," "depends on," or "find a valid order." If the problem involves a directed graph and asks for a sequence respecting all constraints, topological sort is likely the right approach. Course scheduling, build order, compilation dependencies, and task ordering are all classic indicators.