Topological Sort Explained from First Principles
Topological sort explained through two algorithms derived from one invariant, with traced walkthroughs and cycle detection.
What topological sort solves and when it applies
How Kahn's algorithm builds the ordering using in degrees
How DFS post order produces the same result differently
Why cycle detection falls out of both approaches naturally
If you've searched "topological sort explained" and walked away more confused than before, that's fair. The name sounds abstract, but the problem it solves is concrete and shows up everywhere in software engineering.
Topological sort explained: What it is and why it exists
Topological sort orders the nodes of a directed acyclic graph (DAG) so that for every directed edge from node A to node B, A appears before B in the ordering. It solves one problem. Given a set of tasks with dependencies, it finds an execution order that respects every dependency.
That's the whole idea. If task B depends on task A, then A must come first. A topological sort finds an ordering where that holds for every dependency in the graph simultaneously. The word "topological" is a historical artifact from graph theory that makes the concept sound harder than it is.
In practice, it's just dependency ordering. Build tools use it to decide which modules to compile first. Package managers use it to resolve installation sequences. Course prerequisites follow the same logic.
But there's a catch.
Topological sort only works on directed acyclic graphs. If the graph contains a cycle (A depends on B, B depends on C, C depends on A), no valid ordering exists. You can't place all three before each other simultaneously.
Two algorithms solve this. They look different on the surface, but they're both enforcing the same dependency invariant. Never process a node until every dependency is already processed. The difference is how they enforce it.
Kahn's algorithm: BFS on in degrees
Kahn's algorithm is the more direct of the two. It asks one question. Which nodes have zero dependencies right now? You process those first, update the dependency counts, and repeat.
Suppose you have five modules to compile, with these dependencies:
- Module A has no dependencies
- Module B depends on A
- Module C depends on A
- Module D depends on B and C
- Module E depends on D
The in degree of a node is the number of edges pointing into it, which is the number of dependencies it has. Module A has in degree 0. Module D has in degree 2 because both B and C must finish before D can start.
The algorithm works in three steps. First, compute the in degree of every node. Second, add all zero in degree nodes to a queue.
Third, process nodes from the queue one at a time. For each processed node, decrement the in degree of its neighbors, and add any neighbor that reaches zero to the queue.
Python
Walk through this with the module example. Initially, only A has in degree 0, so it enters the queue first. After processing A, both B and C drop to in degree 0 and enter the queue. Next, B gets processed and D's in degree drops from 2 to 1. Then C gets processed, bringing D's in degree to 0. Finally D gets processed, E drops to 0, and the algorithm finishes. The final order is A, B, C, D, E. (Or A, C, B, D, E, since B and C are interchangeable.)
The invariant holds at every step. A node only enters the queue when its in degree hits zero, meaning all its dependencies have been processed.
“The ordering isn't something you compute all at once. It emerges one node at a time, as dependencies resolve.”
DFS based topological sort: Post order reversal
The DFS approach works backwards. Instead of asking "what can I process now?", it asks "what do I need last?" and builds the order in reverse.
The key observation is that in a depth first traversal, when you finish processing a node (all its descendants have been fully visited), everything reachable from that node is already handled. If you collect nodes in this post order and then reverse the list, you get a valid topological ordering.
Why does the reversal work? In post order, a node appears after all nodes it can reach. But topological sort needs a node to appear before all nodes it can reach. Reversing flips that relationship exactly.
Python
Trace this on the same module graph. Starting DFS from A, we recurse: A goes to B, B goes to D, D goes to E. E has no unvisited neighbors, so E enters post order first. The call returns to D, which finishes and enters post order. The call returns to B, which also finishes. Then A explores C, which finishes. Finally A finishes. The post order list is [E, D, B, C, A]. Reversed, that gives [A, C, B, D, E]. Valid topological order.
Both algorithms produce correct results. They won't always produce the same ordering, because multiple valid orderings exist for any DAG. Any ordering that satisfies all dependency constraints is correct.
Why topological sort fails on cyclic graphs
Topological sort and cycle detection are two sides of the same coin. If a valid ordering exists, the graph is a DAG. If no valid ordering exists, the graph contains a cycle. Neither algorithm needs extra machinery to detect this.
In Kahn's algorithm, a cycle means some nodes will never reach in degree 0. They're stuck waiting on each other indefinitely. After the algorithm finishes, if the output list has fewer nodes than the graph, the missing nodes are caught in a cycle.
In the DFS version, a cycle shows up when you encounter a node that's already on the current DFS path (tracked by the in_stack array). If you're exploring from A and reach a node whose recursive call hasn't returned yet, you've found a back edge, and back edges in directed graphs mean cycles.
A single topological sort handles both compilation ordering and circular dependency detection.
One thing that trips people up in interviews: undirected cycle detection and directed cycle detection aren't the same problem. For undirected graphs, checking if you visit an already-visited node (that isn't the parent) is enough. For directed graphs, you need the in_stack versus visited distinction because a node can be reachable from multiple paths without forming a cycle. Only a node on the current recursive path creates a true cycle. Confusing the two is one of the most common mistakes in graph interview problems, and it's worth understanding why the directed case needs the extra tracking.
Topological sort explained through build systems
Consider what happens every time you run a build. Suppose your project has these source files.
- main.py imports server.py and config.py
- server.py imports database.py and auth.py
- auth.py imports config.py
- database.py imports config.py
The dependency graph is a DAG. A build system needs to process config.py first (zero dependencies), then database.py and auth.py (which only depend on config), then server.py, and finally main.py. That's topological sort running behind the scenes.
Package managers like pip and npm solve the same problem when resolving installation order. Task schedulers use it for job ordering in data pipelines.
In interviews, the harder part isn't implementing the algorithm. It's recognising when it applies. The problem statement won't say "apply topological sort." It'll describe tasks, prerequisites, and ordering constraints. Your job is to see the DAG underneath the problem description and pick the right approach.
“The interview problem won't mention topological sort by name. It'll describe tasks, dependencies, and constraints. Recognising the DAG underneath is the actual skill.”
What topological sort connects to
Topological sort connects to several graph concepts that build on it. Cycle detection in directed graphs, covered above, is the immediate extension. Shortest path algorithms on DAGs use topological ordering to process nodes in dependency order, which eliminates the need for a priority queue entirely. Strongly connected components extend the idea to graphs with cycles by decomposing them into acyclic meta-graphs that can then be topologically sorted.
DFS based topological sort also reinforces a broader graph skill, showing how recursive traversal and post order processing interact. If you've worked through BFS vs DFS and understand when each traversal applies, topological sort is a natural next step.
For where topological sort fits within the full graph algorithm family, from traversals through shortest paths and max flow, see our complete guide to graph algorithms.
Codeintuition's Graph course covers topological sort across four dedicated lessons. The understanding lesson builds both algorithms from the dependency invariant before you write any code, with 289 illustrations tracing every variable through each step so the mechanics become visual before they become code. The structured learning path covers 75+ patterns the same way. Understand the invariant, learn the identification triggers, then apply with increasing difficulty. Topological sort is one pattern in a sequence that includes cycle detection, connected components, shortest paths, and five graph specific pattern families.
Six months from now, you'll hit a problem that describes task dependencies with constraints and asks for a valid execution order. If you've traced the in degree queue and the post order reversal until both feel like natural consequences of the same invariant, you won't need to recall which algorithm to use. You'll see the structure and build from it.
See topological sort traced step by step
The Graph course builds both algorithms from the dependency invariant with 289 illustrations tracing every variable. Start with the free courses to see derivation first teaching in action. Permanently FREE
O(V + E) time, where V is the number of vertices and E is the number of edges. Each node and each edge gets processed exactly once. Space complexity is also O(V + E) for the adjacency list, the in degree or visited arrays, and the output list. That makes topological sort efficient even for large dependency graphs with thousands of nodes.