Strongly Connected Components Explained
Learn how Kosaraju's and Tarjan's algorithms find strongly connected components, when to use each, and how SCCs appear in coding interviews.
What strongly connected components are and why undirected connectivity breaks
How Kosaraju's two pass algorithm finds every SCC in a directed graph
How Tarjan's single pass algorithm uses low link values to detect SCCs
When each algorithm fits and where SCCs appear in interviews
Connected components on undirected graphs feel straightforward, and it's tempting to assume the idea transfers directly to directed graphs. It doesn't. In an undirected graph, reachability is symmetric. If A reaches B, B reaches A. One DFS finds the whole component. Directed graphs break that symmetry, and that's where strongly connected components become the right abstraction.
What strongly connected components actually are
A strongly connected component is a maximal set of vertices in a directed graph where every vertex is reachable from every other vertex. "Maximal" means you can't add another vertex without breaking the all pairs reachability property.
This matters because directed graphs have reachability patterns that a single traversal won't reveal. Consider a graph with vertices {0, 1, 2, 3, 4} and these edges:
- 0 → 1, 1 → 2, 2 → 0 (these three form a cycle)
- 2 → 3, 3 → 4, 4 → 3 (these two form a cycle)
This graph has two SCCs: {0, 1, 2} and {3, 4}. Within each group, every vertex reaches every other. But vertex 3 can't reach vertex 0, so they can't belong to the same SCC. Notice that the two SCCs aren't isolated from each other. There's an edge from 2 to 3 that crosses the boundary. SCC decomposition doesn't require components to be disconnected. It requires that the internal reachability within each component is complete.
The obvious brute force approach, running a reachability check from every vertex, takes O(V * (V + E)). Both Kosaraju's and Tarjan's algorithms bring that down to O(V + E), which matches the time of a single traversal.
Kosaraju's algorithm: Two passes and a transpose
Kosaraju's algorithm is the more intuitive of the two. It relies on a specific property of DFS finish times: if SCC A has an edge to SCC B in the condensation graph, then the last vertex to finish in A finishes after every vertex in B. The algorithm has three steps.
- 1Run DFS on the original graph. As each vertex finishes (all its descendants are fully explored), push it onto a stack.
- 2Transpose the graph (reverse every edge direction).
- 3Pop vertices from the stack. For each unvisited vertex, run DFS on the transposed graph. Every vertex reached belongs to the same SCC.
Why does this work?
The finish time ordering gives you a precise guarantee. When you pop from the stack (highest finish time first), you start in an SCC whose vertices finished after all vertices in any SCC it points to. Transposing the graph reverses that edge, so the second pass DFS can't cross into the downstream SCC. The edge points the wrong way now. Each second pass DFS call is trapped inside one SCC.
That's the whole invariant, and it's why Kosaraju's is correct.
Tracing Kosaraju's on a concrete graph
Walk through the 5-node graph from above: vertices {0, 1, 2, 3, 4}, edges: 0→1, 1→2, 2→0, 2→3, 3→4, 4→3.
Python
Graph transposition comes up more often than you'd expect outside SCC detection. Reverse reachability problems ("which nodes can reach this target?") and certain shortest path formulations both use it. Transposing takes O(V + E), so it's cheap enough to reach for whenever you need to flip the direction of a query.
“Each second pass DFS call is trapped inside a single SCC. That's the invariant that makes Kosaraju's correct.”
Tarjan's algorithm: One pass with a stack
Tarjan's algorithm finds all SCCs without a second pass or a transposed graph. It maintains two values per vertex during a single DFS.
disc(discovery index): tracks when the vertex was first visitedlow(low link value): tracks the smallest discovery index reachable from the vertex's DFS subtree through back edges
A vertex is the root of an SCC if and only if its low link value equals its discovery index. When you find such a root, everything on the stack above it (inclusive) forms one SCC.
Python
The low value propagation is the core mechanic. When vertex v reaches vertex w through its subtree, and w has a back edge to an ancestor still on the stack, v's low link value drops. That drop tells you v is part of a cycle including that ancestor. Only the topmost vertex in the cycle (the one whose low equals its disc) gets to "claim" the SCC.
on_stack check is critical. Without it, cross edges to vertices in already processed SCCs would incorrectly lower the low link value, merging components that should stay separate.When to use Kosaraju's vs Tarjan's
Both algorithms run in O(V + E) time. The differences are practical, not asymptotic.
For interviews, Kosaraju's is usually the stronger choice. The two pass structure gives you natural checkpoints to verify your logic with the interviewer, and the invariant is easier to articulate. Tarjan's is more elegant but the low link propagation introduces subtle bugs that are hard to catch without a debugger.
For competitive programming or production systems where memory and constant factors matter, Tarjan's is often preferred. It avoids allocating the transposed graph entirely.
Where strongly connected components appear in interviews
SCC decomposition rarely shows up as a standalone interview question. It's almost always a building block inside a harder problem. You won't struggle with the algorithm itself. You'll struggle with recognizing the SCC signal in the problem at all.
- Dependency cycle analysis: Given a set of modules with import dependencies, find all groups of mutually dependent modules. This is direct SCC decomposition on the dependency graph. Build system and package manager questions use this pattern, and the answer reduces to running Kosaraju's or Tarjan's on the import adjacency list.
- 2-SAT problems: The satisfiability of a 2-SAT formula can be determined by building an implication graph and checking whether any variable and its negation belong to the same SCC. If they do, the formula is unsatisfiable. The connection between boolean satisfiability and graph structure isn't obvious at first, but it clicks fast once you trace through an example.
- Condensation to DAG: Many problems become simpler after you collapse the graph into its condensation. "Can every node reach every other node?" reduces to "is there exactly one SCC?" Reachability queries, minimum edge additions, and longest path problems on general directed graphs often start with SCC decomposition as the first step. Once the graph is a DAG, you can apply topological sort, dynamic programming, or simple BFS to finish.
- Critical connections: Finding edges whose removal disconnects part of the graph uses the same DFS based approach as Tarjan's algorithm. Bridge finding and articulation point detection rely on similar low link value mechanics, so if you understand Tarjan's SCC algorithm, you've already done most of the work for that family of problems.
The common thread across all of these: the problem involves mutual reachability in a directed graph. If the question mentions circular dependencies, bidirectional paths in a one way system, or collapsing cycles into single units, SCC decomposition is probably the right first step. Once you can spot that signal, the implementation is the easy part.
The Graph course on Codeintuition covers the DFS pattern and connected component identification from first principles. If your DFS foundations aren't solid, SCC algorithms will feel like sequences of steps to memorize rather than constructions you can rebuild.
For a visual walkthrough of how BFS and DFS differ and where each traversal applies, see our BFS vs DFS comparison.
Recognizing the SCC signal
Strongly connected components tie together graph traversal, cycle detection, and reachability. The algorithms aren't hard to memorize. The hard part is recognizing that a problem needs SCC decomposition before you start coding. Mutual reachability constraints on a directed graph point directly to it.
Codeintuition's learning path builds graph algorithms from the traversal patterns up. You don't start with Tarjan's implementation. You start with DFS finish time properties and connected component identification, then construct the SCC algorithms from those foundations. The 289 illustrations in the Graph course trace every variable at every step.
Implementing Kosaraju's from a memorized template is the easy half. The real test is reading a problem about mutual dependencies, circular imports, or reachability constraints and knowing that SCC decomposition is where you start.
Build the DFS foundations that make SCC algorithms click.
Codeintuition's Graph course covers DFS finish time properties and connected component identification from first principles, with 289 illustrations tracing every variable. Start with the free foundational courses for FREE