Strongly Connected Components Explained

Strongly Connected Components Explained

Learn how Kosaraju's and Tarjan's algorithms find strongly connected components, when to use each, and how SCCs appear in coding interviews.

10 minutes
Advanced
What you will learn

What strongly connected components are and why undirected connectivity breaks

How Kosaraju's two pass algorithm finds every SCC in a directed graph

How Tarjan's single pass algorithm uses low link values to detect SCCs

When each algorithm fits and where SCCs appear in interviews

Connected components on undirected graphs feel straightforward, and it's tempting to assume the idea transfers directly to directed graphs. It doesn't. In an undirected graph, reachability is symmetric. If A reaches B, B reaches A. One DFS finds the whole component. Directed graphs break that symmetry, and that's where strongly connected components become the right abstraction.

TL;DR
A strongly connected component is a maximal group of vertices where every vertex can reach every other vertex through directed edges. Kosaraju's algorithm finds all SCCs with two DFS passes and a graph transposition. Tarjan's does it in one pass using low link values. Both run in O(V + E).

What strongly connected components actually are

A strongly connected component is a maximal set of vertices in a directed graph where every vertex is reachable from every other vertex. "Maximal" means you can't add another vertex without breaking the all pairs reachability property.

This matters because directed graphs have reachability patterns that a single traversal won't reveal. Consider a graph with vertices {0, 1, 2, 3, 4} and these edges:

  • 0 → 1, 1 → 2, 2 → 0 (these three form a cycle)
  • 2 → 3, 3 → 4, 4 → 3 (these two form a cycle)

This graph has two SCCs: {0, 1, 2} and {3, 4}. Within each group, every vertex reaches every other. But vertex 3 can't reach vertex 0, so they can't belong to the same SCC. Notice that the two SCCs aren't isolated from each other. There's an edge from 2 to 3 that crosses the boundary. SCC decomposition doesn't require components to be disconnected. It requires that the internal reachability within each component is complete.

ℹ️ Info
Collapse each SCC into a single node and the resulting graph is always a DAG. This is the condensation graph, and that's why SCC decomposition matters. Problems that are hard on general directed graphs often become tractable once you reduce them to their condensation.

The obvious brute force approach, running a reachability check from every vertex, takes O(V * (V + E)). Both Kosaraju's and Tarjan's algorithms bring that down to O(V + E), which matches the time of a single traversal.

Kosaraju's algorithm: Two passes and a transpose

Kosaraju's algorithm is the more intuitive of the two. It relies on a specific property of DFS finish times: if SCC A has an edge to SCC B in the condensation graph, then the last vertex to finish in A finishes after every vertex in B. The algorithm has three steps.

  1. 1Run DFS on the original graph. As each vertex finishes (all its descendants are fully explored), push it onto a stack.
  2. 2Transpose the graph (reverse every edge direction).
  3. 3Pop vertices from the stack. For each unvisited vertex, run DFS on the transposed graph. Every vertex reached belongs to the same SCC.

Why does this work?

The finish time ordering gives you a precise guarantee. When you pop from the stack (highest finish time first), you start in an SCC whose vertices finished after all vertices in any SCC it points to. Transposing the graph reverses that edge, so the second pass DFS can't cross into the downstream SCC. The edge points the wrong way now. Each second pass DFS call is trapped inside one SCC.

That's the whole invariant, and it's why Kosaraju's is correct.

Tracing Kosaraju's on a concrete graph

Walk through the 5-node graph from above: vertices {0, 1, 2, 3, 4}, edges: 0→1, 1→2, 2→0, 2→3, 3→4, 4→3.

  1. Python

Graph transposition comes up more often than you'd expect outside SCC detection. Reverse reachability problems ("which nodes can reach this target?") and certain shortest path formulations both use it. Transposing takes O(V + E), so it's cheap enough to reach for whenever you need to flip the direction of a query.

“Each second pass DFS call is trapped inside a single SCC. That's the invariant that makes Kosaraju's correct.”
The finish time ordering

Tarjan's algorithm: One pass with a stack

Tarjan's algorithm finds all SCCs without a second pass or a transposed graph. It maintains two values per vertex during a single DFS.

  • disc (discovery index): tracks when the vertex was first visited
  • low (low link value): tracks the smallest discovery index reachable from the vertex's DFS subtree through back edges

A vertex is the root of an SCC if and only if its low link value equals its discovery index. When you find such a root, everything on the stack above it (inclusive) forms one SCC.

  1. Python

The low value propagation is the core mechanic. When vertex v reaches vertex w through its subtree, and w has a back edge to an ancestor still on the stack, v's low link value drops. That drop tells you v is part of a cycle including that ancestor. Only the topmost vertex in the cycle (the one whose low equals its disc) gets to "claim" the SCC.

💡 Tip
The on_stack check is critical. Without it, cross edges to vertices in already processed SCCs would incorrectly lower the low link value, merging components that should stay separate.

When to use Kosaraju's vs Tarjan's

Both algorithms run in O(V + E) time. The differences are practical, not asymptotic.

Kosaraju's algorithm
Tarjan's algorithm
Requires two DFS passes and a full graph transposition
Single DFS pass, no graph transposition needed
Uses O(V + E) extra space for the transposed graph
Uses O(V) extra space for the stack and index arrays
Easier to implement correctly under time pressure
Harder to implement bug free (low link propagation is subtle)
The invariant (finish time ordering) is more intuitive to explain
Slightly better constant factors in practice

For interviews, Kosaraju's is usually the stronger choice. The two pass structure gives you natural checkpoints to verify your logic with the interviewer, and the invariant is easier to articulate. Tarjan's is more elegant but the low link propagation introduces subtle bugs that are hard to catch without a debugger.

For competitive programming or production systems where memory and constant factors matter, Tarjan's is often preferred. It avoids allocating the transposed graph entirely.

Where strongly connected components appear in interviews

SCC decomposition rarely shows up as a standalone interview question. It's almost always a building block inside a harder problem. You won't struggle with the algorithm itself. You'll struggle with recognizing the SCC signal in the problem at all.

  • Dependency cycle analysis: Given a set of modules with import dependencies, find all groups of mutually dependent modules. This is direct SCC decomposition on the dependency graph. Build system and package manager questions use this pattern, and the answer reduces to running Kosaraju's or Tarjan's on the import adjacency list.
  • 2-SAT problems: The satisfiability of a 2-SAT formula can be determined by building an implication graph and checking whether any variable and its negation belong to the same SCC. If they do, the formula is unsatisfiable. The connection between boolean satisfiability and graph structure isn't obvious at first, but it clicks fast once you trace through an example.
  • Condensation to DAG: Many problems become simpler after you collapse the graph into its condensation. "Can every node reach every other node?" reduces to "is there exactly one SCC?" Reachability queries, minimum edge additions, and longest path problems on general directed graphs often start with SCC decomposition as the first step. Once the graph is a DAG, you can apply topological sort, dynamic programming, or simple BFS to finish.
  • Critical connections: Finding edges whose removal disconnects part of the graph uses the same DFS based approach as Tarjan's algorithm. Bridge finding and articulation point detection rely on similar low link value mechanics, so if you understand Tarjan's SCC algorithm, you've already done most of the work for that family of problems.

The common thread across all of these: the problem involves mutual reachability in a directed graph. If the question mentions circular dependencies, bidirectional paths in a one way system, or collapsing cycles into single units, SCC decomposition is probably the right first step. Once you can spot that signal, the implementation is the easy part.

The Graph course on Codeintuition covers the DFS pattern and connected component identification from first principles. If your DFS foundations aren't solid, SCC algorithms will feel like sequences of steps to memorize rather than constructions you can rebuild.

For a visual walkthrough of how BFS and DFS differ and where each traversal applies, see our BFS vs DFS comparison.

Recognizing the SCC signal

Strongly connected components tie together graph traversal, cycle detection, and reachability. The algorithms aren't hard to memorize. The hard part is recognizing that a problem needs SCC decomposition before you start coding. Mutual reachability constraints on a directed graph point directly to it.

Codeintuition's learning path builds graph algorithms from the traversal patterns up. You don't start with Tarjan's implementation. You start with DFS finish time properties and connected component identification, then construct the SCC algorithms from those foundations. The 289 illustrations in the Graph course trace every variable at every step.

Implementing Kosaraju's from a memorized template is the easy half. The real test is reading a problem about mutual dependencies, circular imports, or reachability constraints and knowing that SCC decomposition is where you start.

Build the DFS foundations that make SCC algorithms click.

Codeintuition's Graph course covers DFS finish time properties and connected component identification from first principles, with 289 illustrations tracing every variable. Start with the free foundational courses for FREE

Connected components apply to undirected graphs, where reachability is symmetric. Strongly connected components apply to directed graphs, where reachability is unidirectional. An SCC requires that every vertex can reach every other vertex through directed paths.
You don't need it for undirected graphs. Since edges are bidirectional, a simple DFS or BFS from any unvisited vertex already finds the entire connected component. Kosaraju's transpose step adds nothing when every edge goes both ways. Use Kosaraju's or Tarjan's only for directed graphs where reachability asymmetry creates distinct SCCs.
Both algorithms run in O(V + E), where V is the number of vertices and E is the number of edges. Kosaraju's performs two DFS passes plus one transposition, but each operation is O(V + E), so the total stays linear. Tarjan's achieves the same bound in a single DFS pass with constant time stack operations per vertex.
You can't topologically sort a graph that contains cycles, and SCCs are the maximal cyclic substructures in a directed graph. But if you condense the graph by collapsing each SCC into a single node, the result is a DAG that you can topologically sort. This condensation then sort technique is standard for ordering tasks or dependencies in graphs with cycles. Kosaraju's algorithm produces SCCs in reverse topological order of the condensation graph, which means you get the topological ordering as a byproduct. Tarjan's produces them in forward topological order.
SCC questions typically appear at the senior or staff level, embedded inside a harder graph problem rather than asked as a standalone question. Google and Amazon have used problems involving dependency cycle detection and reachability analysis where SCC decomposition is a required step. The skill that matters most isn't memorizing Tarjan's but recognizing when a problem's structure calls for SCC decomposition.
Was this helpful?