How call stack works recursion

See how call stack works recursion with a frame-by-frame factorial(4) trace. Variable state at every push and pop.

10 minutes
Beginner
What you will learn

What a stack frame contains and why each field matters

How to trace factorial(4) through four frames with exact variable state

Why recursive depth directly determines O(depth) space complexity

When the call stack breaks and how tail call optimization helps

You're tracing through a recursive function on paper. The first call makes sense. The second makes sense. By the third nested call, you've lost track of which n belongs to which invocation, what's waiting to execute after each return, and why the whole thing produces the right answer. That confusion isn't about recursion itself. It's about what's happening in memory when one function call triggers the next, and how call stack works recursion is the physical answer to that question.

The recursion "works," but you can't explain why it works. Most explanations just show the code and wave their hands at "the recursive leap of faith." That's fine until an interviewer asks you to trace through the actual execution.

TL;DR
Every function call creates a stack frame in memory containing its own parameters, local variables, and return address. Recursive calls stack these frames on top of each other. When a call returns, its frame is popped and the previous call resumes exactly where it left off.

What happens when you call a function

The call stack is a region of memory that tracks active function calls. Each call creates a stack frame containing local variables, parameters, and a return address. When the function returns, its frame is popped and the caller's state is restored.

That's the textbook definition, so here's what it means concretely.

When your program calls a function, the runtime doesn't just "jump" to that function's code. Three things happen in sequence. The runtime saves the current position in the calling function so it knows where to come back. It allocates a new block of memory (the stack frame) for the called function's data. Then it transfers control to the called function's first instruction.

This happens for every function call, recursive or not. main calling helper calling print follows the same mechanism. The only difference with recursion is that multiple frames exist for the same function, each holding its own independent copy of every variable. The stack grows with each call and shrinks with each return, and that's the entire model you need to trace any recursion.

Inside a stack frame

Each stack frame holds four pieces of information. They're worth knowing individually because each one answers a different question when you're tracing recursion by hand.

Parameters are the values passed into this specific invocation. When factorial(3) calls factorial(2), the new frame's parameter is 2, completely independent of the 3 still sitting in the parent frame. This is why recursive calls don't overwrite each other's input. Local variables work the same way. Each frame gets its own copy, so a variable called result in one frame has nothing to do with result in another frame, even though they share a name in the source code.

Return address tells the runtime where to continue executing after this function finishes. For a recursive call, it typically points to the line right after the recursive invocation, where the current frame's computation resumes. Return value is what the function hands back to its caller. When factorial(2) returns 2, that value gets passed to the frame that called it, which uses it to compute its own result.

ℹ️ Info
The return address is the piece most people forget when tracing recursion by hand. Without it, you can't explain why factorial(3) knows to multiply by 3 after the inner call finishes. The address is what preserves the "waiting" computation.

How call stack works recursion through factorial(4)

Let's trace through a concrete example:

  1. Python

When you call factorial(4), four frames get created. Each push and pop happens in a predictable order.

Call 1, factorial(4). A frame is pushed with n = 4. The function checks whether 4 is less than or equal to 1. It isn't. The function reaches the return line and needs factorial(3) before it can multiply, so it pauses and makes the next call.

Call 2, factorial(3). A new frame lands on top with n = 3. The same check fails, so this frame also pauses and waits for factorial(2). Call 3, factorial(2) pushes another frame with n = 2, and this one needs factorial(1) before it can proceed.

Call 4, factorial(1). The final frame is pushed with n = 1. The base case hits, so the function returns 1 and this frame gets popped.

Now the stack unwinds. Factorial(2) receives 1, computes 2 times 1, and returns 2 before its frame is popped. The next frame down receives that 2, computes 3 times 2, and returns 6. Factorial(4) receives 6, computes 4 times 6, and returns 24 as the last frame is popped. The stack is empty and the final result is 24.

Call stack state during factorial(4)
1
Push factorial(4)
n=4, waiting on factorial(3). Stack depth: 1.
2
Push factorial(3)
n=3, waiting on factorial(2). Stack depth: 2.
3
Push factorial(2)
n=2, waiting on factorial(1). Stack depth: 3.
4
Push factorial(1)
n=1, base case reached. Returns 1. Stack depth: 4 (peak).
5
Pop factorial(2)
Receives 1, computes 2 times 1 equals 2, returns 2. Stack depth: 3.
6
Pop factorial(3)
Receives 2, computes 3 times 2 equals 6, returns 6. Stack depth: 2.
7
Pop factorial(4)
Receives 6, computes 4 times 6 equals 24, returns 24. Stack depth: 1.

At peak depth, four frames exist simultaneously. Each one holds a different value of n and a different "waiting" computation. The stack keeps them from colliding.

“Recursion isn't magic. It's four stack frames, each with its own n, waiting in exactly the right order.”
The mental model that makes recursion traceable

How call stack works recursion into O(depth) space

Every active call occupies one stack frame. Frames accumulate until a base case triggers the unwind. The maximum number of frames alive at the same time equals the deepest point of the recursion.

For factorial(n), that peak depth is n, which means four frames for factorial(4) and a thousand frames for factorial(1000). Each frame is a fixed-size block containing parameters, locals, and return address, so the total memory consumed by the call stack is proportional to the depth.

This is why recursive algorithms have O(depth) space complexity from the call stack alone, even when they don't allocate extra arrays or data structures. A recursive binary tree traversal visiting n nodes uses O(h) stack space where h is the tree's height, not O(n), because at most h frames exist at once. The distinction between total calls and peak depth matters when you're analyzing how call stack works recursion in branching scenarios.

Common misconception
What actually happens
Recursion uses O(n) space because it makes n calls
Space depends on maximum simultaneous frames, not total calls
Each recursive call uses one unit of permanent memory
Frames are reclaimed on return and don't accumulate forever
Tail recursion is always better because it uses less memory
Tail recursion helps only if the language optimizes it

A recursive function that branches, like tree traversal, might make n total calls but only have log(n) frames active at once. Earlier branches pop their frames before later branches push theirs.

When the stack breaks

Every runtime environment allocates a fixed-size region for the call stack. On most systems, that's between 1 MB and 8 MB. Each stack frame for a typical function consumes somewhere between 48 and a few hundred bytes, depending on local variable count and language overhead.

That fixed size creates a hard ceiling on recursion depth.

Python's default recursion limit is 1,000, enforced by the interpreter rather than the operating system. Java's practical limit is typically around 5,000 to 10,000 depending on frame size and JVM settings. C++ uses whatever the OS stack allows, usually around 10,000 to 50,000 for simple functions.

When you exceed that limit, you get a stack overflow. The algorithm might be perfectly correct, but the stack ran out of room, and this is why factorial(100000) crashes in most languages even though the math is sound.

There's a rabbit hole here about whether languages should optimize tail calls. Some do (Scheme, Haskell, certain Scala scenarios). Most don't (Python, Java, JavaScript in practice). The debate is genuinely interesting and unresolved for mainstream languages, but the practical takeaway is blunt: if your recursion goes deep, don't assume the language will save you. Check the limit or convert to iteration.

⚠️ Warning
Python's default recursion limit of 1,000 is intentionally conservative. You can raise it with sys.setrecursionlimit(), but the OS-level stack size still applies. Increasing the limit without increasing the actual stack allocation just delays the crash.

When the stack stops growing

Tail call optimization is the one scenario where the call stack doesn't grow with depth. When the recursive call is the last operation in the function, with no multiplication or addition after it, the runtime can reuse the current frame instead of pushing a new one. The factorial example above isn't tail-recursive because it still needs to multiply after the inner call returns. A tail-recursive version passes an accumulator:

  1. Python

In languages that optimize tail calls, this runs in O(1) stack space regardless of n. The frame for factorial_tail(4, 1) gets replaced by factorial_tail(3, 4), then factorial_tail(2, 12), and so on. One frame the entire time, rewritten at each step.

Most languages don't do this optimization, so check before relying on it.

The call stack changes how you reason about recursion. It turns "this function calls itself and somehow the right answer comes out" into a traceable sequence of frames with concrete state, pushed and popped in a predictable order. Once you can draw the stack for any recursive function, you can trace it, debug it, and walk an interviewer through it without hesitation.

Codeintuition's Recursion course starts from exactly this foundation. The first lessons cover stack frames and stack overflow before introducing any recursive patterns. Four recursion patterns follow, each with identification training so you learn when each pattern applies. The free Arrays and Singly Linked List courses cover 63 lessons and 15 patterns permanently free, building the prerequisite foundation. For the broader framework connecting recursion to trees, backtracking, and dynamic programming, see the complete trees and recursion guide.

A month ago, tracing factorial(4) by hand felt like guessing. Now you can draw four frames, label every parameter and return address, and predict the output before the code runs. That's the difference when an interviewer asks you to walk through your solution out loud.

Do you want to master data structures?

Try our data structures learning path made of highly visual and interactive courses. Get hands on experience by solving real problems in a structured manner. All resources you would ever need in one place for FREE

It doesn't. The mechanism is identical. Every function call pushes a stack frame with parameters, local variables, and a return address. Recursion just means multiple frames for the same function exist simultaneously, each with independent parameter values.
It's a safety measure against infinite recursion consuming all available stack memory. You can change it with sys.setrecursionlimit(), but the underlying OS stack size still applies. Raising the limit without increasing the actual stack allocation risks a segfault instead of a clean error. For recursion beyond a few thousand calls, converting to an iterative approach with an explicit stack is more reliable.
Yes. Draw one row per active call, writing the parameter values and what computation is waiting for the return value. Push a new row for each call, pop when it returns. The peak height of your drawing is the space complexity.
Infinite recursion means the function never reaches a base case, so it keeps calling itself forever. Stack overflow is the crash that happens when the accumulated frames exceed available stack memory. You can also get a stack overflow from finite recursion if the depth exceeds the stack limit. Calling factorial(100000) in Python is a correct algorithm that overflows. The fix is reducing depth through tail recursion, increasing the stack size, or converting to iteration.
Neither implements it. Python's creator has stated this is deliberate, to preserve meaningful stack traces for debugging. Java's JVM doesn't guarantee it either, though Scala can optimize specific tail-recursive patterns through compiler transformations.
Count the maximum number of stack frames that can exist at the same time, not the total number of calls. For linear recursion like factorial, that equals n. For binary tree traversal, it equals the tree height h. For branching recursion like Fibonacci, it equals the depth of the deepest call chain (which is n even though total calls grow exponentially). Multiply the frame count by the size of each frame (usually treated as constant), giving you O(max depth) space.
Was this helpful?