Deriving the recurrence relation
We learned about the the Bellman optimality equation and how it is a template that fits all dynamic programming problems in the previous lesson. It tells us the structure that every dynamic programming recurrence has in terms of s, A(s), c(s, a), dp(s) andT(s, a) without committing to what any of these are for any specific problem. To turn the template into a concrete recurrence relation for the problem we are solving, we have to instantiate each of the four pieces by deriving them from the problem.
The Bellman optimality equation talks in terms of s,A(s),c(s, a), dp(s) andT(s, a).
The instantiation of the Bellman equation is what is defined as recurrence design. Given we have already designed a state, the procedure is mechanical and follows five steps. After completing them, the Bellman template, with all its placeholders filled in, becomes the recurrence relation that solves the problem.
Step 1: Define what dp(s) represents
We write down, in one sentence, what the value at state s (dp(s)) represents. Precision is absolutely critical here as ambiguity here propagates into every subsequent step. Phrases like "the first i items" versus "items at index i or earlier," or "at most w" versus "exactly w," can distinguish entirely different problems.
Define precisely what dp(s) means.
Step 2: Enumerate the legal actions
To enumerate the legal actions at a state s, we ask: what choices are available and list them exhaustively and disjointly. The exact form of the action set depends on the problem. For some problems an action could be a single binary choice such as include or exclude an element. For others, it could be a choice from a range, such as a split point or a child to recurse on. For others, it could be a step in a fixed direction, such as moving forward by one index.
Liking the course? Check our discounted plans to continue learning.