Understanding the bottom-up solution to the longest common substring problem
To solve the longest common substring problem using a bottom-up dynamic programming approach, we build solutions iteratively starting from the smallest subproblems and working up to the final answer, storing results in a table as we progress. As with any bottom-up solution, we process subproblems in an order that guarantees every result we depend on has already been computed and stored before we need it.
The bottom-up solution
When implementing the recurrence relation using a top-down algorithm, i and j are indices into s1 and s2, and the base case occurs when either index drops below 0.
For the recurrence relations, if i < 0 or j < 0, the solution is 0, and this serves as the base case.
However, arrays cannot have negative indices. To handle this cleanly in the bottom-up table, we shift the meaning of i and j so that they represent the number of characters considered from each string rather than indices. This way, the base case becomes i == 0 or j == 0, which maps naturally to row 0 and column 0 of the table.
The recurrence relation itself does not change, we still check if the last characters match and either extend or terminate. The only difference is that when
i represents the number of characters considered, the last character of the prefix is at index i - 1 in the string (instead of index i). This is purely a shift in how we index into the table and the strings, not a change in logic.We can use i and j to denote the number of items considered from the start, instead of the index of the last item.
We create a 2D array lcs of dimensions (m + 1) × (n + 1), where lcs[i][j] stores the length of the longest common substring that ends at the last character of the length-i prefix of s1 and the length-j prefix of s2. All entries are initialized to 0.
Liking the course? Check our discounted plans to continue learning.