Search
⌘K
Common Patterns
Dynamic Programming
Overlapping subproblems and optimal substructure — how DP appears in AI coding problems, from edit distance to resource allocation.
Dynamic programming has a reputation problem. It sounds academic, it shows up in competitive programming nightmares, and most people's first encounter with it involves a confusing grid of numbers that somehow produces the right answer. But the core idea is genuinely simple: if you're solving the same subproblem over and over, just save the answer the first time and look it up later.
In AI coding interviews, DP comes up more than you'd expect. Not because interviewers want you to derive obscure recurrences on a whiteboard, but because real software problems are full of optimization under constraints. Edit distance powers text diffing, spell checkers, and DNA sequence alignment. Resource allocation problems decide how to distribute work across servers. Knapsack variants show up any time you're choosing items subject to a budget. These are practical problems, and DP is often the cleanest way to solve them.
Here's the thing interviewers won't say out loud: they don't expect you to have DP solutions memorized. What they want to see is that you can recognize when a problem has DP structure, reason about states and transitions, and verify that a solution is correct. If you're working with an AI assistant, that last part matters even more, because the AI can usually generate a DP solution once you point it in the right direction. Your job is knowing which direction to point.
The two properties
Every DP problem has two properties that make it amenable to this approach. If either one is missing, DP isn't the right tool.
Overlapping subproblems means the same smaller problem gets solved multiple times during the computation. Think about computing Fibonacci numbers recursively. To get fib(5), you need fib(4) and fib(3). But fib(4) itself needs fib(3) and fib(2). So fib(3) gets computed twice. And it gets worse quickly as n grows.
Optimal substructure means the optimal solution to the big problem can be constructed from optimal solutions to smaller problems. For Fibonacci this is trivially true since fib(n) = fib(n-1) + fib(n-2). For more complex problems like shortest paths, it means the shortest path from A to C through B contains the shortest path from A to B.
Here's the recursive call tree for fib(5). Notice how fib(3), fib(2), fib(1), and fib(0) each get computed multiple times. The orange nodes are duplicates, the teal nodes are unique computations.
That's a lot of orange. For fib(5), fib(2) alone gets computed three times. For fib(50), the naive recursive approach would take roughly 2^50 operations. That's where caching comes in.
Top-down: memoization
The most intuitive way to add DP to a recursive solution is memoization. You keep the same recursive structure but store results in a dictionary. Before computing anything, check if you've already solved it.
def fib(n, memo={}):
if n <= 1:
return n
if n in memo:
return memo[n]
memo[n] = fib(n - 1, memo) + fib(n - 2, memo)
return memo[n]That's it. The recursive call tree that was exploding exponentially now runs in O(n) time because each subproblem is solved exactly once. Every orange node in the diagram above becomes a dictionary lookup instead of a full recursive computation.
Top-down is nice because it feels natural. You write the recursion first, then add caching. It also has the advantage of only solving subproblems that are actually needed, which can matter when the state space is sparse. If your problem only touches a fraction of possible states, memoization avoids computing the rest.
In Python, you can also use @functools.lru_cache as a decorator instead of managing the dictionary yourself. It's cleaner for interview code: just slap @lru_cache(maxsize=None) on the function and the caching happens automatically. Make sure the function arguments are hashable though, so no lists or dicts as parameters.
Bottom-up: tabulation
The alternative is to flip the computation around. Instead of starting from the big problem and recursing down, start from the smallest subproblems and build up. You fill in a table iteratively.
def fib(n):
if n <= 1:
return n
dp = [0] * (n + 1)
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i - 1] + dp[i - 2]
return dp[n]Same result, same O(n) time complexity, but no recursion. You don't risk hitting Python's recursion limit, and iterative code is generally faster due to lower function call overhead.
So when should you use which? Top-down is easier to write when the recurrence is complex or the state space is irregular. You just write the recursion and add a cache. Bottom-up is better when you know you'll need all subproblems anyway, or when you want to optimize space (more on that shortly). In an interview, start with whatever feels more natural for the problem. You can always convert later.
Interviewers don't have a strong preference between top-down and bottom-up. Pick whichever one you can explain clearly. If you're unsure, top-down is usually easier to get right on the first attempt because the recursive structure maps directly to the problem definition. Bottom-up requires you to think about traversal order upfront, which is one more thing that can go wrong.
Edit distance: the classic DP problem
If there's one DP problem that shows up in AI coding interviews more than any other, it's edit distance (also called Levenshtein distance). It computes the minimum number of single-character operations needed to transform one string into another. The allowed operations are insert, delete, and replace.
Why does this matter outside of textbook exercises? Text diffing tools use edit distance to compute the minimal set of changes between two files. Spell checkers use it to suggest corrections. Fuzzy search uses it to match approximate queries. DNA sequence alignment is basically edit distance on a different alphabet. These are exactly the kinds of problems that show up in multi-file AI coding challenges.
The recurrence is straightforward once you see it. Let dp[i][j] be the edit distance between the first i characters of string s and the first j characters of string t.
- If s[i-1] == t[j-1], the characters match and no operation is needed: dp[i][j] = dp[i-1][j-1]
- Otherwise, take the minimum of three options:
- Replace: dp[i-1][j-1] + 1
- Delete from s: dp[i-1][j] + 1
- Insert into s: dp[i][j-1] + 1
The base cases are also intuitive. Transforming an empty string to a string of length j requires j insertions, so dp[0][j] = j. Transforming a string of length i to empty requires i deletions, so dp[i][0] = i.
Here's the complete DP table for transforming "KATE" into "SAT". Each cell shows the edit distance for the corresponding prefixes. The highlighted path traces the optimal sequence of operations.
The answer is 2: replace K with S, then delete E. (Or equivalently, delete K, delete E, insert S. Multiple optimal paths can exist.)
Here's the implementation:
def edit_distance(s, t):
m, n = len(s), len(t)
dp = [[0] * (n + 1) for _ in range(m + 1)]
for i in range(m + 1):
dp[i][0] = i
for j in range(n + 1):
dp[0][j] = j
for i in range(1, m + 1):
for j in range(1, n + 1):
if s[i - 1] == t[j - 1]:
dp[i][j] = dp[i - 1][j - 1]
else:
dp[i][j] = 1 + min(
dp[i - 1][j - 1],
dp[i - 1][j],
dp[i][j - 1],
)
return dp[m][n]Time complexity is O(mn), space complexity is O(mn). For the kinds of string sizes you'll see in an interview, this is perfectly fine. But if you're working on a diff tool processing large files, the space can matter.
When an interviewer gives you a problem involving "minimum changes," "closest match," or "how similar are these two sequences," your first instinct should be edit distance or some variant. The recurrence might change slightly (maybe insertions and deletions have different costs, or maybe you only allow certain operations), but the structure is almost always the same 2D DP table.
Space optimization: the rolling array
Look at the edit distance recurrence again. Each cell dp[i][j] only depends on three cells: dp[i-1][j-1], dp[i-1][j], and dp[i][j-1]. That means you only ever need the current row and the previous row. You can throw away everything else.
This is the rolling array technique, and it reduces space from O(mn) to O(min(m, n)).
def edit_distance_optimized(s, t):
if len(s) < len(t):
s, t = t, s
m, n = len(s), len(t)
prev = list(range(n + 1))
curr = [0] * (n + 1)
for i in range(1, m + 1):
curr[0] = i
for j in range(1, n + 1):
if s[i - 1] == t[j - 1]:
curr[j] = prev[j - 1]
else:
curr[j] = 1 + min(prev[j - 1], prev[j], curr[j - 1])
prev, curr = curr, prev
return prev[n]The swap at the end is a common trick. Rather than copying curr into prev, just swap the references. The result ends up in prev after the swap because the last iteration's curr becomes prev.
You won't always need this optimization in an interview, but mentioning it shows you understand the memory implications of your solution. If the interviewer asks "what if the strings are 100,000 characters each?", you have an answer ready.
Recognizing DP problems
Pattern recognition is half the battle. Here are the signals that a problem has DP structure:
The problem asks for an optimum. Minimum cost, maximum profit, longest subsequence, shortest path. When you see words like "minimum," "maximum," "longest," "shortest," or "number of ways," think DP.
Choices compound. Each decision affects future options. You're not making independent choices; you're making a sequence of decisions where the best next move depends on what you've already done.
You can define the state. If you can describe the problem as "given that I've processed the first i items and used j resources, what's the best I can do?", you've found your DP state. The recurrence follows naturally from how you transition between states.
Brute force is exponential. If the naive approach tries all possible combinations and the problem size suggests this would be way too slow, there's probably overlap in the subproblems you can exploit.
Not every optimization problem is DP though. If the problem has a greedy property where the locally optimal choice is always globally optimal, you don't need DP. The difference is whether you need to consider multiple options at each step (DP) or whether one option is always dominant (greedy).
Communicating your approach
When you identify a DP problem in an interview, walk through these steps out loud:
-
Define the state. What does dp[i] (or dp[i][j]) represent? This is the most important step. If your state definition is wrong, everything that follows will be wrong too.
-
Write the recurrence. How does dp[i][j] relate to smaller subproblems? What choices do you have at each step?
-
Identify base cases. What are the trivially solvable subproblems? For edit distance, it's the empty string cases. For Fibonacci, it's fib(0) and fib(1).
-
Determine traversal order. If you're going bottom-up, in what order do you fill the table so that every cell's dependencies are already computed?
-
State the complexity. Time is usually the number of states multiplied by the work per state. Space is the size of your table (or less, with rolling arrays).
Narrating these steps demonstrates structured thinking, which is exactly what interviewers are evaluating. Even if you get stuck on the recurrence, the interviewer can see that you know how to approach the problem.
You don't need to get this right on the first try. It's perfectly fine to say "I think the state needs an extra dimension" or "wait, that recurrence doesn't account for this case." Interviewers expect iteration. What they don't want to see is silent confusion or blind guessing.
Working with AI on DP problems
DP is an area where AI assistants genuinely shine. Once you've identified the state and the general structure, the AI is typically very good at generating correct recurrences, handling base cases, and writing clean iterative implementations. This is great news because DP code is notoriously easy to get subtly wrong when writing from scratch under time pressure.
But there's a catch. AI-generated DP solutions can have bugs that are hard to spot, and the consequences of those bugs are often wrong answers on specific inputs rather than crashes. Here's what to verify:
Base cases. This is where most DP bugs live. Off-by-one errors in the initialization are extremely common. Check that dp[0], dp[0][0], or whatever your base case is actually represents what you think it does. Run through a tiny example by hand.
Recurrence correctness. Read each transition carefully. Does the AI's recurrence actually match the problem? Sometimes the AI will generate a recurrence for a similar but different problem, like computing longest common substring when you asked for longest common subsequence.
Traversal order. In bottom-up solutions, cells must be filled in an order such that all dependencies are already computed. This is usually left-to-right, top-to-bottom, but for some problems it's different. The AI almost always gets this right, but double check when the dependency pattern is unusual.
Return value. Is the answer at dp[n], dp[m][n], max(dp), or something else? Easy to mix up, especially if the AI defined the state slightly differently than you expected.
AI assistants are good at generating the recurrence once you identify the state, but you need to verify the base cases and transitions yourself. A common failure mode is the AI producing a solution that passes most test cases but fails on edge cases like empty inputs, single-element inputs, or inputs where the optimal answer is zero. Always trace through the smallest non-trivial example by hand before trusting the solution.
A good workflow for DP problems with AI assistance:
- Describe the state and recurrence in plain English in your prompt. Don't just say "solve this with DP." Say "use a 2D DP table where dp[i][j] represents the minimum cost to align the first i characters of s with the first j characters of t."
- Let the AI generate the implementation.
- Trace through a small example yourself. Does dp[1][1] have the right value? What about dp[0][j] and dp[i][0]?
- Run the code on the provided test cases plus at least one edge case you came up with.
- Optimize if needed (rolling array, early termination, etc.).
This approach is faster than writing the DP from scratch and safer than blindly trusting the AI output. The interviewer sees you directing the solution strategy, using the tool efficiently, and verifying the result, which is exactly the workflow they're evaluating.
Common DP patterns in AI coding interviews
Beyond edit distance, a few DP patterns show up frequently in the multi-file, real-world-flavored problems you'll see in AI coding interviews:
Longest common subsequence (LCS). Very similar to edit distance in structure. Used for diffing, version control, and merge conflict resolution. The state is the same 2D table, but the recurrence tracks matches instead of edits.
Knapsack variants. Given items with weights and values and a capacity constraint, maximize value. Shows up whenever there's resource allocation: distributing tasks across workers, selecting features within a budget, or optimizing cache eviction policies.
Interval scheduling / weighted job scheduling. Given jobs with start times, end times, and profits, find the maximum profit subset of non-overlapping jobs. This combines DP with sorting and binary search. It's a natural fit for calendar or scheduling systems.
Sequence alignment. A generalization of edit distance where different operations have different costs. Used in bioinformatics but also in matching user input to template patterns.
For each of these, the approach is the same: define the state, write the recurrence, handle base cases, and verify. The specific recurrence changes but the methodology doesn't.
If you're short on prep time, focus on edit distance and knapsack. Those two cover the core mechanics (2D table, multiple choices per state, space optimization) and the patterns transfer to most other DP problems you'll encounter. If you can explain and implement both from scratch, you're in good shape.
Pitfalls to watch for
A few things that trip people up specifically in the AI coding interview context:
Over-engineering the state. Sometimes candidates (or their AI assistants) define DP states with too many dimensions. If your state is dp[i][j][k][l], pause and ask whether all those dimensions are necessary. More dimensions means more complexity, more base cases to get right, and more room for bugs.
Forgetting that DP isn't always the answer. Not every optimization problem needs DP. If the problem has greedy structure, use greedy. If it's a graph problem, use graph algorithms. DP is a specific tool for a specific class of problems. Reaching for it reflexively makes you look like you only have one hammer.
Not testing edge cases. Empty inputs, single elements, inputs where all values are the same, inputs where the answer is zero. These are where DP bugs hide. The AI will often produce code that works beautifully on the happy path and silently returns the wrong answer on edge cases.
Skipping the explanation. In an AI coding interview, producing correct code is necessary but not sufficient. If you can't explain why the recurrence works, the interviewer has no way to evaluate your understanding. They might even suspect you just got lucky with a good AI prompt. Walk through the logic, even if it takes an extra minute.
Dynamic programming is one of those topics where the gap between "I've seen this before" and "I understand it" feels huge. But the core mechanic, caching repeated work, really is that simple. What makes DP problems hard is defining the right state and recurrence for each specific problem, and that's a skill that improves quickly with practice. You don't need to be a DP wizard going into the interview. You need to know the pattern, be able to recognize it, and work through the specifics with confidence.
Purchase Premium to Keep Reading
Unlock this article and so much more with Hello Interview Premium
Reading Progress
On This Page
Schedule a mock interview
Meet with a FAANG senior+ engineer or manager and learn exactly what it takes to get the job.
