Search
⌘K
Common Patterns
Topological Sort
Ordering tasks with dependencies using Kahn's algorithm and DFS — a pattern that appears whenever problems involve build systems, scheduling, or prerequisites.
Dependency problems are everywhere in software. Build systems need to compile files in the right order. Package managers need to install libraries before the things that depend on them. Course registration systems need to enforce prerequisites. Task schedulers need to know what can run in parallel and what has to wait. Spreadsheet engines need to recalculate cells in the right order when formulas reference other cells.
When one of these problems shows up in an AI coding interview, the underlying structure is almost always the same: you have a directed acyclic graph, and you need to find a valid ordering. That's topological sort.
You don't need to walk into the interview with this algorithm memorized. Interviewers running AI-enabled coding sessions aren't testing recall — they're testing whether you can recognize the shape of a problem, reach for the right tool, and work with the AI to implement it correctly. If you've never heard of topological sort before reading this page, that's fine. What matters is that after reading it, you can spot when it applies and guide an AI through the implementation.
The good news is that topological sort is one of those patterns where the concept is more important than the code. Once you understand what it does and why it works, the implementation is fairly mechanical. And mechanical implementations are exactly where AI assistants shine — you just need to give them the right direction.
What is a topological ordering?
A topological ordering is a linear arrangement of nodes in a directed acyclic graph (DAG) where for every directed edge from node A to node B, A appears before B in the ordering. That's it. You're just lining things up so that nothing comes before its dependencies.
Think about a build system. You've got six modules, and some of them depend on others. Module D needs modules B and C to be built first. Module B needs module A. You can't just build them in any random order — you need an ordering that respects all the dependency arrows.
One valid topological ordering here is A, B, C, D, E. Another is A, C, B, D, E. Both are correct — topological sort doesn't produce a unique answer unless the graph happens to have only one valid path. The key constraint is just that every node appears after all of its dependencies.
This is worth internalizing because it comes up in interviews. If the problem says "return any valid ordering," you don't need to worry about which specific ordering your algorithm produces. If it says "return the lexicographically smallest ordering," that's a different problem — you'd use a min-heap instead of a regular queue in Kahn's algorithm. Read the requirements carefully.
The "acyclic" part is crucial. If module A depends on B and B depends on A, there's no valid ordering. Cycles make topological sort impossible. We'll come back to that.
Kahn's Algorithm (BFS approach)
Kahn's algorithm is the most intuitive way to think about topological sort. The idea is simple: find everything that has no dependencies, process it, then remove it from the graph and repeat. It's exactly what you'd do if you were manually figuring out a build order on a whiteboard.
Here's the step-by-step:
- Calculate the in-degree (number of incoming edges) for every node
- Add all nodes with in-degree 0 to a queue — these have no dependencies and are safe to process first
- Pull a node from the queue, add it to the result
- For each neighbor of that node, decrement its in-degree. If the in-degree hits 0, add it to the queue
- Repeat until the queue is empty
If you processed all nodes, you have a valid topological ordering. If some nodes are left over, there's a cycle — we'll get to that.
The "in-degree" terminology sounds academic, but the concept is dead simple: it's just the count of how many things a node depends on. When that count reaches zero, all dependencies are satisfied and the node is ready to go.
Here's the Python implementation:
from collections import deque
def topological_sort_kahn(num_nodes, edges):
adj = [[] for _ in range(num_nodes)]
in_degree = [0] * num_nodes
for src, dst in edges:
adj[src].append(dst)
in_degree[dst] += 1
queue = deque()
for node in range(num_nodes):
if in_degree[node] == 0:
queue.append(node)
result = []
while queue:
node = queue.popleft()
result.append(node)
for neighbor in adj[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
if len(result) != num_nodes:
return None
return resultThe time complexity is O(V + E) where V is vertices and E is edges. You visit every node once and process every edge once. Space is also O(V + E) for the adjacency list.
Notice how naturally Kahn's algorithm detects cycles. If the graph has a cycle, those nodes will never reach in-degree 0, so they'll never enter the queue. You just check whether your result contains all nodes at the end. If it doesn't, there's a cycle somewhere. This is one of the cleanest things about Kahn's — cycle detection isn't an extra step, it falls out of the algorithm for free.
If you see words like "prerequisites", "dependencies", "ordering", or "before/after" in a problem description, your pattern-matching instinct should immediately flag topological sort.
DFS-based topological sort
There's another approach that uses depth-first search. Instead of peeling off nodes with no dependencies from the front, you recurse as deep as possible and add nodes to the result on your way back up. The result comes out in reverse order, so you reverse it at the end (or prepend to a list, or use a stack).
The intuition: in a DFS, a node's recursive call finishes only after all its descendants have finished. So if you record nodes in the order they finish, you get a reverse topological ordering.
def topological_sort_dfs(num_nodes, edges):
adj = [[] for _ in range(num_nodes)]
for src, dst in edges:
adj[src].append(dst)
UNVISITED, IN_PROGRESS, DONE = 0, 1, 2
state = [UNVISITED] * num_nodes
result = []
def dfs(node):
if state[node] == IN_PROGRESS:
return False
if state[node] == DONE:
return True
state[node] = IN_PROGRESS
for neighbor in adj[node]:
if not dfs(neighbor):
return False
state[node] = DONE
result.append(node)
return True
for node in range(num_nodes):
if state[node] == UNVISITED:
if not dfs(node):
return None
result.reverse()
return resultThe three-state tracking (UNVISITED, IN_PROGRESS, DONE) is doing double duty here. It prevents revisiting nodes we've already fully processed, and it detects cycles: if we encounter a node that's IN_PROGRESS, we've found a back edge, which means there's a cycle.
Walk through a small example to see why this works. Starting from node A, we mark it IN_PROGRESS and recurse into its neighbors. When we finish all of A's descendants, we mark A as DONE and append it to the result. Because A finishes after everything reachable from A, it ends up later in the result list. Reversing at the end puts it first — exactly where a node with no dependencies belongs.
Kahn's vs. DFS — when to reach for which
Both run in O(V + E) time. Both produce valid topological orderings. The differences are practical, not theoretical.
Kahn's is usually easier to explain and implement correctly. The BFS structure means you process nodes level by level, which maps naturally to "rounds" of work — handy if the problem also asks about parallelism (which tasks can run simultaneously?). It also gives you cycle detection for free without extra bookkeeping.
DFS is more natural if you're already doing graph traversal for other reasons, or if the problem involves path-finding alongside ordering. It's also slightly more compact to write, though the three-state cycle detection adds some mental overhead.
In an interview, either one works. Pick whichever you're more comfortable explaining. If you're not sure, Kahn's tends to be easier to walk through with an interviewer because the queue-based approach is more visual.
Cycle detection
If the graph isn't a DAG — if it has cycles — topological sort is impossible. No valid ordering exists. Think about it: if A depends on B and B depends on A, which one goes first? Neither can. There's no way to break the deadlock.
Detecting cycles is part of most dependency-resolution problems, either explicitly ("report an error if there are circular dependencies") or implicitly (the algorithm just needs to handle it gracefully). In real-world systems, circular dependencies are bugs — package managers refuse to install, build systems throw errors, and compilers complain. In interview problems, cycle detection is often the twist that separates a complete solution from a partial one.
With Kahn's algorithm, cycle detection is built in. If len(result) < num_nodes after the algorithm finishes, some nodes were never added to the queue because their in-degree never reached 0. Those nodes are part of one or more cycles.
With DFS, you detect cycles by finding back edges — encountering a node that's currently in the IN_PROGRESS state during traversal.
If the problem specifically asks you to find the cycle (not just detect it), here's a quick way to extract it:
def find_cycle(num_nodes, edges):
adj = [[] for _ in range(num_nodes)]
for src, dst in edges:
adj[src].append(dst)
UNVISITED, IN_PROGRESS, DONE = 0, 1, 2
state = [UNVISITED] * num_nodes
parent = [-1] * num_nodes
def dfs(node):
state[node] = IN_PROGRESS
for neighbor in adj[node]:
if state[neighbor] == IN_PROGRESS:
cycle = [neighbor, node]
curr = node
while curr != neighbor:
curr = parent[curr]
cycle.append(curr)
cycle.reverse()
return cycle
if state[neighbor] == UNVISITED:
parent[neighbor] = node
result = dfs(neighbor)
if result:
return result
state[node] = DONE
return None
for node in range(num_nodes):
if state[node] == UNVISITED:
cycle = dfs(node)
if cycle:
return cycle
return NoneMost interview problems don't require extracting the full cycle, but knowing it's possible shows depth of understanding if the interviewer pushes you on edge cases. It's also a good thing to mention proactively: "If we detect a cycle, we could extend this to report which nodes are involved." Even if the interviewer doesn't ask you to implement it, flagging that you've thought about it is a strong signal.
When prompting the AI for cycle detection, be explicit about what should happen when a cycle is found. "Return None" is different from "raise an exception" is different from "return the nodes involved in the cycle." Ambiguity here leads to AI output that technically works but doesn't match what the problem requires.
Common variations you'll encounter
Pure "give me a topological ordering" is rare in real interviews. The pattern shows up in disguise, and often the topological sort is just one piece of a larger problem. Here are the variations that come up most often.
Parallel task scheduling. Given tasks with dependencies and durations, find the minimum total time to complete everything assuming unlimited parallelism. This is the critical path problem. You run topological sort, then do a forward pass computing the earliest start time for each task (the maximum completion time among all its dependencies). The answer is the maximum completion time across all tasks. Kahn's is especially natural here because its level-by-level processing maps directly to "rounds" of parallel execution.
Course scheduling with semesters. A variant of parallel scheduling: given course prerequisites and a limit on courses per semester, find the minimum number of semesters. Same idea as above, but you cap the number of nodes you process per "level" in Kahn's algorithm.
Build order with string names. Instead of numbered nodes, you get module names like "auth", "database", "api". The graph is the same — you just need a mapping from names to indices (or use a dictionary-based adjacency list). Don't overthink this part. It's bookkeeping, not algorithm design, and it's exactly the kind of thing the AI handles well.
Alien dictionary. Given a sorted list of words in an unknown alphabet, determine the character ordering. You derive edges by comparing adjacent words character by character, then topological sort the characters. This one is tricky because the graph construction is the hard part, not the sort itself. A good prompt to the AI would describe the edge-extraction logic explicitly: "Compare each adjacent pair of words, find the first differing character, and create a directed edge from the character in the first word to the character in the second word."
Detecting if a unique ordering exists. If at any point during Kahn's algorithm the queue contains more than one node, multiple valid orderings exist. If the queue always has exactly one node, the ordering is unique. This is a common follow-up question.
Longest path in a DAG. Process nodes in topological order, and for each node, update the distances of its neighbors. Unlike shortest path algorithms that work on general graphs, longest path in a DAG is solvable in O(V + E) precisely because topological ordering lets you process each node exactly once, after all paths leading to it are finalized.
The unifying theme across all these variations is the same: model the problem as a directed graph, sort it topologically, then do whatever computation you need in that order. The sort is rarely the whole answer — it's the scaffolding that makes the rest of the answer efficient.
What interviewers expect
Topological sort problems in AI-enabled interviews usually don't announce themselves as "implement topological sort." They show up as build systems, task schedulers, dependency installers, course planners, or compilation pipelines. The first thing the interviewer is evaluating is whether you recognize the underlying structure.
When you see a problem that involves ordering things based on dependencies, say it out loud: "This looks like a dependency graph — I'd model it as a DAG and use topological sort to find a valid ordering." That one sentence tells the interviewer you've identified the pattern, and it gives them confidence you're heading in the right direction.
From there, the interviewer wants to see you think through a few things before writing code:
- Representation: How will you model the graph? Adjacency list is almost always the right call. Mention it explicitly.
- Edge cases: What if there are cycles? What if a node has no dependencies at all? What if the graph is disconnected?
- Algorithm choice: Kahn's or DFS? Either is fine, but have a reason. "I'll use Kahn's because it naturally gives me cycle detection and the level-by-level processing could be useful if we need to identify parallelizable tasks."
When prompting the AI, be specific about the graph structure. Don't say "sort these tasks." Say something like "Build an adjacency list from the dependency pairs, compute in-degrees, and run Kahn's algorithm to produce a topological ordering. Return None if there's a cycle." The more precise your prompt, the less cleanup you'll need afterward.
Interviewers don't expect you to have topological sort memorized. They expect you to recognize when a problem is fundamentally about dependency ordering, choose an appropriate approach, and implement it correctly — with or without the AI's help. Saying "I haven't implemented this recently, but the idea is to repeatedly process nodes with no remaining dependencies" is a perfectly strong signal.
Something that trips people up in AI-enabled interviews specifically: the AI will almost always produce a correct topological sort if you ask for one. The interview isn't about whether the code runs. It's about whether you understand why you're using topological sort, whether you can explain the approach to the interviewer, and whether you handle the graph construction correctly. The AI won't know the right way to parse the problem's input into edges unless you tell it. That translation from problem domain to graph structure is what the interviewer is really watching.
Also worth being deliberate about: verify the AI's output makes sense. If it generates a topological sort that processes a node before its dependency, that's a bug you should catch before running the code. Trace through two or three nodes in the ordering and confirm the dependency arrows all point forward. This kind of quick sanity check takes ten seconds and signals to the interviewer that you're not blindly trusting the AI.
One more thing worth calling out: many dependency problems have a second part. After finding the ordering, you might need to compute the longest path (critical path in a project schedule), determine which tasks can run in parallel, or propagate values through the DAG. Topological sort gives you the foundation — the ordering that makes these downstream computations straightforward, because you're always processing nodes after their dependencies are resolved.
Kahn's vs. DFS — the key tradeoff. Kahn's algorithm gives you one valid ordering and naturally detects cycles (just check if all nodes were processed). DFS also gives you a valid ordering but requires explicit three-state tracking for cycle detection. In practice, Kahn's is easier to reason about for most dependency-resolution problems, while DFS is more natural when you're already traversing the graph for other purposes.
Quick reference
When you're in the interview and you suspect topological sort, here's the mental checklist:
- Confirm it's a DAG problem. Are there directed dependencies? Is ordering the goal? If yes, you're in the right place.
- Build the adjacency list. Parse whatever input format the problem gives you into a directed graph.
- Pick your algorithm. Kahn's for most cases. DFS if you're already doing graph traversal or need reverse post-order for another reason.
- Handle cycles. Either the problem guarantees no cycles (read carefully) or you need to detect and report them.
- Think about what comes next. Topological sort is often step one. The real problem might be parallel scheduling, critical path analysis, or value propagation through the sorted order.
The pattern recognition is the hard part. Once you've identified that a problem is topological sort, the implementation is mechanical — and that's exactly the kind of thing an AI assistant handles well when you give it a clear, specific prompt.
A common mistake in interviews is jumping straight to code without establishing the graph structure first. Before you or the AI write a single line, make sure you can answer: what are the nodes, what are the edges, and where does the input data come from? Getting the graph construction wrong will silently produce a wrong ordering, and debugging a wrong topological sort is much harder than debugging a wrong graph.
Mark as read
Reading Progress
On This Page
Schedule a mock interview
Meet with a FAANG senior+ engineer or manager and learn exactly what it takes to get the job.

Your account is free and you can post anonymously if you choose.