Leetcode 126. Word Ladder II
Find all shortest sequences transforming beginWord to endWord by changing one letter at a time using only words from wordList; treat words as nodes in an unweighted graph (edges connect one-letter differences), use BFS to discover shortest distances/level graph and then backtrack/DFS to enumerate all shortest transformation paths.
Asked at:
Meta
Bloomberg
DESCRIPTION
Find all shortest sequences transforming beginWord to endWord by changing one letter at a time using only words from wordList; treat words as nodes in an unweighted graph (edges connect one-letter differences), use BFS to discover shortest distances/level graph and then backtrack/DFS to enumerate all shortest transformation paths.
Input:
beginWord = 'hit', endWord = 'cog', wordList = ['hot','dot','dog','lot','log','cog']
Output:
[['hit','hot','dot','dog','cog'], ['hit','hot','lot','log','cog']]
Explanation: There are two shortest transformation sequences of length 5. Both change one letter at a time and use only words from wordList.
Constraints:
- 1 <= beginWord.length <= 10
- endWord.length == beginWord.length
- 1 <= wordList.length <= 5000
- wordList[i].length == beginWord.length
- All strings consist of lowercase English letters
- beginWord != endWord
- All words in wordList are unique
Understanding the Problem
Let's understand what we're being asked to do. We have a beginWord like 'hit', an endWord like 'cog', and a wordList like ['hot','dot','dog','lot','log','cog']. We need to find all shortest transformation sequences from 'hit' to 'cog'.
A transformation means changing exactly one letter at a time, and each intermediate word must exist in the wordList. For example: 'hit' -> 'hot' -> 'dot' -> 'dog' -> 'cog' is valid (each step changes one letter).
Important constraint: We want ALL shortest paths, not just one! If there are multiple ways to reach endWord in the minimum number of steps, we need to return all of them.
Key insight: Words are like nodes in a graph, and edges connect words that differ by exactly one letter. The problem becomes: find all shortest paths in an unweighted graph.
Edge cases to consider: What if endWord is not in wordList? (return empty list). What if no transformation sequence exists? What if beginWord equals endWord?
Brute Force Approach
The brute force approach uses pure DFS to explore all possible transformation paths from beginWord to endWord. Starting from beginWord, recursively try transforming to every word in wordList that differs by one letter, marking words as visited to avoid cycles. When endWord is reached, record the path. After exploring all paths, filter to keep only the shortest ones. This approach explores many unnecessary longer paths before finding the optimal length.
Start word ladder algorithm
0 / 317
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(N^N) In the worst case, we explore all possible permutations of words, leading to exponential time complexity where N is the number of words
Space Complexity: O(N) Recursion stack depth is at most N words in the longest path, plus space for tracking visited words
Building Intuition
In an unweighted graph, BFS discovers nodes level by level, guaranteeing that the first time we reach a node, we've found the shortest distance to it.
Once we know the shortest distance to each word, we can work backwards from endWord to beginWord, only following edges that decrease the distance by 1. This ensures we only traverse shortest paths.
A naive DFS approach would explore all possible paths, including many longer-than-optimal ones, leading to exponential time complexity.
By using BFS first to build a distance map, then DFS to enumerate only shortest paths, we avoid exploring any path that's longer than necessary. This two-phase approach (BFS + DFS) is much more efficient.
Think of this like finding all shortest routes between two cities on a map:
Phase 1 (BFS): Measure the shortest distance from the start city to every other city. Mark each city with its distance number.
Phase 2 (DFS/Backtracking): Starting from the destination, work backwards. At each step, only move to cities whose distance number is exactly 1 less than your current city. This guarantees you're always on a shortest path.
The level-by-level exploration of BFS is what makes this work - it's like ripples spreading outward from a stone dropped in water.
Common Mistakes
Optimal Solution
The optimal approach uses a two-phase strategy: First, run BFS from beginWord to build a distance map showing the shortest distance to each word. This creates a level graph where each word knows its distance from the start. Second, use DFS/backtracking starting from endWord, working backwards to beginWord. At each step, only follow edges to words whose distance is exactly 1 less, ensuring we only traverse shortest paths. This guarantees we enumerate all shortest transformation sequences without exploring longer paths.
Start Word Ladder II: Find all shortest transformation paths
0 / 2298
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(N * L^2 + P) BFS phase: O(N * L^2) where N is number of words and L is word length (checking all one-letter differences). DFS phase: O(P) where P is the total number of nodes in all shortest paths. Building the graph requires comparing words which takes O(L) per comparison.
Space Complexity: O(N * L) We store the graph adjacency list (O(N * L) for all edges), distance map (O(N)), and recursion stack for DFS (O(L) depth for path length)
What We've Learned
- BFS for shortest path discovery + DFS for path enumeration: When you need to find ALL shortest paths (not just one), use BFS first to build a level/distance map from the target, then backtrack with DFS using only edges that decrease distance by 1 - this two-phase approach guarantees you only explore shortest paths without redundant work.
- Bidirectional BFS optimization: Build the graph structure using BFS from endWord backwards to compute distances to all reachable words - this prunes the search space early by identifying which words can actually reach the target, avoiding exploration of dead-end branches during path reconstruction.
- Level-based graph construction: Treat the word transformation as an unweighted graph where BFS naturally discovers shortest distances - store each word's distance/level from the target, then during DFS only follow edges to words at level-1, ensuring all reconstructed paths have identical optimal length.
- Neighbor generation vs pre-built adjacency list: For word ladder problems, generating neighbors on-the-fly (try all 26 letters at each position and check wordList membership with a Set) is often faster than pre-computing all edges, especially when the word list is large but words are short - O(26*L) per word beats O(N²*L) preprocessing.
- Backtracking with distance constraints: The critical insight is using the distance map as a filter during DFS - only recurse into neighbors whose distance is exactly one less than current word's distance, which prevents cycles and ensures you're always moving toward the source along a shortest path.
- Graph problems with path enumeration: This pattern applies broadly to any problem requiring all optimal solutions in a graph - use BFS/Dijkstra to find optimal distances first, then use constrained DFS/backtracking with the distance map to enumerate only optimal paths (useful in routing, dependency resolution, and game state exploration).
Related Concepts and Problems to Practice
This lesson provides foundational understanding of graph representations and BFS traversal, which is the core algorithm used in Word Ladder II to find shortest paths in an unweighted graph where words are nodes.
hard
This problem requires BFS to find the shortest path in a graph with complex node relationships, similar to Word Ladder II where you must navigate through word transformations. Both problems involve building implicit graphs and finding optimal paths.
medium
After using BFS to find shortest distances in Word Ladder II, backtracking/DFS is needed to enumerate all shortest paths. This problem teaches the backtracking pattern for exploring all valid paths through a search space.
Test Your Understanding
Why is graph the right data structure for this problem?
graph provides the optimal access pattern
It's the only data structure that works
It's the easiest to implement
It uses the least memory
Select an answer to see the explanation
Question Timeline
See when this question was last asked and where, including any notes left by other candidates.
Early December, 2025
Meta
Senior
Mid November, 2025
Meta
Senior
Mid August, 2025
Bloomberg
Senior
Comments
Hello Interview Premium
Your account is free and you can post anonymously if you choose.