⌘K

Tutor

Leetcode 126. Word Ladder II

Find all shortest sequences transforming beginWord to endWord by changing one letter at a time using only words from wordList; treat words as nodes in an unweighted graph (edges connect one-letter differences), use BFS to discover shortest distances/level graph and then backtrack/DFS to enumerate all shortest transformation paths.

Asked at:

DESCRIPTION

Input:

beginWord = 'hit', endWord = 'cog', wordList = ['hot','dot','dog','lot','log','cog']

Output:

[['hit','hot','dot','dog','cog'], ['hit','hot','lot','log','cog']]

Explanation: There are two shortest transformation sequences of length 5. Both change one letter at a time and use only words from wordList.

Constraints:

1 <= beginWord.length <= 10
endWord.length == beginWord.length
1 <= wordList.length <= 5000
wordList[i].length == beginWord.length
All strings consist of lowercase English letters
beginWord != endWord
All words in wordList are unique

Understanding the Problem

Let's understand what we're being asked to do. We have a beginWord like 'hit', an endWord like 'cog', and a wordList like ['hot','dot','dog','lot','log','cog']. We need to find all shortest transformation sequences from 'hit' to 'cog'.

A transformation means changing exactly one letter at a time, and each intermediate word must exist in the wordList. For example: 'hit' -> 'hot' -> 'dot' -> 'dog' -> 'cog' is valid (each step changes one letter).

Important constraint: We want ALL shortest paths, not just one! If there are multiple ways to reach endWord in the minimum number of steps, we need to return all of them.

Key insight: Words are like nodes in a graph, and edges connect words that differ by exactly one letter. The problem becomes: find all shortest paths in an unweighted graph.

Edge cases to consider: What if endWord is not in wordList? (return empty list). What if no transformation sequence exists? What if beginWord equals endWord?

Brute Force Approach

The brute force approach uses pure DFS to explore all possible transformation paths from beginWord to endWord. Starting from beginWord, recursively try transforming to every word in wordList that differs by one letter, marking words as visited to avoid cycles. When endWord is reached, record the path. After exploring all paths, filter to keep only the shortest ones. This approach explores many unnecessary longer paths before finding the optimal length.

Visualization

def word_ladder(begin_word, end_word, word_list):
    """
    Find all shortest transformation sequences from begin_word to end_word.
    Brute force: Use DFS to explore all paths, then filter shortest ones.
    """
    # Convert word list to set for O(1) lookup
    word_set = set(word_list)
    
    # If end_word not in word_list, no solution
    if end_word not in word_set:
        return []
    
    # Helper function to check if two words differ by exactly one letter
    def one_letter_diff(word1, word2):
        if len(word1) != len(word2):
            return False
        diff_count = 0
        for i in range(len(word1)):
            if word1[i] != word2[i]:
                diff_count += 1
                if diff_count > 1:
                    return False
        return diff_count == 1
    
    # Store all paths found
    all_paths = []
    
    # DFS to explore all transformation paths
    def dfs(current_word, path, visited):
        # Base case: reached end_word
        if current_word == end_word:
            all_paths.append(path.copy())
            return
        
        # Try transforming to every word in word_list
        for word in word_set:
            # Skip if already visited (avoid cycles)
            if word in visited:
                continue
            
            # Check if word differs by exactly one letter
            if one_letter_diff(current_word, word):
                # Mark as visited
                visited.add(word)
                path.append(word)
                
                # Recursively explore from this word
                dfs(word, path, visited)
                
                # Backtrack: unmark and remove from path
                path.pop()
                visited.remove(word)
    
    # Start DFS from begin_word
    initial_visited = {begin_word}
    initial_path = [begin_word]
    dfs(begin_word, initial_path, initial_visited)
    
    # Filter to keep only shortest paths
    if not all_paths:
        return []
    
    min_length = min(len(path) for path in all_paths)
    shortest_paths = [path for path in all_paths if len(path) == min_length]
    
    return shortest_paths

Start word ladder algorithm

0 / 317

Test Your Knowledge

Complexity Analysis

Time Complexity: O(N^N) In the worst case, we explore all possible permutations of words, leading to exponential time complexity where N is the number of words

Space Complexity: O(N) Recursion stack depth is at most N words in the longest path, plus space for tracking visited words

Building Intuition

In an unweighted graph, BFS discovers nodes level by level, guaranteeing that the first time we reach a node, we've found the shortest distance to it.

Once we know the shortest distance to each word, we can work backwards from endWord to beginWord, only following edges that decrease the distance by 1. This ensures we only traverse shortest paths.

A naive DFS approach would explore all possible paths, including many longer-than-optimal ones, leading to exponential time complexity.

By using BFS first to build a distance map, then DFS to enumerate only shortest paths, we avoid exploring any path that's longer than necessary. This two-phase approach (BFS + DFS) is much more efficient.

Think of this like finding all shortest routes between two cities on a map:

Phase 1 (BFS): Measure the shortest distance from the start city to every other city. Mark each city with its distance number.

Phase 2 (DFS/Backtracking): Starting from the destination, work backwards. At each step, only move to cities whose distance number is exactly 1 less than your current city. This guarantees you're always on a shortest path.

The level-by-level exploration of BFS is what makes this work - it's like ripples spreading outward from a stone dropped in water.

Common Mistakes

Using only DFS without BFS preprocessing leads to exploring many non-optimal paths, causing exponential time complexity and potential timeout on large inputs

Forgetting to check if endWord exists in wordList before starting - this is a required edge case that should return empty result immediately

Building the graph incorrectly by not checking all one-letter differences, or by allowing transformations to words not in wordList (remember beginWord might not be in the list but intermediate words must be)

In the DFS phase, following edges that don't decrease distance by exactly 1 - this breaks the shortest-path guarantee and includes longer paths in the result

Not handling the case where beginWord equals endWord - should return a single-element path [[beginWord]] as the shortest sequence

Optimal Solution

The optimal approach uses a two-phase strategy: First, run BFS from beginWord to build a distance map showing the shortest distance to each word. This creates a level graph where each word knows its distance from the start. Second, use DFS/backtracking starting from endWord, working backwards to beginWord. At each step, only follow edges to words whose distance is exactly 1 less, ensuring we only traverse shortest paths. This guarantees we enumerate all shortest transformation sequences without exploring longer paths.

Visualization

from collections import deque, defaultdict
def find_ladders(begin_word, end_word, word_list):
    """
    Find all shortest transformation sequences from begin_word to end_word.
    Uses BFS to build distance map, then DFS to enumerate all shortest paths.
    """
    # Convert word_list to set for O(1) lookup
    word_set = set(word_list)
    
    # If end_word not in word_list, no transformation possible
    if end_word not in word_set:
        return []
    
    # BFS Phase: Build distance map showing shortest distance from begin_word
    queue = deque([begin_word])
    distances = {begin_word: 0}
    
    while queue:
        current_word = queue.popleft()
        current_distance = distances[current_word]
        
        # Try all one-letter transformations
        for i in range(len(current_word)):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                if c == current_word[i]:
                    continue
                
                # Generate neighbor word
                neighbor = current_word[:i] + c + current_word[i+1:]
                
                # If neighbor is in word_set and not visited
                if neighbor in word_set and neighbor not in distances:
                    distances[neighbor] = current_distance + 1
                    queue.append(neighbor)
    
    # If end_word not reached, return empty list
    if end_word not in distances:
        return []
    
    # DFS Phase: Backtrack from end_word to begin_word using distance map
    all_paths = []
    current_path = [end_word]
    
    def backtrack(word, path):
        # Base case: reached begin_word
        if word == begin_word:
            all_paths.append(path[::-1])
            return
        
        # Try all one-letter transformations
        for i in range(len(word)):
            for c in 'abcdefghijklmnopqrstuvwxyz':
                if c == word[i]:
                    continue
                
                # Generate neighbor word
                neighbor = word[:i] + c + word[i+1:]
                
                # Only follow edges to words with distance exactly 1 less
                if neighbor in distances and distances[neighbor] == distances[word] - 1:
                    path.append(neighbor)
                    backtrack(neighbor, path)
                    path.pop()
    
    backtrack(end_word, current_path)
    return all_paths

Start Word Ladder II: Find all shortest transformation paths

0 / 2298

Test Your Knowledge

Complexity Analysis

Time Complexity: O(N * L^2 + P) BFS phase: O(N * L^2) where N is number of words and L is word length (checking all one-letter differences). DFS phase: O(P) where P is the total number of nodes in all shortest paths. Building the graph requires comparing words which takes O(L) per comparison.

Space Complexity: O(N * L) We store the graph adjacency list (O(N * L) for all edges), distance map (O(N)), and recursion stack for DFS (O(L) depth for path length)

What We've Learned

BFS for shortest path discovery + DFS for path enumeration: When you need to find ALL shortest paths (not just one), use BFS first to build a level/distance map from the target, then backtrack with DFS using only edges that decrease distance by 1 - this two-phase approach guarantees you only explore shortest paths without redundant work.
Bidirectional BFS optimization: Build the graph structure using BFS from endWord backwards to compute distances to all reachable words - this prunes the search space early by identifying which words can actually reach the target, avoiding exploration of dead-end branches during path reconstruction.
Level-based graph construction: Treat the word transformation as an unweighted graph where BFS naturally discovers shortest distances - store each word's distance/level from the target, then during DFS only follow edges to words at level-1, ensuring all reconstructed paths have identical optimal length.
Neighbor generation vs pre-built adjacency list: For word ladder problems, generating neighbors on-the-fly (try all 26 letters at each position and check wordList membership with a Set) is often faster than pre-computing all edges, especially when the word list is large but words are short - O(26*L) per word beats O(N²*L) preprocessing.
Backtracking with distance constraints: The critical insight is using the distance map as a filter during DFS - only recurse into neighbors whose distance is exactly one less than current word's distance, which prevents cycles and ensures you're always moving toward the source along a shortest path.
Graph problems with path enumeration: This pattern applies broadly to any problem requiring all optimal solutions in a graph - use BFS/Dijkstra to find optimal distances first, then use constrained DFS/backtracking with the distance map to enumerate only optimal paths (useful in routing, dependency resolution, and game state exploration).

Test Your Understanding

Why is graph the right data structure for this problem?

graph provides the optimal access pattern

It's the only data structure that works

It's the easiest to implement

It uses the least memory

Select an answer to see the explanation

Question Timeline

See when this question was last asked and where, including any notes left by other candidates.

Company

Level

Region

Early December, 2025

Comments

Your account is free and you can post anonymously if you choose.

Hello Interview Premium

Recent interview questions

System Design Guided Practice

Exclusive content

Learn More

Leetcode 126. Word Ladder II

DESCRIPTION

Understanding the Problem

Brute Force Approach

Test Your Knowledge

Complexity Analysis

Building Intuition

Common Mistakes

Optimal Solution

Test Your Knowledge

Complexity Analysis

What We've Learned

Related Concepts and Problems to Practice

Test Your Understanding

Question Timeline

Comments

Questions

Learn

Links

Legal

Contact