Limited Time Offer:Up to 25% off Hello Interview Premium

⌘K

Tutor

Get Premium

Leetcode 49. Group Anagrams

Group the input strings into lists of anagrams — i.e., strings that contain the same multiset of letters — and return the groups in any order. Typical solutions compute a canonical key per string (sorted characters or a letter-count signature) and use a hash map to bucket up to 10^4 strings of length ≤100.

Asked at:

Amazon

Oracle

DESCRIPTION

Input:

strs = ["eat", "tea", "tan", "ate", "nat", "bat"]

Output:

[["eat","tea","ate"], ["tan","nat"], ["bat"]]

Explanation: "eat", "tea", and "ate" are anagrams (same letters: e, a, t). "tan" and "nat" are anagrams (same letters: t, a, n). "bat" has no anagrams in the input.

Constraints:

1 <= strs.length <= 10^4
0 <= strs[i].length <= 100
strs[i] consists of lowercase English letters only

Understanding the Problem

Let's understand what we're being asked to do. We have an array of strings like ["eat", "tea", "tan", "ate", "nat", "bat"] and need to group strings that are anagrams of each other.

Two strings are anagrams if they contain the exact same letters with the same frequencies, just rearranged. For example, "eat" and "tea" are anagrams (both have one 'e', one 'a', one 't'), but "eat" and "bat" are not (different letters).

The output should be groups: [["eat","tea","ate"], ["tan","nat"], ["bat"]]. The order of groups doesn't matter, and the order within each group doesn't matter either.

Key constraints: We can have up to 10,000 strings, each up to 100 characters long. All strings contain only lowercase English letters.

Edge cases to consider: What if the input is empty? What if all strings are unique (no anagrams)? What if all strings are anagrams of each other?

Brute Force Approach

The brute force approach compares every string with every other string to check if they're anagrams. For each pair, sort both strings and compare if they're equal. If they match, add them to the same group. This requires nested loops and repeated sorting operations, making it very inefficient for large inputs.

Visualization

def group_anagrams(strs):
    """
    Brute force approach: Compare every string with every other string.
    For each pair, sort both strings and check if they're equal.
    Time: O(n^2 * m log m) where n is number of strings, m is max string length.
    """
    # Track which strings have been grouped
    used = [False] * len(strs)
    result = []
    
    # For each string, find all its anagrams
    for i in range(len(strs)):
        if used[i]:
            continue
        
        # Start a new group with current string
        current_group = [strs[i]]
        used[i] = True
        sorted_i = sorted(strs[i])
        
        # Compare with all remaining strings
        for j in range(i + 1, len(strs)):
            if used[j]:
                continue
            
            # Sort both strings and compare
            sorted_j = sorted(strs[j])
            
            # Check if they're anagrams
            if sorted_i == sorted_j:
                current_group.append(strs[j])
                used[j] = True
        
        result.append(current_group)
    
    return result

Start grouping anagrams using brute force

0 / 67

Test Your Knowledge

Complexity Analysis

Time Complexity: O(n² * k log k) For n strings, we do O(n²) pairwise comparisons, and each comparison requires sorting strings of length k, which takes O(k log k)

Space Complexity: O(n * k) We need to store all n strings and their groups, where each string has length up to k

Building Intuition

Two strings are anagrams if and only if they have the same character frequency. For example, "eat" has frequencies {e:1, a:1, t:1} and "tea" has the same frequencies.

This means we need a way to identify strings with identical character frequencies and group them together. The challenge is: how do we efficiently represent 'same character frequencies' so we can use it as a grouping key?

If we can create a canonical representation (a unique signature) for each string based on its character frequencies, we can use that signature to group anagrams together.

For example, both "eat" and "tea" could map to the same signature (like sorted characters "aet" or a frequency pattern). Then we just need to collect all strings with the same signature into the same group.

Think about organizing a deck of cards by their composition. If you have cards with different arrangements of the same suits and numbers, you'd group them by what they contain, not how they're arranged.

For strings, we can either:

Sort the characters: "eat" → "aet", "tea" → "aet" (same sorted form!)
Count frequencies: "eat" → "e1a1t1", "tea" → "e1a1t1" (same frequency signature!)

Either way gives us a unique key to group by. The key insight is that we need a data structure that can map these keys to groups of strings.

Common Mistakes

Forgetting to handle empty strings - they should form their own group with key being empty or a special marker

Using inefficient key generation - concatenating unsorted characters creates different keys for anagrams

Not initializing hash map values properly - need to create a new list for each new key, not reuse the same list

Assuming sorted string is the only canonical form - character frequency counts work too and can be more efficient for very long strings

Forgetting that the problem asks for groups in any order - no need to sort the final output

Optimal Solution

The optimal approach uses a hash map where the key is a canonical representation of each string (either sorted characters or a frequency count signature), and the value is a list of all strings that map to that key. For each string, compute its canonical key and add it to the corresponding group in the hash map. This allows us to group anagrams in a single pass through the input.

Visualization

def group_anagrams(strs):
    """
    Group strings into anagram lists using sorted characters as canonical key.
    Time: O(n * k log k) where n is number of strings, k is max string length.
    Space: O(n * k) for storing all strings in hash map.
    """
    # Initialize hash map to store anagram groups
    anagram_map = {}
    
    # Process each string in the input
    for s in strs:
        # Compute canonical key by sorting characters
        sorted_chars = ''.join(sorted(s))
        
        # Add string to corresponding anagram group
        if sorted_chars not in anagram_map:
            anagram_map[sorted_chars] = []
        anagram_map[sorted_chars].append(s)
    
    # Return all anagram groups as a list
    return list(anagram_map.values())

Start grouping anagrams

0 / 26

Test Your Knowledge

Complexity Analysis

Time Complexity: O(n * k log k) We process n strings, and for each string of length k, we either sort it (O(k log k)) or count its characters (O(k)). The sorting approach dominates at O(n * k log k)

Space Complexity: O(n * k) The hash map stores all n strings, each of length up to k, plus the keys which are also O(k) length

What We've Learned

Canonical key pattern for grouping: When you need to group items by equivalence (anagrams, permutations, same elements), create a canonical representation as a hash key - either by sorting the elements O(k log k) or using a frequency signature O(k), where k is element size. This transforms a grouping problem into a simple hash map bucketing problem.
Tuple as hash key technique: In Python, use tuple(sorted(string)) or tuple(char_count_array) as dictionary keys since lists aren't hashable. The character count approach creates a 26-element tuple like (1,0,2,0,...) representing letter frequencies, which is more efficient than sorting for longer strings.
HashMap for O(1) grouping: A defaultdict(list) or regular dict allows you to group n strings in O(n·k) time where k is average string length - each string is processed once, and hash lookups/insertions are O(1). Without the hash map, comparing all pairs would be O(n²·k).
Sorting vs counting tradeoff: For strings of length k, sorting takes O(k log k) while character counting takes O(k) but requires O(26) = O(1) extra space. Use counting when k is large (k > 100) or when the alphabet is small and fixed; sorting is simpler for small k or variable alphabets.
Equivalence class pattern recognition: This hash-map grouping technique applies broadly to any problem asking to partition items into equivalence classes: group shifted strings, find duplicate subtrees, cluster similar documents, or deduplicate records - always ask 'what canonical form represents this equivalence?'
String immutability consideration: In languages with immutable strings, avoid repeated string concatenation when building keys. Use ''.join() in Python or StringBuilder in Java for the sorted key, or build the count array directly - this prevents O(k²) hidden costs from string copying.

Test Your Understanding

Why is hashmap the right data structure for this problem?

hashmap provides the optimal access pattern

It's the only data structure that works

It's the easiest to implement

It uses the least memory

Select an answer to see the explanation

Question Timeline

See when this question was last asked and where, including any notes left by other candidates.

Company

Level

Region

Late June, 2025

Freelancer

Staff

Late April, 2025

Oracle

Senior

Given an array of strings strs, group the anagrams together. You can return the answer in any order. Input: strs = ["eat","tea","tan","ate","nat","bat"] Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Late March, 2025

Amazon

Mid-level

Get Premium to View All 6 Reports

Comments

Your account is free and you can post anonymously if you choose.

Hello Interview Premium

Recent interview questions

System Design Guided Practice

Exclusive content

Learn More

Leetcode 49. Group Anagrams

DESCRIPTION

Understanding the Problem

Brute Force Approach

Test Your Knowledge

Complexity Analysis

Building Intuition

Common Mistakes

Optimal Solution

Test Your Knowledge

Complexity Analysis

What We've Learned

Related Concepts and Problems to Practice

Test Your Understanding

Question Timeline

Comments

Questions

Learn

Links

Legal

Contact