30% off Lifetime Premium Access expires December 2nd!

30% off Lifetime Premium Access expires Dec 2nd!

Your Dashboard

Interview Coaching

Mock Interviews

1:1 Mentorship

Salary Negotiation

Practice

Learn

System Design

ML System Design

Code

Behavioral

Salary Negotiation

Interview Guides

Blog

Community

Interview Questions

Interview Experiences

Peer System Design Library

Ask The Community

Discord

Refer a Friend

Pricing

Become a Coach

⌘K

Pricing Become a Coach

⌘K

Get Premium

Leetcode 528. Random Pick with Weight

Given an array of positive weights, implement pickIndex() to return index i with probability w[i]/sum(w); the typical approach preprocessing prefix sums and using a random number plus binary search maps the random draw to the weighted bucket in O(log n) time after O(n) setup.

Asked at:

DESCRIPTION

Input:


w = [1, 3, 2, 4]

Output:


pickIndex() returns 0 with 10% probability, 1 with 30%, 2 with 20%, 3 with 40%

Explanation: Total weight is 10. Index 0 has weight 1 (1/10 = 10%), index 1 has weight 3 (3/10 = 30%), index 2 has weight 2 (2/10 = 20%), index 3 has weight 4 (4/10 = 40%)

Constraints:

1 <= w.length <= 10^4
1 <= w[i] <= 10^5
pickIndex will be called at most 10^4 times
All weights are positive integers

Understanding the Problem

Let's understand this problem with a concrete example. Given weights [1, 3, 2], we need to implement pickIndex() that randomly returns index 0, 1, or 2, but NOT with equal probability.

Index 1 (weight 3) should be picked 3 times as often as index 0 (weight 1). The probability of picking each index is weight[i] / sum(weights). For [1, 3, 2], total weight is 6, so: index 0 has probability 1/6 (~17%), index 1 has probability 3/6 (50%), index 2 has probability 2/6 (~33%).

If we call pickIndex() 100 times, we'd expect roughly: ~17 picks of index 0, ~50 picks of index 1, ~33 picks of index 2.

Key constraint: The pickIndex() method will be called many times (up to 10^4), so it needs to be efficient.

Edge cases: What if the array has only one element? What if one weight is much larger than others? What if weights are very large numbers?

Brute Force Approach

For each pickIndex() call, generate a random number between 0 and the total weight sum. Then iterate through the weights array, accumulating a running sum until the random number falls within the current weight's range. This approach is simple to understand: we're essentially checking each bucket linearly to see if our random number belongs to it. However, this requires O(n) time for every pick operation, which becomes inefficient when pickIndex() is called many times.

import random

def weighted_pick_brute_force(w):
    """
    Brute force approach: Generate random number and linearly scan
    to find which weight bucket it falls into.
    Time: O(n) per pick, Space: O(1)
    """
    # Calculate total weight sum
    total_weight = sum(w)
    
    # Generate random number between 1 and total_weight (inclusive)
    target = random.randint(1, total_weight)
    
    # Linear scan: accumulate weights until target is reached
    running_sum = 0
    for i in range(len(w)):
        running_sum += w[i]
        
        # Check if target falls within current weight's range
        if target <= running_sum:
            return i
    
    # Fallback (should never reach here)
    return len(w) - 1

Start weighted random pick (brute force)

0 / 10

Test Your Knowledge

Complexity Analysis

Time Complexity: O(n) per pickIndex call We iterate through all n weights to find which range the random number falls into, checking each weight sequentially

Space Complexity: O(1) We only store the total sum and use a running accumulator, requiring constant extra space

Building Intuition

We can transform weights into ranges on a number line. For weights [1, 3, 2], create ranges: index 0 gets range [0, 1) (size 1), index 1 gets range [1, 4) (size 3), index 2 gets range [4, 6) (size 2).

Now if we pick a random number between 0 and 6, it naturally falls into these ranges proportionally! A random number has a 1/6 chance of landing in [0, 1), a 3/6 chance of landing in [1, 4), etc.

By accumulating weights into running sums (prefix sums), we create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6].

A random target of 3.5 falls between boundaries 1 and 4, which corresponds to index 1. Since these boundaries are sorted, we can find which range a random number falls into very efficiently using binary search instead of checking each range linearly.

Imagine a raffle with 100 tickets:

Person A bought tickets 0-9 (10 tickets)
Person B bought tickets 10-69 (60 tickets)
Person C bought tickets 70-99 (30 tickets)

If you draw ticket #45, you need to find whose range it falls into. You could check each person one by one (slow), or since the ranges are sorted by their boundaries [10, 70, 100], you can use binary search to quickly find that 45 falls between 10 and 70, meaning Person B wins.

This is exactly what we do: precompute prefix sums (boundaries), generate a random number (ticket), and binary search to find which bucket (person) it belongs to.

Common Mistakes

Forgetting to handle the edge case where the random number exactly equals a prefix sum boundary - use 'greater than' comparison, not 'greater than or equal'

Using integer division incorrectly when generating random numbers - ensure the random number can be any value in the range [0, totalSum), not just integers

Off-by-one errors in binary search: the target index is where prefixSum[i] is the first value greater than the random number, not greater than or equal

Not considering that weights can be very large (up to 10^5 each) - the total sum could overflow if not using appropriate data types

Rebuilding prefix sums on every pickIndex call instead of precomputing once during initialization, leading to O(n) time per call instead of O(log n)

Optimal Solution

During initialization, precompute prefix sums to create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6]. When pickIndex() is called, generate a random number between 0 and the total sum, then use binary search on the prefix sums to find the smallest index where prefixSum[i] > random. This works because prefix sums create sorted boundaries, and binary search can efficiently locate which range the random number falls into. The preprocessing takes O(n) time once, but each subsequent pick is O(log n).

import random

def weighted_pick(w):
    """
    Perform weighted random pick using prefix sums and binary search.
    Returns index i with probability w[i]/sum(w).
    """
    # Build prefix sums array
    prefix_sums = []
    running_sum = 0
    
    for weight in w:
        running_sum += weight
        prefix_sums.append(running_sum)
    
    # Generate random number in range [1, total_sum]
    target = random.randint(1, running_sum)
    
    # Binary search to find the bucket
    left = 0
    right = len(prefix_sums) - 1
    
    while left < right:
        mid = (left + right) // 2
        if prefix_sums[mid] < target:
            left = mid + 1
        else:
            right = mid
    
    return left

Start weighted random pick with weights: [1, 3, 2, 4]

0 / 19

Test Your Knowledge

Complexity Analysis

Time Complexity: O(n) for initialization, O(log n) per pickIndex call Building prefix sums requires one pass through the array (O(n)). Each pickIndex call uses binary search on the prefix sums array, which takes O(log n) time

Space Complexity: O(n) We store the prefix sums array which requires O(n) space proportional to the number of weights

What We've Learned

Prefix sum transformation for weighted sampling: Convert a weights array into cumulative sums to transform the problem from "pick with probability" into "find which range contains a random number" - this makes O(log n) binary search possible instead of O(n) linear scanning.
Binary search on answer space: Use binary search to find the leftmost index where prefix_sum[i] > random_value, not to search for an exact match - this maps continuous random ranges to discrete bucket indices efficiently.
Random number range mapping: Generate random values in range [1, total_sum] or [0, total_sum) and ensure consistency - an off-by-one error here breaks the probability distribution, making some indices unreachable or over-represented.
Preprocessing vs query time tradeoff: Invest O(n) time and space upfront to build prefix sums so each pickIndex() call runs in O(log n) instead of O(n) - critical when pickIndex() is called many times, amortizing the setup cost.
Weighted random sampling pattern: This prefix-sum + binary-search technique applies to any problem involving non-uniform probability distributions, weighted reservoirs, or proportional selection like load balancing servers by capacity or sampling ads by bid amount.
Boundary condition precision: When using binary search, carefully handle whether to use `random.randint(1, total)` with `prefix[mid] >= target` or `random.random() * total` with `prefix[mid] > target` - mismatched boundary logic causes incorrect probability weights at bucket edges.

Test Your Understanding

Why is array the right data structure for this problem?

array provides the optimal access pattern

It's the only data structure that works

It's the easiest to implement

It uses the least memory

Select an answer to see the explanation

Question Timeline

See when this question was last asked and where, including any notes left by other candidates.

Company

Level

Region

Mid November, 2025

Comments

Your account is free and you can post anonymously if you choose.

Hello Interview Premium

Recent interview questions

System Design Guided Practice

Exclusive content

Learn More

Leetcode 528. Random Pick with Weight

DESCRIPTION

Understanding the Problem

Brute Force Approach

Test Your Knowledge

Complexity Analysis

Building Intuition

Common Mistakes

Optimal Solution

Test Your Knowledge

Complexity Analysis

What We've Learned

Related Concepts and Problems to Practice

Test Your Understanding

Question Timeline

Comments

Questions

Learn

Links

Legal

Contact