Search
⌘K

Leetcode 528. Random Pick with Weight

Given an array of positive weights, implement pickIndex() to return index i with probability w[i]/sum(w); the typical approach preprocessing prefix sums and using a random number plus binary search maps the random draw to the weighted bucket in O(log n) time after O(n) setup.

Asked at:

Meta

Microsoft

Microsoft

Netflix

Netflix


DESCRIPTION

Given an array of positive weights, implement pickIndex() to return index i with probability w[i]/sum(w); the typical approach preprocessing prefix sums and using a random number plus binary search maps the random draw to the weighted bucket in O(log n) time after O(n) setup.

Input:

w = [1, 3, 2, 4]

Output:

pickIndex() returns 0 with 10% probability, 1 with 30%, 2 with 20%, 3 with 40%


Explanation: Total weight is 10. Index 0 has weight 1 (1/10 = 10%), index 1 has weight 3 (3/10 = 30%), index 2 has weight 2 (2/10 = 20%), index 3 has weight 4 (4/10 = 40%)

Constraints:

  • 1 <= w.length <= 10^4
  • 1 <= w[i] <= 10^5
  • pickIndex will be called at most 10^4 times
  • All weights are positive integers

Understanding the Problem

Let's understand this problem with a concrete example. Given weights [1, 3, 2], we need to implement pickIndex() that randomly returns index 0, 1, or 2, but NOT with equal probability.

Index 1 (weight 3) should be picked 3 times as often as index 0 (weight 1). The probability of picking each index is weight[i] / sum(weights). For [1, 3, 2], total weight is 6, so: index 0 has probability 1/6 (~17%), index 1 has probability 3/6 (50%), index 2 has probability 2/6 (~33%).

If we call pickIndex() 100 times, we'd expect roughly: ~17 picks of index 0, ~50 picks of index 1, ~33 picks of index 2.

Key constraint: The pickIndex() method will be called many times (up to 10^4), so it needs to be efficient.

Edge cases: What if the array has only one element? What if one weight is much larger than others? What if weights are very large numbers?

Brute Force Approach

For each pickIndex() call, generate a random number between 0 and the total weight sum. Then iterate through the weights array, accumulating a running sum until the random number falls within the current weight's range. This approach is simple to understand: we're essentially checking each bucket linearly to see if our random number belongs to it. However, this requires O(n) time for every pick operation, which becomes inefficient when pickIndex() is called many times.

|
comma-separated integers
Visualization
import random
def weighted_pick_brute_force(w):
"""
Brute force approach: Generate random number and linearly scan
to find which weight bucket it falls into.
Time: O(n) per pick, Space: O(1)
"""
# Calculate total weight sum
total_weight = sum(w)
# Generate random number between 1 and total_weight (inclusive)
target = random.randint(1, total_weight)
# Linear scan: accumulate weights until target is reached
running_sum = 0
for i in range(len(w)):
running_sum += w[i]
# Check if target falls within current weight's range
if target <= running_sum:
return i
# Fallback (should never reach here)
return len(w) - 1
13240123

Start weighted random pick (brute force)

0 / 12

Test Your Knowledge

Login to take the complexity quiz and track your progress

Complexity Analysis

Time Complexity: O(n) per pickIndex call We iterate through all n weights to find which range the random number falls into, checking each weight sequentially

Space Complexity: O(1) We only store the total sum and use a running accumulator, requiring constant extra space

Building Intuition

We can transform weights into ranges on a number line. For weights [1, 3, 2], create ranges: index 0 gets range [0, 1) (size 1), index 1 gets range [1, 4) (size 3), index 2 gets range [4, 6) (size 2).

Now if we pick a random number between 0 and 6, it naturally falls into these ranges proportionally! A random number has a 1/6 chance of landing in [0, 1), a 3/6 chance of landing in [1, 4), etc.

By accumulating weights into running sums (prefix sums), we create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6].

A random target of 3.5 falls between boundaries 1 and 4, which corresponds to index 1. Since these boundaries are sorted, we can find which range a random number falls into very efficiently using binary search instead of checking each range linearly.

Imagine a raffle with 100 tickets:

  • Person A bought tickets 0-9 (10 tickets)
  • Person B bought tickets 10-69 (60 tickets)
  • Person C bought tickets 70-99 (30 tickets)

If you draw ticket #45, you need to find whose range it falls into. You could check each person one by one (slow), or since the ranges are sorted by their boundaries [10, 70, 100], you can use binary search to quickly find that 45 falls between 10 and 70, meaning Person B wins.

This is exactly what we do: precompute prefix sums (boundaries), generate a random number (ticket), and binary search to find which bucket (person) it belongs to.

Common Mistakes

Optimal Solution

During initialization, precompute prefix sums to create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6]. When pickIndex() is called, generate a random number between 0 and the total sum, then use binary search on the prefix sums to find the smallest index where prefixSum[i] > random. This works because prefix sums create sorted boundaries, and binary search can efficiently locate which range the random number falls into. The preprocessing takes O(n) time once, but each subsequent pick is O(log n).

|
comma-separated integers
Visualization
import random
def weighted_pick(w):
"""
Perform weighted random pick using prefix sums and binary search.
Returns index i with probability w[i]/sum(w).
"""
# Build prefix sums array
prefix_sums = []
running_sum = 0
for weight in w:
running_sum += weight
prefix_sums.append(running_sum)
# Generate random number in range [1, total_sum]
target = random.randint(1, running_sum)
# Binary search to find the bucket
left = 0
right = len(prefix_sums) - 1
while left < right:
mid = (left + right) // 2
if prefix_sums[mid] < target:
left = mid + 1
else:
right = mid
return left
13240123

Start weighted random pick with weights: [1, 3, 2, 4]

0 / 19

Test Your Knowledge

Login to take the complexity quiz and track your progress

Complexity Analysis

Time Complexity: O(n) for initialization, O(log n) per pickIndex call Building prefix sums requires one pass through the array (O(n)). Each pickIndex call uses binary search on the prefix sums array, which takes O(log n) time

Space Complexity: O(n) We store the prefix sums array which requires O(n) space proportional to the number of weights

What We've Learned

  • Prefix sum transformation for weighted sampling: Convert a weights array into cumulative sums to transform the problem from "pick with probability" into "find which range contains a random number" - this makes O(log n) binary search possible instead of O(n) linear scanning.
  • Binary search on answer space: Use binary search to find the leftmost index where prefix_sum[i] > random_value, not to search for an exact match - this maps continuous random ranges to discrete bucket indices efficiently.
  • Random number range mapping: Generate random values in range [1, total_sum] or [0, total_sum) and ensure consistency - an off-by-one error here breaks the probability distribution, making some indices unreachable or over-represented.
  • Preprocessing vs query time tradeoff: Invest O(n) time and space upfront to build prefix sums so each pickIndex() call runs in O(log n) instead of O(n) - critical when pickIndex() is called many times, amortizing the setup cost.
  • Weighted random sampling pattern: This prefix-sum + binary-search technique applies to any problem involving non-uniform probability distributions, weighted reservoirs, or proportional selection like load balancing servers by capacity or sampling ads by bid amount.
  • Boundary condition precision: When using binary search, carefully handle whether to use `random.randint(1, total)` with `prefix[mid] >= target` or `random.random() * total` with `prefix[mid] > target` - mismatched boundary logic causes incorrect probability weights at bucket edges.

Related Concepts and Problems to Practice

Overview
Lesson
Binary Search

Random Pick with Weight uses binary search on prefix sums to map random values to weighted indices. This lesson covers binary search fundamentals which are essential for understanding the O(log n) lookup approach in the weighted random selection problem.

Overview
Lesson
Prefix Sum

The problem requires preprocessing an array into prefix sums to create cumulative weight ranges. This lesson teaches prefix sum techniques which are the core preprocessing step needed before applying binary search in the weighted random pick solution.

Subarray Sum Equals K

medium

Prefix Sum

This problem reinforces prefix sum array construction and querying techniques. While it uses prefix sums differently (with hash maps for range queries), it strengthens understanding of cumulative sum arrays which are fundamental to the weighted random pick preprocessing step.

Test Your Understanding

Why is array the right data structure for this problem?

1

array provides the optimal access pattern

2

It's the only data structure that works

3

It's the easiest to implement

4

It uses the least memory

Select an answer to see the explanation

Question Timeline

See when this question was last asked and where, including any notes left by other candidates.

Early December, 2025

Meta

Senior

Mid November, 2025

Meta

Mid-level

Mid November, 2025

Microsoft

Microsoft

Mid-level

Comments

Your account is free and you can post anonymously if you choose.