Your Dashboard
Interview Coaching
Learn
System Design
ML System Design
Code
Behavioral
Salary Negotiation
Interview Guides
Leetcode 528. Random Pick with Weight
Given an array of positive weights, implement pickIndex() to return index i with probability w[i]/sum(w); the typical approach preprocessing prefix sums and using a random number plus binary search maps the random draw to the weighted bucket in O(log n) time after O(n) setup.
Asked at:
Meta
Netflix
Microsoft
DESCRIPTION
Given an array of positive weights, implement pickIndex() to return index i with probability w[i]/sum(w); the typical approach preprocessing prefix sums and using a random number plus binary search maps the random draw to the weighted bucket in O(log n) time after O(n) setup.
Input:
Output:
Explanation: Total weight is 10. Index 0 has weight 1 (1/10 = 10%), index 1 has weight 3 (3/10 = 30%), index 2 has weight 2 (2/10 = 20%), index 3 has weight 4 (4/10 = 40%)
Constraints:
- 1 <= w.length <= 10^4
- 1 <= w[i] <= 10^5
- pickIndex will be called at most 10^4 times
- All weights are positive integers
Understanding the Problem
Let's understand this problem with a concrete example. Given weights [1, 3, 2], we need to implement pickIndex() that randomly returns index 0, 1, or 2, but NOT with equal probability.
Index 1 (weight 3) should be picked 3 times as often as index 0 (weight 1). The probability of picking each index is weight[i] / sum(weights). For [1, 3, 2], total weight is 6, so: index 0 has probability 1/6 (~17%), index 1 has probability 3/6 (50%), index 2 has probability 2/6 (~33%).
If we call pickIndex() 100 times, we'd expect roughly: ~17 picks of index 0, ~50 picks of index 1, ~33 picks of index 2.
Key constraint: The pickIndex() method will be called many times (up to 10^4), so it needs to be efficient.
Edge cases: What if the array has only one element? What if one weight is much larger than others? What if weights are very large numbers?
Brute Force Approach
For each pickIndex() call, generate a random number between 0 and the total weight sum. Then iterate through the weights array, accumulating a running sum until the random number falls within the current weight's range. This approach is simple to understand: we're essentially checking each bucket linearly to see if our random number belongs to it. However, this requires O(n) time for every pick operation, which becomes inefficient when pickIndex() is called many times.
Start weighted random pick (brute force)
0 / 10
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(n) per pickIndex call We iterate through all n weights to find which range the random number falls into, checking each weight sequentially
Space Complexity: O(1) We only store the total sum and use a running accumulator, requiring constant extra space
Building Intuition
We can transform weights into ranges on a number line. For weights [1, 3, 2], create ranges: index 0 gets range [0, 1) (size 1), index 1 gets range [1, 4) (size 3), index 2 gets range [4, 6) (size 2).
Now if we pick a random number between 0 and 6, it naturally falls into these ranges proportionally! A random number has a 1/6 chance of landing in [0, 1), a 3/6 chance of landing in [1, 4), etc.
By accumulating weights into running sums (prefix sums), we create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6].
A random target of 3.5 falls between boundaries 1 and 4, which corresponds to index 1. Since these boundaries are sorted, we can find which range a random number falls into very efficiently using binary search instead of checking each range linearly.
Imagine a raffle with 100 tickets:
- Person A bought tickets 0-9 (10 tickets)
- Person B bought tickets 10-69 (60 tickets)
- Person C bought tickets 70-99 (30 tickets)
If you draw ticket #45, you need to find whose range it falls into. You could check each person one by one (slow), or since the ranges are sorted by their boundaries [10, 70, 100], you can use binary search to quickly find that 45 falls between 10 and 70, meaning Person B wins.
This is exactly what we do: precompute prefix sums (boundaries), generate a random number (ticket), and binary search to find which bucket (person) it belongs to.
Common Mistakes
Optimal Solution
During initialization, precompute prefix sums to create sorted range boundaries. For weights [1, 3, 2], prefix sums are [1, 4, 6]. When pickIndex() is called, generate a random number between 0 and the total sum, then use binary search on the prefix sums to find the smallest index where prefixSum[i] > random. This works because prefix sums create sorted boundaries, and binary search can efficiently locate which range the random number falls into. The preprocessing takes O(n) time once, but each subsequent pick is O(log n).
Start weighted random pick with weights: [1, 3, 2, 4]
0 / 19
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(n) for initialization, O(log n) per pickIndex call Building prefix sums requires one pass through the array (O(n)). Each pickIndex call uses binary search on the prefix sums array, which takes O(log n) time
Space Complexity: O(n) We store the prefix sums array which requires O(n) space proportional to the number of weights
What We've Learned
- Prefix sum transformation for weighted sampling: Convert a weights array into cumulative sums to transform the problem from "pick with probability" into "find which range contains a random number" - this makes O(log n) binary search possible instead of O(n) linear scanning.
- Binary search on answer space: Use binary search to find the leftmost index where prefix_sum[i] > random_value, not to search for an exact match - this maps continuous random ranges to discrete bucket indices efficiently.
- Random number range mapping: Generate random values in range [1, total_sum] or [0, total_sum) and ensure consistency - an off-by-one error here breaks the probability distribution, making some indices unreachable or over-represented.
- Preprocessing vs query time tradeoff: Invest O(n) time and space upfront to build prefix sums so each pickIndex() call runs in O(log n) instead of O(n) - critical when pickIndex() is called many times, amortizing the setup cost.
- Weighted random sampling pattern: This prefix-sum + binary-search technique applies to any problem involving non-uniform probability distributions, weighted reservoirs, or proportional selection like load balancing servers by capacity or sampling ads by bid amount.
- Boundary condition precision: When using binary search, carefully handle whether to use `random.randint(1, total)` with `prefix[mid] >= target` or `random.random() * total` with `prefix[mid] > target` - mismatched boundary logic causes incorrect probability weights at bucket edges.
Related Concepts and Problems to Practice
medium
This problem reinforces prefix sum array construction and querying techniques. While it uses prefix sums differently (with hash maps for range queries), it strengthens understanding of cumulative sum arrays which are fundamental to the weighted random pick preprocessing step.
Test Your Understanding
Why is array the right data structure for this problem?
array provides the optimal access pattern
It's the only data structure that works
It's the easiest to implement
It uses the least memory
Select an answer to see the explanation
Question Timeline
See when this question was last asked and where, including any notes left by other candidates.
Mid November, 2025
Meta
Mid-level
Mid November, 2025
Microsoft
Mid-level
Late October, 2025
Meta
Senior
I was asked a variant of this with cities and population and get the city according to the population probability distribution
Comments
Hello Interview Premium
Your account is free and you can post anonymously if you choose.