Leetcode 973. K Closest Points to Origin
Given N points on the plane and an integer k, return the k points with the smallest Euclidean distance to the origin (order doesn't matter). The core challenge is efficiently selecting the k smallest distances (e.g., via a max-heap or quickselect/partial sort) for up to N = 10^4 points.
Asked at:
Meta
Amazon
DESCRIPTION
Given N points on the plane and an integer k, return the k points with the smallest Euclidean distance to the origin (order doesn't matter). The core challenge is efficiently selecting the k smallest distances (e.g., via a max-heap or quickselect/partial sort) for up to N = 10^4 points.
Input:
points = [[1,3], [-2,2], [5,8], [0,1]], k = 2
Output:
[[0,1], [-2,2]]
Explanation: Distances: [1,3] → √10 ≈ 3.16, [-2,2] → √8 ≈ 2.83, [5,8] → √89 ≈ 9.43, [0,1] → 1. The two smallest distances are 1 and √8, so return [[0,1], [-2,2]]
Constraints:
- 1 <= k <= points.length <= 10^4
- -10^4 < points[i][0], points[i][1] < 10^4
- All points are unique
- Order of returned points does not matter
Understanding the Problem
Let's understand what we're being asked to do. We have N points on a 2D plane, like [[1,3], [2,2], [5,8], [0,1]], and we need to find the k points closest to the origin (0,0).
For example, if k=2, we need to return the 2 points with the smallest Euclidean distance to (0,0). The distance from point (x,y) to origin is √(x² + y²). For point [1,3], distance is √(1² + 3²) = √10 ≈ 3.16. For point [0,1], distance is √(0² + 1²) = 1.
Important constraint: We can return the k points in any order - we don't need to sort them by distance, just identify which k points are closest.
Edge cases to consider: What if k equals N (return all points)? What if two points have the same distance (both should be considered)? What if k=1 (just find the single closest point)?
Brute Force Approach
The straightforward approach is to calculate the Euclidean distance for every point, sort all N points by their distances, and then return the first k points from the sorted array. This approach is simple to implement and guarantees correctness, but it does unnecessary work by sorting all points when we only need the k smallest.
def k_closest_points(points, k):"""Brute force: Calculate all distances, sort all N points,then return first k. Inefficient because we sort all pointswhen we only need k smallest."""# Calculate Euclidean distance for each pointdistances = []for point in points:x, y = point[0], point[1]dist = x * x + y * ydistances.append((dist, point))# Sort ALL points by distance (inefficient - sorts all N points)distances.sort(key=lambda item: item[0])# Extract first k points from sorted arrayresult = []for i in range(k):result.append(distances[i][1])return result
Start k closest points algorithm (brute force)
0 / 17
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(n log n) Calculating distances takes O(n), and sorting all N points takes O(n log n), which dominates the time complexity
Space Complexity: O(n) We need O(n) space to store the distance-point pairs for sorting
Building Intuition
We don't need to fully sort all N points by distance - we only need to identify the k smallest distances.
This is a selection problem, not a sorting problem. If we have 10,000 points and k=10, we shouldn't waste time sorting all 10,000 points when we only care about the top 10.
By maintaining a collection of the k best candidates seen so far, we can process each point once and make a decision: is this point better than the worst of our current k candidates?
This transforms an O(n log n) sorting problem into an O(n log k) selection problem. For large N and small k, this is a massive improvement!
Imagine you're a talent scout looking for the top 10 athletes out of 10,000 candidates. You don't need to rank all 10,000 - you just need to keep track of your current top 10.
As you evaluate each new candidate, you compare them to the worst performer in your current top 10. If the new candidate is better, they replace that worst performer. This way, you only maintain a small collection of the best candidates, not the entire pool.
Common Mistakes
Optimal Solution
The optimal approach uses a max-heap of size k to maintain the k closest points seen so far. For each point, calculate its distance and compare it to the maximum distance in the heap (the root). If the new point is closer, remove the farthest point from the heap and add the new point. This way, we only maintain k elements in the heap at any time, and after processing all points, the heap contains exactly the k closest points.
import heapqdef k_closest_points(points, k):"""Find k closest points to origin using max-heap.Time: O(N log k), Space: O(k)"""# Max-heap to store k closest points (negative distance for max-heap)max_heap = []# Process each pointfor x, y in points:# Calculate squared distance (avoid sqrt for efficiency)dist_squared = x * x + y * y# Add to heap with negative distance (max-heap simulation)heapq.heappush(max_heap, (-dist_squared, [x, y]))# Keep only k closest pointsif len(max_heap) > k:heapq.heappop(max_heap)# Extract points from heapresult = [point for _, point in max_heap]return result
Find k closest points to origin using max-heap
0 / 25
Test Your Knowledge
Login to take the complexity quiz and track your progress
Complexity Analysis
Time Complexity: O(n log k) We process each of the n points once, and each heap operation (insert/remove) takes O(log k) time, giving us O(n log k) total time
Space Complexity: O(k) The max-heap stores at most k points at any time, requiring O(k) space
What We've Learned
- Max-heap for top-K smallest problems: Use a max-heap of size k (not min-heap) to find k smallest elements - the heap's root is always the largest among your k candidates, making it easy to decide if a new element should replace it. This inverts intuition but is the correct approach for "k smallest" problems.
- Distance comparison optimization: Avoid computing square roots for Euclidean distance comparisons - since sqrt is monotonic, comparing `x²+y²` gives the same ordering as comparing `√(x²+y²)`, saving expensive floating-point operations on every comparison.
- Heap vs Quickselect trade-off: Max-heap gives O(N log k) which is better when k << N, while quickselect gives O(N) average case but O(N²) worst case. For k close to N/2, quickselect shines; for small k (like k=5 from N=10000), max-heap is more predictable and often faster in practice.
- Custom comparator pitfall: When using language-built heaps, ensure your comparator matches the heap type - Python's heapq is min-heap only, so negate distances or use tuples like `(-distance, point)` to simulate max-heap behavior. Java's PriorityQueue needs explicit `Collections.reverseOrder()` for max-heap.
- Partial sorting pattern recognition: Whenever you need "top k" or "k smallest/largest" from unsorted data and don't need full sorting, think heap-based selection or quickselect - full sorting O(N log N) wastes work when k << N since you only care about k elements.
- Streaming data application: This max-heap approach extends to real-time scenarios like finding k nearest neighbors in recommendation systems, monitoring top-k anomalies in network traffic, or maintaining leaderboards - the heap efficiently handles continuous updates without reprocessing all data.
Related Concepts and Problems to Practice
medium
This is the exact same problem as the one being analyzed. It teaches the core pattern of using a max-heap to efficiently find k smallest elements, which is the primary technique needed for this problem.
medium
This problem uses the same heap-based selection pattern but finds the kth largest instead of k smallest. It reinforces the technique of maintaining a heap of size k to efficiently select elements based on a comparison criterion.
medium
This problem extends the k-closest pattern by finding elements closest to a target value rather than origin. It practices the same heap-based selection technique with distance calculations, reinforcing the core skill of efficiently selecting k elements based on a distance metric.
Test Your Understanding
Why is geometry the right data structure for this problem?
geometry provides the optimal access pattern
It's the only data structure that works
It's the easiest to implement
It uses the least memory
Select an answer to see the explanation
Question Timeline
See when this question was last asked and where, including any notes left by other candidates.
Late October, 2025
Meta
Senior
Mid October, 2025
Meta
Senior
Late September, 2025
Meta
Staff
Comments
Hello Interview Premium
Your account is free and you can post anonymously if you choose.