Search
⌘K
Get Premium
Early Access
Common Problems
Rate Limiter
Understanding the Problem
🚦 What is a Rate Limiter?
A rate limiter controls how many requests a client can make to an API within a specific time window. When a request comes in, the rate limiter checks if the client has exceeded their quota. If they're under the limit, the request proceeds. If they've hit the cap, the request gets rejected. This protects APIs from abuse and ensures fair resource allocation across clients.
Requirements
When the interview starts, you'll get something that looks like this:
"You're building an in-memory rate limiter for an API gateway. The system receives configuration from an external service that provides rate limiting rules per endpoint. Each endpoint can have its own limit with a specific algorithm. Here's an example configuration for one endpoint:{ "endpoint": "/search", "algorithm": "TokenBucket", "algoConfig": { "capacity": 1000, "refillRatePerSecond": 10 } }This config allows bursts up to 1000 requests, refilling at 10 requests per second.Your job is to build the in-memory rate limiter that enforces these rules."
Clarifying Questions
While this gave us a decent understanding of the problem, you'll want to spend some time asking the interviewer clarifying questions to ensure you have a complete understanding of the system you are building. Here's how the conversation might play out:
You: "I see the configuration includes algorithm-specific parameters. Are there different parameter sets for different algorithms?"Interviewer: "Yah, different algorithms need different parameters. So the algoConfig object always exists, but the parameters inside it vary."
Good. Now you know the config is heterogeneous. Different algorithms need different parameters.
You: "When a request comes in, what information do we receive? Client ID and endpoint, or something else?"Interviewer: "Yes, exactly. Each request provides a client ID and an endpoint. The client ID is just a string that uniquely identifies who's making the request."
You: "What should we return when checking a request? Just allowed/denied, or more detail?"Interviewer: "Return three things: whether it's allowed, how many requests remain in their quota, and if denied, when they can retry."
Now you know the return type needs structure, not just a boolean.
You: "What happens if a request comes in for an endpoint we don't have configuration for?"Interviewer: "Good question. Fall back to a default configuration. Don't reject requests just because we're missing config."
You: "Should the system handle concurrent requests from multiple threads?"Interviewer: "Don't worry about it to start. We'll get to it if we have time."
This is a common interviewer pattern. They'll say "don't worry about X" to let you start with something simple and get the core logic working first. It's usually not because they don't care about that aspect, it's because they want to see if you can build a clean foundation, then extend it later. If you finish the basic implementation with time to spare, they'll often circle back with "now how would you handle X?" That's your cue to discuss how you'd extend the design. We cover common extensions (including thread safety) in the extensibility section at the end.
You: "Just to clarify scope, are we building distributed rate limiting across multiple servers, or single-process in-memory?"Interviewer: "Single process, in-memory. Keep it simple."
That's a huge simplification. No network coordination, no shared state across machines.
Designing a distributed rate limiter is a common system design interview question. You can see our breakdown via Design a Distributed Rate Limiter.
You: "And the configuration, is it dynamic, or loaded once at startup?"Interviewer: "Loaded at startup. Don't worry about hot-reloading config while the system is running."
Perfect. You've scoped out what not to build.
Final Requirements
After that back-and-forth, here's what you'd write on the whiteboard:
Final Requirements
Requirements:
1. Configuration is provided at startup (loaded once)
2. System receives requests with (clientId: string, endpoint: string)
3. Each endpoint has a configuration specifying:
- Algorithm to use (e.g., "TokenBucket", "SlidingWindowLog", etc.)
- Algorithm-specific parameters (e.g., capacity, refillRatePerSecond for Token Bucket)
4. System enforces rate limits by checking clientId against the endpoint's configuration
5. Return structured result: (allowed: boolean, remaining: int, retryAfterMs: long | null)
6. If endpoint has no configuration, use a default limit
Out of scope:
- Distributed rate limiting (Redis, coordination)
- Dynamic configuration updates
- Metrics and monitoring
- Config validation beyond basic checksCore Entities and Relationships
Now that the requirements are clear, we need to figure out what objects make up the system. The trick is scanning the requirements for nouns that represent things with behavior or state. We start by considering each of the nouns in our requirements as candidates, pruning until we have a list of entities that make sense to model.
Class Design
RateLimiter
LimiterFactory
Limiter
RateLimitResult
Final Class Design
Implementation
LimiterFactory
RateLimiter
Rate Limiting Algorithms
TokenBucketLimiter
Complete Code Implementation
Verification
Extensibility
1. "How would you add a new rate limiting algorithm?"
2. "How would you handle dynamic configuration updates?"
3. "How would you handle thread safety for concurrent requests?"
4. "How would you handle memory growth from tracking many clients?"
What is Expected at Each Level?
Junior
Mid-level
Senior
Purchase Premium to Keep Reading
Unlock this article and so much more with Hello Interview Premium
Currently up to 25% off
Hello Interview Premium
Reading Progress
On This Page
Understanding the Problem
Requirements
Clarifying Questions
Final Requirements
Core Entities and Relationships
Class Design
RateLimiter
LimiterFactory
Limiter
RateLimitResult
Final Class Design
Implementation
LimiterFactory
RateLimiter
Rate Limiting Algorithms
Complete Code Implementation
Verification
Extensibility
1. "How would you add a new rate limiting algorithm?"
2. "How would you handle dynamic configuration updates?"
3. "How would you handle thread safety for concurrent requests?"
4. "How would you handle memory growth from tracking many clients?"
What is Expected at Each Level?
Junior
Mid-level
Senior

Schedule a mock interview
Meet with a FAANG senior+ engineer or manager and learn exactly what it takes to get the job.