⌘K

Tutor

Common Problems

Rate Limiter

ByEvan King·Published ·

hard

Try This Problem Yourself

Practice with guided hints and real-time feedback

Understanding the Problem

🚦 What is a Rate Limiter? A rate limiter controls how many requests a client can make to an API within a specific time window. When a request comes in, the rate limiter checks if the client has exceeded their quota. If they're under the limit, the request proceeds. If they've hit the cap, the request gets rejected. This protects APIs from abuse and ensures fair resource allocation across clients.

Requirements

When the interview starts, you'll get something that looks like this:

"You're building an in-memory rate limiter for an API gateway. The system receives configuration from an external service that provides rate limiting rules per endpoint. Each endpoint can have its own limit with a specific algorithm. Here's an example configuration for one endpoint:
{
  "endpoint": "/search",
  "algorithm": "TokenBucket",
  "algoConfig": {
    "capacity": 1000,
    "refillRatePerSecond": 10
  }
}
This config allows bursts up to 1000 requests, refilling at 10 requests per second.

Your job is to build the in-memory rate limiter that enforces these rules."

Clarifying Questions

While this gave us a decent understanding of the problem, you'll want to spend some time asking the interviewer clarifying questions to ensure you have a complete understanding of the system you are building. Here's how the conversation might play out:

You: "I see the configuration includes algorithm-specific parameters. Are there different parameter sets for different algorithms?"

Interviewer: "Yah, different algorithms need different parameters. So the algoConfig object always exists, but the parameters inside it vary."

Good. Now you know the config is heterogeneous. Different algorithms need different parameters.

You: "When a request comes in, what information do we receive? Client ID and endpoint, or something else?"

Interviewer: "Yes, exactly. Each request provides a client ID and an endpoint. The client ID is just a string that uniquely identifies who's making the request."

You: "What should we return when checking a request? Just allowed/denied, or more detail?"

Interviewer: "Return three things: whether it's allowed, how many requests remain in their quota, and if denied, when they can retry."

Now you know the return type needs structure, not just a boolean.

You: "What happens if a request comes in for an endpoint we don't have configuration for?"

Interviewer: "Good question. Fall back to a default configuration. Don't reject requests just because we're missing config."

You: "Should the system handle concurrent requests from multiple threads?"

Interviewer: "Don't worry about it to start. We'll get to it if we have time."

This is a common interviewer pattern. They'll say "don't worry about X" to let you start with something simple and get the core logic working first. It's usually not because they don't care about that aspect, it's because they want to see if you can build a clean foundation, then extend it later. If you finish the basic implementation with time to spare, they'll often circle back with "now how would you handle X?" That's your cue to discuss how you'd extend the design. We cover common extensions (including thread safety) in the extensibility section at the end.

You: "Just to clarify scope, are we building distributed rate limiting across multiple servers, or single-process in-memory?"

Interviewer: "Single process, in-memory. Keep it simple."

That's a huge simplification. No network coordination, no shared state across machines.

Designing a distributed rate limiter is a common system design interview question. You can see our breakdown via Design a Distributed Rate Limiter.

You: "And the configuration, is it dynamic, or loaded once at startup?"

Interviewer: "Loaded at startup. Don't worry about hot-reloading config while the system is running."

Perfect. You've scoped out what not to build.

Final Requirements

After that back-and-forth, here's what you'd write on the whiteboard:

Final Requirements

Requirements:
1. Configuration is provided at startup (loaded once)
2. System receives requests with (clientId: string, endpoint: string)
3. Each endpoint has a configuration specifying:
   - Algorithm to use (e.g., "TokenBucket", "SlidingWindowLog", etc.)
   - Algorithm-specific parameters (e.g., capacity, refillRatePerSecond for Token Bucket)
4. System enforces rate limits by checking clientId against the endpoint's configuration
5. Return structured result: (allowed: boolean, remaining: int, retryAfterMs: long | null)
6. If endpoint has no configuration, use a default limit

Out of scope:
- Distributed rate limiting (Redis, coordination)
- Dynamic configuration updates
- Metrics and monitoring
- Config validation beyond basic checks

Core Entities and Relationships

Now that the requirements are clear, we need to figure out what objects make up the system. The trick is scanning the requirements for nouns that represent things with behavior or state. We start by considering each of the nouns in our requirements as candidates, pruning until we have a list of entities that make sense to model.

Purchase Premium to Keep Reading

Unlock this article and so much more with Hello Interview Premium

Buy Premium

Schedule a mock interview

Meet with a FAANG senior+ engineer or manager and learn exactly what it takes to get the job.

Schedule a Mock Interview

Common Problems

Rate Limiter

Try This Problem Yourself

Understanding the Problem

Requirements

Clarifying Questions

Final Requirements

Core Entities and Relationships

Purchase Premium to Keep Reading

Unlock this article and so much more with Hello Interview Premium

Schedule a mock interview

Questions

Learn

Links

Legal

Contact

Common Problems

Rate Limiter

Try This Problem Yourself

Understanding the Problem

Requirements

Clarifying Questions

Final Requirements

Core Entities and Relationships

Class Design

RateLimiter

LimiterFactory

Limiter

RateLimitResult

Final Class Design

Implementation

LimiterFactory

RateLimiter

Rate Limiting Algorithms

TokenBucketLimiter

Complete Code Implementation

Verification

Extensibility

1. "How would you add a new rate limiting algorithm?"

2. "How would you handle dynamic configuration updates?"

3. "How would you handle thread safety for concurrent requests?"

4. "How would you handle memory growth from tracking many clients?"

What is Expected at Each Level?

Junior

Mid-level

Senior

Purchase Premium to Keep Reading

Unlock this article and so much more with Hello Interview Premium

Schedule a mock interview

Questions

Learn

Links

Legal

Contact