The System Design Interview: Building a URL Shortening Service like Bit.ly
By Evan King
Aug 13, 2023

System design interviews can be intimidating, especially if you're like most candidates who have never designed large-scale systems before. Perhaps even more stressful, the last time you interviewed might have been several years ago, and that was for a junior position where system design interviews weren't even part of the process. So, facing this for the first time, it's natural that you might feel apprehensive.
But there's no need to worry. Take a deep breath; I'm here to help.
As a former Staff Engineer at Meta, I conducted hundreds of interviews. Over time, I've seen exactly what makes the difference between getting hired and falling short, and I'm here to let you in on a few of those secrets.
In this blog, we're going to delve into one of the most common system design interview questions out there: designing a URL shortening service like bit.ly. Whether you're a novice or looking to brush up on your skills, this guide should equip you with the tools you need to excel in your next interview.
What is a System Design interview?Â
Let's start from the beginning. Unlike other interview formats, System Design interviews are unique in that they are largely candidate driven. You have the wheel, so it's important to know where you're heading!Â
You should follow this simple roadmap:
- Understand the Requirements: Before you get started, you need to ensure you know exactly what it is you're building. This includes defining functional and non-functional requirements, asking questions, and estimating the scale of the system.
- High-level Design: Often using a whiteboard (virtual in many cases), you'll draw an overview of the system, outlining its architecture and the interactions between components, providing a general understanding of the structure.
- Data Model: Next, you'll want to define the data modal and the relationships between the data, including the design of the database schema and making a selection for the type of database best suited for your system.
- Core Components: These are the heart and soul of your system. Identify the primary modules and functions that are vital to your solution, describing how they interact and collaborate to fulfill the system's purpose.
- Scalability: Explain how the system can grow, outlining methods to handle increased loads, identifying potential bottlenecks, and addressing strategies to ensure that your design can expand to meet future demands.
- Security, Monitoring & Testing: Outline the measures to ensure data integrity, privacy, and protection from threats, and describe how you'll monitor system performance and health, along with the testing approaches to make sure everything works as intended.
Designing a URL Shortening Service like Bit.ly
Ok, now that the housekeeping is out of the way, let's get down to business.
It's an early Monday morning, the coffee's just been drained from your mug, and you've fired up your laptop to boot up Zoom. The time has come for your first System Design interview. You join the call at 9 a.m. sharp, and after exchanging some pleasantries, your interviewer gets to the point: "So, for today's interview, I want you to design a URL shortening service, something similar to bit.ly." It's go time.
NOTE
If you want to practice this very question with your own AI interviewer navigate to Mock Interviews in the top nav then click "URL Shortening Service like Bit.ly." I highly recommend!
Start by understanding the requirementsÂ
Before we start calculating the square root of pi, first things first, let's get the requirements down pat. What is our system supposed to do? In a system design interview, an interviewer won't always explicitly mention every requirement. It's up to you, as the interviewee, to dig out these details through targeted questions. Making sure that you understand the requirements correctly is an essential exercise and one that an interviewer wants to see you perform.
For this example, let's walk through a likely set of functional requirements for our URL shortening service.
- Our service should be able to generate a unique short URL for each long URL submitted. These short URLs will be much easier to share, especially on social media platforms.
- When users access a short URL, the service should redirect them to the original long URL.
- Our service should ensure the uniqueness of the generated short URLs. No two long URLs should map to the same short URL.
- We should support analytics. Showing users how often their short url is accessed.
- We may even want to support custom vanity urls. Instead of creating a random hash for the short url, we will allow users to provide us with a short string to use instead.Â
So far so good. Now moving onto our non-functional requirements:
- Our service needs to be highly scalable. This is due to its incredibly high potential usage, especially if our service gains popularity on social media.
- Our service also needs to be highly available. Our users will expect almost instantaneous redirection from the short URL to the original one. We simply can't afford downtime or slow responses.
- It's important that our system has low latency for redirection. The time it takes to redirect a user from the short URL to the original long URL should be in the same order of magnitude as directly navigating to the original URL.
The list we just walked through is not exhaustive, and during an actual interview, you should tailor your questions based on the prompts from your interviewer. But for the purpose of our exercise today, this list will cover our bases.
Estimating usage and system capacity
Now that we know what our system is expected to do, let's try to estimate how much it needs to do it. We need to gauge the number of users and the amount of storage our system will require.
Let’s assume our platform became wildly popular and we managed to capture a 25% market share of social media users who shared shortened links. Now, if there are around 1 billion daily active users on all social media platforms and about 1% of those users post shortened links each day, this will mean we're looking at Daily Active Users (DAU) in the ballpark of 2.5M.
For storage requirements, let's do some quick math. If we assume that each record in our database may be around 1KB - that's the sum of original URL (about 100 bytes), short URL (about 8 bytes), and some metadata & analytics (let's say 500 bytes). But, keeping in mind backup, redundancy and future growth, we can safely estimate system storage requirements at about 1TB per year.
Let's just take a moment here. Interviewers love to throw curve-balls, and one might come in the form of the CAP theorem. You might get asked where to put emphasis for our system: consistency or availability? For a URL shortening service like ours, we prioritize availability. Users expect our service to be up and running, and to respond quickly. It's okay if our system takes a while to become consistent (url mappings might exhibit slight inconsistencies for a brief period).Â
TIP
In the majority of cases, especially for non-financial systems, availability is prioritized over consistency. When in doubt, favor availability.
High-level design and API endpoints
Now comes the fun part! We have a broad understanding of what our system is supposed to do and how heavily it'll be used. It's time now to draw the major components of our system.
We'll be dealing with a lot of terms – User Interface, API Layer, Business Logic Layer, Distribute Database, Distributed Cache, and Load Balancer. Don’t let the jargon scare you away! Each term just describes a single piece of the puzzle.
On the frontend, the User Interface is responsible for taking user input (the long, unwieldy URLs) and returning the short, pretty URLs. The API Layer acts as the communication channel between the UI and the core logic of the application, processing requests and responses. The main component of our system is the Business Logic Layer (or application servers), which houses the core functionality - generating the short URLs and maintaining the mapping between short and long URLs. We'll store this mapping in a distributed database.
To enhance performance, a distributed cache stores frequently accessed URLs, while a load balancer distributes requests across multiple servers, ensuring no single server gets overwhelmed. To complete the architecture, we include a dedicated service for monitoring, logging, and analytics. Our logs are securely housed in S3, and our analytics service writes directly to our main NoSQL database for swift access.
TIP
Analytics data is often stored in a SQL database due to its robust support for structured data and ability to handle complex queries. In this case, we opted for simplicity, making the system more maintainable and aligning with the specific requirements of our analytics needs (where the analytics should be simple, not requiring complex aggregations). This choice would be a great tradeoff to discuss with your interviewer! Proposing a polyglot approach (using both SQL and NoSQL) can be impressive, but it's even better when you can thoughtfully identify that such an approach may be overkill for the given scenario, demonstrating a nuanced understanding of system design principles.
What about the API?Â
While interviewing, translating requirements into API endpoints might seem a bit too nitty-gritty, but it's an exercise that demonstrates how detail-oriented you are in understanding and explaining requirements. Endpoints like POST /shorten, GET /:shortUrl, GET /:shortUrl/details, POST /users, and GET /users/:id would be essential to a system like this, and clearly plotting out requests, responses, and the purpose behind each endpoint will show your interviewer a clear picture of your thinking process.
api
POST /shorten - Request: { 'url': 'original long URL', 'userId': 'user's ID (optional)' } - Response: { 'shortUrl': 'generated short URL' } - Purpose: Create a new short URL from the given long URL GET /:shortUrl - Response: Redirects to the original URL - Purpose: Get and redirect to the original URL using the short URL GET /:shortUrl/details - Response: { 'shortUrl': 'short URL', 'url': 'original URL', 'clicks': 'number of redirects happened', 'created_at': 'creation timestamp' } - Purpose: Get the details of a short URL including the original URL and number of clicks or redirects POST /users - Request: name, email - Response: id - Purpose: Create a user GET /users/:id - Response: id, name, email - Purpose: Get a user by id
Feel good so far? Great, because we're just starting to scrape beneath the surface!Â
Choosing the right database
Once we have our system’s core functionality down, we need to set our sights on the right database. If you’re having heart palpitations trying to decide between SQL and NoSQL – don't! For our purposes, we go NoSql. Why? We need a database that provides high performance and low latency, and by using a key-value store like DynamoDB or Redis, we get exactly that. A Key-Value store allows for easy scalability and is ideal for simple key-value mappings (shortened URL to original URL). This makes it preferable to a SQL DB which is best suited when dealing with complex data models and queries.Â
But we're not just going to slap a NoSQL database into our system and call it a day. To make sure our system stays on its feet when things get rocky, we need to ensure high data availability, efficient replication, and accurate synchronization. Planning for redundancy is a must to keep our service highly available and responsive. In the case of data replication, while the nature of key-value stores generally means eventual consistency, an asynchronous replication strategy would be a good fit for our system or any other read-heavy systems. It ensures lower latency, which is a critical factor for a URL shortening service. The difference between updated states of data across different nodes (known as synchronization delay) must be kept within acceptable bounds to uphold the quality of service. Monitoring and managing this delay gets one step easier with tools like Amazon DynamoDB which offer built-in support for eventual consistency.
TIP
Calling out specific technologies (e.g., DynamoDB) can impress the interviewer, but tread carefully. Only mention technologies that you have decent familiarity with. You may face follow-up questions like, "Why DynamoDB instead of Cassandra?" or, even more challenging, "How do you plan to deal with DynamoDB's lack of built-in support for secondary indexes, compared to Cassandra's ability to create them?" Being prepared to discuss these specific differences and how you might address or leverage them in your design will showcase your deep understanding and ability to think strategically about technology choices. But if you don't have the necessary experience, you're setting yourself up to get egg on your face.
Generating short URLs and combatting potential collisions
Let's get into the nitty-gritty of generating shorter and unique URLs. After all, it's what our service is all about. The good news? You don’t need to reinvent the wheel. A combination of hashing and encoding techniques gets this done for us.
We start by creating a hash of the input URL, using a hashing algorithm like MD5 or SHA-256. This hash then gets encoded using a URL-friendly character set, such as Base62 (which uses alphanumeric characters: A-Z, a-z, and 0-9). We truncate the tagged hash to a defined length, like 6 or 8 characters, and voila, we have a shortened URL!
Here is an illustration of how this could look in python for reference. Note, it's very unlikely an interviewer asks you to actually write any of the code.
python
import hashlib import base62 def shorten_url(original_url, length=6): # Hash the original URL using MD5 hashed_url = hashlib.md5(original_url.encode()).digest() # Convert the hashed URL to an integer hashed_number = int.from_bytes(hashed_url, byteorder='big') # Encode the hash using Base62 using a third-party library base62_encoded = base62.encode(hashed_number) # Truncate to the desired length shortened_url = base62_encoded[:length] return shortened_url original_url = "https://www.example.com" short_url = shorten_url(original_url, length=6) print("Original URL:", original_url) print("Shortened URL:", short_url)
Now, I know what you’re thinking - what if our hashing and encoding generates the same short URL for two different long URLs? We've got it covered. Thanks to the "Write if not exists" operation in most NoSQL key-value stores, our system will attempt to store the newly generated shortened URL only if it does not already exist in the database. If our attempt fails because of a collision, we regenerate a new hash by using a different hashing function or by appending a counter at the end of the original URL. We keep this up until our shortened URL is as unique as our users.
Let's talk redirection
Assuming we have generated a unique URL for each of our users, the bread and butter of our service comes down to redirecting them to the original URL when the short one is accessed. Seems simple, no? Stay with me.
We conduct a database lookup when the shortened URL is accessed. Upon retrieving the original URL, we dish out an HTTP status of 301 (Permanent Redirect) or 302 (Temporary Redirect), and the user gets sent to the original URL. (We'll opt for 302 here, so we can capture analytics about the number of times the short URL was queried). In the rare case, if our database does not have a record of the short URL, we send a 404 (Not Found) error.
The interviewer may very well ask you which HTTP status you'd use for redirect and, unlike many subjective questions you'll be asked in the interview, there is a right answer this time! If you mentioned analytics in your requirements then you'll need to opt for 302 so that the browser does not cache the redirect. This allows each redirect to be logged and analyzed, enabling the collection of valuable information such as how many times the short URL was accessed, from which geographical locations, devices, or referring sites.
Using a 301 (Permanent Redirect) in this context would be incorrect because most browsers would cache this redirect, bypassing the server on subsequent requests, and therefore not allowing the capture of any analytics.
Ensuring scalability and identifying your bottleneck
Let’s circle back to scale - Can our service handle the rigors of widespread usage? This is a question we need to answer by focusing on horizontal scaling and caching.
As our user base grows, we need to distribute data across multiple servers to accommodate the higher load. Here's where our handy-dandy NoSQL database shines, due to its propensity for horizontal scaling. For an even distribution of incoming requests across servers, a trusty load balancer is our superhero.
However, just having more servers isn’t going to help if our bottleneck is read operations on the database. If our service is read-heavy, hammering those read operations on the database may mean an unhappy user. For this, we look to the superhero's sidekick - Caching. A caching layer between our app and database reduces read operations, as frequently accessed URLs will be stored and retrieved from the cache. Presto, happier users, less frustrated engineers!
Hungry for more? We can serve our increasing dataset by employing consistent hashing, a technique to distribute the data evenly across multiple database nodes, making for quick data retrieval. With these strategies coupled together, our system should be able to handle even the Mondayest of Monday loads.
Remember the potential performance bottleneck we talked about earlier? That pesky latency? Here's where the dynamic duo of caching and read replicas come back into play. Caching can store and provide quick access to frequently accessed URLs, significantly reducing reads. Under extreme loads, read replicas of our database can be created. These replicas distribute load across multiple nodes, thus preventing individual servers from collapsing under the weight.
Securing the system and monitoring performance
In designing our URL shortening service, two key security measures would be URL Validation and Rate Limiting.
Before our system goes gangbusters creating short links, we need to ensure users aren’t trying to circulate malicious URLs. For this, we check the domain of the long URL against a blacklist of known harmful websites and scan it for common vulnerabilities. Additionally, we provide users with a preview of their destination before they redirect, reducing the risk of them landing on harmful websites.
To further prevent misuse of our service through spamming or DDoS attacks, we would also implement rate limiting. This restrains the number of requests a single client can make in a specific time frame.
All this fantastic information is useless unless we keep an eye on how we're doing. We're looking to monitor two main performance metrics: response times and error rates. Using application logs, real-time analytics, application and server logs for error rates, we set up regular health checks to ensure optimized functioning of our components. A comprehensive dashboard provides a one-shot overview of the system's performance, enabling the team to monitor the system's performance effectively.
TIP
While explaining your security and monitoring strategies, don't forget to showcase your system's robustness, reliability, and user-centric approach.
Testing functionality and reliability
To the final frontier - testing. The aim is simple: Make sure everything works as it should!
To ensure that the URL shortening process, integration between components, and handling of user requirements work perfectly, we conduct Unit Testing, Integration Testing, and System Testing. Together these are powerful tools to ensure functionality and reliability. But to truly understand our system strength, we must subject it under stress.
Load Testing, the act of simulating heavy traffic and testing our system’s performance, will help identify bottlenecks and confirm our system's ability to handle large volumes of users. Stress testing, on the other hand, tests our system's behavior under extreme conditions, revealing breaking points, and allowing for improvements in our system's resilience.
By using an automated testing framework, we can conduct regular health checks, detect early bugs, and ensure faster deployments of functionalities. A few testing iterations and you'll have a system as crisp as a freshly ironed shirt.
Parting Words
And there you have it! You should not have a pretty solid understanding of what a System Design interview is and how you would go about building a URL shortener. But remember, this process isn't about memorizing the perfect answers; it's about understanding the thought process behind designing scalable, reliable systems.
You can practice this exact question (and dozens of others) with Hello Interview AI. You'll be able to answer real system design interview questions on an interactive whiteboard and receive instant AI feedback!
Rehearse, revise, and be ready to ace your next system design interview.

Evan, Co-founder of Hello Interview and former Tech Lead at Meta, possesses a unique vantage point, having been on both sides of the tech hiring process. With a track record of conducting hundreds of interviews and securing offers from top tech companies himself, he is now on a mission to help others do the same.
Recent Posts
The System Design Interview: What is Expected at Each Level
Thu Nov 30 2023
Understanding the Differences between Meta's SWE System Design and Product Design Interviews
Wed Nov 15 2023
System Design Interview Fundamentals: Mastering Estimation
Thu Nov 02 2023
Understanding Job Levels at FAANG Companies
Wed Nov 01 2023
Story Crafting 101: Constructing Engaging Behavioral Interview Stories
Mon Oct 16 2023