Design a Matchmaking Service for Multiplayer Games
Design a matchmaking service that allows users to join waiting queues, get grouped by skill level into teams of 16, and be allocated to new game servers for multiplayer sessions.
Asked at:
Roblox
A multiplayer matchmaking service is the control plane behind games like Fortnite, Valorant, or Roblox experiences that lets players press Play, wait in a queue, get grouped with similarly skilled players (e.g., 16 per match), and then jump together into a fresh game server. It orchestrates queues by game/region/skill, forms fair teams, reserves players atomically, and allocates server capacity so everyone loads in at the same time. Interviewers ask this to see if you can design a high-throughput, low-latency coordination system under contention. They’re probing your ability to partition hot queues, avoid global locks, manage idempotent multi-step workflows (queue → reserve → allocate server), balance fairness vs. wait time, and deliver real-time updates at scale. Expect tradeoffs, failure handling, and capacity/backpressure thinking, not just a data model.
Common Functional Requirements
Most candidates end up covering this set of core functionalities
Users should be able to join a matchmaking queue for a specific game, region, and mode, while being restricted to one active queue at a time.
Users should be able to cancel or leave the queue at any time and receive prompt confirmation that their state has been updated.
Users should be grouped into fair 16‑player matches by skill (and latency/region), with matching rules that can relax over time to keep wait times reasonable.
Users should receive real-time status updates (queued, forming match, assigned server) and the final server details to join when a match is allocated.
Common Deep Dives
Common follow-up questions interviewers like to ask for this question
This is the core quality vs. latency trade-off. Interviewers want to see dynamic policies that adapt over a player's wait time and current queue distribution, not a single fixed rule that starves outliers. - You could bucket by MMR/ELO bands within a region, then widen the acceptable skill window and nearby regions as a player's queue time increases to meet SLOs. - You could track queue age percentiles per bucket and merge adjacent buckets when tails age, ensuring you don't leave edge-skill players stranded. - You could tune per-game policies (band size, widen rate, max radius/region hop) using live histograms of skill distribution and regional capacity.
Multiple workers will read hot queues concurrently. You need a safe way to pop 16 players, mark them reserved, and publish a match without races or global locks. - You could shard queues by game/region/skill-band and use atomic operations (e.g., Redis Lua/MULTI) to pop a batch and move players to a reserved set with a matchId and TTL. - You could make every downstream step idempotent by carrying matchId and user reservation tokens; retries check reservation state to avoid duplicates. - You could use a small, per-shard coordinator (or consistent hashing) so no two workers contend on the same shard key, keeping throughput high.
Server allocation is a separate control plane that must be fast and predictable. Interviewers look for warm capacity, leasing, and backpressure when pools run dry. - You could maintain a warm pool of empty servers per game/region and lease one via a lightweight allocator API; if empty, queue a provisioning request and put the match in 'awaiting server' with a timeout. - You could apply backpressure by slowing match formation or expanding acceptable regions when allocator latency/queue depth crosses thresholds; autoscale warm pools from metrics. - You could make allocations leases with expirations so crashed flows return servers to the pool automatically.
Real users churn, lose connections, or change their minds. Your workflow needs explicit states, timeouts, and push updates so players aren't stranded and capacity isn't wasted. - You could model a state machine (queued, reserved, assigned, confirmed) with TTLs/heartbeats; expired reservations release slots and optionally backfill from a standby list. - You could make join/leave/confirm idempotent using client tokens so reconnecting clients can resume their reservation or receive a clean cancel. - You could push real-time updates via WebSockets or server-sent events per match/user to avoid heavy polling and improve UX; fall back to short-polling if needed.
Relevant Patterns
Relevant patterns that you should know for this question
Hot queues will see hundreds of joins per second. Sharding by game/region/skill, atomic batch pops, and idempotent operations prevent double-assignment and eliminate global locks while keeping throughput predictable.
Matchmaking spans multiple steps—queue, reserve players, allocate a server, confirm/join—and each can fail. A saga-like state machine with TTLs and compensations ensures stuck reservations are reclaimed and matches complete reliably.
Players expect immediate feedback on queue position and when to join the server. Pub/sub and push channels reduce polling pressure and keep user experience responsive during formation and allocation.
Relevant Technologies
Relevant technologies that could be used to solve this question
Similar Problems to Practice
Related problems to practice for this question
Like matching riders and drivers, you balance proximity/latency and quality (skill vs. ETA), avoid double-assignments under concurrency, and handle cancellations with backfill—all under bursty, regional demand.
Forming a match resembles batching jobs; allocating a game server is like assigning a worker. Both require multi-step orchestration, leases/TTLs, idempotency, and backpressure to keep the system stable during spikes.
Scored matching with dynamic widening of acceptance criteria over time mirrors skill-based matchmaking trade-offs between match quality and wait time, especially for users at the distribution tails.
Red Flags to Avoid
Common mistakes that can sink candidates in an interview
Question Timeline
See when this question was last asked and where, including any notes left by other candidates.
Late September, 2025
Roblox
Senior
Early May, 2025
Roblox
Staff
Roblox hosts millions of multiplayer game sessions daily. It is planning to release a matchmaking service for game developers such that users can join a waiting queue, be grouped together by skill level, and join a new gameserver at the same time. A popular game might have 500k concurrent players, hundreds of players joining per second. A user may have skill level e.g. userId : skill (1-100) Design a service to help organize these users into groups of 16 before allocating them a new empty gameserver. A user may only queue for 1 game at a time. You may assume the following ID types exist long GameID long UserID long ServerId And the following table Table UserSkill long GameId long UserId int skill
Your account is free and you can post anonymously if you choose.