Protocol Choices
Transport choice
- TCP: Default for most designs; reliable, ordered delivery with flow and congestion control.
- UDP: Use when low latency matters and some loss is fine; no delivery or ordering guarantees.
API style
- REST: Default for interviews and public APIs; simple, flexible, but less performant than binary RPC.
- GraphQL: Use for flexible clients with changing views; solves over-fetching and under-fetching.
- gRPC: Use for internal service calls when performance matters; binary and stronger typed than JSON.
Realtime channel
- SSE: Use for server push over HTTP; good for notifications, but connections get closed and must reconnect.
- WebSockets: Use for high-frequency bidirectional traffic; avoid unless persistent stateful connections are needed.
- WebRTC: Use mainly for audio/video calling; niche, peer-to-peer, and painful to get right.
Load balancing
Where balancing happens
- Client-side load balancing: Best for controlled internal clients, like microservices, that can keep backend lists fresh.
- DNS: Acts like client-side load balancing; good for many clients, but changes are bounded by TTL.
- Dedicated load balancer: Use when clients should not know backend hosts or when routing updates must be fast.
Layer 4 vs Layer 7
- Layer 4 load balancer: Routes by IP and port; good for WebSocket and other persistent connection protocols.
- Layer 7 load balancer: Terminates HTTP and routes by URL, headers, or cookies; default for HTTP-based traffic.
Failure handling
- Timeouts: Set expected limits on network calls so slow dependencies do not stall the whole system.
- Retries with exponential backoff: Use for transient failures; add jitter so clients do not retry in lockstep.
- Idempotency key: Required for retried writes with side effects, like payments, to avoid duplicate execution.
- Circuit breakers: Open after repeated failures, fail fast, and probe recovery later to prevent cascades.
Latency and Locality
Key latency numbers
Nearby server request
<1ms.
NY to London round trip
>80ms.
Theoretical NY-London minimum
around 56ms.
Placement strategy
- CDN: Use for globally read-heavy, cacheable content so edge servers answer requests close to users.
- Regional partitioning: Use when queries are region-local; keep services and databases co-located with local users.
Interview defaults
Use TCP + HTTPS
The default stack; deviate only when requirements clearly push you elsewhere.
Use REST for public APIs
Reach for something else only when flexibility, binary efficiency, or streaming is core.
Use REST internally too
Bring up gRPC only when service-to-service performance matters.
Don't jump to WebSockets
Prove persistent, bidirectional communication is needed first.
Retry with exponential backoff
The reliability answer interviewers expect; senior probes may ask for jitter.

Your account is free and you can post anonymously if you choose.