Shard Decision
When To Shard
Prove the bottleneck
Identify storage, write, or read limits before proposing sharding.
Check storage limits
A single database eventually caps out; Amazon Aurora maxes around 256 TiB.
Check throughput limits
50K writes/s or 100M DAU with many queries can justify distributing load.
Avoid premature sharding
Sharding adds shard keys, routing, hotspots, rebalancing, and consistency work.
Split Types
- Partitioning: Splits data within one database instance; improves scans and maintenance.
- Sharding: Splits data across multiple machines; scales storage and read/write throughput.
Shard Key
Pick high cardinality
user_id gives millions of values; is_premium gives only two.
Prefer even distribution
Avoid country if 90% of users are in the US.
Match query patterns
Common reads and writes should hit one shard, such as a user's data by user_id.
Avoid time-only keys
created_at makes all new writes hit the latest shard.
Distribution
Shard Assignment
- Hash-Based Sharding: Default for most designs; evenly distributes keys but needs a resharding plan.
- Range-Based Sharding: Simple and supports efficient range scans, but skewed ranges create hotspots.
- Directory-Based Sharding: Flexible for moving hot users, but every request adds a lookup and critical dependency.
Resharding Methods
- Consistent hashing: Use with hash sharding; minimizes data movement when adding or removing shards.
- Simple modulo: Changing hash(key) % N from 4 to 5 remaps almost every record.
Sharding Pitfalls
Hot Spots
- Celebrity problem: A hot user can drive 1000x more traffic to one shard; isolate hot keys.
- Compound shard keys: hash(user_id + date) spreads one hot user's data across shards over time.
- Dynamic shard splitting: MongoDB balancer splits and migrates chunks to maintain balance.
Fan-Out Reads
- Cache results: Cache global results for 5 minutes when real-time accuracy is not required.
- Denormalize data: Duplicate related data onto the same shard to avoid common cross-shard reads.
- Precompute globals: Use background jobs for trending content instead of querying all shards per request.
- Accept rare fan-out: Infrequent admin totals can query all shards and aggregate.
Distributed Writes
- Avoid cross-shard transactions: Best solution; keep all of a user's data on their shard.
- Saga pattern: Use independent steps with compensating actions when multiple shards must coordinate.
- Two-phase commit (2PC): Guarantees consistency but is slow, fragile, and usually avoided.
- Eventual consistency: Works for denormalized counts that can differ briefly and converge.
Managed Sharding
- Cassandra: Uses a partitioner with virtual nodes to map partition keys to token ranges.
- DynamoDB: Hashes partition keys to internal partitions and splits or merges as they grow.
- MongoDB: Uses range-based chunks; hashed shard keys are ranges over the hash space.
- Vitess and Citus: SQL sharding layers for MySQL or PostgreSQL that handle routing and resharding.

Your account is free and you can post anonymously if you choose.