Limited Time Offer:Up to 0% off Hello Interview Premium

⌘K

Tutor

Get Premium

Full Article

Quick Reference

Data Modeling

Interview Standards

Show minimum schema

Include key fields, relationships, and index or partition notes beside the database.

Tie choices to requirements

Say how data volume, access pattern, or consistency need caused each schema choice.

Stay domain-specific

Use users, tweets, follows for Twitter-like systems, not abstract entity names.

Do not over-model

Good enough is clear and functional; no full normalization or complete schema diagram expected.

Database Choice

Relational Databases (SQL) are the default for most designs with PostgreSQL, clear entities, joins, and ACID transactions; Document Databases fit changing, nested, or varied records but complicate updates; Key-Value Stores fit caches, sessions, feature flags, and exact-key lookups with limited queries; Wide-Column Databases fit enormous writes, time-series, and append-heavy analytics; Graph Databases are almost never needed in interviews because traversal gains rarely justify operational complexity.

Schema Drivers

Data volume

Decides whether data fits one store or must split across systems with separate schemas.

Access patterns

Most important driver; derive required queries from each API endpoint.

Consistency requirements

Strong consistency favors one ACID database; eventual consistency allows distribution.

Keys Relationships

Key Types

Primary Keys should be stable system-generated IDs, Foreign Keys enforce referential integrity with write validation cost, and Constraints protect data quality with write overhead.

Relationship Cardinalities

One-to-many models parent-child ownership, Many-to-many uses a linking table such as likes, and One-to-one is rare and often should be merged.

Access Indexes

$posts.user_id supports fetching all posts by one user or GET /users/{id}/posts, posts.created_at supports loading recent posts in chronological order, and (user_id, created_at) supports a user's recent posts without scanning all posts.$

Normalization Choice

Start Normalized is the default to store each fact once and avoid update anomalies, Denormalize Selectively fits analytics, audit trails, event logs, search, and heavily read-optimized systems, and Denormalized Cache keeps the source of truth clean while a cache holds precomputed joins or aggregations for fast reads.

Sharding Choice

Shard Key Selection

Shard by access pattern is the default when sharding is needed to keep related data together, while Avoid time-range sharding because current writes hit the latest shard in write-heavy systems.

Sharding Constraints

Avoid cross-shard queries because querying and merging multiple shards is expensive and complex, and Treat shard key as permanent because it affects every query and is often permanent.

Your account is free and you can post anonymously if you choose.

Reading Progress

On This Page

Interview Standards

Database Choice

Schema Drivers

Keys Relationships

Key Types

Relationship Cardinalities

Access Indexes

Normalization Choice

Sharding Choice

Shard Key Selection

Sharding Constraints

Data Modeling

Interview Standards

Show minimum schema

Tie choices to requirements

Stay domain-specific

Do not over-model

Database Choice

Schema Drivers

Data volume

Access patterns

Consistency requirements

Keys Relationships

Key Types

Relationship Cardinalities

Access Indexes

Normalization Choice

Sharding Choice

Shard Key Selection

Sharding Constraints

Comments

Questions

Learn

Links

Legal

Contact