Search
⌘K
Get Premium
Key Technologies
Vector Databases
Learn how vector databases power similarity search, recommendations, and AI applications in system design.
If you've been paying attention to anything in tech over the past few years, you've noticed "embeddings" everywhere. Search engines that actually understand what you mean. Recommendation systems that surface eerily relevant content. Chatbots that can retrieve information from massive document collections. All of these rely on the same underlying primitive: finding things that are similar to other things, fast.
This isn't actually a new concept. Vector databases and their related techniques have been around for a long time in recommendation systems. But the power of vector databases has been amplified by the rise of new machine learning techniques and it unlocks a cool new set of infra applications.
Traditional databases are great at exact lookups. Give me the user with ID 12345. Find all orders placed on January 1st. But ask a traditional database "find me documents similar to this one" and you're in trouble. That's where vector databases come in.
This deep dive will cover what vector databases are, how they work under the hood, and most importantly, how to use them effectively in a system design interview. We'll go deep on the indexing algorithms that make similarity search fast, but we'll also be practical about when you actually need a dedicated vector database versus when a simple extension to your existing database will do the job.
If the detail here is frightening to you, skip to the applications section at the end and work backwards. Most system design interviews won't cover vector databases. Those that do often don't care that you know the internals of vector databases as much as they care about you knowing how and where to use them.
What's a Vector Anyway?
Before we talk about databases that store vectors, we need to understand what we're actually storing.
A vector (or embedding) is just an array of numbers that represents something. That "something" could be a word, a sentence, an image, a user, a product, or really anything you can feed into a machine learning model. The magic is that similar things end up with similar vectors.
// Two sentences that mean similar things
"The cat sat on the mat" → [0.12, -0.34, 0.78, ..., 0.45] // 1536 numbers
"A feline rested on a rug" → [0.11, -0.32, 0.79, ..., 0.44] // very similar!
// A sentence with different meaning
"The stock market crashed" → [-0.89, 0.12, -0.45, ..., 0.23] // very differentThe typical embedding has somewhere between 128 and 1536 dimensions (OpenAI's text-embedding-3-large uses 3072). Each dimension captures some aspect of the meaning, though the individual dimensions aren't usually interpretable by humans. What matters is that the geometric relationships between vectors reflect semantic relationships between the things they represent.
Vector Similarity with just 2 dimensions for visualization (real embeddings have many more!)
Similarity Metrics
The Nearest Neighbor Problem
How Vector Databases Work
Indexing Strategies
HNSW (Hierarchical Navigable Small World)
IVF (Inverted File Index)
Locality Sensitive Hashing (LSH)
Annoy
Filtering and Hybrid Search
Inserts, Updates, and Index Maintenance
Vector Database Options
Vector Extensions for Traditional DBs and Stores (Start Here)
Purpose-Built Vector DBs (When You Need Scale)
Using Vector Databases in Your Interview
Common Interview Scenarios
Architecture Patterns
Key Design Decisions to Discuss
Numbers to Know
Gotchas and Limitations
Summary
Purchase Premium to Keep Reading
Unlock this article and so much more with Hello Interview Premium
Currently up to 25% off
Hello Interview Premium
Reading Progress
On This Page
What's a Vector Anyway?
Similarity Metrics
The Nearest Neighbor Problem
How Vector Databases Work
Indexing Strategies
Filtering and Hybrid Search
Inserts, Updates, and Index Maintenance
Vector Database Options
Vector Extensions for Traditional DBs and Stores (Start Here)
Purpose-Built Vector DBs (When You Need Scale)
Using Vector Databases in Your Interview
Common Interview Scenarios
Architecture Patterns
Key Design Decisions to Discuss
Numbers to Know
Gotchas and Limitations
Summary

Schedule a mock interview
Meet with a FAANG senior+ engineer or manager and learn exactly what it takes to get the job.