Limited Time Offer:Up to 20% off Hello Interview Premium
Up to 20% off Hello Interview Premium 🎉
Hello Interview
Your Dashboard
System Design
Code
Low Level Design
Behavioral
AI Coding
New
ML System Design
Salary Negotiation
Interview Guides
Blog
System Design
Low Level Design
AI Coding
Behavioral
New
Interview Questions
Success Stories
System Design
Low-Level Design
New
Ask The Community
Interview Experiences
Discord
Mock Interviews
1:1 Mentorship
Refer a Friend
Pricing
Sign in / Sign up
Search
⌘K
Pricing

Tutor

In-Memory Database: Implement SQL-Like Operations

Asked at:

OpenAI


DESCRIPTION

Build an in-memory database that supports SQL-like operations including INSERT, SELECT with WHERE clauses, and ORDER BY functionality. Implement efficient data storage structures, a query parser to evaluate conditions, and inverted indexes to optimize search performance. For example, after inserting records like {id: 1, name: 'Alice', age: 30}, you should be able to query SELECT * WHERE age > 25 ORDER BY name and retrieve matching records efficiently.

Input:

INSERT {id: 1, name: 'Alice', age: 30}
INSERT {id: 2, name: 'Bob', age: 25}
SELECT * WHERE age > 25 ORDER BY name ASC

Output:

[{id: 1, name: 'Alice', age: 30}]


Explanation: Only Alice matches the condition age > 25, returned in ascending order by name

Constraints:

  • Support INSERT operations with arbitrary key-value pairs
  • Support SELECT with multiple WHERE conditions (AND logic)
  • Support ORDER BY with ASC/DESC sorting on any field
  • Implement inverted indexes for efficient field-based lookups
  • Handle comparison operators: =, !=, <, >, <=, >=
  • Records can have different schemas (flexible fields)

Understanding the Problem

The challenge involves three interconnected components: storage, querying, and optimization. You need a flexible data structure to store records with varying schemas, a parser to interpret SQL-like query syntax and evaluate conditions, and inverted indexes to avoid full table scans. The complexity lies in efficiently filtering records based on multiple conditions while maintaining sorted results. Balancing memory usage with query performance requires careful index design.

Building Intuition

A naive approach would store all records in a list and scan every record for each query, checking conditions one by one—this becomes O(n) per query. A better approach uses inverted indexes that map field values to record IDs, allowing direct lookup of matching records in O(1) for equality checks. For example, an index on age would map {30: [record1], 25: [record2]}, letting you instantly find all records with age = 30 without scanning.

Real-world databases handle millions of records where full table scans are prohibitively expensive. Inverted indexes are the foundation of search engines and database query optimizers, enabling sub-second response times. Understanding this pattern teaches you how to trade memory for speed—a critical skill in system design.

Common Pitfalls

Pitfall: Re-parsing query strings on every execution → Fix: Parse once and cache the query plan

Pitfall: Building indexes for every field upfront → Fix: Create indexes lazily or only for frequently queried fields

Pitfall: Not handling missing fields in records → Fix: Treat missing fields as null and handle gracefully in comparisons

Pitfall: Sorting entire dataset before filtering → Fix: Filter first to reduce the dataset, then sort

Implementation

Data Storage Layer with Inverted Indexes

Implement the core storage engine using a list to hold records and inverted indexes (hash maps) to map field values to record IDs. Each index tracks which records contain specific values for a field, enabling O(1) lookups. For example, inserting {id: 1, age: 30} updates the age index to map 30 → [1]. This section focuses solely on storage and index maintenance during insertions.

Solution
Python
Language
class Database:
def __init__(self):
self.records = []
self.inverted_indexes = {}
def insert(self, record):
record_id = len(self.records)
self.records.append(record)
for field, value in record.items():
if field not in self.inverted_indexes:
self.inverted_indexes[field] = {}
if value not in self.inverted_indexes[field]:
self.inverted_indexes[field][value] = []
self.inverted_indexes[field][value].append(record_id)
Query Parser and Condition Evaluator

Build a query parser that extracts WHERE conditions and ORDER BY clauses from query strings, then implements a condition evaluator to check if records match criteria. Parse conditions into structured objects with field, operator, and value components. For example, age > 25 becomes {field: 'age', operator: '>', value: 25}. This section uses the storage layer from Section 1 but focuses on parsing and evaluation logic.

Solution
Python
Language
class QueryParser:
def __init__(self):
self.operators = {'=': lambda a, b: a == b, '>': lambda a, b: a > b, '<': lambda a, b: a < b, '>=': lambda a, b: a >= b, '<=': lambda a, b: a <= b, '!=': lambda a, b: a != b}
def parse_query(self, query):
conditions, order_by = [], None
if 'WHERE' in query:
where_part = query.split('WHERE')[1].split('ORDER BY')[0].strip()
for cond in where_part.split('AND'):
cond = cond.strip()
for op in ['>=', '<=', '!=', '=', '>', '<']:
if op in cond:
field, value = cond.split(op)
conditions.append({'field': field.strip(), 'operator': op, 'value': self._cast_value(value.strip())})
break
if 'ORDER BY' in query:
order_by = query.split('ORDER BY')[1].strip()
return conditions, order_by
def _cast_value(self, value):
if value.isdigit(): return int(value)
try: return float(value)
except: return value.strip("'\"")
def evaluate(self, record, conditions):
return all(self.operators[c['operator']](record.get(c['field']), c['value']) for c in conditions)
Query Executor with Index Optimization

Implement the query execution engine that combines the parser (Section 2) and storage layer (Section 1) to process SELECT queries efficiently. Use inverted indexes to fetch candidate records for equality conditions, then filter with remaining conditions and apply sorting. For example, WHERE age = 30 AND name = 'Alice' first looks up age index, then filters by name. This section orchestrates the complete query flow.

Solution
Python
Language
class QueryExecutor:
def __init__(self, database, parser):
self.db = database
self.parser = parser
def execute(self, query):
conditions, order_by = self.parser.parse_query(query)
candidate_ids = self._get_candidates_from_index(conditions)
results = [self.db.records[rid] for rid in candidate_ids if self.parser.evaluate(self.db.records[rid], conditions)]
if order_by:
results.sort(key=lambda r: r.get(order_by, 0))
return results
def _get_candidates_from_index(self, conditions):
eq_conditions = [c for c in conditions if c['operator'] == '=']
if not eq_conditions:
return range(len(self.db.records))
field, value = eq_conditions[0]['field'], eq_conditions[0]['value']
return self.db.inverted_indexes.get(field, {}).get(value, [])
Complete Integration and Usage

Combine Section 1's storage engine, Section 2's parser, and Section 3's executor into a unified Database class with a simple API. Provide methods like insert(record) and query(sql_string) that internally coordinate all components. Show end-to-end examples of inserting multiple records and executing complex queries with multiple conditions and sorting, demonstrating how the inverted indexes accelerate lookups compared to full scans.

Solution
Python
Language
class InMemoryDatabase:
def __init__(self):
self.storage = Database()
self.parser = QueryParser()
self.executor = QueryExecutor(self.storage, self.parser)
def insert(self, record):
self.storage.insert(record)
def query(self, sql_string):
return self.executor.execute(sql_string)
# Usage example
db = InMemoryDatabase()
db.insert({'id': 1, 'name': 'Alice', 'age': 30, 'city': 'NYC'})
db.insert({'id': 2, 'name': 'Bob', 'age': 25, 'city': 'LA'})
db.insert({'id': 3, 'name': 'Charlie', 'age': 35, 'city': 'NYC'})
results = db.query("SELECT * WHERE city = 'NYC' AND age >= 30 ORDER BY age")

What We've Learned

  • Pattern: Inverted indexes trade memory for query speed by pre-computing field-to-record mappings
  • Use Case: Essential for search engines, databases, and any system requiring fast filtered lookups
  • Optimization: Filter with indexed fields first to minimize the candidate set before applying expensive operations
  • Design: Separate parsing, storage, and execution concerns for maintainable and testable code

Problems to Practice

Implement Trie Methods

medium

Trie

Building a Trie involves implementing data structure methods similar to an in-memory database. Both require efficient data storage structures, insert operations, and query functionality with optimized search performance through specialized indexing.

Practice
Subarray Sum Equals K

medium

Prefix Sum

This problem uses hashmaps to optimize query operations, similar to how inverted indexes work in databases. It demonstrates efficient data retrieval using hash-based lookups and condition evaluation, which are core concepts in database query optimization.

Practice
Overview
Lesson
Trie

This lesson provides foundational understanding of Trie data structures, which share similarities with database indexing strategies. Understanding Tries helps grasp how efficient search structures work, which is essential for implementing inverted indexes in an in-memory database.

Learn

Question Timeline

See when this question was last asked and where, including any notes left by other candidates.

Company
​
Level
All Regions
Region

Late March, 2026

OpenAI

Mid-level

Early March, 2026

OpenAI

Mid-level

Late January, 2026

OpenAI

Mid-level

Get Premium to View All 10+ Reports

Your account is free and you can post anonymously if you choose.

Hello Interview Premium

Recent interview questions
System Design Guided Practice
Exclusive content
Learn More
Questions
Meta SWE Interview QuestionsAmazon SWE Interview QuestionsGoogle SWE Interview QuestionsOpenAI SWE Interview QuestionsEngineering Manager (EM) Interview Questions
Learn
Learn System DesignLearn DSALearn BehavioralLearn ML System DesignLearn Low Level DesignGuided Practice
Links
FAQPricingGift PremiumHello Interview Premium
Legal
Terms and ConditionsPrivacy PolicySecurity
Contact
About UsProduct Support

7511 Greenwood Ave North Unit #4238 Seattle WA 98103


© 2026 Optick Labs Inc. All rights reserved.