
Murrdb: A columnar in-memory cache for AI inference workloads. A faster Redis/RocksDB replacement, optimized for batch low-latency zero-copy reads and writes.
This
README.mdis 99%[^1] human written.
[^1]: Used only for grammar and syntax checking.
What is Murr?

Murr is a caching layer for ML/AI data serving that sits between your batch data pipelines and inference apps:
- Tiered storage: hot data lives in memory, cold data stays on disk with S3-based replication. It's 2026, RAM is expensive - keep only the hot stuff there.
- Batch-in, batch-out: native batch reads and writes over columnar storage, with no per-row overhead. Dumping 1GB Parquet/Arrow files into the ingestion API is a perfectly valid use case.
# yes this works for batch writes
curl -d @0000.parquet -H "Content-Type: application/vnd.apache.parquet" \
-XPUT http://localhost:8080/api/v1/table/yolo/write
- Zero-copy wire protocol: no conversion needed when building
np.ndarray,pd.DataFrameorpt.Tensorfrom API responses. Sure, Redis is fast, but parsing its replies is not (especially in Python!).
=
# look mom, zero copy!
- Stateless: Murr is not a database - all state is persisted on S3. When a Redis node gets restarted, you're cooked. Murr just self-bootstraps from block storage.
Murr shines when:
- your data is heavy and tabular: that giant Parquet dump on S3 your AI inference or ML prep job produces? Perfect fit.
- reads are batched: pulling 100 columns across 1000 documents your agent wants to analyze? Great!
- you care about costs: sure, Redis with 1TB of RAM will work fine, but disk/S3 offloading is operationally simpler and way cheaper.
Short quickstart (see full example):
uv pip install murrdb
and then
= # embedded local instance
# fetch columns for a batch of document keys
=
# Output:
# score category
# 0 0.95 ml
# 1 0.72 infra
# 2 0.68 ops
Why Murr?
TLDR: latency, simplicity, cost -- pick two. Murrdb tries to nail all three: fastest, cheapest, and easiest to operate. A bold claim, I know.

For the typical use case of read N datapoints across M documents (an agent reading document attributes, an ML ranker fetching feature values), on top of being the fastest, Murrdb:
- vs Redis: is persistent (S3 is the new filesystem) and can offload cold data to local NVMe.
- vs embedded RocksDB: no need to build data sync between producer jobs and inference nodes yourself. Murrdb was designed to be distributed from the start.
- vs DynamoDB: roughly 10x cheaper, since you only pay for CPU/RAM, not per query.
Not being a general-purpose database, it tries to be friendly to the everyday pain points of ML/AI engineers:
- First-class Python support:
pip install murrdb, then map to/from Numpy/Pandas/Polars/Pytorch arrays with zero copy. - Sparse columns: when a column has no data, it takes up zero bytes. Unlike the packed feature blob approach, where null columns aren't actually null.
Why NOT Murr?
Murr is not a general-purpose database:
- OLTP workloads: if you have relations, transactions, and per-row reads/writes, go with Postgres.
- Analytics: aggregating over entire tables to produce reports? Pick Clickhouse, BigQuery, or Snowflake.
- General-purpose caching: need to cache user session data for a web app? Use Redis.
- Feature store: yes, it kinda looks like one — but Murrdb doesn't govern how you compute and store your data. Murr is an online serving layer, and can be a part of both internal feature stores and open-source ones like Feast, Hopsworks, and Databricks Feature Store.
[!WARNING] Murr is still in its early days and may not be stable enough for your use case yet. But it's improving quickly.
Quickstart
=
# define table schema
=
# write a batch of documents
=
# fetch specific columns for a few keys
=
# Output:
# score category
# 0 0.95 ml
# 1 0.72 infra
# 2 0.68 ops
Benchmarks
Full benchmark suite with reproduction steps: murrdb/murr-benchmark.
We benchmark a typical ML Ranking use case: 100M rows, 10 float32 columns, 1000 random key lookups per iteration. The suite includes two complementary harnesses:
- Rust (Criterion) — measures raw service throughput as time-to-last-byte. Reads
select_rowsrandom keys per iteration and consumes raw response bytes without decoding. This isolates the storage/network layer and shows the theoretical ceiling of each backend. - Python (pyperf) — measures end-to-end latency as experienced by a Python ML client. Performs the same random-key reads but includes full protocol decoding and conversion into a
pd.DataFrame. This captures the real cost a user pays: protocol parsing, byte deserialization, and DataFrame construction.
Backends and data layouts tested:
- murr (columnar, Arrow IPC) — native columnar format with zero-copy reads and projection pushdown.
- Redis blob — all features packed into a single 40-byte
MGETblob. Compact and cache-friendly, but always reads all columns. - Redis HSET — Feast-style hash-per-row: each feature is a separate HSET field. Flexible, but per-field overhead adds up.
- RocksDB blob — embedded key-value store with the same packed binary layout as Redis blob.
- PostgreSQL blob — BYTEA column with packed features.
- PostgreSQL col-per-feature — explicit typed columns, one per feature.
Rust time-to-last-byte
All backends run on the same machine; container-backed ones use Docker via testcontainers. Memory is measured via Docker cgroup stats (container backends) or /proc/self/statm (embedded backends) as a before/after delta around the data load phase.
| Engine | Layout | Disk | Memory | Ingestion | p95 read latency |
|---|---|---|---|---|---|
| murr 0.1.8 | columnar | 4.8 GiB | 9.5 GiB | 2.76M rows/s | 443 us |
| Redis 8.6.1 | blob | 1.3 GiB | 10.6 GiB | 1.31M rows/s | 998 us |
| Redis 8.6.1 | HSET | 8.2 GiB | 21.2 GiB | 381K rows/s | 4.30 ms |
| RocksDB | blob | 4.3 GiB | 2.5 GiB | 2.40M rows/s | 3.85 ms |
| PostgreSQL 17 | blob | 12.8 GiB | 13.7 GiB | 283K rows/s | 9.75 ms |
| PostgreSQL 17 | col-per-feature | 12.7 GiB | 13.5 GiB | 138K rows/s | 8.79 ms |
Python end-to-end
Measures full round-trip latency including protocol decoding and pd.DataFrame conversion. Ingestion throughput includes Python-side serialization and batch writes.
| Engine | Layout | Ingestion | Read latency |
|---|---|---|---|
| murr 0.1.8 | columnar | 2.34M rows/s | 1.38 ms |
| Redis 8.6.1 | blob | 136K rows/s | 2.42 ms |
| Redis 8.6.1 | HSET | 61K rows/s | 9.39 ms |
| RocksDB | blob | 622K rows/s | 4.90 ms |
| PostgreSQL 17 | blob | 356K rows/s | 10.8 ms |
| PostgreSQL 17 | col-per-feature | 143K rows/s | 10.6 ms |
Murr is ~2.3x faster than the best Redis layout (MGET with packed blobs) on raw read latency, and ~17x faster on Python end-to-end ingestion throughput.
Roadmap
No ETAs, but at least you can see where things stand:
- HTTP API
- Arrow Flight gRPC API
- API for data ingestion
- Storage Directory interface (which is heavily inspired by Apache Lucene)
- Segment read/writes (again, inspired by Apache Lucene)
- Python embedded murrdb, so we can make a cool demo
- Benchmarking harness: Redis support, Feast and feature-blob styles
- Win at your own benchmark (this was surprisingly hard btw)
- Support for
utf8,float32andfloat64datatypes - Python remote API client (sync + async)
- Docker image
- Support most popular Arrow numerical types (signed/unsigned int 8/16/32/64, float 16, date-time)
- Array datatypes (e.g. Arrow
list), so you can store embeddings - Sparse columns
- Add RocksDB and Postgres to the benchmark harness
- Apache Iceberg and the very popular
parquet dump on S3data catalog support
Development
License
Apache 2.0