Memvid
The Rust engine behind Memvid. Store everything in a single .mv2 file.
What is this?
Memvid is a library for building AI memory systems. It packs documents, embeddings, full-text search indices, and a write-ahead log into a single portable file. No database setup. No sidecar files. Just one .mv2 file you can copy anywhere.
Why "Frames"?
Memvid borrows from video encoding. Just as video files store sequential frames that can be played, seeked, and edited, Memvid stores your data as an append-only sequence of frames.
Each frame contains your content (text, PDF, image, audio) plus metadata, timestamps, and checksums. Frames group into segments for efficient compression and parallel indexing.
This design gives you:
- Append-only simplicity: New data never corrupts existing frames
- Time-travel queries: Search your memory as it existed at any point
- Timeline playback: Browse frames chronologically like scrubbing through video
- Crash safety: Incomplete writes don't affect committed frames
use ;
// Create a memory file
let mut mem = create?;
// Add documents with metadata
let opts = builder
.title
.uri
.tag
.build;
mem.put_bytes_with_options?;
mem.commit?;
// Search
let response = mem.search?;
for hit in response.hits
Installation
[]
= { = "2.0", = ["lex", "vec", "temporal_track", "parallel_segments"] }
Features
| Feature | Default | What it does |
|---|---|---|
lex |
yes | Full-text search with BM25 ranking (Tantivy) |
vec |
no | Vector similarity search (HNSW + ONNX embeddings) |
temporal_track |
no | Parse natural language dates ("last Tuesday") |
parallel_segments |
no | Multi-threaded ingestion for large imports |
pdfium |
no | PDF text extraction |
Core API
Create and Open
// New file
let mut mem = create?;
// Open existing (read-write)
let mut mem = open?;
// Open read-only (no lock contention)
let mem = open_read_only?;
Put Documents
// Simple
mem.put_bytes?;
// With metadata
let opts = builder
.title
.uri
.tag
.search_text
.build;
mem.put_bytes_with_options?;
// Don't forget to commit
mem.commit?;
Search
let response = mem.search?;
println!;
Timeline
Browse documents chronologically:
use NonZeroU64;
let entries = mem.timeline?;
Stats and Verification
let stats = mem.stats?;
println!;
// Verify integrity
let report = verify?; // true = deep check
File Format
Everything lives in the .mv2 file:
┌────────────────────────────┐
│ Header (4KB) │ Magic, version, capacity
├────────────────────────────┤
│ Embedded WAL (1-64MB) │ Crash recovery
├────────────────────────────┤
│ Data Segments │ Compressed frames
├────────────────────────────┤
│ Lex Index │ Tantivy full-text
├────────────────────────────┤
│ Vec Index │ HNSW vectors
├────────────────────────────┤
│ Time Index │ Chronological ordering
├────────────────────────────┤
│ TOC (Footer) │ Segment offsets
└────────────────────────────┘
No .wal, .lock, .shm, or any other files. Ever.
See MV2_SPEC.md for the complete file format specification.
Benchmarks
Run on Apple M1 Pro with 50K documents:
| Operation | Time |
|---|---|
| Search (single term) | 0.8ms |
| Search (multi-term) | 1.2ms |
| Cold start + first search | 190ms |
| Concurrent readers (8x) | 3.5ms total |
Run benchmarks yourself:
Examples
See examples/ for more.
Feature Compatibility
Files remember which features were enabled when created. Opening a file requires matching features:
# Check what a file needs
|
If you created a file with the CLI (which enables everything), open it with all features:
= { = "2.0", = ["lex", "vec", "temporal_track"] }
Logging
Uses tracing. Configure in your app:
fmt
.with_env_filter
.init;
License
Apache 2.0