# Memvid
The Rust engine behind Memvid. Store everything in a single `.mv2` file.
[](https://crates.io/crates/memvid-core)
[](https://docs.rs/memvid-core)
## What is this?
`Memvid` is a library for building AI memory systems. It packs documents, embeddings, full-text search indices, and a write-ahead log into a single portable file. No database setup. No sidecar files. Just one `.mv2` file you can copy anywhere.
## Why "Frames"?
Memvid borrows from video encoding. Just as video files store sequential frames that can be played, seeked, and edited, Memvid stores your data as an append-only sequence of **frames**.
Each frame contains your content (text, PDF, image, audio) plus metadata, timestamps, and checksums. Frames group into segments for efficient compression and parallel indexing.
This design gives you:
- **Append-only simplicity**: New data never corrupts existing frames
- **Time-travel queries**: Search your memory as it existed at any point
- **Timeline playback**: Browse frames chronologically like scrubbing through video
- **Crash safety**: Incomplete writes don't affect committed frames
```rust
use memvid_core::{Memvid, PutOptions, SearchRequest};
// Create a memory file
let mut mem = Memvid::create("knowledge.mv2")?;
// Add documents with metadata
let opts = PutOptions::builder()
.title("Meeting Notes")
.uri("mv2://meetings/2024-01-15")
.tag("project", "alpha")
.build();
mem.put_bytes_with_options(b"Q4 planning discussion...", opts)?;
mem.commit()?;
// Search
let response = mem.search(SearchRequest {
query: "planning".into(),
top_k: 10,
snippet_chars: 200,
..Default::default()
})?;
for hit in response.hits {
println!("{}: {}", hit.title.unwrap_or_default(), hit.text);
}
```
## Installation
```toml
[dependencies]
memvid-core = { version = "2.0", features = ["lex", "vec", "temporal_track", "parallel_segments"] }
```
### Features
| `lex` | yes | Full-text search with BM25 ranking (Tantivy) |
| `vec` | no | Vector similarity search (HNSW + ONNX embeddings) |
| `temporal_track` | no | Parse natural language dates ("last Tuesday") |
| `parallel_segments` | no | Multi-threaded ingestion for large imports |
| `pdfium` | no | PDF text extraction |
## Core API
### Create and Open
```rust
// New file
let mut mem = Memvid::create("data.mv2")?;
// Open existing (read-write)
let mut mem = Memvid::open("data.mv2")?;
// Open read-only (no lock contention)
let mem = Memvid::open_read_only("data.mv2")?;
```
### Put Documents
```rust
// Simple
mem.put_bytes(b"Some text content")?;
// With metadata
let opts = PutOptions::builder()
.title("API Reference")
.uri("mv2://docs/api")
.tag("version", "2.0")
.search_text("custom text for indexing".into())
.build();
mem.put_bytes_with_options(content, opts)?;
// Don't forget to commit
mem.commit()?;
```
### Search
```rust
let response = mem.search(SearchRequest {
query: "distributed systems".into(),
top_k: 50,
snippet_chars: 200,
scope: Some("mv2://docs/".into()), // optional: filter by URI prefix
..Default::default()
})?;
println!("Found {} results in {}ms", response.total_hits, response.elapsed_ms);
```
### Timeline
Browse documents chronologically:
```rust
use std::num::NonZeroU64;
let entries = mem.timeline(TimelineQuery {
limit: NonZeroU64::new(100),
since: Some(1706745600), // Unix timestamp
until: None,
reverse: false,
temporal: None,
})?;
```
### Stats and Verification
```rust
let stats = mem.stats()?;
println!("Frames: {}, Lex index: {}", stats.frame_count, stats.has_lex_index);
// Verify integrity
let report = Memvid::verify("data.mv2", true)?; // true = deep check
```
## File Format
Everything lives in the `.mv2` file:
```
┌────────────────────────────┐
│ Header (4KB) │ Magic, version, capacity
├────────────────────────────┤
│ Embedded WAL (1-64MB) │ Crash recovery
├────────────────────────────┤
│ Data Segments │ Compressed frames
├────────────────────────────┤
│ Lex Index │ Tantivy full-text
├────────────────────────────┤
│ Vec Index │ HNSW vectors
├────────────────────────────┤
│ Time Index │ Chronological ordering
├────────────────────────────┤
│ TOC (Footer) │ Segment offsets
└────────────────────────────┘
```
No `.wal`, `.lock`, `.shm`, or any other files. Ever.
See [MV2_SPEC.md](MV2_SPEC.md) for the complete file format specification.
## Benchmarks
Run on Apple M1 Pro with 50K documents:
| Search (single term) | 0.8ms |
| Search (multi-term) | 1.2ms |
| Cold start + first search | 190ms |
| Concurrent readers (8x) | 3.5ms total |
Run benchmarks yourself:
```bash
cd crates/memvid-core/benchmarks
cargo bench
```
## Examples
```bash
cargo run --example basic_usage
```
See [`examples/`](examples/) for more.
## Feature Compatibility
Files remember which features were enabled when created. Opening a file requires matching features:
```bash
# Check what a file needs
If you created a file with the CLI (which enables everything), open it with all features:
```toml
memvid-core = { version = "2.0", features = ["lex", "vec", "temporal_track"] }
```
## Logging
Uses `tracing`. Configure in your app:
```rust
tracing_subscriber::fmt()
.with_env_filter("memvid_core=warn")
.init();
```
## License
Apache 2.0