OmenDB
Embedded vector database for Python and Node.js. No server, no setup, just install.
Quick Start
# Create database (persistent) - creates ./mydb.omen file
=
# Add vectors with metadata
# Search
=
# Filtered search
=
Features
- Embedded - Runs in-process, no server needed
- Persistent - Data survives restarts automatically
- Filtered search - Query by metadata with JSON-style filters
- Hybrid search - Combine vector similarity with BM25 text search
- Quantization - 4x smaller indexes with minimal recall loss
Platforms
| Platform | Status |
|---|---|
| Linux (x86_64, ARM64) | Supported |
| macOS (Intel, Apple Silicon) | Supported |
| Windows (x86_64) | Experimental |
API
# Database
= # Open or create
= # In-memory (ephemeral)
# CRUD
# Insert/update vectors
# Get by ID
# Batch get by IDs
# Delete by IDs
# Delete by metadata filter
# Update metadata only
# Iteration
# Number of vectors
# Same as len(db)
# Count matching filter
# Iterate all IDs (lazy)
# Get all items as list
# Check if ID exists
in # Same as exists()
... # Iterate all items (lazy)
# Search
# Vector search
# Filtered search
# Only results with distance <= 0.5
# Batch search (parallel)
# Hybrid search (requires text field in vectors)
# 70% vector, 30% text
# Return separate scores
# Text-only BM25
# Persistence
# Flush to disk
# Close and release file locks
# Remove deleted records, reclaim space
# Reorder graph for cache locality
Distance Filtering
Use max_distance to filter out low-relevance results (prevents "context rot" in RAG):
# Only return results with distance <= 0.5
=
# Combine with metadata filter
=
This ensures your RAG pipeline only receives highly relevant context, avoiding distractors that can hurt LLM performance.
Filters
# Equality
# Shorthand
# Explicit
# Comparison
# Not equal
# Greater than
# Greater or equal
# Less than
# Less or equal
# Membership
# In list
# String contains
# Logical
# AND
# OR
Configuration
=
# Quantization options:
# - True or "sq8": SQ8 ~4x smaller, ~99% recall (recommended)
# - None/False: Full precision (default)
# Distance metric options:
# - "l2" or "euclidean": Euclidean distance (default)
# - "cosine": Cosine distance (1 - cosine similarity)
# - "dot" or "ip": Inner product (for MIPS)
# Context manager (auto-flush on exit)
# Hybrid search with alpha (0=text, 1=vector, default=0.5)
# Get separate keyword and semantic scores for debugging/tuning
=
# Returns: {"id": "...", "score": 0.85, "keyword_score": 0.92, "semantic_score": 0.78}
Performance
10K vectors, Apple M3 Max (m=16, ef=100, k=10). Measured 2026-01-20:
| Dimension | Single QPS | Batch QPS | Speedup |
|---|---|---|---|
| 128D | 11,542 | 82,015 | 7.1x |
| 768D | 3,531 | 26,254 | 7.4x |
| 1536D | 1,825 | 7,579 | 4.2x |
SIFT-1M (1M vectors, 128D, m=16, ef=100, k=10):
| Machine | QPS | Recall |
|---|---|---|
| i9-13900KF | 4,591 | 98.6% |
| Apple M3 Max | 3,216 | 98.4% |
Quantization reduces memory with minimal recall loss:
| Mode | Compression | Use Case |
|---|---|---|
| f32 | 1x | Default, highest recall |
| sq8 | 4x | Recommended for most users |
= # Enable SQ8
- Parameters: m=16, ef_construction=100, ef_search=100
- Batch: Uses Rayon for parallel search across all cores
- Recall: Validated against brute-force ground truth on SIFT/GloVe
- Reproduce:
- Quick (10K):
uv run python benchmarks/run.py - SIFT-1M:
uv run python benchmarks/ann_dataset_test.py --dataset sift-128-euclidean
- Quick (10K):
Tuning ef_search for High Dimensions
The ef_search parameter controls the recall/speed tradeoff at query time. Higher values explore more candidates, improving recall but slowing search.
Rules of thumb:
ef_searchmust be >= k (number of results requested)- For 128D embeddings: ef=100 usually achieves 90%+ recall
- For 768D+ embeddings: increase to ef=200-400 for better recall
- If recall drops at scale (50K+), increase both ef_search and ef_construction
Runtime tuning:
# Check current value
# 100
# Increase for better recall (slower)
# Decrease for speed (may reduce recall)
# Per-query override
=
Recommended settings by use case:
| Use Case | ef_search | Expected Recall |
|---|---|---|
| Fast search (128D) | 64 | ~85% |
| Balanced (default) | 100 | ~90% |
| High recall (768D+) | 200-300 | ~95%+ |
| Maximum recall | 500+ | ~98%+ |
Examples
See complete working examples:
python/examples/quickstart.py- Minimal Python examplepython/examples/basic.py- CRUD operations and persistencepython/examples/filters.py- All filter operatorspython/examples/rag.py- RAG workflow with mock embeddingsnode/examples/quickstart.js- Minimal Node.js example
Integrations
LangChain
=
=
LlamaIndex
=
=
=
=
License
Elastic License 2.0 - Free to use, modify, and embed. The only restriction: you can't offer OmenDB as a managed service to third parties.