Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
OmenDB
Embedded vector database for Python and Node.js. No server, no setup, just install.
Quick Start
# Create database (persistent) - creates ./mydb.omen file
=
# Add vectors with metadata
# Search
=
# Filtered search
=
Features
- Embedded - Runs in-process, no server needed
- Persistent - Data survives restarts automatically
- Filtered search - Query by metadata with JSON-style filters
- Hybrid search - Combine vector similarity with BM25 text search
- Quantization - 4x smaller indexes with minimal recall loss
Platforms
| Platform | Status |
|---|---|
| Linux (x86_64, ARM64) | Supported |
| macOS (Intel, Apple Silicon) | Supported |
| Windows (x86_64) | Experimental |
API
# Database
= # Open or create
= # In-memory (ephemeral)
# CRUD
# Insert/update vectors
# Get by ID
# Batch get by IDs
# Delete by IDs
# Delete by metadata filter
# Update metadata only
# Iteration
# Number of vectors
# Same as len(db)
# Count matching filter
# Iterate all IDs (lazy)
# Get all items as list
# Check if ID exists
in # Same as exists()
... # Iterate all items (lazy)
# Search
# Vector search
# Filtered search
# Only results with distance <= 0.5
# Batch search (parallel)
# Hybrid search (requires text field in vectors)
# 70% vector, 30% text
# Return separate scores
# Text-only BM25
# Persistence
# Flush to disk
# Close and release file locks
# Remove deleted records, reclaim space
# Reorder graph for cache locality
Distance Filtering
Use max_distance to filter out low-relevance results (prevents "context rot" in RAG):
# Only return results with distance <= 0.5
=
# Combine with metadata filter
=
This ensures your RAG pipeline only receives highly relevant context, avoiding distractors that can hurt LLM performance.
Filters
# Equality
# Shorthand
# Explicit
# Comparison
# Not equal
# Greater than
# Greater or equal
# Less than
# Less or equal
# Membership
# In list
# String contains
# Logical
# AND
# OR
Configuration
=
# Quantization options:
# - True or "sq8": SQ8 ~4x smaller, ~99% recall (recommended)
# - None/False: Full precision (default)
# Distance metric options:
# - "l2" or "euclidean": Euclidean distance (default)
# - "cosine": Cosine distance (1 - cosine similarity)
# - "dot" or "ip": Inner product (for MIPS)
# Context manager (auto-flush on exit)
# Hybrid search with alpha (0=text, 1=vector, default=0.5)
# Get separate keyword and semantic scores for debugging/tuning
=
# Returns: {"id": "...", "score": 0.85, "keyword_score": 0.92, "semantic_score": 0.78}
Performance
10K vectors, Apple M3 Max (m=16, ef=100, k=10). Measured 2026-01-20:
| Dimension | Single QPS | Batch QPS | Speedup |
|---|---|---|---|
| 128D | 11,542 | 82,015 | 7.1x |
| 768D | 3,531 | 26,254 | 7.4x |
| 1536D | 1,825 | 7,579 | 4.2x |
SIFT-1M (1M vectors, 128D, m=16, ef=100, k=10):
| Machine | QPS | Recall |
|---|---|---|
| i9-13900KF | 4,591 | 98.6% |
| Apple M3 Max | 3,216 | 98.4% |
Quantization reduces memory with minimal recall loss:
| Mode | Compression | Use Case |
|---|---|---|
| f32 | 1x | Default, highest recall |
| sq8 | 4x | Recommended for most users |
= # Enable SQ8
- Parameters: m=16, ef_construction=100, ef_search=100
- Batch: Uses Rayon for parallel search across all cores
- Recall: Validated against brute-force ground truth on SIFT/GloVe
- Reproduce:
- Quick (10K):
uv run python benchmarks/run.py - SIFT-1M:
uv run python benchmarks/ann_dataset_test.py --dataset sift-128-euclidean
- Quick (10K):
Tuning ef_search for High Dimensions
The ef_search parameter controls the recall/speed tradeoff at query time. Higher values explore more candidates, improving recall but slowing search.
Rules of thumb:
ef_searchmust be >= k (number of results requested)- For 128D embeddings: ef=100 usually achieves 90%+ recall
- For 768D+ embeddings: increase to ef=200-400 for better recall
- If recall drops at scale (50K+), increase both ef_search and ef_construction
Runtime tuning:
# Check current value
# 100
# Increase for better recall (slower)
# Decrease for speed (may reduce recall)
# Per-query override
=
Recommended settings by use case:
| Use Case | ef_search | Expected Recall |
|---|---|---|
| Fast search (128D) | 64 | ~85% |
| Balanced (default) | 100 | ~90% |
| High recall (768D+) | 200-300 | ~95%+ |
| Maximum recall | 500+ | ~98%+ |
Examples
See complete working examples:
python/examples/quickstart.py- Minimal Python examplepython/examples/basic.py- CRUD operations and persistencepython/examples/filters.py- All filter operatorspython/examples/rag.py- RAG workflow with mock embeddingsnode/examples/quickstart.js- Minimal Node.js example
Integrations
LangChain
=
=
LlamaIndex
=
=
=
=
License
Elastic License 2.0 - Free to use, modify, and embed. The only restriction: you can't offer OmenDB as a managed service to third parties.