docs.rs failed to build omendb-0.0.34
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Visit the last successful build: omendb-0.0.24

OmenDB

Embedded vector database for Python and Node.js. No server, no setup, just install.

7,600 QPS single / 64,000 QPS batch search, 99.8% recall (SIFT-100K)
60K vec/s insert throughput
SQ8 quantization (4x compression, 99.8% recall, 2x faster search)
ACORN-1 predicate-aware filtered search
Hybrid search -- BM25 text + vector with RRF fusion
Multi-vector -- ColBERT/MaxSim with MUVERA and token pooling
Auto-embedding -- pass a function, store documents, search with strings

pip install omendb       # Python
npm install omendb       # Node.js

Quick Start

Python

With auto-embedding -- pass an embedding function, work with documents and strings:

import omendb

def embed(texts):
    # Your embedding model here (OpenAI, sentence-transformers, etc.)
    return [[0.1] * 384 for _ in texts]

db = omendb.open("./mydb", dimensions=384, embedding_fn=embed)

# Add documents -- auto-embedded
db.set([
    {"id": "doc1", "document": "Paris is the capital of France", "metadata": {"topic": "geography"}},
    {"id": "doc2", "document": "The mitochondria is the powerhouse of the cell", "metadata": {"topic": "biology"}},
])

# Search with text -- auto-embedded
results = db.search("capital of France", k=5)

With vectors -- bring your own embeddings:

db = omendb.open("./mydb", dimensions=128)

db.set([
    {"id": "doc1", "vector": [0.1] * 128, "metadata": {"category": "science"}},
    {"id": "doc2", "vector": [0.2] * 128, "metadata": {"category": "history"}},
])

results = db.search([0.1] * 128, k=5)
results = db.search([0.1] * 128, k=5, filter={"category": "science"})

Node.js

With auto-embedding:

const { open } = require("omendb");

const db = open("./mydb", { dimensions: 384 }, embed);
await db.set([{ id: "doc1", document: "Paris is the capital of France" }]);
const results = await db.search("capital of France", 5);

With vectors:

const db = open("./mydb", { dimensions: 128 });
await db.set([{ id: "doc1", vector: new Float32Array(128).fill(0.1) }]);
const results = await db.search(new Float32Array(128).fill(0.1), 5);

Features

HNSW graph indexing -- SIMD-accelerated distance computation
ACORN-1 filtered search -- predicate-aware graph traversal, 37.79x speedup over post-filtering
SQ8 quantization -- 4x compression, 99.8% recall, 2x faster search
BM25 text search -- full-text search via Tantivy
Hybrid search -- RRF fusion of vector + text results
Multi-vector / ColBERT -- MUVERA + MaxSim scoring for token-level retrieval
Token pooling -- k-means clustering, 50% storage reduction for multi-vector
Auto-embedding -- embedding_fn (Python) / embeddingFn (Node.js) for document-in, text-query workflows
Collections -- namespaced sub-databases within a single file
Persistence -- WAL + atomic checkpoints
O(1) lazy delete + compaction -- deleted records cleaned up in background
Segment-based architecture -- background merging for sustained write throughput
Context manager (Python) / close() (Node.js) for resource cleanup

Platforms

Platform	Status
Linux (x86_64, ARM64)	Supported
macOS (Intel, Apple Silicon)	Supported

API Reference

Python

# Database
db = omendb.open(path, dimensions, embedding_fn=fn)  # With auto-embedding
db = omendb.open(path, dimensions)                    # Manual vectors
db = omendb.open(":memory:", dimensions)              # In-memory

# CRUD
db.set(items)                           # Insert/update (vectors or documents)
db.set("id", vector, metadata)          # Single insert
db.get(id)                              # Get by ID
db.get_batch(ids)                       # Batch get
db.delete(ids)                          # Delete by IDs
db.delete_by_filter(filter)             # Delete by metadata filter
db.update(id, vector, metadata, text)   # Update fields

# Search
db.search(query, k)                     # Vector or string query
db.search(query, k, filter={...})       # Filtered search (ACORN-1)
db.search(query, k, max_distance=0.5)   # Distance threshold
db.search_batch(queries, k)             # Batch search (parallel)

# Hybrid search
db.search_hybrid(query_vector, query_text, k)
db.search_hybrid("query text", k=10)    # String query (auto-embeds both)
db.search_text(query_text, k)           # Text-only BM25

# Iteration
len(db)                                 # Count
db.count(filter={...})                  # Filtered count
db.ids()                                # Lazy ID iterator
db.items()                              # All items (loads to memory)
for item in db: ...                     # Lazy iteration
"id" in db                              # Existence check

# Collections
col = db.collection("users")            # Create/get collection
db.collections()                        # List collections
db.delete_collection("users")           # Delete collection

# Persistence
db.flush()                              # Flush to disk
db.close()                              # Close
db.compact()                            # Remove deleted records
db.optimize()                           # Reorder for cache locality
db.merge_from(other_db)                 # Merge databases

# Config
db.ef_search                            # Get search quality
db.ef_search = 200                      # Set search quality
db.dimensions                           # Vector dimensionality
db.stats()                              # Database statistics

Node.js

// Database
const db = open(path, { dimensions, embeddingFn: fn });
const db = open(path, { dimensions });

// CRUD
await db.set(items);
db.get(id);
db.getBatch(ids);
db.delete(ids);
db.deleteByFilter(filter);
await db.set([{ id, vector, metadata }]); // update

// Search
await db.search(query, k);
await db.search(query, k, { filter, maxDistance, ef });
await db.searchBatch(queries, k);

// Hybrid
await db.searchHybrid(queryVector, queryText, k);
db.searchText(queryText, k);

// Collections
db.collection("users");
db.collections();
db.deleteCollection("users");

// Persistence
db.flush();
db.close();
db.compact();
db.optimize();

Configuration

db = omendb.open(
    "./mydb",                # Creates ./mydb.omen + ./mydb.wal
    dimensions=384,
    m=16,                    # HNSW connections per node (default: 16)
    ef_construction=200,     # Index build quality (default: 100)
    ef_search=100,           # Search quality (default: 100)
    quantization=True,       # SQ8 quantization (default: None)
    metric="cosine",         # Distance metric (default: "l2")
    embedding_fn=embed,      # Auto-embed documents and string queries
)

# Quantization options:
# - True or "sq8": SQ8 ~4x smaller, ~99% recall (recommended)
# - None/False: Full precision (default)

# Distance metric options:
# - "l2" or "euclidean": Euclidean distance (default)
# - "cosine": Cosine distance (1 - cosine similarity)
# - "dot" or "ip": Inner product (for MIPS)

# Context manager (auto-flush on exit)
with omendb.open("./db", dimensions=768) as db:
    db.set([...])

Distance Filtering

Use max_distance to filter out low-relevance results (prevents "context rot" in RAG):

# Only return results with distance <= 0.5
results = db.search(query, k=10, max_distance=0.5)

# Combine with metadata filter
results = db.search(query, k=10, filter={"type": "doc"}, max_distance=0.5)

This ensures your RAG pipeline only receives highly relevant context, avoiding distractors that can hurt LLM performance.

Filters

# Equality
{"field": "value"}                      # Shorthand
{"field": {"$eq": "value"}}             # Explicit

# Comparison
{"field": {"$ne": "value"}}             # Not equal
{"field": {"$gt": 10}}                  # Greater than
{"field": {"$gte": 10}}                 # Greater or equal
{"field": {"$lt": 10}}                  # Less than
{"field": {"$lte": 10}}                 # Less or equal

# Membership
{"field": {"$in": ["a", "b"]}}          # In list
{"field": {"$contains": "sub"}}         # String contains

# Logical
{"$and": [{...}, {...}]}                # AND
{"$or": [{...}, {...}]}                 # OR

Hybrid Search

Combine vector similarity with BM25 full-text search using RRF fusion:

# With embedding_fn -- pass a string for both vector and text query
db = omendb.open("./mydb", dimensions=384, embedding_fn=embed)
db.set([
    {"id": "doc1", "document": "Paris is the capital of France", "metadata": {"topic": "geography"}},
])

results = db.search_hybrid("capital of France", k=10)

# With manual vectors
db.search_hybrid(query_vector, "query text", k=10)

# Tune alpha: 0 = text only, 1 = vector only, default = 0.5
db.search_hybrid(query_vector, "query text", k=10, alpha=0.7)

# Get separate keyword and semantic scores for debugging/tuning
results = db.search_hybrid(query_vector, "query text", k=10, subscores=True)
# Returns: {"id": "...", "score": 0.85, "keyword_score": 0.92, "semantic_score": 0.78}

# Text-only BM25
db.search_text("capital of France", k=10)

Multi-vector (ColBERT)

MUVERA with MaxSim scoring for ColBERT-style token-level retrieval. Token pooling via k-means reduces storage by 50%.

mvdb = omendb.open(":memory:", dimensions=128, multi_vector=True)
mvdb.set([{
    "id": "doc1",
    "vectors": [[0.1]*128, [0.2]*128, [0.3]*128],  # Token embeddings
}])
results = mvdb.search([[0.1]*128, [0.15]*128], k=5)  # MaxSim scoring

Performance

Authoritative baseline: SIFT-100K · 128D · M=16 · ef_construction=100 · ef_search=100 · k=10 · Fedora i9-13900KF (5-run median)

Mode	Build	Single	Batch	Recall@10
fp32	24,881 v/s	2,324 QPS	39,905 QPS	99.8%
SQ8	pending refreshed Linux run	pending	pending	pending

Batch search uses Rayon for parallel execution across all cores. Scales to 1M+ vectors. Apple Silicon runs are still useful for local reference, but Fedora/Linux medians are the authoritative comparison baseline.

Filtered search (ACORN-1, 10% selectivity): predicate-aware graph traversal, no post-filter overhead.

Dataset: SIFT-100K (real 128D embeddings, not random vectors)
Parameters: M=16, ef_construction=100, ef_search=100, k=10
Batch: parallel via Rayon
Recall: validated against brute-force ground truth
Authoritative runs: Fedora/Linux medians from cd python && uv run python benchmark.py --publish
Local reproduction: cd python && uv run python benchmark.py
Synthetic sweeps: uv run python benchmark.py --full is exploratory and not comparable to SIFT history
Current Apple M3 Max reference: fp32 59,789 v/s, 7,644 QPS, 64,570 QPS, 99.8%; SQ8 59,905 v/s, 15,403 QPS, 95,442 QPS, 99.8%

Tuning

The ef_search parameter controls the recall/speed tradeoff at query time. Higher values explore more candidates, improving recall but slowing search.

Rules of thumb:

ef_search must be >= k (number of results requested)
For 128D embeddings: ef=100 usually achieves 90%+ recall
For 768D+ embeddings: increase to ef=200-400 for better recall
If recall drops at scale (50K+), increase both ef_search and ef_construction

Runtime tuning:

# Check current value
print(db.ef_search)  # 100

# Increase for better recall (slower)
db.ef_search = 200

# Decrease for speed (may reduce recall)
db.ef_search = 50

# Per-query override
results = db.search(query, k=10, ef=300)

Recommended settings by use case:

Use Case	ef_search	Expected Recall
Fast search (128D)	64	~85%
Balanced (default)	100	~90%
High recall (768D+)	200-300	~95%+
Maximum recall	500+	~98%+

Examples

See complete working examples:

python/examples/quickstart.py -- Minimal Python example
python/examples/basic.py -- CRUD operations and persistence
python/examples/filters.py -- All filter operators
python/examples/rag.py -- RAG workflow with mock embeddings
python/examples/embedding_fn.py -- Auto-embedding with embedding_fn
python/examples/quantization.py -- SQ8 quantization
node/examples/quickstart.js -- Minimal Node.js example
node/examples/embedding_fn.js -- Auto-embedding with embeddingFn
node/examples/multivector.ts -- Multi-vector / ColBERT

Integrations

LangChain

pip install omendb[langchain]

LangChain integration requires Python 3.10+.

from langchain_openai import OpenAIEmbeddings
from omendb.langchain import OmenDBVectorStore

store = OmenDBVectorStore.from_texts(
    texts=["Paris is the capital of France"],
    embedding=OpenAIEmbeddings(),
    path="./langchain_vectors",
)
docs = store.similarity_search("capital of France", k=1)

LlamaIndex

pip install omendb[llamaindex]

from llama_index.core import VectorStoreIndex, Document, StorageContext
from omendb.llamaindex import OmenDBVectorStore

vector_store = OmenDBVectorStore(path="./llama_vectors")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
    [Document(text="OmenDB is fast")],
    storage_context=storage_context,
)
response = index.as_query_engine().query("What is OmenDB?")

License

Elastic License 2.0 -- Free to use, modify, and embed. The only restriction: you can't offer OmenDB as a managed service to third parties.

omendb 0.0.34

OmenDB

Quick Start

Python

Node.js

Features

Platforms

API Reference

Python

Node.js

Configuration

Distance Filtering

Filters

Hybrid Search

Multi-vector (ColBERT)

Performance

Tuning

Examples

Integrations

LangChain

LlamaIndex

License