Crate vectorlite

Expand description

§VectorLite

A tiny, in-process Rust vector store with built-in embeddings for sub-millisecond semantic search.

VectorLite is a high-performance, in-memory vector database optimized for AI agent and edge workloads.
It co-locates model inference (via Candle) with a low-latency vector index, making it ideal for session-scoped, single-instance, or privacy-sensitive environments.

§Why VectorLite?

Feature	Description
Sub-millisecond search	In-memory HNSW or flat search tuned for real-time agent loops.
Built-in embeddings	Runs all-MiniLM-L6-v2 locally using Candle, or any other model of your choice. No external API calls.
Single-binary simplicity	No dependencies, no servers to orchestrate. Start instantly via CLI or Docker.
Session-scoped collections	Perfect for ephemeral agent sessions or sidecars
Thread-safe concurrency	RwLock-based access and atomic ID generation for multi-threaded workloads.
Instant persistence	Save or restore collections snapshots in one call.

VectorLite trades distributed scalability for deterministic performance, perfect for use cases where latency matters more than millions of vectors.

§When to Use It

Scenario	Why VectorLite fits
AI agent sessions	Keep short-lived embeddings per conversation. No network latency.
Edge or embedded AI	Run fully offline with model + index in one binary.
Realtime search / personalization	Sub-ms search for pre-computed embeddings.
Local prototyping & CI	Rust-native, no external services.
Single-tenant microservices	Lightweight sidecar for semantic capabilities.

§Key Features

In-memory storage for zero-latency access patterns
Native Rust ML models using Candle framework with pluggable architecture
Thread-safe concurrency with RwLock per collection and atomic ID generation
HNSW indexing for approximate nearest neighbor search with configurable accuracy
HTTP API for easy integration with AI agents and other services

§Quick Start

use vectorlite::{VectorLiteClient, EmbeddingGenerator, IndexType, SimilarityMetric};
use serde_json::json;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));

    client.create_collection("quotes", IndexType::HNSW)?;
     
    let id = client.add_text_to_collection(
        "quotes", 
        "I just want to lie on the beach and eat hot dogs",
        Some(json!({
            "author": "Kevin Malone",
            "tags": ["the-office", "s3:e23"],
            "year": 2005,
        }))
    )?;

    let results = client.search_text_in_collection(
        "quotes",
        "beach games",
        3,
        SimilarityMetric::Cosine,
    )?;

    for result in &results {
        println!("ID: {}, Score: {:.4}, Text: {:?}", result.id, result.score, result.text);
    }

    Ok(())
}

§Index Types

§FlatIndex

Complexity: O(n) search, O(1) insert
Memory: Linear with dataset size
Use Case: Small datasets (< 10K vectors) or exact search requirements

§HNSWIndex

Complexity: O(log n) search, O(log n) insert
Memory: ~2-3x vector size due to graph structure
Use Case: Large datasets with approximate search tolerance

§Similarity Metrics

Cosine: Default for normalized embeddings, scale-invariant
Euclidean: Geometric distance, sensitive to vector magnitude
Manhattan: L1 norm, robust to outliers
Dot Product: Raw similarity, requires consistent vector scaling

§HTTP Server

use vectorlite::{VectorLiteClient, EmbeddingGenerator, start_server};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));
    start_server(client, "127.0.0.1", 3001).await?;
    Ok(())
}

§Configuration Profiles

# Balanced (default)
cargo build

# Memory-constrained environments
cargo build --features memory-optimized

# High-precision search
cargo build --features high-accuracy

Re-exports§

pub use index::flat::FlatIndex;
pub use index::hnsw::HNSWIndex;
pub use embeddings::EmbeddingGenerator;
pub use embeddings::EmbeddingFunction;
pub use client::VectorLiteClient;
pub use client::Collection;
pub use client::Settings;
pub use client::IndexType;
pub use server::create_app;
pub use server::start_server;
pub use persistence::PersistenceError;
pub use persistence::save_collection_to_file;
pub use persistence::load_collection_from_file;

Modules§

client: Client Module
embeddings: Embeddings Module
errors: Error Types Module
index: Index Module
persistence: Persistence Module
server: HTTP Server Module

Structs§

SearchResult: Search result containing a vector ID, similarity score, original text, and optional metadata
Vector: Represents a vector with an ID, floating-point values, and original text

Enums§

SimilarityMetric: Similarity metrics for vector comparison
VectorIndexWrapper: Wrapper enum for different vector index implementations

Constants§

DEFAULT_VECTOR_DIMENSION: Default vector dimension for embedding models

Traits§

VectorIndex: Trait for vector indexing implementations

Functions§

cosine_similarity: Calculate cosine similarity between two vectors
dot_product: Calculate dot product between two vectors
euclidean_similarity: Calculate Euclidean similarity between two vectors
manhattan_similarity: Calculate Manhattan similarity between two vectors