Crate vectorlite

Crate vectorlite 

Source
Expand description

§VectorLite

A tiny, in-process Rust vector store with built-in embeddings for sub-millisecond semantic search.

VectorLite is a high-performance, in-memory vector database optimized for AI agent and edge workloads.
It co-locates model inference (via Candle) with a low-latency vector index, making it ideal for session-scoped, single-instance, or privacy-sensitive environments.

§Why VectorLite?

FeatureDescription
Sub-millisecond searchIn-memory HNSW or flat search tuned for real-time agent loops.
Built-in embeddingsRuns all-MiniLM-L6-v2 locally using Candle, or any other model of your choice. No external API calls.
Single-binary simplicityNo dependencies, no servers to orchestrate. Start instantly via CLI or Docker.
Session-scoped collectionsPerfect for ephemeral agent sessions or sidecars
Thread-safe concurrencyRwLock-based access and atomic ID generation for multi-threaded workloads.
Instant persistenceSave or restore collections snapshots in one call.

VectorLite trades distributed scalability for deterministic performance, perfect for use cases where latency matters more than millions of vectors.

§When to Use It

ScenarioWhy VectorLite fits
AI agent sessionsKeep short-lived embeddings per conversation. No network latency.
Edge or embedded AIRun fully offline with model + index in one binary.
Realtime search / personalizationSub-ms search for pre-computed embeddings.
Local prototyping & CIRust-native, no external services.
Single-tenant microservicesLightweight sidecar for semantic capabilities.

§Key Features

  • In-memory storage for zero-latency access patterns
  • Native Rust ML models using Candle framework with pluggable architecture
  • Thread-safe concurrency with RwLock per collection and atomic ID generation
  • HNSW indexing for approximate nearest neighbor search with configurable accuracy
  • HTTP API for easy integration with AI agents and other services

§Quick Start

use vectorlite::{VectorLiteClient, EmbeddingGenerator, IndexType, SimilarityMetric};
use serde_json::json;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));

    client.create_collection("quotes", IndexType::HNSW)?;
     
    let id = client.add_text_to_collection(
        "quotes", 
        "I just want to lie on the beach and eat hot dogs",
        Some(json!({
            "author": "Kevin Malone",
            "tags": ["the-office", "s3:e23"],
            "year": 2005,
        }))
    )?;

    let results = client.search_text_in_collection(
        "quotes",
        "beach games",
        3,
        SimilarityMetric::Cosine,
    )?;

    for result in &results {
        println!("ID: {}, Score: {:.4}, Text: {:?}", result.id, result.score, result.text);
    }

    Ok(())
}

§Index Types

§FlatIndex

  • Complexity: O(n) search, O(1) insert
  • Memory: Linear with dataset size
  • Use Case: Small datasets (< 10K vectors) or exact search requirements

§HNSWIndex

  • Complexity: O(log n) search, O(log n) insert
  • Memory: ~2-3x vector size due to graph structure
  • Use Case: Large datasets with approximate search tolerance

§Similarity Metrics

  • Cosine: Default for normalized embeddings, scale-invariant
  • Euclidean: Geometric distance, sensitive to vector magnitude
  • Manhattan: L1 norm, robust to outliers
  • Dot Product: Raw similarity, requires consistent vector scaling

§HTTP Server

use vectorlite::{VectorLiteClient, EmbeddingGenerator, start_server};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));
    start_server(client, "127.0.0.1", 3001).await?;
    Ok(())
}

§Configuration Profiles

# Balanced (default)
cargo build

# Memory-constrained environments
cargo build --features memory-optimized

# High-precision search
cargo build --features high-accuracy

Re-exports§

pub use index::flat::FlatIndex;
pub use index::hnsw::HNSWIndex;
pub use embeddings::EmbeddingGenerator;
pub use embeddings::EmbeddingFunction;
pub use client::VectorLiteClient;
pub use client::Collection;
pub use client::Settings;
pub use client::IndexType;
pub use server::create_app;
pub use server::start_server;
pub use persistence::PersistenceError;
pub use persistence::save_collection_to_file;
pub use persistence::load_collection_from_file;

Modules§

client
Client Module
embeddings
Embeddings Module
errors
Error Types Module
index
Index Module
persistence
Persistence Module
server
HTTP Server Module

Structs§

SearchResult
Search result containing a vector ID, similarity score, original text, and optional metadata
Vector
Represents a vector with an ID, floating-point values, and original text

Enums§

SimilarityMetric
Similarity metrics for vector comparison
VectorIndexWrapper
Wrapper enum for different vector index implementations

Constants§

DEFAULT_VECTOR_DIMENSION
Default vector dimension for embedding models

Traits§

VectorIndex
Trait for vector indexing implementations

Functions§

cosine_similarity
Calculate cosine similarity between two vectors
dot_product
Calculate dot product between two vectors
euclidean_similarity
Calculate Euclidean similarity between two vectors
manhattan_similarity
Calculate Manhattan similarity between two vectors