Expand description
§VectorLite
A tiny, in-process Rust vector store with built-in embeddings for sub-millisecond semantic search.
VectorLite is a high-performance, in-memory vector database optimized for AI agent and edge workloads.
It co-locates model inference (via Candle) with a low-latency vector index, making it ideal for session-scoped, single-instance, or privacy-sensitive environments.
§Why VectorLite?
| Feature | Description |
|---|---|
| Sub-millisecond search | In-memory HNSW or flat search tuned for real-time agent loops. |
| Built-in embeddings | Runs all-MiniLM-L6-v2 locally using Candle, or any other model of your choice. No external API calls. |
| Single-binary simplicity | No dependencies, no servers to orchestrate. Start instantly via CLI or Docker. |
| Session-scoped collections | Perfect for ephemeral agent sessions or sidecars |
| Thread-safe concurrency | RwLock-based access and atomic ID generation for multi-threaded workloads. |
| Instant persistence | Save or restore collections snapshots in one call. |
VectorLite trades distributed scalability for deterministic performance, perfect for use cases where latency matters more than millions of vectors.
§When to Use It
| Scenario | Why VectorLite fits |
|---|---|
| AI agent sessions | Keep short-lived embeddings per conversation. No network latency. |
| Edge or embedded AI | Run fully offline with model + index in one binary. |
| Realtime search / personalization | Sub-ms search for pre-computed embeddings. |
| Local prototyping & CI | Rust-native, no external services. |
| Single-tenant microservices | Lightweight sidecar for semantic capabilities. |
§Key Features
- In-memory storage for zero-latency access patterns
- Native Rust ML models using Candle framework with pluggable architecture
- Thread-safe concurrency with RwLock per collection and atomic ID generation
- HNSW indexing for approximate nearest neighbor search with configurable accuracy
- HTTP API for easy integration with AI agents and other services
§Quick Start
use vectorlite::{VectorLiteClient, EmbeddingGenerator, IndexType, SimilarityMetric};
use serde_json::json;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));
client.create_collection("quotes", IndexType::HNSW)?;
let id = client.add_text_to_collection(
"quotes",
"I just want to lie on the beach and eat hot dogs",
Some(json!({
"author": "Kevin Malone",
"tags": ["the-office", "s3:e23"],
"year": 2005,
}))
)?;
let results = client.search_text_in_collection(
"quotes",
"beach games",
3,
SimilarityMetric::Cosine,
)?;
for result in &results {
println!("ID: {}, Score: {:.4}, Text: {:?}", result.id, result.score, result.text);
}
Ok(())
}§Index Types
§FlatIndex
- Complexity: O(n) search, O(1) insert
- Memory: Linear with dataset size
- Use Case: Small datasets (< 10K vectors) or exact search requirements
§HNSWIndex
- Complexity: O(log n) search, O(log n) insert
- Memory: ~2-3x vector size due to graph structure
- Use Case: Large datasets with approximate search tolerance
§Similarity Metrics
- Cosine: Default for normalized embeddings, scale-invariant
- Euclidean: Geometric distance, sensitive to vector magnitude
- Manhattan: L1 norm, robust to outliers
- Dot Product: Raw similarity, requires consistent vector scaling
§HTTP Server
use vectorlite::{VectorLiteClient, EmbeddingGenerator, start_server};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = VectorLiteClient::new(Box::new(EmbeddingGenerator::new()?));
start_server(client, "127.0.0.1", 3001).await?;
Ok(())
}§Configuration Profiles
# Balanced (default)
cargo build
# Memory-constrained environments
cargo build --features memory-optimized
# High-precision search
cargo build --features high-accuracyRe-exports§
pub use index::flat::FlatIndex;pub use index::hnsw::HNSWIndex;pub use embeddings::EmbeddingGenerator;pub use embeddings::EmbeddingFunction;pub use client::VectorLiteClient;pub use client::Collection;pub use client::Settings;pub use client::IndexType;pub use server::create_app;pub use server::start_server;pub use persistence::PersistenceError;pub use persistence::save_collection_to_file;pub use persistence::load_collection_from_file;
Modules§
- client
- Client Module
- embeddings
- Embeddings Module
- errors
- Error Types Module
- index
- Index Module
- persistence
- Persistence Module
- server
- HTTP Server Module
Structs§
- Search
Result - Search result containing a vector ID, similarity score, original text, and optional metadata
- Vector
- Represents a vector with an ID, floating-point values, and original text
Enums§
- Similarity
Metric - Similarity metrics for vector comparison
- Vector
Index Wrapper - Wrapper enum for different vector index implementations
Constants§
- DEFAULT_
VECTOR_ DIMENSION - Default vector dimension for embedding models
Traits§
- Vector
Index - Trait for vector indexing implementations
Functions§
- cosine_
similarity - Calculate cosine similarity between two vectors
- dot_
product - Calculate dot product between two vectors
- euclidean_
similarity - Calculate Euclidean similarity between two vectors
- manhattan_
similarity - Calculate Manhattan similarity between two vectors