VectorLite
A high-performance, in-memory vector database optimized for AI agent workloads with HTTP API and thread-safe concurrency.
Overview
VectorLite is designed for single-instance, low-latency vector operations in AI agent environments. It prioritizes sub-millisecond search performance over distributed scalability, making it ideal for:
- AI Agent Sessions: Session-scoped vector storage with fast retrieval
- Real-time Search: Sub-millisecond response requirements for pre-computed embeddings
- Prototype Development: Rapid iteration without infrastructure complexity
- Single-tenant Applications: No multi-tenancy isolation requirements
Key Features
- In-memory storage for zero-latency access patterns
- Native Rust ML models using Candle framework with pluggable architecture. Bring your own embedding model (default to all-MiniLM-L6-v2)
- Thread-safe concurrency with RwLock per collection and atomic ID generation
- HNSW indexing for approximate nearest neighbor search with configurable accuracy
- Collection persistence with vector lite collection (VLC) file format for saving/loading collections
HTTP API
Presentation of the RESTful interface.
# Health check
# Collection management
# Vector operations
# Persistence operations
Index Types
Flat
- Complexity: O(n) search, O(1) insert
- Memory: Linear with dataset size
- Use Case: Small datasets (< 10K vectors) or exact search requirements
HNSW
- Complexity: O(log n) search, O(log n) insert
- Memory: ~2-3x vector size due to graph structure
- Use Case: Large datasets with approximate search tolerance
See Hierarchical Navigable Small World paper for details.
ML Model Integration
Built-in Embedding Models
- all-MiniLM-L6-v2: Default 384-dimensional model for general-purpose text
- Candle Framework: Native Rust ML inference with CPU/GPU acceleration
- Pluggable Architecture: Easy integration of custom embedding models
- Memory Efficient: Models loaded once and shared across requests
Similarity Metrics
- Cosine: Default for normalized embeddings, scale-invariant
- Euclidean: Geometric distance, sensitive to vector magnitude
- Manhattan: L1 norm, robust to outliers
- Dot Product: Raw similarity, requires consistent vector scaling
Configuration Profiles
# Balanced (default)
# Memory-constrained environments
# High-precision search
Getting Started
use ;
// Create client with embedding function
let client = new;
// Create collection
client.create_collection?;
// Add text (auto-generates embedding and ID)
let id = client.add_text_to_collection?;
// Search
let results = client.search_text_in_collection?;
HTTP Server Example
use ;
async
CLI Usage
Start the server with optional collection loading:
# Start empty server
# Start with pre-loaded collection
Testing
Run tests with mock embeddings (CI-friendly, no model files required):
Run tests with real ML models (requires downloaded models):
Download ML Model
This downloads the BERT-based embedding model files needed for real embedding generation:
License
Apache 2.0 License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.