VectorLite
A high-performance, in-memory vector database optimized for AI agent workloads with HTTP API and thread-safe concurrency.
Overview
VectorLite is designed for single-instance, low-latency vector operations in AI agent environments. It prioritizes sub-millisecond search performance over distributed scalability, making it ideal for:
- AI Agent Sessions: Session-scoped vector storage with fast retrieval
- Real-time Search: Sub-millisecond response requirements
- Prototype Development: Rapid iteration without infrastructure complexity
- Single-tenant Applications: No multi-tenancy isolation requirements
Key Features
- In-memory storage for zero-latency access patterns
- Native Rust ML models using Candle framework with pluggable architecture. Bring your own embedding model (default to all-MiniLM-L6-v2)
- Thread-safe concurrency with RwLock per collection and atomic ID generation
- HNSW indexing for approximate nearest neighbor search with configurable accuracy
HTTP API
RESTful interface optimized for AI agent integration:
# Health check
# Collection management
# Vector operations
Index Types
FlatIndex
- Complexity: O(n) search, O(1) insert
- Memory: Linear with dataset size
- Use Case: Small datasets (< 10K vectors) or exact search requirements
HNSWIndex
- Complexity: O(log n) search, O(log n) insert
- Memory: ~2-3x vector size due to graph structure
- Use Case: Large datasets with approximate search tolerance
ML Model Integration
Built-in Embedding Models
- all-MiniLM-L6-v2: Default 384-dimensional model for general-purpose text
- Candle Framework: Native Rust ML inference with CPU/GPU acceleration
- Pluggable Architecture: Easy integration of custom embedding models
- Memory Efficient: Models loaded once and shared across requests
Similarity Metrics
- Cosine: Default for normalized embeddings, scale-invariant
- Euclidean: Geometric distance, sensitive to vector magnitude
- Manhattan: L1 norm, robust to outliers
- Dot Product: Raw similarity, requires consistent vector scaling
Configuration Profiles
# Balanced (default)
# Memory-constrained environments
# High-precision search
Getting Started
use ;
// Create client with embedding function
let client = new;
// Create collection
client.create_collection?;
// Add text (auto-generates embedding and ID)
let id = client.add_text_to_collection?;
// Search
let results = client.search_text_in_collection?;
HTTP Server Example
use ;
async
License
Apache 2.0 License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.