VectorLite
A tiny, in-process Rust vector store with built-in embeddings for sub-millisecond semantic search.
VectorLite is a high-performance, in-memory vector database optimized for AI agent and edge workloads.
It co-locates model inference (via Candle) with a low-latency vector index, making it ideal for session-scoped, single-instance, or privacy-sensitive environments.
Why VectorLite?
| Feature | Description |
|---|---|
| Sub-millisecond search | In-memory HNSW or flat search tuned for real-time agent loops. |
| Built-in embeddings | Runs all-MiniLM-L6-v2 locally using Candle, or any other model of your choice. No external API calls. |
| Single-binary simplicity | No dependencies, no servers to orchestrate. Start instantly via CLI or Docker. |
| Session-scoped collections | Perfect for ephemeral agent sessions or sidecars |
| Thread-safe concurrency | RwLock-based access and atomic ID generation for multi-threaded workloads. |
| Instant persistence | Save or restore collections snapshots in one call. |
VectorLite trades distributed scalability for deterministic performance, perfect for use cases where latency mattters more than millions of vectors.
When to Use It
| Scenario | Why VectorLite fits |
|---|---|
| AI agent sessions | Keep short-lived embeddings per conversation. No network latency. |
| Edge or embedded AI | Run fully offline with model + index in one binary. |
| Realtime search / personalization | Sub-ms search for pre-computed embeddings. |
| Local prototyping & CI | Rust-native, no external services. |
| Single-tenant microservices | Lightweight sidecar for semantic capabilities. |
Quick Start
Run from Source
# Start with preloaded collection
Run with Docker
With default settings:
With a different embeddings model and memory-optimized HNSW:
HTTP API Overview
| Operation | Method & Endpoint | Body |
|---|---|---|
| Health | GET /health |
– |
| List collections | GET /collections |
– |
| Create collection | POST /collections |
{"name": "docs", "index_type": "hnsw"} |
| Delete collection | DELETE /collections/{name} |
– |
| Add text | POST /collections/{name}/text |
{"text": "Hello world", "metadata": {...}} |
| Search (text) | POST /collections/{name}/search/text |
{"query": "hello", "k": 5} |
| Get vector | GET /collections/{name}/vectors/{id} |
– |
| Delete vector | DELETE /collections/{name}/vectors/{id} |
– |
| Save collection | POST /collections/{name}/save |
{"file_path": "./collection.vlc"} |
| Load collection | POST /collections/load |
{"file_path": "./collection.vlc", "collection_name": "restored"} |
Index Types
| Index | Search Complexity | Insert | Use Case |
|---|---|---|---|
| Flat | O(n) | O(1) | Small datasets (<10K) or exact search |
| HNSW | O(log n) | O(log n) | Larger datasets or approximate search |
See Hierarchical Navigable Small World.
Configuration profiles for HNSW
| Profile | Features | Use Case |
|---|---|---|
| default | balanced | general workloads |
| memory-optimized | reduced precision, smaller graph | constrained devices |
| high-accuracy | higher recall, more memory | offline re-ranking or research |
Similarity Metrics
- Cosine: Default for normalized embeddings, scale-invariant
- Euclidean: Geometric distance, sensitive to vector magnitude
- Manhattan: L1 norm, robust to outliers
- Dot Product: Raw similarity, requires consistent vector scaling
Rust SDK Example
use ;
use json;
Testing
Run tests with mock embeddings (CI-friendly, no model files required):
Run tests with local models:
Download ML Model
This downloads the BERT-based embedding model files needed for real embedding generation:
The model files must be present in the ./models/{model-name}/ directory with the required files:
config.jsonpytorch_model.bintokenizer.json
Using a different model
You can override the default embedding model at compile time using the custom-model feature:
DEFAULT_EMBEDDING_MODEL="sentence-transformers/paraphrase-MiniLM-L3-v2"
DEFAULT_EMBEDDING_MODEL="sentence-transformers/paraphrase-MiniLM-L3-v2"
License
Apache 2.0 License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.