Ruvector Core
High-performance Rust vector database engine with HNSW indexing, quantization, and SIMD optimizations.
ruvector-core is the foundational Rust library powering Ruvectorโa next-generation vector database built for extreme performance and universal deployment. This crate provides the core vector database engine with state-of-the-art algorithms optimized for modern hardware.
๐ Why Ruvector Core?
- โก Blazing Fast: <0.5ms p50 query latency with HNSW indexing
- ๐ง Memory Efficient: 4-32x compression via quantization techniques
- ๐ฏ High Accuracy: 95%+ recall with HNSW + Product Quantization
- ๐ SIMD Accelerated: Hardware-optimized distance calculations using SimSIMD
- ๐ง Zero Dependencies: Minimal external dependencies, pure Rust implementation
- ๐ฆ Production Ready: Battle-tested algorithms with comprehensive benchmarks
๐ Features
Core Capabilities
- HNSW Indexing: Hierarchical Navigable Small World graphs for O(log n) approximate nearest neighbor search
- Multiple Distance Metrics: Euclidean, Cosine, Dot Product, Manhattan
- Advanced Quantization: Scalar (4x), Product (8-32x), and Binary (32x) quantization
- SIMD Optimizations: Hardware-accelerated distance calculations via
simsimd - Zero-Copy I/O: Memory-mapped storage for instant loading
- Concurrent Operations: Lock-free data structures and parallel batch processing
- Flexible Storage: Persistent storage with
redband memory-mapped files
Advanced Features
- Hybrid Search: Combine dense vector search with sparse BM25 text search
- Filtered Search: Apply metadata filters during vector search
- MMR Diversification: Maximal Marginal Relevance for diverse result sets
- Conformal Prediction: Uncertainty quantification for search results
- Product Quantization: Memory-efficient vector compression with high accuracy
- Cache Optimization: Multi-level caching for improved performance
- Lock-Free Indexing: High-concurrency operations without blocking
๐ฆ Installation
Add ruvector-core to your Cargo.toml:
[]
= "0.1.0"
Feature Flags
[]
= { = "0.1.0", = ["simd", "uuid-support"] }
Available features:
simd(default): Enable SIMD-optimized distance calculationsuuid-support(default): Enable UUID generation for vector IDs
โก Quick Start
Basic Usage
use ;
Batch Operations
use ;
// Insert multiple vectors efficiently
let entries = vec!;
let ids = db.insert_batch?;
println!;
With Metadata Filtering
use HashMap;
use json;
// Insert with metadata
db.insert?;
// Search with metadata filter
let results = db.search?;
HNSW Configuration
use ;
let mut options = default;
options.dimensions = 384;
options.distance_metric = Cosine;
// Configure HNSW index parameters
options.hnsw_config = Some;
let db = new?;
Quantization
use ;
let mut options = default;
options.dimensions = 384;
// Enable scalar quantization (4x compression)
options.quantization = Some;
// Or product quantization (8-32x compression)
options.quantization = Some;
let db = new?;
๐ API Overview
Core Types
// Main database interface
// Vector entry with optional ID and metadata
// Search query parameters
// Search result with score
Main Operations
Distance Metrics
Advanced Features
// Hybrid search (dense + sparse)
use ;
let hybrid = new;
// Filtered search with expressions
use ;
let filtered = new;
let expr = And;
// MMR diversification
use ;
let mmr = new;
๐ฏ Performance Characteristics
Latency (Single Query)
Operation Flat Index HNSW Index
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Search (1K vecs) ~0.1ms ~0.2ms
Search (100K vecs) ~10ms ~0.5ms
Search (1M vecs) ~100ms <1ms
Insert ~0.1ms ~1ms
Batch (1000) ~50ms ~500ms
Memory Usage (1M Vectors, 384 Dimensions)
Configuration Memory Recall
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Full Precision (f32) ~1.5GB 100%
Scalar Quantization ~400MB 98%
Product Quantization ~200MB 95%
Binary Quantization ~50MB 85%
Throughput (Queries Per Second)
Configuration QPS Latency (p50)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Single Thread ~2,000 ~0.5ms
Multi-Thread (8 cores) ~50,000 <0.5ms
With SIMD ~80,000 <0.3ms
With Quantization ~100,000 <0.2ms
๐ง Configuration Guide
For Maximum Accuracy
let options = DbOptions ;
For Maximum Speed
let options = DbOptions ;
For Balanced Performance
let options = default; // Recommended defaults
๐จ Building and Testing
Build
# Build with default features
# Build without SIMD
# Build for specific target with optimizations
RUSTFLAGS="-C target-cpu=native"
Testing
# Run all tests
# Run with specific features
# Run with logging
RUST_LOG=debug
Benchmarks
# Run all benchmarks
# Run specific benchmark
# Run with features
Available benchmarks:
distance_metrics- SIMD-optimized distance calculationshnsw_search- HNSW index search performancequantization_bench- Quantization techniquesbatch_operations- Batch insert/search operationscomprehensive_bench- Full system benchmarks
๐ Documentation
Complete Ruvector Documentation
This crate is part of the larger Ruvector project:
- Main README - Complete project overview
- Getting Started Guide - Quick start tutorial
- Rust API Reference - Detailed API documentation
- Advanced Features Guide - Quantization, indexing, tuning
- Performance Tuning - Optimization strategies
- Benchmarking Guide - Running benchmarks
API Documentation
Generate and view the full API documentation:
๐ Related Crates
ruvector-core is the foundation for platform-specific bindings:
- ruvector-node - Node.js bindings via NAPI-RS
- ruvector-wasm - WebAssembly bindings for browsers
- ruvector-cli - Command-line interface
- ruvector-bench - Performance benchmarks
๐ค Contributing
We welcome contributions! See the main Contributing Guidelines for details.
Areas for Contribution
- ๐ Bug fixes and stability improvements
- โจ New distance metrics or quantization techniques
- ๐ Performance optimizations
- ๐งช Additional test coverage
- ๐ Documentation and examples
๐ Comparison
Why Ruvector Core vs. Alternatives?
| Feature | Ruvector Core | hnswlib-rs | faiss-rs | qdrant |
|---|---|---|---|---|
| Pure Rust | โ | โ | โ (C++) | โ |
| SIMD | โ SimSIMD | โ | โ | โ |
| Quantization | โ Multiple | โ | โ | โ |
| Zero-Copy I/O | โ | โ | โ | โ |
| Metadata Filter | โ | โ | โ | โ |
| Hybrid Search | โ | โ | โ | โ |
| P50 Latency | <0.5ms | ~1ms | ~0.5ms | ~1ms |
| Dependencies | Minimal | Minimal | Heavy | Heavy |
๐ License
MIT License - see LICENSE for details.
๐ Acknowledgments
Built with state-of-the-art algorithms and libraries:
- hnsw_rs - HNSW implementation
- simsimd - SIMD distance calculations
- redb - Embedded database
- rayon - Data parallelism
- memmap2 - Memory-mapped files
Part of Ruvector โข Built by rUv
Documentation โข Crates.io โข GitHub