Router Core
High-performance vector database and neural routing inference engine built in Rust.
Core engine powering Ruvector's intelligent request distribution, model selection, and sub-millisecond vector similarity search. Combines advanced indexing algorithms with SIMD-optimized distance calculations for maximum performance.
๐ฏ Overview
Router Core is the foundation of Ruvector's vector database capabilities, providing:
- Neural Routing: Intelligent request distribution across multiple models and endpoints
- Vector Database: High-performance storage and retrieval with HNSW indexing
- Model Selection: Adaptive routing strategies for multi-model AI systems
- SIMD Acceleration: Hardware-optimized vector operations via simsimd
- Memory Efficiency: Advanced quantization techniques (4-32x compression)
- Zero Dependencies: Pure Rust implementation with minimal external dependencies
โก Key Features
Core Capabilities
- Sub-Millisecond Search: <0.5ms p50 latency with HNSW indexing
- HNSW Indexing: Hierarchical Navigable Small World for fast approximate nearest neighbor search
- Multiple Distance Metrics: Euclidean, Cosine, Dot Product, Manhattan
- Advanced Quantization: Scalar (4x), Product (8-16x), Binary (32x) compression
- SIMD Optimizations: Hardware-accelerated distance calculations
- Zero-Copy I/O: Memory-mapped files for efficient data access
- Thread-Safe: Concurrent read/write operations with minimal locking
- Persistent Storage: Durable vector storage with redb backend
Neural Routing Features
- Intelligent Request Distribution: Route queries to optimal model endpoints
- Load Balancing: Distribute workload across multiple inference servers
- Model Selection: Automatically select best model based on query characteristics
- Adaptive Strategies: Learn and optimize routing decisions over time
- Latency Optimization: Minimize end-to-end inference time
- Failover Support: Automatic fallback to backup endpoints
๐ฆ Installation
Add to your Cargo.toml:
[]
= "0.1.0"
Or use the full ruvector package:
[]
= "0.1.0"
๐ Quick Start
Basic Vector Database
use ;
use HashMap;
// Create database with builder pattern
let db = builder
.dimensions // Vector dimensions
.distance_metric
.hnsw_m // HNSW connections per node
.hnsw_ef_construction // Construction accuracy
.storage_path
.build?;
// Insert vectors
let entry = VectorEntry ;
db.insert?;
// Search for similar vectors
let query = SearchQuery ;
let results = db.search?;
for result in results
Batch Operations
use ;
// Insert multiple vectors efficiently
let entries: =
.map
.collect;
// Batch insert (much faster than individual inserts)
db.insert_batch?;
// Check statistics
let stats = db.stats;
println!;
println!;
Advanced Configuration
use ;
let db = builder
.dimensions // Larger embeddings
.max_elements // 10M vectors
.distance_metric // Cosine similarity
.hnsw_m // More connections = higher recall
.hnsw_ef_construction // Higher accuracy during build
.hnsw_ef_search // Search-time accuracy
.quantization // 4x memory compression
.mmap_vectors // Memory-mapped storage
.storage_path
.build?;
๐ง Neural Routing Strategies
Router Core supports multiple routing strategies for intelligent request distribution:
1. Round-Robin Routing
Simple load balancing across endpoints:
use ;
let router = new;
let endpoint = router.select_endpoint?;
2. Latency-Based Routing
Route to fastest available endpoint:
use ;
let router = new;
3. Semantic Routing
Route based on query similarity to model specializations:
use ;
// Define model specializations with example vectors
let models = vec!;
let router = new;
// Routes to most appropriate model based on query vector
let endpoint = router.select_endpoint?;
4. Adaptive Routing
Learn optimal routing decisions over time:
use ;
let mut router = new;
// Router learns from feedback
router.record_request?;
// Routing improves with more data
let best_endpoint = router.select_endpoint?;
๐จ Distance Metrics
Router Core supports multiple distance metrics with SIMD optimization:
Cosine Similarity
Best for normalized embeddings (recommended for most AI applications):
use ;
let a = vec!;
let b = vec!;
let dist = calculate_distance?;
// Returns 1 - cosine_similarity (0 = identical, 2 = opposite)
Euclidean Distance (L2)
Measures absolute geometric distance:
let dist = calculate_distance?;
// Returns sqrt(sum((a[i] - b[i])^2))
Dot Product
Fast similarity for pre-normalized vectors:
let dist = calculate_distance?;
// Returns -sum(a[i] * b[i]) (negated for distance)
Manhattan Distance (L1)
Sum of absolute differences:
let dist = calculate_distance?;
// Returns sum(|a[i] - b[i]|)
๐๏ธ Quantization Techniques
Reduce memory usage with minimal accuracy loss:
Scalar Quantization (4x compression)
Compress float32 to int8:
use ;
let db = builder
.dimensions
.quantization
.build?;
// Automatic quantization on insert
// 384 dims ร 4 bytes = 1536 bytes โ 384 bytes + overhead
Product Quantization (8-16x compression)
Divide vector into subspaces and quantize independently:
let db = builder
.dimensions
.quantization
.build?;
// 384 dims ร 4 bytes = 1536 bytes โ 8 bytes + overhead
Binary Quantization (32x compression)
Compress to 1 bit per dimension:
let db = builder
.dimensions
.quantization
.build?;
// 384 dims ร 4 bytes = 1536 bytes โ 48 bytes + overhead
// Fast Hamming distance for similarity
Compression Ratio Comparison
use calculate_compression_ratio;
let dims = 384;
let none_ratio = calculate_compression_ratio;
// 1x - no compression
let scalar_ratio = calculate_compression_ratio;
// ~4x compression
let product_ratio = calculate_compression_ratio;
// ~8-16x compression
let binary_ratio = calculate_compression_ratio;
// ~32x compression
๐ HNSW Index Configuration
Tune the HNSW index for your performance/accuracy requirements:
M Parameter (Connections per Node)
Controls graph connectivity and search accuracy:
// Low M = faster build, less memory, lower recall
let db_fast = builder
.hnsw_m // Minimal connections
.build?;
// Medium M = balanced (default)
let db_balanced = builder
.hnsw_m // Default setting
.build?;
// High M = slower build, more memory, higher recall
let db_accurate = builder
.hnsw_m // Maximum accuracy
.build?;
ef_construction (Build-Time Accuracy)
Controls accuracy during index construction:
// Fast build, lower recall
let db_fast = builder
.hnsw_ef_construction
.build?;
// Balanced (default)
let db_balanced = builder
.hnsw_ef_construction
.build?;
// Slow build, maximum recall
let db_accurate = builder
.hnsw_ef_construction
.build?;
ef_search (Query-Time Accuracy)
Can be adjusted per query for dynamic performance/accuracy tradeoff:
// Fast search, lower recall
let query_fast = SearchQuery ;
// Accurate search
let query_accurate = SearchQuery ;
๐ฏ Use Cases
Multi-Model AI Systems
Route queries to specialized models based on content:
// Route code questions to code model, math to math model, etc.
let router = new;
let best_model = router.route?;
Load Balancing
Distribute inference load across multiple servers:
// Balance load across 10 GPU servers
let router = new;
let endpoint = router.next_endpoint?;
RAG (Retrieval-Augmented Generation)
Fast context retrieval for LLMs:
// Store document embeddings
for doc in documents
// Retrieve relevant context for query
let query_embedding = embed_model.encode?;
let context_docs = db.search?;
Semantic Search
Build intelligent search engines:
// Index product catalog
for product in catalog
// Search by natural language
let search_embedding = encode_query?;
let results = db.search?;
Agent Memory Systems
Store and retrieve agent experiences:
// Store agent observations
๐ง Configuration Guide
Optimizing for Different Workloads
High Throughput (Batch Processing)
let db = builder
.dimensions
.hnsw_m // Lower M for faster queries
.hnsw_ef_construction // Faster build
.hnsw_ef_search // Lower default search accuracy
.quantization // Compress for speed
.mmap_vectors // Reduce memory pressure
.build?;
High Accuracy (Research/Analysis)
let db = builder
.dimensions
.hnsw_m // Maximum connections
.hnsw_ef_construction // High build accuracy
.hnsw_ef_search // High search accuracy
.quantization // No compression
.build?;
Memory Constrained (Edge Devices)
let db = builder
.dimensions // Smaller embeddings
.max_elements // Limit dataset size
.hnsw_m // Fewer connections
.quantization // 32x compression
.mmap_vectors // Use disk instead of RAM
.build?;
Balanced (Production Default)
let db = builder
.dimensions
.hnsw_m
.hnsw_ef_construction
.hnsw_ef_search
.quantization
.mmap_vectors
.build?;
๐ Performance Characteristics
Latency Benchmarks
Configuration Query Latency (p50) Recall@10
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Uncompressed, M=64 0.3ms 98.5%
Scalar Quant, M=32 0.4ms 96.2%
Product Quant, M=32 0.5ms 94.8%
Binary Quant, M=16 0.6ms 91.3%
Memory Usage (1M vectors @ 384 dims)
Quantization Memory Usage Compression Ratio
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
None (float32) 1536 MB 1x
Scalar (int8) 392 MB 3.9x
Product (8 subspaces) 120 MB 12.8x
Binary (1 bit/dim) 52 MB 29.5x
Throughput (1M vectors)
Operation Throughput Notes
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Single Insert ~100K/sec Sequential
Batch Insert ~500K/sec Parallel (rayon)
Query (k=10) ~50K QPS ef_search=100
Query (k=100) ~20K QPS ef_search=100
๐๏ธ Integration with Vector Database
Router Core integrates seamlessly with the main Ruvector database:
use VectorDB as MainDB;
use VectorDB as RouterDB;
// Use router-core for specialized routing logic
let router_db = builder
.dimensions
.build?;
// Or use main ruvector-core for full features
let main_db = builder
.dimensions
.build?;
// Both share the same API!
๐งช Building and Testing
Build
# Build library
# Build with all features
# Build static library
Test
# Run all tests
# Run specific test
# Run with logging
RUST_LOG=debug
Benchmark
# Run benchmarks
# Run specific benchmark
# With criterion output
๐ API Documentation
Core Types
VectorDB: Main database interfaceVectorEntry: Vector with ID, data, and metadataSearchQuery: Query parameters for similarity searchSearchResult: Search result with ID, score, and metadataDistanceMetric: Enum for distance calculation methodsQuantizationType: Enum for compression methods
Key Methods
// VectorDB
๐ Links
- Main Repository: github.com/ruvnet/ruvector
- Documentation: docs/README.md
- API Reference: docs/api/RUST_API.md
- Performance Guide: docs/optimization/PERFORMANCE_TUNING_GUIDE.md
- Examples: examples/
๐ Related Crates
ruvector-core: Full-featured vector database (superset of router-core)ruvector-node: Node.js bindings via NAPI-RSruvector-wasm: WebAssembly bindings for browsersrouter-cli: Command-line interface for router operationsrouter-ffi: Foreign function interface for C/C++router-wasm: WebAssembly bindings for router
๐ค Contributing
Contributions are welcome! Please see:
๐ License
MIT License - see LICENSE for details.
๐ Acknowledgments
Built with battle-tested technologies:
- HNSW: Hierarchical Navigable Small World algorithm
- Product Quantization: Memory-efficient vector compression
- simsimd: SIMD-accelerated similarity computations
- redb: Embedded database for persistent storage
- rayon: Data parallelism for batch operations
- parking_lot: High-performance synchronization primitives
Part of the Ruvector ecosystem
Built by rUv โข Production Ready โข MIT Licensed
Documentation โข API Reference โข Examples โข Benchmarks