Shardex - Vector Search Engine
A high-performance memory-mapped vector search engine using the ApiThing pattern for consistent, type-safe operations.
Quick Start
use ;
use ;
use ApiOperation;
Features
- Consistent API: All operations use the ApiThing pattern for predictable, composable interactions
- Type Safety: Parameter objects with validation prevent common errors and provide clear interfaces
- Shared Context: Efficient resource management through centralized state
- Memory-mapped storage for zero-copy operations and fast startup
- ACID transactions via write-ahead logging
- Incremental updates without full index rebuilds
- Document text storage with snippet extraction
- Performance monitoring and detailed statistics
- Crash recovery from unexpected shutdowns
- Dynamic shard management with automatic splitting
Architecture
Shardex is built around three core concepts:
- ShardexContext: Shared state and resource management
- Operations: Types implementing ApiOperation trait
- Parameters: Type-safe input objects for each operation
All operations follow the same pattern:
use ApiOperation;
let result = execute?;
Core Operations
Index Management
- CreateIndex - Create new index
Document Operations
- AddPostings - Add vector postings
- StoreDocumentText - Store document text
- BatchStoreDocumentText - Batch text storage
Search Operations
- Search - Vector similarity search
- GetDocumentText - Retrieve document text
- ExtractSnippet - Extract text snippets
Maintenance Operations
- Flush - Flush pending operations
- GetStats - Index statistics
- GetPerformanceStats - Performance metrics
Migration Guide
From Previous API
Old pattern:
let config = new.directory_path;
let mut index = create.await?;
index.add_postings.await?;
let results = index.search.await?;
New pattern:
let mut context = new;
let params = builder
.directory_path
.build?;
execute?;
execute?;
let results = execute?;
Benefits of the new API:
- Type Safety: Parameter objects catch errors at compile time
- Consistency: All operations follow the same execute pattern
- Composability: Operations can be easily chained and tested
- Resource Efficiency: Single context manages all resources
For a complete migration guide, see MIGRATION.md.
Configuration
Extensive configuration options for optimization:
use ;
let params = builder
.directory_path
.vector_size // Vector dimensions
.shard_size // Max vectors per shard
.batch_write_interval_ms // WAL batch frequency
.default_slop_factor // Search breadth
.bloom_filter_size // Bloom filter size
.build?;
let mut context = new;
execute?;
Examples
The repository includes comprehensive examples demonstrating the ApiThing pattern:
- basic_usage.rs - Core operations and patterns
- configuration.rs - Advanced configuration options
- batch_operations.rs - High-throughput processing
- document_text_basic.rs - Text storage and retrieval
- document_text_advanced.rs - Advanced text operations
- monitoring.rs - Statistics and performance monitoring
- error_handling.rs - Robust error handling strategies
Run examples with:
Performance
Shardex delivers high performance through:
- Zero-copy operations via memory mapping
- Parallel shard search for multi-core utilization
- Configurable search breadth (slop factor) for speed vs. accuracy
- Bloom filter acceleration for document operations
- SIMD-optimized distance calculations
Benchmarks (1M vectors, 384 dimensions, Intel i7, 32GB RAM):
- Search latency: <5ms p95 for k=10
- Indexing throughput: >10,000 vectors/second
- Memory usage: ~2-3GB for 1M vectors
- Startup time: <100ms (memory-mapped loading)
Performance varies based on hardware, data characteristics, and configuration. Run your own benchmarks for accurate measurements.
Requirements
- Rust 1.70+
- ApiThing for operation pattern
- Sufficient disk space for index files
- Memory mapping support (Linux, macOS, Windows)
Documentation
- Migration Guide - Converting from previous API versions
Contributing
Contributions are welcome!
License
This project is licensed under the MIT License - see the LICENSE file for details.