# lethe-core-rust
[](https://crates.io/crates/lethe-core-rust)
[](https://docs.rs/lethe-core-rust)
[](https://opensource.org/licenses/MIT)
A high-performance hybrid retrieval engine that combines BM25 lexical search with vector similarity using z-score fusion. Lethe Core provides state-of-the-art context selection for conversational AI and retrieval-augmented generation (RAG) systems.
## Features
- **Hybrid Retrieval**: Combines BM25 lexical search with vector similarity for optimal relevance
- **Z-Score Fusion**: Normalizes and fuses scores using statistical z-score transformation (α=0.5, β=0.5)
- **Hero Configuration**: Pre-tuned parameters achieving parity with splade baseline performance
- **Gamma Boosting**: Context-aware score boosting for code, errors, and technical content
- **Chunking Pipeline**: Intelligent text segmentation with sentence-level granularity
- **Async-First**: Built on Tokio for high-performance concurrent operations
- **Extensible**: Modular architecture supporting custom embedding services and repositories
## Quick Start
Add this to your `Cargo.toml`:
```toml
[dependencies]
lethe-core-rust = "0.1.0"
```
### Basic Usage
```rust
use lethe_core_rust::{get_hero_config, apply_zscore_fusion, Candidate};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Get the hero configuration (optimal for splade parity)
let config = get_hero_config();
println!("Hero config: α={}, β={}, k_final={}",
config.alpha, config.beta, config.k_final);
// Example BM25 candidates
let bm25_candidates = vec![
Candidate {
doc_id: "doc1".to_string(),
score: 0.8,
text: Some("Rust async programming tutorial".to_string()),
kind: Some("bm25".to_string()),
},
Candidate {
doc_id: "doc2".to_string(),
score: 0.6,
text: Some("Python async examples".to_string()),
kind: Some("bm25".to_string()),
},
];
// Example vector candidates
let vector_candidates = vec![
Candidate {
doc_id: "doc1".to_string(),
score: 0.9,
text: Some("Rust async programming tutorial".to_string()),
kind: Some("vector".to_string()),
},
Candidate {
doc_id: "doc3".to_string(),
score: 0.7,
text: Some("Async programming concepts".to_string()),
kind: Some("vector".to_string()),
},
];
// Apply z-score fusion with hero configuration (α=0.5)
let results = apply_zscore_fusion(bm25_candidates, vector_candidates, 0.5);
println!("Hybrid results:");
for (i, result) in results.iter().enumerate() {
println!("{}. {} (score: {:.3})", i + 1, result.doc_id, result.score);
}
Ok(())
}
```
### Advanced Usage with Full Pipeline
```rust
use lethe_core_rust::{
HybridRetrievalService, HybridRetrievalConfig,
ChunkingService, ChunkingConfig
};
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create chunking service
let chunking_config = ChunkingConfig::default();
let chunking_service = ChunkingService::new(chunking_config);
// Chunk some text
let text = "This is a sample document about Rust programming. \
Rust is a systems programming language. \
It provides memory safety without garbage collection.";
let chunks = chunking_service.chunk_text(
text,
"session-123",
uuid::Uuid::new_v4(),
0
).await?;
println!("Created {} chunks", chunks.len());
// Set up hybrid retrieval with hero configuration
let config = HybridRetrievalConfig::hero();
let service = HybridRetrievalService::mock_for_testing();
println!("Hero configuration loaded:");
println!(" α (BM25 weight): {}", config.alpha);
println!(" β (Vector weight): {}", config.beta);
println!(" k_initial (pool size): {}", config.k_initial);
println!(" k_final (results): {}", config.k_final);
println!(" Diversification: {}", config.diversify_method);
Ok(())
}
```
## Z-Score Fusion Algorithm
Lethe Core implements a sophisticated z-score fusion algorithm that normalizes and combines scores from different retrieval methods:
1. **Score Normalization**: BM25 scores are normalized to [0,1], cosine similarities from [-1,1] to [0,1]
2. **Z-Score Calculation**: Each score set is converted to z-scores (mean=0, std=1)
3. **Weighted Fusion**: Combined using `hybrid_score = α * z_bm25 + β * z_vector`
4. **Gamma Boosting**: Context-aware multipliers for code, error, and technical content
### Hero Configuration
The hero configuration provides optimal parameters validated against splade baseline performance:
- **α = 0.5, β = 0.5**: Equal weighting of lexical and semantic signals
- **k_initial = 200**: Large candidate pool for comprehensive coverage
- **k_final = 5**: Focused results optimized for Recall@5 metrics
- **Diversification = "splade"**: Advanced diversification matching baseline method
- **Gamma boosting disabled**: Clean z-score fusion without latent multipliers
## Performance Characteristics
- **Concurrent Retrieval**: BM25 and vector search run in parallel using `tokio::try_join!`
- **Memory Efficient**: Streaming chunking and lazy evaluation
- **Scalable**: Handles large document collections with efficient indexing
- **Fast Fusion**: O(n) z-score calculation and fusion
- **Low Latency**: Sub-millisecond fusion for typical candidate sets
## Feature Flags
```toml
[dependencies]
lethe-core-rust = { version = "0.1.0", features = ["ollama"] }
```
- **default**: Basic functionality with fallback embedding service
- **fallback**: Fallback embedding service for testing (included in default)
- **ollama**: Integration with Ollama embedding service
## Architecture
Lethe Core follows a modular architecture:
- **`lethe-shared`**: Common types, errors, and utilities
- **`lethe-domain`**: Core business logic and services
- **`lethe-infrastructure`**: External integrations and adapters
### Key Types
- **`Candidate`**: Search result with document ID, score, and metadata
- **`HybridRetrievalConfig`**: Configuration for retrieval parameters
- **`HybridRetrievalService`**: Main service orchestrating hybrid search
- **`Chunk`**: Text segment with tokenization and metadata
- **`EmbeddingVector`**: Vector representation with similarity operations
## Examples
The repository includes several examples:
- **Basic Usage**: Simple z-score fusion example
- **Hero Configuration**: Using pre-tuned optimal parameters
- **Custom Config**: Advanced configuration with custom parameters
- **Full Pipeline**: End-to-end chunking and retrieval
## Testing
Run the test suite:
```bash
cargo test
```
Run with logging enabled:
```bash
RUST_LOG=debug cargo test -- --nocapture
```
## Golden Parity Snapshots
The hero configuration has been validated against golden parity snapshots ensuring consistent performance with the TypeScript implementation. Key validation points:
- Z-score mathematical properties (mean ≈ 0, std ≈ 1)
- Fusion calculation accuracy to 10 decimal places
- Candidate ordering and scoring consistency
- Performance parity with splade baseline
## Contributing
Contributions are welcome! Please ensure:
1. All tests pass: `cargo test`
2. Code is formatted: `cargo fmt`
3. Linting passes: `cargo clippy`
4. Documentation builds: `cargo doc --no-deps`
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Related Projects
- [Lethe](https://github.com/nrice/lethe) - The main Lethe project
- [SPLADE](https://github.com/naver/splade) - Sparse lexical and expansion model baseline
## Citation
If you use Lethe Core in your research, please cite:
```bibtex
@software{lethe_core_rust,
title={Lethe Core: High-Performance Hybrid Retrieval with Z-Score Fusion},
author={Nathan Rice},
year={2024},
url={https://github.com/nrice/lethe}
}
```