lethe-core-rust 0.1.1

High-performance hybrid retrieval engine combining BM25 lexical search with vector similarity using z-score fusion. Features hero configuration for optimal parity with splade baseline, gamma boosting for code/error contexts, and comprehensive chunking pipeline.
Documentation
# lethe-core-rust

[![Crates.io](https://img.shields.io/crates/v/lethe-core-rust.svg)](https://crates.io/crates/lethe-core-rust)
[![Documentation](https://docs.rs/lethe-core-rust/badge.svg)](https://docs.rs/lethe-core-rust)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A high-performance hybrid retrieval engine that combines BM25 lexical search with vector similarity using z-score fusion. Lethe Core provides state-of-the-art context selection for conversational AI and retrieval-augmented generation (RAG) systems.

## Features

- **Hybrid Retrieval**: Combines BM25 lexical search with vector similarity for optimal relevance
- **Z-Score Fusion**: Normalizes and fuses scores using statistical z-score transformation (α=0.5, β=0.5)
- **Hero Configuration**: Pre-tuned parameters achieving parity with splade baseline performance
- **Gamma Boosting**: Context-aware score boosting for code, errors, and technical content
- **Chunking Pipeline**: Intelligent text segmentation with sentence-level granularity
- **Async-First**: Built on Tokio for high-performance concurrent operations
- **Extensible**: Modular architecture supporting custom embedding services and repositories

## Quick Start

Add this to your `Cargo.toml`:

```toml
[dependencies]
lethe-core-rust = "0.1.0"
```

### Basic Usage

```rust
use lethe_core_rust::{get_hero_config, apply_zscore_fusion, Candidate};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Get the hero configuration (optimal for splade parity)
    let config = get_hero_config();
    println!("Hero config: α={}, β={}, k_final={}", 
             config.alpha, config.beta, config.k_final);
    
    // Example BM25 candidates
    let bm25_candidates = vec![
        Candidate {
            doc_id: "doc1".to_string(),
            score: 0.8,
            text: Some("Rust async programming tutorial".to_string()),
            kind: Some("bm25".to_string()),
        },
        Candidate {
            doc_id: "doc2".to_string(), 
            score: 0.6,
            text: Some("Python async examples".to_string()),
            kind: Some("bm25".to_string()),
        },
    ];
    
    // Example vector candidates
    let vector_candidates = vec![
        Candidate {
            doc_id: "doc1".to_string(),
            score: 0.9,
            text: Some("Rust async programming tutorial".to_string()),
            kind: Some("vector".to_string()),
        },
        Candidate {
            doc_id: "doc3".to_string(),
            score: 0.7,
            text: Some("Async programming concepts".to_string()),
            kind: Some("vector".to_string()),
        },
    ];
    
    // Apply z-score fusion with hero configuration (α=0.5)
    let results = apply_zscore_fusion(bm25_candidates, vector_candidates, 0.5);
    
    println!("Hybrid results:");
    for (i, result) in results.iter().enumerate() {
        println!("{}. {} (score: {:.3})", i + 1, result.doc_id, result.score);
    }
    
    Ok(())
}
```

### Advanced Usage with Full Pipeline

```rust
use lethe_core_rust::{
    HybridRetrievalService, HybridRetrievalConfig, 
    ChunkingService, ChunkingConfig
};
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create chunking service
    let chunking_config = ChunkingConfig::default();
    let chunking_service = ChunkingService::new(chunking_config);
    
    // Chunk some text
    let text = "This is a sample document about Rust programming. \
                Rust is a systems programming language. \
                It provides memory safety without garbage collection.";
                
    let chunks = chunking_service.chunk_text(
        text,
        "session-123", 
        uuid::Uuid::new_v4(),
        0
    ).await?;
    
    println!("Created {} chunks", chunks.len());
    
    // Set up hybrid retrieval with hero configuration
    let config = HybridRetrievalConfig::hero();
    let service = HybridRetrievalService::mock_for_testing();
    
    println!("Hero configuration loaded:");
    println!("  α (BM25 weight): {}", config.alpha);
    println!("  β (Vector weight): {}", config.beta);
    println!("  k_initial (pool size): {}", config.k_initial);
    println!("  k_final (results): {}", config.k_final);
    println!("  Diversification: {}", config.diversify_method);
    
    Ok(())
}
```

## Z-Score Fusion Algorithm

Lethe Core implements a sophisticated z-score fusion algorithm that normalizes and combines scores from different retrieval methods:

1. **Score Normalization**: BM25 scores are normalized to [0,1], cosine similarities from [-1,1] to [0,1]
2. **Z-Score Calculation**: Each score set is converted to z-scores (mean=0, std=1)
3. **Weighted Fusion**: Combined using `hybrid_score = α * z_bm25 + β * z_vector`
4. **Gamma Boosting**: Context-aware multipliers for code, error, and technical content

### Hero Configuration

The hero configuration provides optimal parameters validated against splade baseline performance:

- **α = 0.5, β = 0.5**: Equal weighting of lexical and semantic signals
- **k_initial = 200**: Large candidate pool for comprehensive coverage
- **k_final = 5**: Focused results optimized for Recall@5 metrics
- **Diversification = "splade"**: Advanced diversification matching baseline method
- **Gamma boosting disabled**: Clean z-score fusion without latent multipliers

## Performance Characteristics

- **Concurrent Retrieval**: BM25 and vector search run in parallel using `tokio::try_join!`
- **Memory Efficient**: Streaming chunking and lazy evaluation
- **Scalable**: Handles large document collections with efficient indexing
- **Fast Fusion**: O(n) z-score calculation and fusion
- **Low Latency**: Sub-millisecond fusion for typical candidate sets

## Feature Flags

```toml
[dependencies]
lethe-core-rust = { version = "0.1.0", features = ["ollama"] }
```

- **default**: Basic functionality with fallback embedding service
- **fallback**: Fallback embedding service for testing (included in default)
- **ollama**: Integration with Ollama embedding service

## Architecture

Lethe Core follows a modular architecture:

- **`lethe-shared`**: Common types, errors, and utilities
- **`lethe-domain`**: Core business logic and services
- **`lethe-infrastructure`**: External integrations and adapters

### Key Types

- **`Candidate`**: Search result with document ID, score, and metadata
- **`HybridRetrievalConfig`**: Configuration for retrieval parameters
- **`HybridRetrievalService`**: Main service orchestrating hybrid search
- **`Chunk`**: Text segment with tokenization and metadata
- **`EmbeddingVector`**: Vector representation with similarity operations

## Examples

The repository includes several examples:

- **Basic Usage**: Simple z-score fusion example
- **Hero Configuration**: Using pre-tuned optimal parameters
- **Custom Config**: Advanced configuration with custom parameters
- **Full Pipeline**: End-to-end chunking and retrieval

## Testing

Run the test suite:

```bash
cargo test
```

Run with logging enabled:

```bash
RUST_LOG=debug cargo test -- --nocapture
```

## Golden Parity Snapshots

The hero configuration has been validated against golden parity snapshots ensuring consistent performance with the TypeScript implementation. Key validation points:

- Z-score mathematical properties (mean ≈ 0, std ≈ 1)
- Fusion calculation accuracy to 10 decimal places
- Candidate ordering and scoring consistency
- Performance parity with splade baseline

## Contributing

Contributions are welcome! Please ensure:

1. All tests pass: `cargo test`
2. Code is formatted: `cargo fmt`
3. Linting passes: `cargo clippy`
4. Documentation builds: `cargo doc --no-deps`

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Related Projects

- [Lethe]https://github.com/nrice/lethe - The main Lethe project
- [SPLADE]https://github.com/naver/splade - Sparse lexical and expansion model baseline

## Citation

If you use Lethe Core in your research, please cite:

```bibtex
@software{lethe_core_rust,
  title={Lethe Core: High-Performance Hybrid Retrieval with Z-Score Fusion},
  author={Nathan Rice},
  year={2024},
  url={https://github.com/nrice/lethe}
}
```