Laurus : Lexical Augmented Unified Retrieval Using Semantics
Laurus is a search core library written in Rust, designed for Information Retrieval with Semantics.
Laurus provides the foundational mechanisms essential for advanced search capabilities:
- Lexical search primitives for precise, exact-match retrieval
- Vector-based similarity search for deep semantic understanding
- Hybrid scoring and ranking to synthesize multiple signals into coherent results
Rather than functioning as a monolithic search engine, Laurus is architected as a composable search core — a suite of modular building blocks designed to be embedded into applications, extended with custom logic, or orchestrated within distributed systems.
Documentation
Comprehensive documentation is available in the docs/ directory and online at https://mosuka.github.io/laurus/:
- Getting Started: Installation and basic usage.
- Architecture: System architecture overview.
- Core Concepts: Schema, Analysis, Embeddings, and Storage.
- Indexing: Lexical and Vector indexing.
- Search: Lexical, Vector, and Hybrid search.
- Advanced Features: Query DSL, ID Management, Persistence, and Deletions.
- API Reference
Features
- Pure Rust Implementation: Memory-safe and fast performance with zero-cost abstractions.
- Hybrid Search: Seamlessly combine BM25 lexical search with HNSW vector search using configurable fusion strategies.
- Multimodal capabilities: Native support for text-to-image and image-to-image search via CLIP embeddings.
- Rich Query DSL: Term, phrase, boolean, fuzzy, wildcard, range, and geographic queries.
- Flexible Analysis: Configurable pipelines for tokenization, normalization, and stemming (including CJK support).
- Pluggable Storage: Interfaces for in-memory, file-system, and memory-mapped storage backends.
Quick Start
use ;
use ;
use ;
use MemoryStorageConfig;
async
Examples
You can find usage examples in the laurus/examples/ directory:
- Quickstart - Basic full-text search
- Lexical Search - All query types (Term, Phrase, Boolean, Fuzzy, Wildcard, Range, Geo, Span)
- Vector Search - Semantic similarity search with embeddings
- Hybrid Search - Combining lexical and vector search with fusion
- Multimodal Search - Text-to-image and image-to-image search
- Synonym Graph Filter - Synonym expansion in analysis pipeline
- Candle Embedder - Local BERT embeddings
- OpenAI Embedder - Cloud-based embeddings
Contributing
We welcome contributions!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.