Laurus : Lexical Augmented Unified Retrieval Using Semantics
Laurus is a composable search core library written in Rust — built for Lexical Augmented Unified Retrieval Using Semantics. Rather than a monolithic engine, Laurus provides modular building blocks for embedding powerful search into any application:
- Lexical search primitives for precise, exact-match retrieval
- Vector-based similarity search for deep semantic understanding
- Hybrid scoring and ranking to synthesize multiple signals into coherent results
Rather than functioning as a monolithic search engine, Laurus is architected as a composable search core — a suite of modular building blocks designed to be embedded into applications, extended with custom logic, or orchestrated within distributed systems.
Documentation
Comprehensive documentation is available online:
- English: https://mosuka.github.io/laurus/
- Japanese (日本語): https://mosuka.github.io/laurus/ja/
Contents
- Getting Started
- Core Concepts
- Schema & Fields
- Text Analysis
- Embeddings
- Storage
- Indexing (Lexical / Vector)
- Search (Lexical / Vector / Hybrid)
- Query DSL
- Crate Guides
- laurus (Library) — Engine, Scoring, Faceting, Highlighting, Spelling Correction, Persistence & WAL
- laurus-cli — Command-line interface, REPL, Schema Format
- laurus-server — gRPC server, HTTP Gateway, Configuration
- Development
- API Reference (docs.rs)
Features
- Pure Rust Implementation: Memory-safe and fast performance with zero-cost abstractions.
- Hybrid Search: Seamlessly combine BM25 lexical search with HNSW vector search using configurable fusion strategies.
- Multimodal Capabilities: Native support for text-to-image and image-to-image search via CLIP embeddings.
- Rich Query DSL: Term, phrase, boolean, fuzzy, wildcard, range, geographic, and span queries.
- Flexible Analysis: Configurable pipelines for tokenization, normalization, and stemming (including CJK support via Lindera).
- Pluggable Storage: Interfaces for in-memory, file-system, and memory-mapped storage backends.
- Scoring & Ranking: BM25 scoring with customizable fusion strategies for hybrid results.
- Faceting & Highlighting: Built-in support for faceted navigation and search result highlighting.
- Spelling Correction: Suggest corrections for misspelled query terms.
Workspace Structure
Laurus is organized as a Cargo workspace with 3 crates:
| Crate | Description |
|---|---|
laurus |
Core search library — schema, analysis, indexing, search, and storage |
laurus-cli |
Command-line interface with REPL for interactive search |
laurus-server |
gRPC server with HTTP gateway for deploying Laurus as a service |
Feature Flags
The laurus crate provides optional feature flags for embedding support:
| Feature | Description |
|---|---|
embeddings-candle |
Local BERT embeddings via Candle |
embeddings-openai |
Cloud-based embeddings via the OpenAI API |
embeddings-multimodal |
CLIP-based multimodal (text + image) embeddings |
embeddings-all |
Enable all embedding backends |
Quick Start
use ;
use MemoryStorageConfig;
use ;
use ;
async
Examples
You can find usage examples in the laurus/examples/ directory:
| Example | Description | Feature Flag |
|---|---|---|
| quickstart | Basic full-text search | — |
| lexical_search | All query types (Term, Phrase, Boolean, Fuzzy, Wildcard, Range, Geo, Span) | — |
| vector_search | Semantic similarity search with embeddings | — |
| hybrid_search | Combining lexical and vector search with fusion | — |
| synonym_graph_filter | Synonym expansion in analysis pipeline | — |
| search_with_candle | Local BERT embeddings via Candle | embeddings-candle |
| search_with_openai | Cloud-based embeddings via OpenAI | embeddings-openai |
| multimodal_search | Text-to-image and image-to-image search | embeddings-multimodal |
Contributing
We welcome contributions!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.