laurus
Core search engine library for the Laurus project. Provides lexical search (keyword matching via inverted index), vector search (semantic similarity via embeddings), and hybrid search (combining both) through a unified API.
Features
- Lexical Search -- Full-text search powered by an inverted index with BM25 scoring
- Vector Search -- Approximate nearest neighbor (ANN) search using Flat, HNSW, or IVF indexes
- Hybrid Search -- Combine lexical and vector results with fusion algorithms (RRF, WeightedSum)
- Text Analysis -- Pluggable analyzer pipeline: tokenizers, filters, stemmers, synonyms (including CJK support via Lindera)
- Embeddings -- Built-in support for Candle (local BERT/CLIP), OpenAI API, or custom embedders
- Pluggable Storage -- In-memory, file-based, or memory-mapped backends
- Faceting & Highlighting -- Faceted navigation and search result highlighting
- Spelling Correction -- Suggest corrections for misspelled query terms
- Write-Ahead Log -- Durability via WAL with automatic recovery on restart
Installation
# Lexical search only (no embedding)
[]
= "0.2"
# With local BERT embeddings
[]
= { = "0.2", = ["embeddings-candle"] }
# All embedding backends
[]
= { = "0.2", = ["embeddings-all"] }
Feature Flags
| Feature | Description |
|---|---|
embeddings-candle |
Local BERT embeddings via Candle |
embeddings-openai |
Cloud-based embeddings via the OpenAI API |
embeddings-multimodal |
CLIP-based multimodal (text + image) embeddings |
embeddings-all |
Enable all embedding backends |
Quick Start
use ;
use MemoryStorageConfig;
use ;
use ;
async
Key Types
| Type | Module | Description |
|---|---|---|
Engine |
engine |
Unified search engine coordinating lexical and vector search |
Schema |
engine |
Field definitions and routing configuration |
Document |
data |
Collection of named field values |
SearchRequestBuilder |
engine |
Builder for unified search requests (lexical, vector, or hybrid) |
FusionAlgorithm |
engine |
Result merging strategy (RRF or WeightedSum) |
LaurusError |
error |
Comprehensive error type with variants for each subsystem |
Examples
Usage examples are in the examples/ directory:
| Example | Description | Feature Flag |
|---|---|---|
| quickstart | Basic full-text search | -- |
| lexical_search | All query types (Term, Phrase, Boolean, Fuzzy, Wildcard, Range, Geo, Span) | -- |
| vector_search | Semantic similarity search with embeddings | -- |
| hybrid_search | Combining lexical and vector search with fusion | -- |
| synonym_graph_filter | Synonym expansion in analysis pipeline | -- |
| search_with_candle | Local BERT embeddings via Candle | embeddings-candle |
| search_with_openai | Cloud-based embeddings via OpenAI | embeddings-openai |
| multimodal_search | Text-to-image and image-to-image search | embeddings-multimodal |
Documentation
License
This project is licensed under the MIT License - see the LICENSE file for details.