Searus
A flexible, multi-modal search engine library for Rust.
Overview
Searus is a powerful search engine library that provides multiple search strategies out of the box:
- Semantic Search - BM25-based text search with configurable field rules
- Tag-based Search - Exact and fuzzy tag matching
- Fuzzy Search - String similarity matching using Jaro-Winkler distance
- Vector Search - Nearest neighbor search with embeddings (via index adapters)
- Multi-modal Search - Combine multiple search strategies with weighted scoring
Features
- 🚀 Fast and Lightweight - Zero-cost abstractions with minimal dependencies
- 🔧 Flexible Configuration - Fine-tune search behavior with semantic rules
- 🎯 Multi-Strategy - Combine different search methods with custom weights
- 📊 Score Transparency - Detailed per-field scores and match explanations
- 🔌 Pluggable Storage - Bring your own index with the
IndexAdaptertrait - 🎨 Type-Safe - Generic over your document types with
serdesupport
Installation
Add this to your Cargo.toml:
[]
= "0.0.3"
Quick Start
use *;
use SemanticSearch;
use ;
Search Strategies
Semantic Search
BM25-based text search with configurable field rules and matching strategies:
use *;
use SemanticSearch;
let rules = builder
.field
.field
.field
.build;
let searcher = new;
Matching Strategies:
Matcher::BM25- Full BM25 scoring with IDFMatcher::Tokenized- Simple term frequency matchingMatcher::Exact- Case-insensitive exact string matchingMatcher::Fuzzy- Delegated toFuzzySearch
Tag-based Search
Match documents by tags with configurable field names:
use TaggedSearch;
// Default field name is "tags"
let tag_searcher = new;
// Or specify a custom field
let tag_searcher = with_field;
let query = builder
.tags
.build;
Tag Relationship Trees (TRT)
Enhance tag-based search by defining relationships between tags. This allows queries for a parent tag (e.g., "programming") to automatically include results for child tags (e.g., "rust", "python").
use ;
use HashMap;
// Define tag relationships
let nodes = vec!;
let trt = new;
// Configure searcher with TRT
let tag_searcher = new.with_trt;
// Query with TRT expansion (depth 1)
let query = builder
.tags
.with_trt
.build;
Fuzzy Search
String similarity matching using Jaro-Winkler distance:
use FuzzySearch;
let fuzzy_searcher = new
.with_threshold; // Minimum similarity: 0.0 to 1.0
let query = builder
.text // Will match "programming"
.build;
Multi-Strategy Search
Combine multiple searchers with custom weights:
use *;
use ;
let semantic_rules = builder
.field
.field
.build;
let engine = builder
.with
.with
.with
.build;
let query = builder
.text
.tags
.options
.build;
Extensions
Customize the search lifecycle with the SearusExtension trait. Extensions can intercept queries, modify items, and alter results.
use *;
;
// Register extension in the engine
let engine: = builder
.with
.with_extension
.build;
Custom Searchers
Implement your own search strategies by implementing the Searcher trait.
use *;
;
This allows you to plug in any algorithm (e.g., TF-IDF, LSH, experimental models) and combine it with built-in searchers.
Optimization
For large datasets (100k+ entities), consider these optimization strategies:
- Precomputation: Pre-tokenize text and pre-compute embeddings.
- Parallelism: Enable the
parallelfeature to userayonfor concurrent search execution. - Early Filtering: Apply cheap filters (tags, exact matches) before expensive semantic or vector searches.
- Approximate Nearest Neighbors (ANN): Use an
IndexAdapterthat supports ANN (e.g., HNSW) instead of brute-force KNN.
Index Adapters
Searus supports pluggable storage backends through the IndexAdapter trait:
use ;
// Built-in in-memory index
let mut index: = new;
index.put.unwrap;
// Find nearest neighbors
let neighbors = index.knn;
Implement IndexAdapter for your own storage backend (e.g., PostgreSQL, Redis, Qdrant).
Embeddings
Searus provides traits for embedding providers:
use ;
// Built-in stub embedder for testing
let embedder = new; // 384-dimensional vectors
let embedding = embedder.embed?;
// Implement TextEmbedder for your own provider (OpenAI, Cohere, local models, etc.)
Query Options
Fine-tune your search with query options:
let query = builder
.text
.tags
.options
.filters
.build;
Score Transparency
Searus provides detailed scoring information:
for result in results
Examples
Run the included examples:
# Basic semantic search
# Multi-strategy search
# Time check
# Filters example
# Tagged TRT search
Roadmap
- Multithreaded Operations: Run all search operations in parallel.
- Filter Expressions: Range queries, boolean logic, and complex filtering.
- Async Operations: Asynchronous entity search logic.
- Geospatial Search: Location-based querying.
- Image Search: Image-to-image and text-to-image search using embeddings.
- Persistent Storage: Disk-backed index adapters (e.g., using
sledorrocksdb). - Distributed Search: Sharding and clustering for massive datasets.
- Performance: SIMD optimizations and advanced caching strategies.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- BM25 implementation inspired by search engine research
- Fuzzy matching powered by the excellent strsim crate
- Text tokenization using unicode-segmentation