Expand description
§Searus: A Flexible, Multi-Modal Search Engine Library for Rust
Searus is a powerful, adaptable search library designed to provide a unified interface for various search strategies. Whether you need full-text search, semantic understanding, tag-based filtering, or fuzzy matching, Searus offers a cohesive solution. It is particularly well-suited for applications that require combining multiple search modalities to deliver nuanced and relevant results.
The library is built with flexibility in mind, allowing you to compose different searchers, define custom scoring rules, and extend functionality with hooks into the search lifecycle.
§Key Features
- Multi-Modal Search: Combine different searchers (e.g.,
SemanticSearch,TaggedSearch,FuzzySearch) in a single query. - Configurable Ranking: Use
SemanticRulesto define field-specific weights, priorities, and search methods (like BM25 or exact matching). - Extensible Architecture: Implement custom
Searchertraits or useSearusExtensionto modify queries and results. - Filtering: Apply complex, field-based filters to refine search results before or after the search process.
- Tag Relationship Trees (TRT): Expand tag-based queries to include related tags, enabling more comprehensive searches.
- Parallel Execution: Speed up searches with the optional
parallelfeature flag.
§Feature Flags
semantic(default): Enables semantic search capabilities (BM25).fuzzy(default): Enables fuzzy search capabilities.tagged(default): Enables tag-based search capabilities.parallel: Enables parallel execution usingrayon.serde: Enables serialization support (required for most features).
§Getting Started
Here’s a quick example of how to set up a semantic search engine for a collection of blog posts.
First, add Searus to your Cargo.toml:
[dependencies]
searus = "0.1.0" # Replace with the latest version
serde = { version = "1.0", features = ["derive"] }Now, you can create a search engine and query your data:
use searus::prelude::*;
use searus::searchers::SemanticSearch;
use serde::{Deserialize, Serialize};
// Define the data structure to be searched.
// It must derive `Serialize`, `Deserialize`, and `Clone`.
#[derive(Debug, Clone, Serialize, Deserialize)]
struct Post {
id: u32,
title: String,
content: String,
author: String,
}
fn main() {
// 1. Create a collection of documents to search.
let posts = vec![
Post {
id: 1,
title: "Getting Started with Rust".to_string(),
content: "Rust is a systems programming language.".to_string(),
author: "Alice".to_string(),
},
Post {
id: 2,
title: "Building a Search Engine in Rust".to_string(),
content: "Learn how to build a search engine.".to_string(),
author: "Bob".to_string(),
},
];
// 2. Define semantic rules for searching fields.
// Here, we prioritize matches in 'title' over 'content'.
let rules = SemanticRules::builder()
.field("title", FieldRule::bm25().priority(2))
.field("content", FieldRule::bm25().priority(1))
.build();
// 3. Create a searcher. `SemanticSearch` is great for text-based queries.
let semantic_searcher = SemanticSearch::new(rules);
// 4. Build the search engine and register the searcher.
let engine = SearusEngine::builder()
.with(Box::new(semantic_searcher))
.build();
// 5. Construct a query.
let query = Query::builder()
.text("rust programming")
.options(SearchOptions::default().limit(1))
.build();
// 6. Execute the search.
let results = engine.search(&posts, &query);
// 7. Print the results.
println!("Query: \"rust programming\"");
for result in results {
println!(
"Found post: \"{}\" by {} (Score: {:.3})",
result.item.title, result.item.author, result.score
);
}
}This example demonstrates the basic workflow: defining data, configuring rules, building an engine, and executing a query. For more advanced use cases, such as combining multiple searchers or using filters, see the documentation for SearusEngine, Query, and the specific Searcher implementations.
Modules§
- context
- Provides the
SearchContext, which holds the state of the items being searched. Context provided to searchers during a search operation. - embeddings
- Contains components for generating embeddings, used in vector or semantic search. (Currently experimental). Provides abstractions for generating embeddings from text and images.
- engine
- The core
SearusEngine, which orchestrates the search process across multiple searchers. The main search engine that coordinates multiple searchers. - extension
- Defines the
SearusExtensiontrait for hooking into the search lifecycle to modify queries or results. Defines the extension system for Searus. - filter
- Provides powerful filtering capabilities with
FilterExprto refine search results. Defines the structures for building filter expressions for queries. - index
- Defines indexing structures for optimizing search performance. (Currently includes in-memory adapters). Provides abstractions and implementations for indexing and storing data.
- prelude
- Convenient re-exports for common types and traits.
- rules
- Implements the
SemanticRulesandFieldRulefor fine-grained control over text-based searching. A domain-specific language (DSL) for configuring semantic text search. - searcher
- Contains the fundamental
Searchertrait and the multi-searcher implementation. TheSearchertrait, which defines the interface for search plugins. - searchers
- A collection of built-in
Searcherimplementations, includingSemanticSearch,TaggedSearch, andFuzzySearch. A collection of built-inSearcherimplementations. - types
- Defines the core data structures used throughout the library, such as
Query,SearusMatch, andSearchOptions. Core data types for the Searus search engine.