probly-search ยท
A full-text search library, written in Rust, optimized for insertion speed, that provides full control over the scoring calculations.
This start initially as a port of the Node library NDX.
Demo
Recipe (title) search with 50k documents.
https://quantleaf.github.io/probly-search-demo/
Features
-
Three ways to do scoring
- BM25 ranking function to rank matching documents. The same ranking function that is used by default in Lucene >= 6.0.0.
- zero-to-one, a library unique scoring function that provides a normalized score that is bounded by 0 and 1. Perfect for matching titles/labels with queries.
- Ability to fully customize your own scoring function by implenting the
ScoreCalculator
trait.
-
Trie based dynamic Inverted Index.
-
Multiple fields full-text indexing and searching.
-
Per-field score boosting.
-
Configurable tokenizer.
-
Free text queries with query expansion.
-
Fast allocation, but latent deletion.
-
WASM compatible
Documentation
Adding, Removing and Searching documents
See Integration tests.
Use this library with WASM
See recipe search demo project
A basic example
Creating an index with a document that has 2 fields. Query documents, and remove a document.
use HashSet;
use ;
// A white space tokenizer
// We have to provide extraction functions for the fields we want to index
// Title
// Description
// Create index with 2 fields
let mut index = new;
// Create docs from a custom Doc struct
let doc_1 = Doc ;
let doc_2 = Doc ;
// Add documents to index
index.add_document;
index.add_document;
// Search, expected 2 results
let mut result = index.query;
assert_eq!;
assert_eq!;
assert_eq!;
// Remove documents from index
index.remove_document;
// Vacuum to remove completely
index.vacuum;
// Search, expect 1 result
result = index.query;
assert_eq!;
assert_eq!;
Go through source tests in for the BM25 implementation and zero-to-one implementation for more query examples.
Testing
Run all tests with
cargo test
Benchmark
Run all benchmarks with
cargo bench