Crate searchy

Source
Expand description

§searchy: an embeddable in-memory search engine

Search index (in-memory), that can be constructed and searched using a bag of words model, with cosine similarity scoring based on tf-idf. Supports multiple search terms, permissions, text suggestions (for spelling errors), and evaluating common arithmetic expressions directly to values. Tries to reduce memory footprint, and if possible speed (of updates and searching).

Features:

  • expression evaluation (1+2 will result in 3);
  • small in memory representation using two allocations (after building the index);
  • cheap updates to the search index;
  • spell suggestions based on search index;
  • summary sentence in search results (based on Luhn);
  • filtering of result based on group (role-based access control).

§Minimal example

use searchy::*;
use expry::*;
 
// MemoryPool avoids small allocations.
pool!(scope);

// Add documents
let mut builder = SearchBuilder::new();
let doc_id = builder.add_document("url".into(), "foo bar text", &mut scope);
// doc_id can be used to later remove the document from the search index
 
// Build search index
let index : SearchIndex = builder.into();
const MAX_DOCS : usize = 1024;
let results = index.search("query text", MAX_DOCS);
eprintln!("RESULTS: {}x in {}ms", results.docs, results.duration);
for ScoredDocumentInfo{doc_id: _, score, info} in results.entries {
  eprintln!("{} -> {}", score, info);
}
 
// do some mutations to the search index
let mut builder = SearchBuilder::from(index);
builder.remove_document(doc_id);

Macros§

clear
pool
rewind

Structs§

MemoryPool
MemoryScope
ScopedArrayBuilder
ScopedStringBuilder
ScoredDocumentInfo
SearchBuilder
Builder to construct a SearchIndex. Can be constructed fresh or from an existing SearchIndex.
SearchIndex
SearchResult

Functions§

defining_sentence_luhn
Implementation based on H.P. Luhn: The Automatic Creation of Literature Abstracts published in 1957.
extract_links
from_hex
from_hex_u8
html_escape_inside_attribute_u8
html_escape_outside_attribute_u8
Escaping for inside the script HTML tag.
resolve_relative_links
to_hex
url_encode_u8
Encodes a part of a URL so it can be used in any part of a URL. Not to be confused if you want to escape a whole URL (e.g. the string http://example.org), use url_escape_u8 for that.
url_escape
url_escape_u8
Escapes a whole URL so only valid URL chars are inside the result. Not to be confused if you want to encode a random string as part of a URL (e.g. in a form), use url_encode_u8 for that.
url_unescape_u8
url_unescape_u8_to_scope

Type Aliases§

DefaultDocumentInfo
DocumentFrequency
DocumentId
ReplaceFn
SmallStr