Expand description
§searchy: an embeddable in-memory search engine
Search index (in-memory), that can be constructed and searched using a bag of words model, with cosine similarity scoring based on tf-idf. Supports multiple search terms, permissions, text suggestions (for spelling errors), and evaluating common arithmetic expressions directly to values. Tries to reduce memory footprint, and if possible speed (of updates and searching).
Features:
- expression evaluation (1+2 will result in 3);
- small in memory representation using two allocations (after building the index);
- cheap updates to the search index;
- spell suggestions based on search index;
- summary sentence in search results (based on Luhn);
- filtering of result based on group (role-based access control).
§Minimal example
use searchy::*;
use expry::*;
// MemoryPool avoids small allocations.
pool!(scope);
// Add documents
let mut builder = SearchBuilder::new();
let doc_id = builder.add_document("url".into(), "foo bar text", &mut scope);
// doc_id can be used to later remove the document from the search index
// Build search index
let index : SearchIndex = builder.into();
const MAX_DOCS : usize = 1024;
let results = index.search("query text", MAX_DOCS);
eprintln!("RESULTS: {}x in {}ms", results.docs, results.duration);
for ScoredDocumentInfo{doc_id: _, score, info} in results.entries {
eprintln!("{} -> {}", score, info);
}
// do some mutations to the search index
let mut builder = SearchBuilder::from(index);
builder.remove_document(doc_id);
Macros§
Structs§
- Memory
Pool - Memory
Scope - Scoped
Array Builder - Scoped
String Builder - Scored
Document Info - Search
Builder - Builder to construct a SearchIndex. Can be constructed fresh or from an existing SearchIndex.
- Search
Index - Search
Result
Functions§
- defining_
sentence_ luhn - Implementation based on H.P. Luhn: The Automatic Creation of Literature Abstracts published in 1957.
- extract_
links - from_
hex - from_
hex_ u8 - html_
escape_ inside_ attribute_ u8 - html_
escape_ outside_ attribute_ u8 - Escaping for inside the
script
HTML tag. - resolve_
relative_ links - to_hex
- url_
encode_ u8 - Encodes a part of a URL so it can be used in any part of a URL. Not to be confused if you
want to escape a whole URL (e.g. the string
http://example.org
), useurl_escape_u8
for that. - url_
escape - url_
escape_ u8 - Escapes a whole URL so only valid URL chars are inside the result. Not to be confused if you
want to encode a random string as part of a URL (e.g. in a form), use
url_encode_u8
for that. - url_
unescape_ u8 - url_
unescape_ u8_ to_ scope