probly-search 0.1.0

A lightweight full-text search engine with a fully customizable scoring function
Documentation

probly-search · GitHub license Coverage Status PRs Welcome

A lightweight full-text search library that provides full control over the scoring calculations. Intended for creating small and short lifetime indices.

This library started as port of the Node library NDX.

Documentation

Documentation is under development. For now

Features

  • Multiple fields full-text indexing and searching.
  • Per-field score boosting.
  • BM25 ranking function to rank matching documents. The same ranking
  • Ability to fully customize your own scoring function function that is used by default in Lucene >= 6.0.0.
  • Trie based dynamic Inverted Index.
  • Configurable tokenizer and term filter.
  • Free text queries with query expansion.
  • Small memory footprint, optimized for mobile devices.
  • Serializable index.

Documentation

Example

Creating an index with a document that has fields. Then indexing two documents and query for one

let mut idx: Index<usize> = create_index(2);
let docs = vec![
    Doc {
        id: 1,
        title: "a b c".to_string(),
        text: "hello world".to_string(),
    },
    Doc {
        id: 2,
        title: "c d e".to_string(),
        text: "lorem ipsum".to_string(),
    },
];
for doc in docs {
    add_document_to_index(
        &mut idx,
        &[title_extract, text_extract],
        tokenizer,
        filter,
        doc.id,
        doc,
    );
}
let result = query(
    &mut idx,
    &vec![1., 1.],
    &score::default::bm25::default(),
    tokenizer,
    filter,
    None,
    &"a",
);
assert_eq!(result.len(), 1);
assert_eq!(
    approx_equal(result.get(0).unwrap().score, 0.6931471805599453, 8),
    true
);
assert_eq!(result.get(0).unwrap().key, 1);

Go through source tests for more examples.

License

MIT