probly-search 0.1.1

A lightweight full-text search engine with a fully customizable scoring function
Documentation

probly-search ยท GitHub license Coverage Status Latest Version PRs Welcome

A lightweight full-text search library that provides full control over the scoring calculations. Intended for creating small and short lifetime indices.

This library started as port of the Node library NDX.

Features

  • Multiple fields full-text indexing and searching.
  • Per-field score boosting.
  • BM25 ranking function to rank matching documents. The same ranking function that is used by default in Lucene >= 6.0.0.
  • Ability to fully customize your own scoring function by implenting the ScoreCalculator trait.
  • Trie based dynamic Inverted Index.
  • Configurable tokenizer and term filter.
  • Free text queries with query expansion.
  • Small memory footprint, optimized for mobile devices.

Documentation

Documentation is under development. For now read the source tests.

Example

Creating an index with a document that has 2 fields. Then indexing two documents and query for one using the BM25 scoring function

let mut idx: Index<usize> = create_index(2);
let docs = vec![
    Doc {
        id: 1,
        title: "a b c".to_string(),
        text: "hello world".to_string(),
    },
    Doc {
        id: 2,
        title: "c d e".to_string(),
        text: "lorem ipsum".to_string(),
    },
];
for doc in docs {
    add_document_to_index(
        &mut idx,
        &[title_extract, text_extract],
        tokenizer,
        filter,
        doc.id,
        doc,
    );
}
let result = query(
    &mut idx,
    &vec![1., 1.],
    &score::default::bm25::default(),
    tokenizer,
    filter,
    None,
    &"a",
);
assert_eq!(result.len(), 1);
assert_eq!(
    approx_equal(result.get(0).unwrap().score, 0.6931471805599453, 8),
    true
);
assert_eq!(result.get(0).unwrap().key, 1);

Go through source tests for more examples.

License

MIT