Module tantivy::termdict

source ·
Expand description

The term dictionary main role is to associate the sorted Terms to a TermInfo struct that contains some meta-information about the term.

Internally, the term dictionary relies on the fst crate to store a sorted mapping that associate each term to its rank in the lexicographical order. For instance, in a dictionary containing the sorted terms “abba”, “bjork”, “blur” and “donovan”, the TermOrdinal are respectively 0, 1, 2, and 3.

For u64-terms, tantivy explicitly uses a BigEndian representation to ensure that the lexicographical order matches the natural order of integers.

i64-terms are transformed to u64 using a continuous mapping val ⟶ val - i64::MIN and then treated as a u64.

f64-terms are transformed to u64 using a mapping that preserve order, and are then treated as u64.

A second datastructure makes it possible to access a TermInfo.

Structs§

  • A TermDictionary wrapping either an FST based dictionary or a SSTable based one.
  • A TermDictionaryBuilder wrapping either an FST or a SSTable dictionary builder.
  • Given a list of sorted term streams, returns an iterator over sorted unique terms.
  • TermStreamer acts as a cursor over a range of terms of a segment. Terms are guaranteed to be sorted.

Type Aliases§

  • Position of the term in the sorted list of terms.