Expand description
The term dictionary main role is to associate the sorted Term
s to
a TermInfo
struct that contains some meta-information
about the term.
Internally, the term dictionary relies on the fst
crate to store
a sorted mapping that associate each term to its rank in the lexicographical order.
For instance, in a dictionary containing the sorted terms “abba”, “bjork”, “blur” and “donovan”,
the TermOrdinal
are respectively 0
, 1
, 2
, and 3
.
For u64
-terms, tantivy explicitly uses a BigEndian
representation to ensure that the
lexicographical order matches the natural order of integers.
i64
-terms are transformed to u64
using a continuous mapping val ⟶ val - i64::MIN
and then treated as a u64
.
f64
-terms are transformed to u64
using a mapping that preserve order, and are then treated
as u64
.
A second datastructure makes it possible to access a TermInfo
.
Structs§
- A TermDictionary wrapping either an FST based dictionary or a SSTable based one.
- A TermDictionaryBuilder wrapping either an FST or a SSTable dictionary builder.
- Given a list of sorted term streams, returns an iterator over sorted unique terms.
TermStreamer
acts as a cursor over a range of terms of a segment. Terms are guaranteed to be sorted.
Type Aliases§
- Position of the term in the sorted list of terms.