Expand description
§Collectors
Collectors define the information you want to extract from the documents matching the queries. In tantivy jargon, we call this information your search “fruit”.
Your fruit could for instance be :
At some point in your code, you will trigger the actual search operation by calling
Searcher::search()
.
This call will look like this:
let fruit = searcher.search(&query, &collector)?;
Here the type of fruit is actually determined as an associated type of the collector
(Collector::Fruit
).
§Combining several collectors
A rich search experience often requires to run several collectors on your search query. For instance,
- selecting the top-K products matching your query
- counting the matching documents
- computing several facets
- computing statistics about the matching product prices
A simple and efficient way to do that is to pass your collectors as one tuple.
The resulting Fruit
will then be a typed tuple with each collector’s original fruits
in their respective position.
use tantivy::collector::{Count, TopDocs};
let (doc_count, top_docs): (usize, Vec<(Score, DocAddress)>) =
searcher.search(&query, &(Count, TopDocs::with_limit(2)))?;
The Collector
trait is implemented for up to 4 collectors.
If you have more than 4 collectors, you can either group them into
tuples of tuples (a,(b,(c,d)))
, or rely on MultiCollector
.
§Combining several collectors dynamically
Combining collectors into a tuple is a zero-cost abstraction: everything happens as if you had manually implemented a single collector combining all of our features.
Unfortunately it requires you to know at compile time your collector types.
If on the other hand, the collectors depend on some query parameter,
you can rely on MultiCollector
’s.
§Implementing your own collectors.
See the custom_collector
example.
Structs§
- A variant of the
FilterCollector
specialized for bytes fast fields, i.e. it transparently wraps an innerCollector
but filters documents based on the result of applying the predicate to the bytes fast field. - Contains a feature (field, score, etc.) of a document along with the document address.
CountCollector
collector only counts how many documents match the query.- Collectors that returns the set of DocAddress that matches the query.
- Collector for faceting
- Intermediary result of the
FacetCollector
that stores the facet counts for all the segments. - The
FilterCollector
filters docs using a fast field value and a predicate. - FruitHandle stores reference to the corresponding collector inside MultiCollector
- Histogram builds an histogram of the values of a fastfield for the collected DocSet.
- Multicollector makes it possible to collect on more than one collector. It should only be used for use cases where the Collector types is unknown at compile time.
- MultiFruit keeps Fruits from every nested Collector
- The
TopDocs
collector keeps track of the topK
documents sorted by their score. - Fast TopN Computation
Traits§
- Collectors are in charge of collecting and retaining relevant information from the document found and scored by the query.
CustomScorer
makes it possible to define any kind of score.- A custom segment scorer makes it possible to define any kind of score for a given document belonging to a specific segment.
Fruit
is the type for the result of our collection. e.g.usize
for theCount
collector.- A
ScoreSegmentTweaker
makes it possible to modify the default score for a given document belonging to a specific segment. ScoreTweaker
makes it possible to tweak the score emitted by the scorer into another one.- The
SegmentCollector
is the trait in charge of defining the collect operation at the scale of the segment.