Expand description
§Collectors
Collectors define the information you want to extract from the documents matching the queries. In tantivy jargon, we call this information your search “fruit”.
Your fruit could for instance be :
At some point in your code, you will trigger the actual search operation by calling
Searcher::search().
This call will look like this:
let fruit = searcher.search(&query, &collector)?;Here the type of fruit is actually determined as an associated type of the collector
(Collector::Fruit).
§Combining several collectors
A rich search experience often requires to run several collectors on your search query. For instance,
- selecting the top-K products matching your query
- counting the matching documents
- computing several facets
- computing statistics about the matching product prices
A simple and efficient way to do that is to pass your collectors as one tuple.
The resulting Fruit will then be a typed tuple with each collector’s original fruits
in their respective position.
use tantivy::collector::{Count, TopDocs};
let (doc_count, top_docs): (usize, Vec<(Score, DocAddress)>) =
searcher.search(&query, &(Count, TopDocs::with_limit(2)))?;The Collector trait is implemented for up to 4 collectors.
If you have more than 4 collectors, you can either group them into
tuples of tuples (a,(b,(c,d))), or rely on MultiCollector.
§Combining several collectors dynamically
Combining collectors into a tuple is a zero-cost abstraction: everything happens as if you had manually implemented a single collector combining all of our features.
Unfortunately it requires you to know at compile time your collector types.
If on the other hand, the collectors depend on some query parameter,
you can rely on MultiCollector’s.
§Implementing your own collectors.
See the custom_collector example.
Structs§
- Bytes
Filter Collector - A variant of the
FilterCollectorspecialized for bytes fast fields, i.e. - Comparable
Doc - Contains a feature (field, score, etc.) of a document along with the document address.
- Count
CountCollectorcollector only counts how many documents match the query.- DocSet
Collector - Collectors that returns the set of DocAddress that matches the query.
- Facet
Collector - Collector for faceting
- Facet
Counts - Intermediary result of the
FacetCollectorthat stores the facet counts for all the segments. - Filter
Collector - The
FilterCollectorfilters docs using a fast field value and a predicate. - Fruit
Handle - FruitHandle stores reference to the corresponding collector inside MultiCollector
- Histogram
Collector - Histogram builds an histogram of the values of a fastfield for the collected DocSet.
- Multi
Collector - Multicollector makes it possible to collect on more than one collector. It should only be used for use cases where the Collector types is unknown at compile time.
- Multi
Fruit - MultiFruit keeps Fruits from every nested Collector
- TopDocs
- The
TopDocscollector keeps track of the topKdocuments sorted by their score. - TopN
Computer - Fast TopN Computation
Traits§
- Collector
- Collectors are in charge of collecting and retaining relevant information from the document found and scored by the query.
- Custom
Scorer CustomScorermakes it possible to define any kind of score.- Custom
Segment Scorer - A custom segment scorer makes it possible to define any kind of score for a given document belonging to a specific segment.
- Fruit
Fruitis the type for the result of our collection. e.g.usizefor theCountcollector.- Score
Segment Tweaker - A
ScoreSegmentTweakermakes it possible to modify the default score for a given document belonging to a specific segment. - Score
Tweaker ScoreTweakermakes it possible to tweak the score emitted by the scorer into another one.- Segment
Collector - The
SegmentCollectoris the trait in charge of defining the collect operation at the scale of the segment.