[][src]Module tique::conditional_collector

Top-K Collectors, with ordering and condition support.

This is a collection of collectors that provide top docs rank functionality very similar to tantivy::TopDocs, with added support for declaring the ordering (ascending or descending) and collection-time conditions.

let collector =
    TopCollector::<Score, Descending, _>::new(10, condition_for_segment);

NOTE: Usually the score type (Score above, a f32) is inferred so there's no need to specify it.

Ordering Support

When constructing a top collector you must specify how to actually order the items: in ascending or descending order.

You simply choose Ascending or Descending and let the compiler know:

let collector =
    TopCollector::<Score, Ascending, _>::new(limit, condition_for_segment);

Condition Support

A "condition" is simply a way to tell the collector that a document is a valid candidate to the top. It behaves just like a query filter would, but does not limit the candidates before the collector sees them.

This is a valid condition that accepts everything:

let condition_for_segment = true;

Generally speaking, a condition is anything that implements the ConditionForSegment trait and you can use closures as a shortcut:

let condition_for_segment = move |reader: &SegmentReader| {
    // Fetch useful stuff from the `reader`, then:
    move |segment_id, doc_id, score, is_ascending| {
        // Express whatever logic you want
        true
    }
};

let collector =
    TopCollector::<Score, Ascending, _>::new(limit, condition_for_segment);

Aside: Pagination with Constant Memory

If you've been using tantivy for a while, you're probably used to seeing tuples like (T, DocAddress) (T is usually tantivy::Score, but changes if you customize the score somehow).

You can also use these tuples as a condition and they act like a cursor for pagination, so when you do something like:

let limit = 10;
let condition_for_segment = (0.42, DocAddress(0, 1));
let collector =
    TopCollector::<_, Descending, _>::new(limit, condition_for_segment);

What you are asking for is the top limit documents that appear after (because you chose the Descending order) documents that scored 0.42 at whatever query you throw at it (and in case multiple docs score the name, the collector knows to break even by the DocAddress).

The results that you get after your search will contain more (T, DocAddress) tuples you can use to keep pagination going without ever having to increase limit.

Check examples/conditional_collector_tutorial.rs for more details.

Structs

Ascending

Marker to create a TopCollector in ascending order

CollectionResult

The basic result type, containing the top selected items and additional metadata.

Descending

Marker to create a TopCollector in descending order

TopCollector

A TopCollector like tantivy's, with added support for ordering and conditions.

Traits

CheckCondition

The condition that gets checked before collection. In order for a document to appear in the results it must first return true for check.

ConditionForSegment

A trait that allows defining arbitrary conditions to be checked before considering a matching document for inclusion in the top results.