1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
/// A TopKSelector could be either online or offline.
/// Here we store distances and we report the k smallest distances.
/// This means that for some metrics (such as dot product) you want to
/// store negative distances to get the closest vector.
///
/// We report the distances and ids of the top k items
/// The id is just the timestamp (from 0 to number of items inserted so far)
/// of the item. You must be able to remap those timestamps to the original ids
/// if needed.
/// We adopt this strategy for two reasons:
/// - The value of k is small so we can afford the remapping;
/// - The number of distances to be checked is large so we want to save the
/// time needed to create a vector of original ids and copy them.
///
/// An online selector, such as an implementation of a Heap,
/// updates the data structure after every `push`.
/// The current top k values can be reported efficiently after every push.
///
/// An offline selector (e.g., quickselect) may just collect every pushed
/// distances without doing anything. Then it can spend spends more time
/// (e.g., linear time) in computing the `topk` distances.
///
/// An online selector may be faster if a lot of distance are processed
/// at once.
pub use HeapFaiss;