A lightweight research library for managing Approximate Nearest Neighbor search datasets.
It offers the following features:
- Storage of dense, sparse, and dense-sparse vector sets;
- Storage of query sets with ground-truth (i.e., exact nearest neighbors) according to different metrics;
- Basic functionality such as computing recall given a retrieved set; and,
- Serialization into and deserialization from HDF5 file format.
Find out more on crates.io.
Example usage
It is straightforward to read an ANN dataset. The code snippet below gives a concise example.
use ;
// Load the dataset.
let dataset = read
.expect;
// Get a reference to the data points.
let data_points: & = dataset.get_data_points;
// Get the test query set.
let test: & = dataset.get_test_query_set
.expect;
let test_queries: & = test.get_points;
let gt: &GroundTruth = test.get_ground_truth
.expect;
// Compute recall, assuming `retrieved_set` is &[Vec<usize>],
// where the `i`-th entry is a list of ids of retrieved points
// for the `i`-th query.
let recall = gt.mean_recall;