mih-rs
Rust implementation of multi-index hashing (MIH) for neighbor searches on 64-bit codes in the Hamming space, described in the paper
Norouzi, Punjani, and Fleet, Fast exact search in Hamming space with multi-index hashing, IEEE TPAMI, 36(6):1107– 1119, 2014.
Features
-
Two types of neighbor searches: mih-rs provides the two search operations:
- Range search finds neighbor codes whose Hamming distances to a given code are within a radius.
- Top-K search finds the top-K codes that are closest to a given code.
-
Fast and memory-efficient implementation: The data structure is built on sparse hash tables, following the original implementation.
-
Parameter free: mih-rs automatically sets an optimal parameter of MIH depending on a given database (although you can also set this manually).
Example
use Index;
Benchmark
timeperf_topk.rs
offers the benchmark of top-K search for MIH and LinearSearch algorithms.
The following table shows the result of average search times in milliseconds per query, in the settings:
- Database: N random codes from a uniform distribution.
- Query set: 100 random codes from a uniform distribution.
- Machine: Mac Pro (Late 2013) of 6 core Intel Xeon E5 @3.5 GHz with 32 GB of RAM.
Algorithm | N=10,000 | N=100,000 | N=1,000,000 | N=10,000,000 |
---|---|---|---|---|
MIH (K=1) | 0.1 | 0.4 | 1.7 | 6.5 |
MIH (K=10) | 0.2 | 0.8 | 4.0 | 13.6 |
MIH (K=100) | 0.4 | 1.6 | 7.9 | 32.0 |
LinearSearch | 0.4 | 4.9 | 56.1 | 701.4 |
Licensing
This library is free software provided under MIT.