Index

Struct Index 

Source
pub struct Index<T: CodeInt> { /* private fields */ }
Expand description

Multi-index hashing for neighbor searches on binary codes in the Hamming space.

Index implements the multi-index hashing proposed by Norouzi et al., which provides fast and memory-efficient neighbor searches on binary codes in the Hamming space.

The following two search options are supported:

  • Range search finds neighbor codes whose Hamming distances to a given code are within a radius.
  • Top-K search finds the top-K codes that are closest to a given code.

§Arguments

Index takes a generic type parameter of CodeInt to represent a binary code.

§Examples

use mih_rs::Index;

// Database of codes
let codes: Vec<u64> = vec![
    0b1111111111111111111111011111111111111111111111111011101111111111, // #zeros = 3
    0b1111111111111111111111111111111101111111111011111111111111111111, // #zeros = 2
    0b1111111011011101111111111111111101111111111111111111111111111111, // #zeros = 4
    0b1111111111111101111111111111111111111000111111111110001111111110, // #zeros = 8
    0b1101111111111111111111111111111111111111111111111111111111111111, // #zeros = 1
    0b1111111111111111101111111011111111111111111101001110111111111111, // #zeros = 6
    0b1111111111111111111111111111111111101111111111111111011111111111, // #zeros = 2
    0b1110110101011011011111111111111101111111111111111000011111111111, // #zeros = 11
];

// Query code
let qcode: u64 = 0b1111111111111111111111111111111111111111111111111111111111111111; // #zeros = 0

// Construct the index
let index = Index::new(codes).unwrap();

// Find the ids of neighbor codes whose Hamming distances are within 2
let mut searcher = index.range_searcher();
let answers = searcher.run(qcode, 2);
assert_eq!(answers, vec![1, 4, 6]);

// Find the ids of the top-4 nearest neighbor codes
let mut searcher = index.topk_searcher();
let answers = searcher.run(qcode, 4);
assert_eq!(answers, vec![4, 1, 6, 0]);

// Serialization/Deserialization
let mut data = vec![];
index.serialize_into(&mut data).unwrap();
let other = Index::<u64>::deserialize_from(&data[..]).unwrap();
assert_eq!(index, other);

Implementations§

Source§

impl<T: CodeInt> Index<T>

Source

pub fn new(codes: Vec<T>) -> Result<Self>

Builds an index from binary codes. The number of blocks for multi-index is set to the optimal one estimated from the number of input codes. The input database codes is stolen, but the reference can be gotten with Index::codes().

§Arguments
  • codes: Vector of binary codes of type CodeInt.
§Errors

anyhow::Error will be returned when

  • the codes is empty, or
  • the number of entries in codes is more than u32::max_value().
Source

pub fn with_blocks(codes: Vec<T>, num_blocks: usize) -> Result<Self>

Builds an index from binary codes with a manually specified number of blocks. The input database codes is stolen, but the reference can be gotten with Index::codes().

§Arguments
  • codes: Vector of binary codes of type CodeInt.
  • num_blocks: The number of blocks for multi-index.
§Errors

anyhow::Error will be returned when

  • the codes is empty,
  • the number of entries in codes is more than u32::max_value(), or
  • num_blocks is less than 2 or more than the number of dimensions in a binary code.
Source

pub fn range_searcher(&self) -> RangeSearcher<'_, T>

Returns a searcher RangeSearcher to find neighbor codes whose Hamming distances to a query code are within a query radius.

§Examples
use mih_rs::Index;

let codes: Vec<u64> = vec![
    0b1111111111111111111111011111111111111111111111111011101111111111, // #zeros = 3
    0b1111111111111111111111111111111101111111111011111111111111111111, // #zeros = 2
    0b1111111011011101111111111111111101111111111111111111111111111111, // #zeros = 4
    0b1111111111111101111111111111111111111000111111111110001111111110, // #zeros = 8
    0b1101111111111111111111111111111111111111111111111111111111111111, // #zeros = 1
    0b1111111111111111101111111011111111111111111101001110111111111111, // #zeros = 6
    0b1111111111111111111111111111111111101111111111111111011111111111, // #zeros = 2
    0b1110110101011011011111111111111101111111111111111000011111111111, // #zeros = 11
];

let index = Index::new(codes).unwrap();
let mut searcher = index.range_searcher();

let qcode: u64 = 0b1111111111111111111111111111111111111111111111111111111111111111; // #zeros = 0
let answers = searcher.run(qcode, 2);
assert_eq!(answers, vec![1, 4, 6]);
Source

pub fn topk_searcher(&self) -> TopkSearcher<'_, T>

Returns a searcher TopkSearcher to find top-K codes that are closest to a query code.

§Examples
use mih_rs::Index;

let codes: Vec<u64> = vec![
    0b1111111111111111111111011111111111111111111111111011101111111111, // #zeros = 3
    0b1111111111111111111111111111111101111111111011111111111111111111, // #zeros = 2
    0b1111111011011101111111111111111101111111111111111111111111111111, // #zeros = 4
    0b1111111111111101111111111111111111111000111111111110001111111110, // #zeros = 8
    0b1101111111111111111111111111111111111111111111111111111111111111, // #zeros = 1
    0b1111111111111111101111111011111111111111111101001110111111111111, // #zeros = 6
    0b1111111111111111111111111111111111101111111111111111011111111111, // #zeros = 2
    0b1110110101011011011111111111111101111111111111111000011111111111, // #zeros = 11
];

let index = Index::new(codes).unwrap();
let mut searcher = index.topk_searcher();

let qcode: u64 = 0b1111111111111111111111111111111111111111111111111111111111111111; // #zeros = 0
let answers = searcher.run(qcode, 4);
assert_eq!(answers, vec![4, 1, 6, 0]);
Source

pub fn codes(&self) -> &[T]

Gets the reference of the input database.

§Examples
use mih_rs::Index;

let codes: Vec<u64> = vec![
    0b1111111111111111111111011111111111111111111111111011101111111111, // #zeros = 3
    0b1111111111111111111111111111111101111111111011111111111111111111, // #zeros = 2
    0b1111111011011101111111111111111101111111111111111111111111111111, // #zeros = 4
    0b1111111111111101111111111111111111111000111111111110001111111110, // #zeros = 8
    0b1101111111111111111111111111111111111111111111111111111111111111, // #zeros = 1
    0b1111111111111111101111111011111111111111111101001110111111111111, // #zeros = 6
    0b1111111111111111111111111111111111101111111111111111011111111111, // #zeros = 2
    0b1110110101011011011111111111111101111111111111111000011111111111, // #zeros = 11
];

let index = Index::new(codes.clone()).unwrap();
assert_eq!(codes, index.codes());
Source

pub fn num_blocks(&self) -> usize

Gets the number of defined blocks in multi-index.

Source

pub fn serialize_into<W: Write>(&self, writer: W) -> Result<()>

Serializes the index into the file.

Source

pub fn deserialize_from<R: Read>(reader: R) -> Result<Self>

Deserializes the index from the file.

Trait Implementations§

Source§

impl<T: Clone + CodeInt> Clone for Index<T>

Source§

fn clone(&self) -> Index<T>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<T: Debug + CodeInt> Debug for Index<T>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<T: PartialEq + CodeInt> PartialEq for Index<T>

Source§

fn eq(&self, other: &Index<T>) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<T: Eq + CodeInt> Eq for Index<T>

Source§

impl<T: CodeInt> StructuralPartialEq for Index<T>

Auto Trait Implementations§

§

impl<T> Freeze for Index<T>

§

impl<T> RefUnwindSafe for Index<T>
where T: RefUnwindSafe,

§

impl<T> Send for Index<T>
where T: Send,

§

impl<T> Sync for Index<T>
where T: Sync,

§

impl<T> Unpin for Index<T>
where T: Unpin,

§

impl<T> UnwindSafe for Index<T>
where T: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.