[][src]Trait finalfusion::subword::NGramsIndices

pub trait NGramsIndices {
    fn ngrams_indices<I>(
        &self,
        min_n: usize,
        max_n: usize,
        indexer: &I
    ) -> Vec<(&str, u64)>
    where
        I: Indexer
; }

A trait for getting ngrams and their indices of a string.

N-gram indexing assigns an identifier to each subword (n-gram) of a string. A subword is indexed by computing its hash and then mapping the hash to a bucket.

Since a non-perfect hash function is used, multiple subwords can map to the same index.

Required methods

fn ngrams_indices<I>(
    &self,
    min_n: usize,
    max_n: usize,
    indexer: &I
) -> Vec<(&str, u64)> where
    I: Indexer

Return the ngrams and their indices of a string.

The n-grams that are used are of length [min_n, max_n], these are mapped to indices into 2^buckets_exp buckets.

The largest possible bucket exponent is 64.

Loading content...

Implementations on Foreign Types

impl NGramsIndices for str[src]

Loading content...

Implementors

Loading content...