BlockHashPositionArrayImplUnchecked

Trait BlockHashPositionArrayImplUnchecked 

Source
pub unsafe trait BlockHashPositionArrayImplUnchecked: BlockHashPositionArrayData {
    // Required methods
    unsafe fn is_equiv_unchecked(&self, other: &[u8]) -> bool;
    unsafe fn has_common_substring_unchecked(&self, other: &[u8]) -> bool;
    unsafe fn edit_distance_unchecked(&self, other: &[u8]) -> u32;
    unsafe fn score_strings_raw_unchecked(&self, other: &[u8]) -> u32;
    unsafe fn score_strings_unchecked(
        &self,
        other: &[u8],
        log_block_size: u8,
    ) -> u32;
}
Available on crate feature unchecked only.
Expand description

The implementation of the block hash position array (unchecked; immutable).

This trait is automatically implemented for all types implementing the BlockHashPositionArrayData trait.

§Safety

This trait contains unsafe methods and need to comply with constraints described in each method.

§Compatibility Notice

This trait is going to be completely private on the next major release. If you need to experiment with internal hashing functions, just vendor the source code for your needs.

Required Methods§

Source

unsafe fn is_equiv_unchecked(&self, other: &[u8]) -> bool

Compare whether two block hashes are equivalent.

§Safety

If they are not satisfied, it will return a meaningless value.

Source

unsafe fn has_common_substring_unchecked(&self, other: &[u8]) -> bool

Checks whether two given strings have common substrings with a length of block_hash::MIN_LCS_FOR_COMPARISON.

§Algorithm Implemented (By Default)

This function implements an extension to the algorithm described in [Baeza-Yates and Gonnet, 1992] (doi:10.1145/135239.135243) to find a fixed-length common substring. The original algorithm is the Backward Shift-Add algorithm for the k-LCF problem as described in [Hirvola, 2016] (which searches the longest common substring with up to k errors under the Hamming distance).

This algorithm is modified:

  • to search only perfect matches (up to 0 errors; k = 0),
  • to return as soon as possible if it finds a common substring and
  • to share the position array representation with BlockHashPositionArrayImpl::edit_distance() by reversing a pattern from the original paper (the original algorithm used reverse “incidence matrix”).
§Safety

If the condition above is not satisfied, it will return a meaningless value.

Source

unsafe fn edit_distance_unchecked(&self, other: &[u8]) -> u32

Computes the edit distance between two given strings.

Specifically, it computes the Longest Common Subsequence (LCS) distance, allowing character addition and deletion as two primitive operations (in cost 1).

§Algorithm Implemented (By Default)

This method implements the longest common subsequence length (LCS length or LLCS) algorithm as in [Hyyrö, 2004] and then converts the LCS length to the LCS distance using the basic relation between them.

§Safety

If they are not satisfied, it will return a meaningless distance.

Source

unsafe fn score_strings_raw_unchecked(&self, other: &[u8]) -> u32

Compare two block hashes and computes the similarity score without capping.

This method does not “cap” the score to prevent exaggerating the matches that are not meaningful enough, making this function block size independent.

§Safety

If they are not satisfied, it will return a meaningless score.

Source

unsafe fn score_strings_unchecked( &self, other: &[u8], log_block_size: u8, ) -> u32

Compare two block hashes and computes the similarity score.

This method “caps” the score to prevent exaggerating the matches that are not meaningful enough. This behavior depends on the block size (score cap gets higher when block size gets higher).

§Safety

If they are not satisfied, it will return a meaningless score.

Implementors§

Source§

impl<T> BlockHashPositionArrayImplUnchecked for T
where T: BlockHashPositionArrayImplInternal,