pub const MIN_LCS_FOR_COMPARISON: usize = 7;
Expand description

The minimum length of the common substring to compute edit distance between two block hashes.

To score similarity between two block hashes with the same block size, ssdeep expects that two block hashes are similar enough. In specific, ssdeep expects that they have a common substring of a length MIN_LCS_FOR_COMPARISON or longer to reduce the possibility of false matches by chance.

If we couldn’t find such a common substring, the low level block hash comparison method returns zero (meaning, not similar).

Finding such common substrings is a special case of finding a longest common substring (LCS).

For instance, those two strings:

  • +r/kcOpEYXB+0ZJ
  • 7ocOpEYXB+0ZF29

have a common substring cOpEYXB+0Z (length 10), long enough (≧ MIN_LCS_FOR_COMPARISON) to compute the edit distance to compute the similarity score.

See also: “Fuzzy Hash Comparison” section of FuzzyHashData