#[repr(align(8))]pub struct FuzzyHashDualData<const S1: usize, const S2: usize, const C1: usize, const C2: usize>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,{ /* private fields */ }
Expand description
An efficient compressed fuzzy hash representation, containing both normalized and raw block hash contents.
This struct contains a normalized fuzzy hash object and opaque data to perform “reverse normalization” afterwards.
On the current design, it allows compression ratio of about 5 / 8 (compared to two fuzzy hash objects, one normalized and another raw).
With this, you can compare many fuzzy hashes efficiently while preserving the original string representation without requesting too much memory.
Some methods accept AsRef
to the normalized FuzzyHashData
.
On such cases, it is possible to pass this object directly
(e.g. FuzzyHashCompareTarget::compare()
).
§Ordering
Sorting objects of this type will result in the following order.
- Two equivalent
FuzzyHashDualData
objects are considered equal (and the underlying sorting algorithm decides ordering of equivalent objects). - Two different
FuzzyHashDualData
objects with different normalizedFuzzyHashData
objects (inside) will be ordered as the same order as the underlyingFuzzyHashData
. - Two different
FuzzyHashDualData
objects with the same normalizedFuzzyHashData
objects (inside) will be ordered in an implementation-defined manner.
The implementation-defined order is not currently guaranteed to be stable. For instance, different versions of this crate may order them differently. However, it is guaranteed deterministic so that you can expect the same order in the same version of this crate.
§Safety
Generic parameters of this type should not be considered stable because some
generic parameters are just there because of the current restrictions of
Rust’s constant generics (that will be resolved after the feature
generic_const_exprs
is stabilized).
Do not use FuzzyHashDualData
directly.
Instead, use instantiations of this generic type:
DualFuzzyHash
(will be sufficient on most cases)LongDualFuzzyHash
§Examples
// Requires either the "alloc" feature or std environment on your crate
// to use the `to_string()` method (default enabled).
use ssdeep::{DualFuzzyHash, FuzzyHash, RawFuzzyHash};
let hash_str_raw = "12288:+ySwl5P+C5IxJ845HYV5sxOH/cccccccei:+Klhav84a5sxJ";
let hash_str_norm = "12288:+ySwl5P+C5IxJ845HYV5sxOH/cccei:+Klhav84a5sxJ";
let dual_hash: DualFuzzyHash = str::parse(hash_str_raw).unwrap();
// This object can effectively contain both
// normalized and raw fuzzy hash representations.
assert_eq!(dual_hash.to_raw_form().to_string(), hash_str_raw);
assert_eq!(dual_hash.to_normalized().to_string(), hash_str_norm);
let another_hash: FuzzyHash = str::parse(
"12288:+yUwldx+C5IxJ845HYV5sxOH/cccccccex:+glvav84a5sxK"
).unwrap();
// You can directly compare a DualFuzzyHash against a FuzzyHash.
//
// This is almost as fast as comparison between two FuzzyHash objects
// because the native representation inside DualFuzzyHash
// is a FuzzyHash object.
assert_eq!(another_hash.compare(dual_hash), 88);
// But DualFuzzyHash is not a drop-in replacement of FuzzyHash.
// You need to use `as_normalized()` to compare a FuzzyHash against
// a DualFuzzyHash (direct comparison may be provided on the later version).
assert_eq!(dual_hash.as_normalized().compare(&another_hash), 88);
Implementations§
Source§impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
Sourcepub const MAX_BLOCK_HASH_SIZE_1: usize = FuzzyHashData<S1, S2, true>::MAX_BLOCK_HASH_SIZE_1
pub const MAX_BLOCK_HASH_SIZE_1: usize = FuzzyHashData<S1, S2, true>::MAX_BLOCK_HASH_SIZE_1
The maximum size of the block hash 1.
This value is the same as the underlying fuzzy hash type.
Sourcepub const MAX_BLOCK_HASH_SIZE_2: usize = FuzzyHashData<S1, S2, true>::MAX_BLOCK_HASH_SIZE_2
pub const MAX_BLOCK_HASH_SIZE_2: usize = FuzzyHashData<S1, S2, true>::MAX_BLOCK_HASH_SIZE_2
The maximum size of the block hash 2.
This value is the same as the underlying fuzzy hash type.
Sourcepub const IS_NORMALIZED_FORM: bool = false
pub const IS_NORMALIZED_FORM: bool = false
Denotes whether the fuzzy type only contains a normalized form.
In this type, it is always false
.
Sourcepub const IS_LONG_FORM: bool = FuzzyHashData<S1, S2, true>::IS_LONG_FORM
pub const IS_LONG_FORM: bool = FuzzyHashData<S1, S2, true>::IS_LONG_FORM
Denotes whether the fuzzy type can contain a non-truncated fuzzy hash.
This value is the same as the underlying fuzzy hash type.
Sourcepub const MAX_LEN_IN_STR: usize = FuzzyHashData<S1, S2, true>::MAX_LEN_IN_STR
pub const MAX_LEN_IN_STR: usize = FuzzyHashData<S1, S2, true>::MAX_LEN_IN_STR
The maximum length in the string representation.
This value is the same as the underlying fuzzy hash type.
Sourcepub fn new() -> Self
pub fn new() -> Self
Creates a new fuzzy hash object with empty contents.
This is equivalent to the fuzzy hash string 3::
.
Sourcepub fn init_from_raw_form(&mut self, hash: &FuzzyHashData<S1, S2, false>)
pub fn init_from_raw_form(&mut self, hash: &FuzzyHashData<S1, S2, false>)
Initialize the object from a raw fuzzy hash.
Sourcepub unsafe fn new_from_internals_near_raw_unchecked(
log_block_size: u8,
block_hash_1: &[u8],
block_hash_2: &[u8],
) -> Self
Available on crate feature unchecked
only.
pub unsafe fn new_from_internals_near_raw_unchecked( log_block_size: u8, block_hash_1: &[u8], block_hash_2: &[u8], ) -> Self
unchecked
only.Creates a new fuzzy hash object with internal contents (with raw block size).
§Safety
block_hash_1
andblock_hash_2
must have valid lengths.- Elements of
block_hash_1
andblock_hash_2
must consist of valid Base64 indices. log_block_size
must hold a valid base-2 logarithm form of a block size.
If they are not satisfied, the resulting object will be corrupted.
Sourcepub fn new_from_internals_near_raw(
log_block_size: u8,
block_hash_1: &[u8],
block_hash_2: &[u8],
) -> Self
pub fn new_from_internals_near_raw( log_block_size: u8, block_hash_1: &[u8], block_hash_2: &[u8], ) -> Self
Creates a new fuzzy hash object with internal contents (with raw block size).
Because this function assumes that you know the fuzzy hash internals, it panics when you fail to satisfy fuzzy hash constraints.
§Usage Constraints
block_hash_1
andblock_hash_2
must have valid lengths.- Elements of
block_hash_1
andblock_hash_2
must consist of valid Base64 indices. log_block_size
must hold a valid base-2 logarithm form of a block size.
Sourcepub unsafe fn new_from_internals_unchecked(
block_size: u32,
block_hash_1: &[u8],
block_hash_2: &[u8],
) -> Self
Available on crate feature unchecked
only.
pub unsafe fn new_from_internals_unchecked( block_size: u32, block_hash_1: &[u8], block_hash_2: &[u8], ) -> Self
unchecked
only.Creates a new fuzzy hash object with internal contents.
§Safety
block_hash_1
andblock_hash_2
must have valid lengths.- Elements of
block_hash_1
andblock_hash_2
must consist of valid Base64 indices. block_size
must hold a valid block size.
If they are not satisfied, the resulting object will be corrupted.
Sourcepub fn new_from_internals(
block_size: u32,
block_hash_1: &[u8],
block_hash_2: &[u8],
) -> Self
pub fn new_from_internals( block_size: u32, block_hash_1: &[u8], block_hash_2: &[u8], ) -> Self
Creates a new fuzzy hash object with internal contents.
Because this function assumes that you know the fuzzy hash internals, it panics when you fail to satisfy fuzzy hash constraints.
§Usage Constraints
block_hash_1
andblock_hash_2
must have valid lengths.- Elements of
block_hash_1
andblock_hash_2
must consist of valid Base64 indices. block_size
must hold a valid block size.
Sourcepub fn log_block_size(&self) -> u8
pub fn log_block_size(&self) -> u8
The base-2 logarithm form of the block size.
See also: “Block Size” section of FuzzyHashData
Sourcepub fn block_size(&self) -> u32
pub fn block_size(&self) -> u32
The block size of the fuzzy hash.
Sourcepub fn as_normalized(&self) -> &FuzzyHashData<S1, S2, true>
pub fn as_normalized(&self) -> &FuzzyHashData<S1, S2, true>
A reference to the normalized fuzzy hash.
To note, this operation should be fast enough because this type contains this object directly.
Sourcepub fn from_raw_form(hash: &FuzzyHashData<S1, S2, false>) -> Self
pub fn from_raw_form(hash: &FuzzyHashData<S1, S2, false>) -> Self
Constructs an object from a raw fuzzy hash.
Sourcepub fn from_normalized(hash: &FuzzyHashData<S1, S2, true>) -> Self
pub fn from_normalized(hash: &FuzzyHashData<S1, S2, true>) -> Self
Constructs an object from a normalized fuzzy hash.
Sourcepub fn into_mut_raw_form(&self, hash: &mut FuzzyHashData<S1, S2, false>)
pub fn into_mut_raw_form(&self, hash: &mut FuzzyHashData<S1, S2, false>)
Decompresses a raw variant of the fuzzy hash and stores into an existing object.
Sourcepub fn to_raw_form(&self) -> FuzzyHashData<S1, S2, false>
pub fn to_raw_form(&self) -> FuzzyHashData<S1, S2, false>
Decompresses and generates a raw variant of the fuzzy hash.
Based on the normalized fuzzy hash representation and the “reverse normalization” data, this method generates the original, a raw variant of the fuzzy hash.
Sourcepub fn to_normalized(&self) -> FuzzyHashData<S1, S2, true>
pub fn to_normalized(&self) -> FuzzyHashData<S1, S2, true>
Returns the clone of the normalized fuzzy hash.
Where possible, as_normalized()
or
AsRef::as_ref()
should be used instead.
Sourcepub fn to_normalized_string(&self) -> String
Available on crate feature alloc
only.
pub fn to_normalized_string(&self) -> String
alloc
only.Converts the fuzzy hash to the string (normalized form).
This method returns the string corresponding the normalized form.
Sourcepub fn to_raw_form_string(&self) -> String
Available on crate feature alloc
only.
pub fn to_raw_form_string(&self) -> String
alloc
only.Converts the fuzzy hash to the string (raw form).
This method returns the string corresponding the raw (non-normalized) form.
Sourcepub fn from_bytes_with_last_index(
str: &[u8],
index: &mut usize,
) -> Result<Self, ParseError>
pub fn from_bytes_with_last_index( str: &[u8], index: &mut usize, ) -> Result<Self, ParseError>
Parse a fuzzy hash from given bytes (a slice of u8
)
of a string representation.
If the parser succeeds, it also updates the index
argument to the
first non-used index to construct the fuzzy hash, which is that of
either the end of the string or the character ','
to separate the rest
of the fuzzy hash and the file name field.
If the parser fails, index
is not updated.
The behavior of this method is affected by the strict-parser
feature.
For more information, see The Strict Parser.
Sourcepub fn from_bytes(str: &[u8]) -> Result<Self, ParseError>
pub fn from_bytes(str: &[u8]) -> Result<Self, ParseError>
Parse a fuzzy hash from given bytes (a slice of u8
)
of a string representation.
The behavior of this method is affected by the strict-parser
feature.
For more information, see The Strict Parser.
Sourcepub fn normalize_in_place(&mut self)
pub fn normalize_in_place(&mut self)
Normalize the fuzzy hash in place.
After calling this method, self
will be normalized.
In this implementation, it clears all “reverse normalization” data.
See also: “Normalization” section of FuzzyHashData
Sourcepub fn is_normalized(&self) -> bool
pub fn is_normalized(&self) -> bool
Returns whether the dual fuzzy hash is normalized.
Sourcepub fn is_valid(&self) -> bool
pub fn is_valid(&self) -> bool
Performs full validity checking of the internal structure.
The primary purpose of this is debugging and it should always
return true
unless…
- There is a bug in this crate, corrupting this structure,
- A memory corruption is occurred somewhere else or
- An
unsafe
function to construct this object is misused.
Because of its purpose, this method is not designed to be fast.
Note that, despite that it is only relevant to users when the
unchecked
feature is enabled but made public without any features
because this method is not unsafe or unchecked in any way.
§Safety: No Panic Guarantee
This method is guaranteed to be panic-free as long as the underlying
memory region corresponding to self
is sound.
In other words, it won’t cause panic by itself if any data is
contained in this object.
Trait Implementations§
Source§impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> AsRef<FuzzyHashData<S1, S2, true>> for FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> AsRef<FuzzyHashData<S1, S2, true>> for FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
Source§fn as_ref(&self) -> &FuzzyHashData<S1, S2, true>
fn as_ref(&self) -> &FuzzyHashData<S1, S2, true>
Source§impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> Clone for FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
impl<const S1: usize, const S2: usize, const C1: usize, const C2: usize> Clone for FuzzyHashDualData<S1, S2, C1, C2>where
BlockHashSize<S1>: ConstrainedBlockHashSize,
BlockHashSize<S2>: ConstrainedBlockHashSize,
BlockHashSizes<S1, S2>: ConstrainedBlockHashSizes,
ReconstructionBlockSize<S1, C1>: ConstrainedReconstructionBlockSize,
ReconstructionBlockSize<S2, C2>: ConstrainedReconstructionBlockSize,
Source§fn clone(&self) -> FuzzyHashDualData<S1, S2, C1, C2>
fn clone(&self) -> FuzzyHashDualData<S1, S2, C1, C2>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read more