pub struct DedupConfig {Show 18 fields
pub database_path: PathBuf,
pub perceptual_threshold: f64,
pub ssim_threshold: f64,
pub histogram_threshold: f64,
pub feature_match_threshold: usize,
pub audio_threshold: f64,
pub metadata_threshold: f64,
pub parallel: bool,
pub sample_frames: usize,
pub chunk_size: usize,
pub thumbnail_resolution: usize,
pub bloom_prescreen: bool,
pub bloom_capacity: usize,
pub bloom_fpr: f32,
pub use_lsh: bool,
pub lsh_num_tables: usize,
pub lsh_bits_per_table: usize,
pub lsh_seed: u64,
}Expand description
Configuration for deduplication.
Fields§
§database_path: PathBufDatabase path
perceptual_threshold: f64Perceptual hash similarity threshold (0.0-1.0)
ssim_threshold: f64SSIM similarity threshold (0.0-1.0)
histogram_threshold: f64Histogram similarity threshold (0.0-1.0)
feature_match_threshold: usizeFeature match threshold (minimum number of matches)
audio_threshold: f64Audio fingerprint similarity threshold (0.0-1.0)
metadata_threshold: f64Metadata similarity threshold (0.0-1.0)
parallel: boolEnable parallel processing
sample_frames: usizeNumber of frames to sample for video analysis
chunk_size: usizeChunk size for content-based chunking (bytes)
thumbnail_resolution: usizeThumbnail resolution for SSIM duplicate detection.
Specifies both width and height of the grayscale thumbnail used for SSIM comparison. Must be >= 4. Default is 8 (i.e. 8x8 = 64 pixels). Higher values give more accurate SSIM at the cost of storage and CPU.
bloom_prescreen: boolEnable bloom filter pre-screening before expensive perceptual comparisons.
When enabled, a bloom filter is used to quickly reject items whose content hash is already known to be unique, avoiding expensive pairwise perceptual hash comparisons.
bloom_capacity: usizeExpected capacity for the bloom filter pre-screener.
bloom_fpr: f32False positive rate for the bloom filter pre-screener.
use_lsh: boolUse LSH acceleration for perceptual hash deduplication.
When enabled, find_perceptual_duplicates() uses a BitLshIndex
instead of O(n^2) pairwise comparison. This provides sub-quadratic
performance for large libraries at the cost of slightly reduced recall.
lsh_num_tables: usizeNumber of LSH hash tables (more = better recall, more memory).
lsh_bits_per_table: usizeBits sampled per LSH table (fewer = more candidates = better recall).
lsh_seed: u64Deterministic seed for LSH projections.
Trait Implementations§
Source§impl Clone for DedupConfig
impl Clone for DedupConfig
Source§fn clone(&self) -> DedupConfig
fn clone(&self) -> DedupConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for DedupConfig
impl Debug for DedupConfig
Auto Trait Implementations§
impl Freeze for DedupConfig
impl RefUnwindSafe for DedupConfig
impl Send for DedupConfig
impl Sync for DedupConfig
impl Unpin for DedupConfig
impl UnsafeUnpin for DedupConfig
impl UnwindSafe for DedupConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more