pub struct StreamDedupIndex { /* private fields */ }Expand description
Index of stream fingerprints for near-duplicate detection.
Files are added by name via ingest; duplicates are
retrieved via find_duplicates.
Implementations§
Source§impl StreamDedupIndex
impl StreamDedupIndex
Sourcepub fn new(config: StreamChunkerConfig) -> Self
pub fn new(config: StreamChunkerConfig) -> Self
Create a new, empty index with the given chunker config.
Sourcepub fn ingest<R: Read>(
&mut self,
name: &str,
reader: R,
) -> Result<StreamFingerprint>
pub fn ingest<R: Read>( &mut self, name: &str, reader: R, ) -> Result<StreamFingerprint>
Ingest a stream and store its fingerprint under name.
Returns the computed StreamFingerprint.
§Errors
Propagates io::Error from reader.
Sourcepub fn jaccard_similarity(
&self,
a: &StreamFingerprint,
b: &StreamFingerprint,
) -> f64
pub fn jaccard_similarity( &self, a: &StreamFingerprint, b: &StreamFingerprint, ) -> f64
Compute the Jaccard similarity between two fingerprints.
Sourcepub fn find_duplicates(&self, threshold: f64) -> Vec<(String, String, f64)>
pub fn find_duplicates(&self, threshold: f64) -> Vec<(String, String, f64)>
Find all pairs of indexed entries whose Jaccard similarity exceeds
threshold.
Returns a list of (name_a, name_b, similarity) tuples sorted by
descending similarity.
Sourcepub fn get(&self, name: &str) -> Option<&StreamFingerprint>
pub fn get(&self, name: &str) -> Option<&StreamFingerprint>
Retrieve the fingerprint stored under name, if any.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for StreamDedupIndex
impl RefUnwindSafe for StreamDedupIndex
impl Send for StreamDedupIndex
impl Sync for StreamDedupIndex
impl Unpin for StreamDedupIndex
impl UnsafeUnpin for StreamDedupIndex
impl UnwindSafe for StreamDedupIndex
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more