pub struct FileGroup<F> {
pub file_len: FileLen,
pub file_hash: FileHash,
pub files: Vec<F>,
}
Expand description
A group of files that have something in common, e.g. same size or same hash
Fields§
§file_len: FileLen
Length of each file
file_hash: FileHash
Hash of a part or the whole of the file
files: Vec<F>
Group of files with the same length and hash
Implementations§
source§impl<F> FileGroup<F>
impl<F> FileGroup<F>
sourcepub fn file_count(&self) -> usize
pub fn file_count(&self) -> usize
Returns the count of all files in the group
sourcepub fn total_size(&self) -> FileLen
pub fn total_size(&self) -> FileLen
Returns the total size of all files in the group
sourcepub fn map<R>(self, f: impl Fn(F) -> R) -> FileGroup<R>
pub fn map<R>(self, f: impl Fn(F) -> R) -> FileGroup<R>
Maps the list of files in the group. Preserves the group file len and hash.
sourcepub fn filter_map<R>(self, f: impl Fn(F) -> Option<R>) -> FileGroup<R>
pub fn filter_map<R>(self, f: impl Fn(F) -> Option<R>) -> FileGroup<R>
Transforms files into different type, filtering out files that cannot be transformed
sourcepub fn try_map_all<R: Debug, E: Debug>(
self,
f: impl Fn(F) -> Result<R, E>
) -> Result<FileGroup<R>, Vec<E>>
pub fn try_map_all<R: Debug, E: Debug>( self, f: impl Fn(F) -> Result<R, E> ) -> Result<FileGroup<R>, Vec<E>>
Tries to map each file by given fallible function. Does not stop processing on the first failure. If mapping any of the files fails, then returns a vector of errors.
sourcepub fn flat_map<R, I>(self, f: impl Fn(F) -> I) -> FileGroup<R>where
I: IntoIterator<Item = R>,
pub fn flat_map<R, I>(self, f: impl Fn(F) -> I) -> FileGroup<R>where
I: IntoIterator<Item = R>,
Flat maps the list of files in the group. Preserves the group file len and hash.
sourcepub fn partition_by_key<K: Eq + Hash>(
self,
key_fn: impl Fn(&F) -> K
) -> Vec<FileGroup<F>>
pub fn partition_by_key<K: Eq + Hash>( self, key_fn: impl Fn(&F) -> K ) -> Vec<FileGroup<F>>
Splits the group into one or more groups based on the key function applied to each file. Files with the same key are placed in the same group. The key is computed only once per item. File len and file hash are preserved.
source§impl<F: AsRef<FileId>> FileGroup<F>
impl<F: AsRef<FileId>> FileGroup<F>
sourcepub fn unique_count(&self) -> usize
pub fn unique_count(&self) -> usize
Returns the number of files with distinct identifiers. Files must be sorted by id.
sourcepub fn unique_size(&self) -> FileLen
pub fn unique_size(&self) -> FileLen
Returns the total size of data in files with distinct identifiers. Files must be sorted by id.
sourcepub fn sort_by_id(&mut self)
pub fn sort_by_id(&mut self)
Sorts the files in this group by their identifiers.
source§impl<F: AsRef<Path> + AsRef<FileId>> FileGroup<F>
impl<F: AsRef<Path> + AsRef<FileId>> FileGroup<F>
sourcepub fn matches(&self, filter: &FileGroupFilter) -> bool
pub fn matches(&self, filter: &FileGroupFilter) -> bool
Returns true if the file group should be forwarded to the next grouping stage, because the number of duplicate files is higher than the maximum allowed number of replicas.
This method returns always true if the user searches for underreplicated files
(filter.replication
is Replication::Underreplicated
). This is because even if
the number of replicas is currently higher than the maximum number of allowed replicas,
the group can be split in later stages and the number of replicas in the group may drop.
sourcepub fn matches_strictly(&self, filter: &FileGroupFilter) -> bool
pub fn matches_strictly(&self, filter: &FileGroupFilter) -> bool
Returns true if the file group should be included in the final report.
The number of replicas in the group must be appropriate for the condition
specified in filter.replication
.
sourcepub fn missing_count(&self, filter: &FileGroupFilter) -> usize
pub fn missing_count(&self, filter: &FileGroupFilter) -> usize
Returns the number of missing file replicas.
This is the difference between the desired minimum number of replicas
given by filter.replication
and the number of files in the group.
If the number of files is greater than the minimum number of replicas, or
if filter.replication
is set to Replication::Overreplicated
0 is returned.
sourcepub fn redundant_count(&self, filter: &FileGroupFilter) -> usize
pub fn redundant_count(&self, filter: &FileGroupFilter) -> usize
Returns the highest number of redundant files that could be removed from the group.
If filter.roots
are empty, the difference between the total number of files
in the group and the desired maximum number of replicas controlled by filter.replication
is returned.
If filter.roots
are not empty, then files in the group are split into subgroups first,
where each subgroup shares one of the roots. If the number of subgroups N
is larger
than the allowed number of replicas r, the last N - r subgroups are considered
redundant. The total number of files in redundant subgroups is returned.
If the result would be negative in any of the above cases or if filter.replication
is set to Replication::Underreplicated
, 0 is returned.
sourcepub fn reported_count(&self, filter: &FileGroupFilter) -> usize
pub fn reported_count(&self, filter: &FileGroupFilter) -> usize
Returns either the number of files redundant or missing, depending on the type of search.
sourcepub fn sort_by_path(&mut self, root_paths: &[Path])
pub fn sort_by_path(&mut self, root_paths: &[Path])
Sorts the files by their path names. If filter requires grouping by roots, then groups are kept together.
Trait Implementations§
source§impl<'de, F> Deserialize<'de> for FileGroup<F>where
F: Deserialize<'de>,
impl<'de, F> Deserialize<'de> for FileGroup<F>where
F: Deserialize<'de>,
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
source§impl<F: PartialEq> PartialEq for FileGroup<F>
impl<F: PartialEq> PartialEq for FileGroup<F>
impl<F: Eq> Eq for FileGroup<F>
impl<F> StructuralEq for FileGroup<F>
impl<F> StructuralPartialEq for FileGroup<F>
Auto Trait Implementations§
impl<F> RefUnwindSafe for FileGroup<F>where
F: RefUnwindSafe,
impl<F> Send for FileGroup<F>where
F: Send,
impl<F> Sync for FileGroup<F>where
F: Sync,
impl<F> Unpin for FileGroup<F>where
F: Unpin,
impl<F> UnwindSafe for FileGroup<F>where
F: UnwindSafe,
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key
and return true
if they are equal.§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key
and return true
if they are equal.