pub struct Corpus {
pub name: String,
pub root_path: PathBuf,
pub images: Vec<CorpusImage>,
pub metadata: CorpusMetadata,
}Expand description
A corpus of test images.
Fields§
§name: StringName of the corpus.
root_path: PathBufRoot path of the corpus.
images: Vec<CorpusImage>Images in the corpus.
metadata: CorpusMetadataMetadata about the corpus.
Implementations§
Source§impl Corpus
impl Corpus
Sourcepub fn new(name: impl Into<String>, root_path: impl Into<PathBuf>) -> Self
pub fn new(name: impl Into<String>, root_path: impl Into<PathBuf>) -> Self
Create a new empty corpus.
Sourcepub fn discover(path: impl AsRef<Path>) -> Result<Self>
pub fn discover(path: impl AsRef<Path>) -> Result<Self>
Discover images in a directory.
Recursively scans the directory for supported image formats (PNG, JPEG, WebP, AVIF).
Sourcepub fn save(&self, path: impl AsRef<Path>) -> Result<()>
pub fn save(&self, path: impl AsRef<Path>) -> Result<()>
Save the corpus to a JSON manifest file.
Sourcepub fn filter_category(&self, category: ImageCategory) -> Vec<&CorpusImage>
pub fn filter_category(&self, category: ImageCategory) -> Vec<&CorpusImage>
Filter images by category.
Sourcepub fn filter_format(&self, format: &str) -> Vec<&CorpusImage>
pub fn filter_format(&self, format: &str) -> Vec<&CorpusImage>
Filter images by format.
Sourcepub fn filter_min_size(
&self,
min_width: u32,
min_height: u32,
) -> Vec<&CorpusImage>
pub fn filter_min_size( &self, min_width: u32, min_height: u32, ) -> Vec<&CorpusImage>
Filter images by minimum dimensions.
Sourcepub fn split(&self, train_ratio: f64) -> (Vec<&CorpusImage>, Vec<&CorpusImage>)
pub fn split(&self, train_ratio: f64) -> (Vec<&CorpusImage>, Vec<&CorpusImage>)
Split the corpus into training and validation sets.
Uses a deterministic split based on checksum to ensure reproducibility.
§Arguments
train_ratio- Fraction of images to include in training set (0.0-1.0).
Sourcepub fn compute_checksums(&mut self) -> Result<usize>
pub fn compute_checksums(&mut self) -> Result<usize>
Compute checksums for all images that don’t have them.
Sourcepub fn find_duplicates(&self) -> Vec<Vec<&CorpusImage>>
pub fn find_duplicates(&self) -> Vec<Vec<&CorpusImage>>
Find duplicate images by checksum.
Sourcepub fn update_category_counts(&mut self)
pub fn update_category_counts(&mut self)
Update category counts in metadata.
Sourcepub fn stats(&self) -> CorpusStats
pub fn stats(&self) -> CorpusStats
Get statistics about the corpus.
Trait Implementations§
Source§impl<'de> Deserialize<'de> for Corpus
impl<'de> Deserialize<'de> for Corpus
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Deserialize this value from the given Serde deserializer. Read more
Auto Trait Implementations§
impl Freeze for Corpus
impl RefUnwindSafe for Corpus
impl Send for Corpus
impl Sync for Corpus
impl Unpin for Corpus
impl UnwindSafe for Corpus
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more