Skip to main content

Iris

Struct Iris 

Source
pub struct Iris { /* private fields */ }
Expand description

A struct representing the Iris dataset with lazy loading.

The dataset is not loaded until you call one of the data accessor methods. Once loaded, the data is cached for subsequent accesses.

§About Dataset

The Iris dataset is a classic dataset for classification tasks. It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

Features:

  • sepal length in cm
  • sepal width in cm
  • petal length in cm
  • petal width in cm

Labels:

  • species name (in &str): "setosa", "versicolor", "virginica"

See more information at https://archive.ics.uci.edu/dataset/53/iris

§Citation

R. A. Fisher. “Iris,” UCI Machine Learning Repository, [Online]. Available: https://doi.org/10.24432/C56C76

§Thread Safety

This struct automatically implements Send and Sync (All fields implement them), making it safe to share across threads. The internal Dataset ensures thread-safe lazy initialization.

§Example

use dataset_core::datasets::iris::Iris;

let download_dir = "./iris"; // the code will create the directory if it doesn't exist

let dataset = Iris::new(download_dir);
let features = dataset.features().unwrap();
let labels = dataset.labels().unwrap();

let (features, labels) = dataset.data().unwrap(); // this is also a way to get features and labels
// you can use `.to_owned()` to get owned copies of the data
let mut features_owned = features.to_owned();
let mut labels_owned = labels.to_owned();

// Example: Modify feature values
features_owned[[0, 0]] = 5.5;
labels_owned[0] = "setosa-modified";

assert_eq!(features.shape(), &[150, 4]);
assert_eq!(labels.len(), 150);

// clean up: remove the downloaded files (dispensable)
std::fs::remove_dir_all(download_dir).unwrap();

Implementations§

Source§

impl Iris

Source

pub fn new(storage_dir: &str) -> Self

Create a new Iris instance without loading data.

The dataset will be loaded lazily when you first call any data accessor method. This is a lightweight operation that only stores the storage directory.

§Parameters
  • storage_dir - Directory where the dataset will be stored.
§Returns
  • Self - Iris instance ready for lazy loading.
Source

pub fn features(&self) -> Result<&Array2<f64>, DatasetError>

Get a reference to the feature matrix.

This method triggers lazy loading on first call. Subsequent calls return the cached data instantly.

§Returns
  • &Array2<f64> - Reference to feature matrix with shape (150, 4) containing:
    • sepal length in cm
    • sepal width in cm
    • petal length in cm
    • petal width in cm
§Errors

Returns DatasetError if:

  • Download fails due to network issues
  • File extraction or I/O operations fail
  • Data format is invalid (wrong number of columns, unparseable values, or invalid labels)
  • Dataset size doesn’t match expected dimensions (150 samples, 4 features)
Source

pub fn labels(&self) -> Result<&Array1<&'static str>, DatasetError>

Get a reference to the labels vector.

This method triggers lazy loading on first call. Subsequent calls return the cached data instantly.

§Returns
  • &Array1<&'static str> - Reference to labels vector with shape (150,) containing species names ("setosa", "versicolor", "virginica")
§Errors

Returns DatasetError if:

  • Download fails due to network issues
  • File extraction or I/O operations fail
  • Data format is invalid (wrong number of columns, unparseable values, or invalid labels)
  • Dataset size doesn’t match expected dimensions (150 samples)
Source

pub fn data( &self, ) -> Result<(&Array2<f64>, &Array1<&'static str>), DatasetError>

Get both features and labels as references.

This method triggers lazy loading on first call. Subsequent calls return the cached data instantly.

§Returns
  • &Array2<f64> - Reference to feature matrix with shape (150, 4) containing:
    • sepal length in cm
    • sepal width in cm
    • petal length in cm
    • petal width in cm
  • &Array1<&'static str> - Reference to labels vector with shape (150,) containing species names ("setosa", "versicolor", "virginica")
§Errors

Returns DatasetError if:

  • Download fails due to network issues
  • File extraction or I/O operations fail
  • Data format is invalid (wrong number of columns, unparseable values, or invalid labels)
  • Dataset size doesn’t match expected dimensions (150 samples, 4 features)

Trait Implementations§

Source§

impl Debug for Iris

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl !Freeze for Iris

§

impl RefUnwindSafe for Iris

§

impl Send for Iris

§

impl Sync for Iris

§

impl Unpin for Iris

§

impl UnsafeUnpin for Iris

§

impl UnwindSafe for Iris

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.