pub struct Dataset<XT: Number, YT: TargetValue> {
pub x: DMatrix<XT>,
pub y: DVector<YT>,
}Fields§
§x: DMatrix<XT>§y: DVector<YT>Implementations§
Source§impl<XT: Number, YT: TargetValue> Dataset<XT, YT>
Implementation of a generic dataset structure.
impl<XT: Number, YT: TargetValue> Dataset<XT, YT>
Implementation of a generic dataset structure.
This structure represents a dataset consisting of input features (x) and target values (y).
It provides various methods for manipulating and analyzing the dataset.
§Type Parameters
XT: The type of the input features.YT: The type of the target values.
§Examples
use nalgebra::{DMatrix, DVector};
use rusty_ai::data::dataset::Dataset;
use rand::prelude::*;
// Define a dataset with input features of type f64 and target values of type u32
let x = DMatrix::from_row_slice(3, 2, &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]);
let y = DVector::from_vec(vec![0, 1, 0]);
let dataset = Dataset::new(x, y);
// Split the dataset into training and testing sets
let (mut train_set, test_set) = dataset.train_test_split(0.8, Some(42)).unwrap();
// Standardize the input features of the dataset
train_set.standardize();
// Split the dataset based on a threshold value
let (left_set, right_set) = dataset.split_on_threshold(0, 3.5);
// Sample a subset of the dataset
let sample_set = dataset.samples(2, Some(123));Sourcepub fn into_parts(&self) -> (&DMatrix<XT>, &DVector<YT>)
pub fn into_parts(&self) -> (&DMatrix<XT>, &DVector<YT>)
Splits the dataset into its constituent parts.
§Returns
A tuple containing references to the input features and target values of the dataset.
Sourcepub fn is_not_empty(&self) -> bool
pub fn is_not_empty(&self) -> bool
Sourcepub fn standardize(&mut self)where
XT: RealNumber,
pub fn standardize(&mut self)where
XT: RealNumber,
Standardizes the input features of the dataset.
This method calculates the mean and standard deviation of each input feature and standardizes the values by subtracting the mean and dividing by the standard deviation.
§Requirements
The input features (XT) must implement the RealNumber trait.
Sourcepub fn train_test_split(
&self,
train_size: f64,
seed: Option<u64>,
) -> Result<(Self, Self), Box<dyn Error>>
pub fn train_test_split( &self, train_size: f64, seed: Option<u64>, ) -> Result<(Self, Self), Box<dyn Error>>
Splits the dataset into training and testing sets.
§Arguments
train_size- The proportion of the dataset to use for training. Should be between 0.0 and 1.0.seed- An optional seed value for the random number generator.
§Returns
A result containing the training and testing datasets, or an error if the train size is invalid.
Sourcepub fn split_on_threshold(
&self,
feature_index: usize,
threshold: XT,
) -> (Self, Self)
pub fn split_on_threshold( &self, feature_index: usize, threshold: XT, ) -> (Self, Self)
Splits the dataset based on a threshold value.
This method partitions the dataset into two subsets based on the specified feature index and threshold value. The left subset contains rows where the feature value is less than or equal to the threshold, while the right subset contains rows where the feature value is greater than the threshold.
§Arguments
feature_index- The index of the feature to split on.threshold- The threshold value for the split.
§Returns
A tuple containing the left and right subsets of the dataset.
Sourcepub fn samples(&self, sample_size: usize, seed: Option<u64>) -> Self
pub fn samples(&self, sample_size: usize, seed: Option<u64>) -> Self
Samples a subset of the dataset.
This method randomly selects a specified number of rows from the dataset to create a new subset.
§Arguments
sample_size- The number of rows to sample.seed- An optional seed value for the random number generator.
§Returns
A new dataset containing the sampled subset.
Trait Implementations§
Auto Trait Implementations§
impl<XT, YT> Freeze for Dataset<XT, YT>
impl<XT, YT> RefUnwindSafe for Dataset<XT, YT>where
XT: RefUnwindSafe,
YT: RefUnwindSafe,
impl<XT, YT> Send for Dataset<XT, YT>
impl<XT, YT> Sync for Dataset<XT, YT>
impl<XT, YT> Unpin for Dataset<XT, YT>
impl<XT, YT> UnwindSafe for Dataset<XT, YT>where
XT: UnwindSafe,
YT: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.