Dataset

Struct Dataset 

Source
pub struct Dataset<XT: Number, YT: TargetValue> {
    pub x: DMatrix<XT>,
    pub y: DVector<YT>,
}

Fields§

§x: DMatrix<XT>§y: DVector<YT>

Implementations§

Source§

impl<XT: Number, YT: TargetValue> Dataset<XT, YT>

Implementation of a generic dataset structure.

This structure represents a dataset consisting of input features (x) and target values (y). It provides various methods for manipulating and analyzing the dataset.

§Type Parameters

  • XT: The type of the input features.
  • YT: The type of the target values.

§Examples

use nalgebra::{DMatrix, DVector};
use rusty_ai::data::dataset::Dataset;
use rand::prelude::*;

// Define a dataset with input features of type f64 and target values of type u32
let x = DMatrix::from_row_slice(3, 2, &[1.0, 2.0, 3.0, 4.0, 5.0, 6.0]);
let y = DVector::from_vec(vec![0, 1, 0]);
let dataset = Dataset::new(x, y);

// Split the dataset into training and testing sets
let (mut train_set, test_set) = dataset.train_test_split(0.8, Some(42)).unwrap();

// Standardize the input features of the dataset
train_set.standardize();

// Split the dataset based on a threshold value
let (left_set, right_set) = dataset.split_on_threshold(0, 3.5);

// Sample a subset of the dataset
let sample_set = dataset.samples(2, Some(123));
Source

pub fn new(x: DMatrix<XT>, y: DVector<YT>) -> Self

Creates a new dataset with the given input features and target values.

§Arguments
  • x - The input features of the dataset.
  • y - The target values of the dataset.
§Returns

A new Dataset instance.

Source

pub fn into_parts(&self) -> (&DMatrix<XT>, &DVector<YT>)

Splits the dataset into its constituent parts.

§Returns

A tuple containing references to the input features and target values of the dataset.

Source

pub fn is_not_empty(&self) -> bool

Checks if the dataset is not empty.

§Returns

true if the dataset is not empty, false otherwise.

Source

pub fn nrows(&self) -> usize

Returns the number of rows in the dataset.

§Returns

The number of rows in the dataset.

Source

pub fn standardize(&mut self)
where XT: RealNumber,

Standardizes the input features of the dataset.

This method calculates the mean and standard deviation of each input feature and standardizes the values by subtracting the mean and dividing by the standard deviation.

§Requirements

The input features (XT) must implement the RealNumber trait.

Source

pub fn train_test_split( &self, train_size: f64, seed: Option<u64>, ) -> Result<(Self, Self), Box<dyn Error>>

Splits the dataset into training and testing sets.

§Arguments
  • train_size - The proportion of the dataset to use for training. Should be between 0.0 and 1.0.
  • seed - An optional seed value for the random number generator.
§Returns

A result containing the training and testing datasets, or an error if the train size is invalid.

Source

pub fn split_on_threshold( &self, feature_index: usize, threshold: XT, ) -> (Self, Self)

Splits the dataset based on a threshold value.

This method partitions the dataset into two subsets based on the specified feature index and threshold value. The left subset contains rows where the feature value is less than or equal to the threshold, while the right subset contains rows where the feature value is greater than the threshold.

§Arguments
  • feature_index - The index of the feature to split on.
  • threshold - The threshold value for the split.
§Returns

A tuple containing the left and right subsets of the dataset.

Source

pub fn samples(&self, sample_size: usize, seed: Option<u64>) -> Self

Samples a subset of the dataset.

This method randomly selects a specified number of rows from the dataset to create a new subset.

§Arguments
  • sample_size - The number of rows to sample.
  • seed - An optional seed value for the random number generator.
§Returns

A new dataset containing the sampled subset.

Trait Implementations§

Source§

impl<XT: Number, YT: TargetValue> Debug for Dataset<XT, YT>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<XT, YT> Freeze for Dataset<XT, YT>

§

impl<XT, YT> RefUnwindSafe for Dataset<XT, YT>

§

impl<XT, YT> Send for Dataset<XT, YT>

§

impl<XT, YT> Sync for Dataset<XT, YT>

§

impl<XT, YT> Unpin for Dataset<XT, YT>
where XT: Unpin, YT: Unpin,

§

impl<XT, YT> UnwindSafe for Dataset<XT, YT>
where XT: UnwindSafe, YT: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V