Skip to main content

DataSet

Struct DataSet 

Source
pub struct DataSet {
    pub x: Array2<f64>,
    pub y: Array1<f64>,
    pub feature_names: Vec<String>,
    pub target_name: String,
}
Expand description

A dataset of input features x (shape [n_rows, n_vars]) and targets y ([n_rows]).

Fields§

§x: Array2<f64>

Feature matrix, one row per observation.

§y: Array1<f64>

Target vector, one entry per observation.

§feature_names: Vec<String>

Names of the feature columns (length n_vars).

§target_name: String

Name of the target column.

Implementations§

Source§

impl DataSet

Source

pub fn from_arrays(x: Array2<f64>, y: Array1<f64>) -> Result<Self>

Build a dataset from in-memory arrays.

§Errors

Returns PhopError::ShapeMismatch if the row counts of x and y differ.

Source

pub fn n_vars(&self) -> usize

Number of feature variables.

Source

pub fn len(&self) -> usize

Number of observations.

Source

pub fn is_empty(&self) -> bool

Whether the dataset is empty.

Source

pub fn standardized(&self) -> (DataSet, Standardizer)

Produce a z-scored copy of the dataset together with the Standardizer that maps predictions back to the original target units.

Each feature column and the target are centered and scaled to unit variance; constant columns (zero variance) are left centered with a unit scale so the transform stays finite.

Source

pub fn select(&self, rows: &[usize]) -> Result<DataSet>

Build a sub-dataset from the given row indices (used by minibatching).

§Errors

Returns PhopError::ShapeMismatch if any index is out of range.

Source

pub fn minibatches(&self, size: usize, seed: u64) -> Vec<DataSet>

Partition the data axis into shuffled minibatches of (at most) size rows.

The shuffle is seeded for reproducibility (Risk T3 mitigation: bounds per-step memory by letting the optimizer consume the data in chunks). A size of 0 or one >= the row count yields a single batch containing all rows.

Source

pub fn from_csv<P: AsRef<Path>>(path: P) -> Result<Self>

Load a dataset from a CSV file.

The file is expected to have a header row. By default the last column is taken as the target y and all preceding columns as features x.

§Errors

Returns an error if the file cannot be read, parsed, or has fewer than two columns.

Source

pub fn from_csv_with_target<P: AsRef<Path>>( path: P, target: Option<usize>, ) -> Result<Self>

Load a dataset from a CSV file, optionally choosing which column is the target.

target is a 0-based column index; None selects the last column. All other columns become features x, preserving their header order.

§Errors

Returns an error if the file cannot be read or parsed, has fewer than two columns, or if target is out of range.

Source

pub fn from_csv_columns<P: AsRef<Path>>( path: P, features: &[usize], target: usize, ) -> Result<Self>

Load a dataset from a CSV file selecting an explicit subset of feature columns and a target column (all 0-based indices). Feature columns appear in the order given.

§Errors

Returns an error if the file cannot be read/parsed, has fewer than two columns, any index is out of range, or the target appears among the features.

Source§

impl DataSet

Source

pub fn to_dimensionless( &self, feature_dims: &[Dimension], ) -> Result<(DataSet, Vec<Vec<i32>>)>

Reduce the features to their dimensionless Buckingham-π groups, given each feature’s Dimension. The new feature columns are the monomials ∏ xᵢ^{eᵢ} for each π-group; the target is left unchanged. The π-group exponent vectors are returned alongside.

§Errors

Returns PhopError::ShapeMismatch if feature_dims.len() != n_vars, or PhopError::NotConverged if the inputs are dimensionally independent (no π-groups exist).

Trait Implementations§

Source§

impl Clone for DataSet

Source§

fn clone(&self) -> DataSet

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for DataSet

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V