Struct DataSet

Source

pub struct DataSet {
    pub x: Array2<f64>,
    pub y: Array1<f64>,
    pub feature_names: Vec<String>,
    pub target_name: String,
}

Expand description

A dataset of input features x (shape [n_rows, n_vars]) and targets y ([n_rows]).

Fields§

§x: Array2<f64>

Feature matrix, one row per observation.

§y: Array1<f64>

Target vector, one entry per observation.

§feature_names: Vec<String>

Names of the feature columns (length n_vars).

§target_name: String

Name of the target column.

Implementations§

Source §

impl DataSet

Source

pub fn from_arrays(x: Array2<f64>, y: Array1<f64>) -> Result<Self>

Build a dataset from in-memory arrays.

§Errors

Returns PhopError::ShapeMismatch if the row counts of x and y differ.

Source

pub fn n_vars(&self) -> usize

Number of feature variables.

Source

pub fn len(&self) -> usize

Number of observations.

Source

pub fn is_empty(&self) -> bool

Whether the dataset is empty.

Source

pub fn standardized(&self) -> (DataSet, Standardizer)

Produce a z-scored copy of the dataset together with the Standardizer that maps predictions back to the original target units.

Each feature column and the target are centered and scaled to unit variance; constant columns (zero variance) are left centered with a unit scale so the transform stays finite.

Source

pub fn select(&self, rows: &[usize]) -> Result<DataSet>

Build a sub-dataset from the given row indices (used by minibatching).

§Errors

Returns PhopError::ShapeMismatch if any index is out of range.

Source

pub fn minibatches(&self, size: usize, seed: u64) -> Vec<DataSet>

Partition the data axis into shuffled minibatches of (at most) size rows.

The shuffle is seeded for reproducibility (Risk T3 mitigation: bounds per-step memory by letting the optimizer consume the data in chunks). A size of 0 or one >= the row count yields a single batch containing all rows.

Source

pub fn from_csv<P: AsRef<Path>>(path: P) -> Result<Self>

Load a dataset from a CSV file.

The file is expected to have a header row. By default the last column is taken as the target y and all preceding columns as features x.

§Errors

Returns an error if the file cannot be read, parsed, or has fewer than two columns.

Source

pub fn from_csv_with_target<P: AsRef<Path>>( path: P, target: Option<usize>, ) -> Result<Self>

Load a dataset from a CSV file, optionally choosing which column is the target.

target is a 0-based column index; None selects the last column. All other columns become features x, preserving their header order.

§Errors

Returns an error if the file cannot be read or parsed, has fewer than two columns, or if target is out of range.

Source

pub fn from_csv_columns<P: AsRef<Path>>( path: P, features: &[usize], target: usize, ) -> Result<Self>

Load a dataset from a CSV file selecting an explicit subset of feature columns and a target column (all 0-based indices). Feature columns appear in the order given.

§Errors

Returns an error if the file cannot be read/parsed, has fewer than two columns, any index is out of range, or the target appears among the features.

Source §

impl DataSet

Source

pub fn to_dimensionless( &self, feature_dims: &[Dimension], ) -> Result<(DataSet, Vec<Vec<i32>>)>

Reduce the features to their dimensionless Buckingham-π groups, given each feature’s Dimension. The new feature columns are the monomials ∏ xᵢ^{eᵢ} for each π-group; the target is left unchanged. The π-group exponent vectors are returned alongside.

§Errors

Returns PhopError::ShapeMismatch if feature_dims.len() != n_vars, or PhopError::NotConverged if the inputs are dimensionally independent (no π-groups exist).

Trait Implementations§

Source §

impl Clone for DataSet

Source §

fn clone(&self) -> DataSet

Returns a duplicate of the value. Read more

1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Source §

impl Debug for DataSet

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl UnwindSafe for DataSet

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source §

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source §

impl<T> CloneToUninit for T
where T: Clone,

Source §

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §