pub struct DatasetSplit {
pub train: ArrowDataset,
pub test: ArrowDataset,
pub validation: Option<ArrowDataset>,
}Expand description
Dataset split with optional validation set
Fields§
§train: ArrowDatasetTraining dataset (required)
test: ArrowDatasetTest/holdout dataset (required)
validation: Option<ArrowDataset>Validation dataset (optional)
Implementations§
Source§impl DatasetSplit
impl DatasetSplit
Sourcepub fn new(train: ArrowDataset, test: ArrowDataset) -> Self
pub fn new(train: ArrowDataset, test: ArrowDataset) -> Self
Create train/test split (no validation)
Sourcepub fn with_validation(
train: ArrowDataset,
test: ArrowDataset,
validation: ArrowDataset,
) -> Self
pub fn with_validation( train: ArrowDataset, test: ArrowDataset, validation: ArrowDataset, ) -> Self
Create train/test/validation split
Sourcepub fn train(&self) -> &ArrowDataset
pub fn train(&self) -> &ArrowDataset
Get training data
Sourcepub fn test(&self) -> &ArrowDataset
pub fn test(&self) -> &ArrowDataset
Get test data
Sourcepub fn validation(&self) -> Option<&ArrowDataset>
pub fn validation(&self) -> Option<&ArrowDataset>
Get validation data (if present)
Sourcepub fn from_ratios(
dataset: &ArrowDataset,
train_ratio: f64,
test_ratio: f64,
val_ratio: Option<f64>,
seed: Option<u64>,
) -> Result<Self>
pub fn from_ratios( dataset: &ArrowDataset, train_ratio: f64, test_ratio: f64, val_ratio: Option<f64>, seed: Option<u64>, ) -> Result<Self>
Split dataset by ratios
§Arguments
dataset- Source dataset to splittrain_ratio- Fraction for training (0.0 to 1.0)test_ratio- Fraction for testing (0.0 to 1.0)val_ratio- Optional fraction for validationseed- Optional random seed for shuffling
§Errors
Returns error if ratios don’t sum to 1.0 or dataset is empty
Sourcepub fn stratified(
dataset: &ArrowDataset,
label_column: &str,
train_ratio: f64,
test_ratio: f64,
val_ratio: Option<f64>,
seed: Option<u64>,
) -> Result<Self>
pub fn stratified( dataset: &ArrowDataset, label_column: &str, train_ratio: f64, test_ratio: f64, val_ratio: Option<f64>, seed: Option<u64>, ) -> Result<Self>
Stratified split preserving label distribution
§Arguments
dataset- Source dataset to splitlabel_column- Name of the label/target columntrain_ratio- Fraction for trainingtest_ratio- Fraction for testingval_ratio- Optional fraction for validationseed- Optional random seed
§Errors
Returns error if label column not found or ratios invalid
Trait Implementations§
Source§impl Clone for DatasetSplit
impl Clone for DatasetSplit
Source§fn clone(&self) -> DatasetSplit
fn clone(&self) -> DatasetSplit
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for DatasetSplit
impl !RefUnwindSafe for DatasetSplit
impl Send for DatasetSplit
impl Sync for DatasetSplit
impl Unpin for DatasetSplit
impl UnsafeUnpin for DatasetSplit
impl !UnwindSafe for DatasetSplit
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreCreates a shared type from an unshared type.