Skip to main content

Module transform

Module transform 

Source
Expand description

Data transforms for alimentar.

Transforms apply operations to RecordBatches, enabling data preprocessing pipelines. All transforms are composable and can be chained together.

Structs§

Cast
A transform that casts columns to different data types.
Chain
A chain of transforms applied in sequence.
Drop
A transform that drops (removes) specified columns from a RecordBatch.
FillNull
A transform that fills null values in specified columns.
Filter
A transform that filters rows based on a predicate.
Fim
Fill-in-the-Middle transform for code training data.
FimTokens
Configuration for FIM sentinel tokens.
Map
A transform that applies a function to each RecordBatch.
Normalize
A transform that normalizes numeric columns.
Rename
A transform that renames columns in a RecordBatch.
Sample
A transform that randomly samples rows from a RecordBatch.
Select
A transform that selects specific columns from a RecordBatch.
Shuffle
A transform that shuffles rows in a RecordBatch.
Skip
A transform that skips the first N rows from a RecordBatch.
Sort
A transform that sorts rows by one or more columns.
Take
A transform that takes the first N rows from a RecordBatch.
Unique
A transform that removes duplicate rows based on specified columns.

Enums§

FillStrategy
Strategy for filling null values.
FimFormat
FIM format variant.
NormMethod
Normalization method for numeric columns.
SortOrder
Sort order for the Sort transform.

Traits§

Transform
A transform that can be applied to RecordBatches.