Expand description
Data transforms for alimentar.
Transforms apply operations to RecordBatches, enabling data preprocessing pipelines. All transforms are composable and can be chained together.
Structs§
- Cast
- A transform that casts columns to different data types.
- Chain
- A chain of transforms applied in sequence.
- Drop
- A transform that drops (removes) specified columns from a RecordBatch.
- Fill
Null - A transform that fills null values in specified columns.
- Filter
- A transform that filters rows based on a predicate.
- Fim
- Fill-in-the-Middle transform for code training data.
- FimTokens
- Configuration for FIM sentinel tokens.
- Map
- A transform that applies a function to each RecordBatch.
- Normalize
- A transform that normalizes numeric columns.
- Rename
- A transform that renames columns in a RecordBatch.
- Sample
- A transform that randomly samples rows from a RecordBatch.
- Select
- A transform that selects specific columns from a RecordBatch.
- Shuffle
- A transform that shuffles rows in a RecordBatch.
- Skip
- A transform that skips the first N rows from a RecordBatch.
- Sort
- A transform that sorts rows by one or more columns.
- Take
- A transform that takes the first N rows from a RecordBatch.
- Unique
- A transform that removes duplicate rows based on specified columns.
Enums§
- Fill
Strategy - Strategy for filling null values.
- FimFormat
- FIM format variant.
- Norm
Method - Normalization method for numeric columns.
- Sort
Order - Sort order for the Sort transform.
Traits§
- Transform
- A transform that can be applied to RecordBatches.