Skip to main content

ColumnFrame

Struct ColumnFrame 

Source
pub struct ColumnFrame {
    pub index: KeyIndex,
    pub data_frame: Vec<TypedDataArray>,
}
Expand description

Column-oriented in-memory store for candidate data.

Each column is a TypedData — a tagged union over the supported primitives (bool, u8/u32/u64, i32/i64, f32/f64, String) with a Generic fallback for heterogeneous values. Columns are looked up by Key through the embedded KeyIndex.

Most operations work directly against the typed columns (zero-copy where possible). Use ColumnFrame::select to materialize a row-major Array2<DataValue> when you need a 2-D view.

Fields§

§index: KeyIndex§data_frame: Vec<TypedDataArray>

Implementations§

Source§

impl ColumnFrame

Source

pub fn new<K, V>(index: K, data_frame: Vec<V>) -> Self

Creates a new ColumnFrame from a column index and column-oriented data.

Each element of data_frame becomes one column and must convert into a TypedData. The number of entries must match the number of keys in index.

Accepted column shapes out of the box include Vec<DataValue>, Vec<T> for every supported primitive, Array1<DataValue>, and raw TypedData values.

The column data is stored as-is — no coercion between the value type and the key’s declared dtype happens here. Use ColumnFrame::new_coerced or Self::try_fix_dtype if you want the column promoted to the key’s dtype.

Source

pub fn new_coerced<K, V>(index: K, data_frame: Vec<V>) -> Self

Creates a new ColumnFrame and coerces each column to the dtype declared by its matching Key.

This is a convenience constructor for the common case where the input data is generic (Vec<DataValue>) but the keys carry authoritative dtypes.

Source

pub fn keys(&self) -> &[Key]

Returns the ordered slice of column Keys in this frame.

Source

pub fn nrows(&self) -> usize

Returns the number of rows in this frame.

Source

pub fn ncolumns(&self) -> usize

Returns the number of columns in this frame.

Source

pub fn is_empty(&self) -> bool

Returns true if the frame contains no rows.

Source

pub fn shrink(&mut self)

Compacts internal storage to reduce memory usage after mutations that may have left over-allocated capacity.

Source

pub fn try_fix_dtype_for_keys(&mut self, force: bool) -> Result<(), Error>

This method will try to fix dtype based on data stored in each column. If dtype is crate::DataType::Unknown this method will replace dtype for “correct” one based on DataValue. NOTE: flag force will enforce this dtype even if dtype is known

Source

pub fn try_fix_dtype(&mut self) -> Result<(), Error>

Attempts to fix the dtype of every column by inspecting stored values.

Columns whose data does not match their declared dtype are cast in-place. Returns Err with a list of (key, message) pairs for columns that failed casting.

Source

pub fn get_column(&self, key: &Key) -> Result<&TypedDataArray, Error>

Returns a shared reference to the column identified by key.

Source

pub fn get_column_mut( &mut self, key: &Key, ) -> Result<&mut TypedDataArray, Error>

Returns a mutable reference to the column identified by key.

Source

pub fn get_column_by_idx(&self, idx: usize) -> Result<&TypedDataArray, Error>

Returns a shared reference to the column at positional idx.

Source

pub fn get_column_by_idx_mut( &mut self, idx: usize, ) -> Result<&mut TypedDataArray, Error>

Returns a mutable reference to the column at positional idx.

Source

pub fn get_row(&self, idx: usize) -> Result<Vec<DataValue>, Error>

Returns a single row as a Vec<DataValue>, one element per column.

Source

pub fn try_fix_column_by_key(&mut self, key: &Key) -> Result<(), Error>

Casts all values in the column identified by key to match the key’s declared dtype.

Returns an error if the key is not found in the frame.

Source

pub fn enforce_dtype_for_column( &mut self, key: &str, dtype: DataType, ) -> Result<(), Error>

Forces all values in the named column to the specified dtype, updating both the stored data and the column’s Key metadata.

Returns an error if the column name does not exist.

Source

pub fn rename_key(&mut self, old: &str, new: Key) -> Result<(), Error>

Renames a column from old (matched by name) to new.

The new Key can carry a different DataType, which effectively re-declares the column’s type without casting existing values.

Source

pub fn add_alias(&mut self, key: &str, alias: &str) -> Result<(), Error>

Registers alias as an alternative name for the column named key.

After aliasing, select and friends accept both the original name and the alias.

Source

pub fn select_transposed_typed<D: Extract>(&self, keys: &[Key]) -> Vec<Vec<D>>

Returns the requested columns as Vec<Vec<D>> in row-major order — each outer Vec is one row, each inner Vec holds one cell per selected key. Values are coerced to D via Extract.

Despite the name, the result is not transposed. Returns an empty Vec if no key resolves.

Source

pub fn select_transposed( &self, keys: Option<&[Key]>, ) -> Result<Array2<DataValue>, Error>

Selects columns and stacks them into a 2-D array of shape (ncols, nrows) — i.e. each row of the output is one column from the frame.

Despite the name, this is the only method that returns the column-as-row layout; Self::select returns the row-major (nrows, ncols) form.

When keys is None, every column from the KeyIndex is included. Returns an empty Array2 if no key resolves.

Source

pub fn select_column(&self, key: &Key) -> Option<Array1<DataValue>>

👎Deprecated:

allocates O(n); use get_column() for zero-copy typed access

Selects a whole column from the ColumnFrame by the given key as an owned Array1<DataValue>.

Returns None if the key is missing. Typed columns are materialized on the fly; for zero-copy access to typed data, use Self::get_column to obtain the TypedDataArray directly.

Source

pub fn apply_function<F>(&mut self, keys: &[Key], func: F) -> Result<(), Error>
where F: FnMut(&[Key], &mut ColumnFrame) -> Result<(), Error>,

Applies a user-defined closure to this frame.

The closure receives the provided keys and a mutable reference to the frame, allowing arbitrary in-place transformations.

Source

pub fn validate_entry_access( &self, column: &Key, row_index: usize, ) -> Result<usize, Error>

Validates that (column, row_index) is in range and returns the corresponding column position.

§Errors
Source

pub fn get_by_row_index( &self, column: &Key, row_index: usize, ) -> Option<DataValue>

Returns an owned DataValue for the given column and row index, or None if either is out of range.

Typed columns allocate a DataValue on the fly. For zero-copy typed access see Self::get_column + TypedData::as_slice_i32 (and its siblings for other primitive types).

Source

pub fn set_by_row_index( &mut self, column: &Key, row_index: usize, value: DataValue, ) -> Result<(), Error>

Sets the value at (column, row_index), coercing value to the column’s dtype via Extract.

Returns Error::NotFound if the column is missing or Error::IndexOutOfRange if the row index is out of range.

Source

pub fn select_as_map( &self, keys: Option<&[Key]>, ) -> HashMap<Key, Vec<DataValue>>

Returns the requested columns as a HashMap<Key, Vec<DataValue>>.

When keys is None, every column from the KeyIndex is included. Keys absent from the frame are simply omitted from the result; an empty map is returned when no key resolves.

Source

pub fn select(&self, keys: Option<&[Key]>) -> Array2<DataValue>

Materializes the requested columns into a row-major Array2<DataValue> of shape (nrows, ncols).

When keys is None, every column from the KeyIndex is included. Keys absent from the frame produce a Null-filled column at the corresponding output position. Returns an empty Array2 if no key resolves.

This always allocates a fresh buffer — values are cloned into the output array. For zero-copy single-column access, use Self::select_column (or Self::get_column for direct TypedData access).

Source

pub fn select_vec_view( &self, keys: Option<&[Key]>, ) -> Result<Vec<Option<&TypedDataArray>>, Error>

Returns selected columns as a Vec of owned Array1 columns.

Each entry of the returned Vec corresponds to one requested column, in the same order as keys. Missing keys yield None; present keys yield Some(&TypedData) borrowed directly from the column store. When keys is None every column is borrowed in storage order.

§Errors

Returns Error::EmptyData when keys resolves to an empty or entirely unknown key set.

§Examples
use trs_dataframe::{column_frame, Key};

let cf = column_frame! { "a" => [1i32, 2i32], "b" => [3i32, 4i32] };
let cols = cf.select_vec_view(Some(&["a".into()])).unwrap();
assert_eq!(cols.len(), 1);
assert_eq!(cols[0].as_ref().unwrap().len(), 2);
Source

pub fn select_typed_columns( &self, keys: Option<&[Key]>, ) -> Result<Vec<TypedDataArray>, Error>

Returns selected columns as a Vec<TypedDataArray>, preserving the native primitive storage and null bitmap when available.

Source

pub fn select_view(&self, keys: Option<&[Key]>) -> Result<MaybeView<'_>, Error>

Returns selected columns wrapped in a MaybeView.

Each column is materialized as an owned Array1<DataValue> and then stacked along axis 0, producing an owned Array2 of shape (ncols, nrows). When keys is None, every column from the KeyIndex is included.

Use MaybeView::row_view on the result to obtain a uniform (nrows, ncols) view regardless of which variant was returned.

§Errors

Returns Error::EmptyData when keys resolves to an empty or entirely unknown key set.

§Examples
use trs_dataframe::{column_frame, Key};

let cf = column_frame! { "x" => [10i32, 20i32], "y" => [30i32, 40i32] };
let view = cf.select_view(Some(&["x".into(), "y".into()])).unwrap();
// row_view() gives a (nrows, ncols) view — shape is [2, 2] here.
assert_eq!(view.row_view().nrows(), 2);
Source

pub fn select_typed<T: Extract + Clone>( &self, keys: Option<&[Key]>, ) -> Array2<T>

Returns selected columns as a typed 2D array, converting each DataValue via the Extract trait.

This is the typed counterpart of select. If keys is None, all columns are returned. The data is in column-major (Fortran) order internally but the shape is (nrows, ncols) — indexing and row iteration work identically to a row-major array.

§Type coercion

The Extract trait performs best-effort numeric coercion (e.g. I32 -> f64). Values that cannot be meaningfully converted yield the type’s default (0 for numbers, false for bool, empty string for String).

Source

pub fn push<C: CandidateData>(&mut self, row_candidate: C) -> Result<(), Error>

Pushes the row candidate into the ColumnFrame If the column is not found this method will add the column to the ColumnFrame

This operation clones all values from the candidate. For batch operations, consider using Self::extend instead which can be more efficient.

Source

pub fn remove_column(&mut self, keys: &[Key]) -> Result<Self, Error>

Removes the specified columns from this frame and returns them as a new ColumnFrame.

The remaining columns stay in self with their data intact.

Source

pub fn extend(&mut self, other: Self) -> Result<(), Error>

Extends the ColumnFrame with the data from the other ColumnFrame If the KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame If the other KeyIndex is empty, nothing happens If the length of the KeyIndex of the other data frame is greater then current, an error is returned Error::DataSetSizeDoesntMatch If Key from other data frame - extends the KeyIndex and add column to the current ColumnFrame

Source

pub fn replace(&mut self, other: Self) -> Result<(), Error>

Replace the ColumnFrame with the other ColumnFrame If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame If the other KeyIndex is empty, nothing happens If the KeyIndex of the other data frame and current doesn’t match an error is returned Error::DataSetSizeDoesntMatch If the Key from other data frame is not present in the current ColumnFrame - extends the KeyIndex and add column to the current ColumnFrame

Source

pub fn join_by_id_inner( &mut self, right: Self, keys: &[Key], ) -> Result<(), Error>

Joins the candidates by the keys in the JoinRelation::JoinById struct. This function creates Index for the keys and then joins the candidates by the keys.

Source

pub fn add_single_column<K, V>( &mut self, key: K, column: V, ) -> Result<(), Error>
where K: Into<Key>, V: Into<TypedDataArray>,

Adds a single column to the current ColumnFrame.

The column may be provided as anything that converts into a TypedData — e.g. Vec<DataValue>, a typed primitive Vec<T>, an Array1<DataValue>, or a raw TypedData. If the column’s dtype is Generic but key declares a concrete dtype, the column is coerced to match.

Errors:

Source

pub fn add_columns(&mut self, other: Self) -> Result<(), Error>

Adds the columns from the other ColumnFrame to the current ColumnFrame If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame If the other KeyIndex is empty, nothing happens

Source

pub fn broadcast(&mut self, other: Self) -> Result<(), Error>

Broadcasts the data from the other ColumnFrame to the current ColumnFrame If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame If the other KeyIndex is empty, nothing happens If the length (number of rows) of the other data frame is greater then 1 an error is returned Error::CannotBroadcast

Source

pub fn cartesian_product(&mut self, other: Self) -> Result<(), Error>

Replaces self with the Cartesian product of self and other — every row of self paired with every row of other.

Existing column order in self is preserved; columns that exist only in other are appended at the end.

Source

pub fn join( &mut self, right: Self, join_type: &JoinRelation, ) -> Result<(), Error>

Joins the candidates with the other candidates by the JoinRelation policy. For JoinBy::AddColumns the columns are added to the existing structure via Self::add_columns For JoinBy::Replace the columns are replaced with the new columns For JoinBy::Extend the candidates are extended via Self::extend For JoinBy::Broadcast each candidate is extended with the values of the other candidates Self::broadcast For JoinBy::CartesianProduct the candidates are multiplied by the other candidates For JoinBy::JoinById the candidates are joined by the keys in the JoinRelation::JoinById struct see Self::join_by_id_inner

Source

pub fn get_single_column(&self, key: &Key) -> Option<Array1<DataValue>>

👎Deprecated:

allocates O(n); use get_column() for zero-copy typed access

Returns a single column materialized as an owned Array1<DataValue>.

Typed columns allocate a DataValue per element on the fly. For zero-copy typed access, use Self::get_column to obtain the underlying TypedData directly.

Source

pub fn get_single_column_typed<T: Extract>( &self, key: &Key, ) -> Option<Array1<T>>

Returns a column extracted into a typed Array1<T>, where each DataValue is converted via the Extract trait. Returns None if the key does not exist in the frame.

§Type coercion

The Extract trait performs best-effort numeric coercion (e.g. I32 -> f64). Values that cannot be meaningfully converted yield the type’s default (0 for numbers, false for bool, empty string for String).

Source

pub fn sorted(&self, key: &Key) -> Result<SortedDataFrame<'_>, Error>

Returns a SortedDataFrame view sorted ascending by the given column.

Null values are pushed to the end. Ties are broken by original row order. Use SortedDataFrame::topn to efficiently extract the first/last N rows.

Source

pub fn filter(&self, filter: &FilterRules) -> Result<Self, Error>

Returns a new ColumnFrame containing only rows that match the filter expression.

The filter is applied against each row and matching row indices are collected, de-duplicated, and used to build the result.

Trait Implementations§

Source§

impl AddAssign<HashMap<String, DataValue>> for ColumnFrame

Source§

fn add_assign(&mut self, rhs: HashMap<String, DataValue>)

Performs the += operation. Read more
Source§

impl AddAssign for ColumnFrame

Source§

fn add_assign(&mut self, rhs: Self)

Performs the += operation. Read more
Source§

impl Clone for ColumnFrame

Source§

fn clone(&self) -> ColumnFrame

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for ColumnFrame

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for ColumnFrame

Source§

fn default() -> ColumnFrame

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for ColumnFrame

Custom Deserialize: accept row-major Array2<DataValue> and convert it to the column-oriented TypedData storage, promoting each column to its detected dtype where possible.

Source§

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Display for ColumnFrame

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl DivAssign<HashMap<String, DataValue>> for ColumnFrame

Source§

fn div_assign(&mut self, rhs: HashMap<String, DataValue>)

Performs the /= operation. Read more
Source§

impl DivAssign for ColumnFrame

Source§

fn div_assign(&mut self, rhs: Self)

Performs the /= operation. Read more
Source§

impl From<ColumnFrame> for DataFrame

Source§

fn from(dataframe: ColumnFrame) -> Self

Converts to this type from the input type.
Source§

impl From<HashMap<String, ArrayBase<OwnedRepr<DataValue>, Dim<[usize; 1]>>>> for ColumnFrame

Source§

fn from(dataframe: HashMap<String, Array1<DataValue>>) -> Self

Converts to this type from the input type.
Source§

impl From<HashMap<String, Vec<DataValue>>> for ColumnFrame

NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!

Source§

fn from(dataframe: HashMap<String, Vec<DataValue>>) -> Self

Converts to this type from the input type.
Source§

impl From<SizedHashMap<SmartString<LazyCompact>, Vec<DataValue>>> for ColumnFrame

Source§

fn from(dataframe: MLChefMap) -> Self

Converts to this type from the input type.
Source§

impl From<Vec<(Key, Vec<DataValue>)>> for ColumnFrame

Source§

fn from(dataframe: Vec<(Key, Vec<DataValue>)>) -> Self

Converts to this type from the input type.
Source§

impl From<Vec<HashMap<Key, DataValue>>> for ColumnFrame

NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!

Source§

fn from(dataframe: Vec<HashMap<Key, DataValue>>) -> Self

Converts to this type from the input type.
Source§

impl From<Vec<SizedHashMap<Key, DataValue>>> for ColumnFrame

NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!

Source§

fn from(dataframe: Vec<HashMap<Key, DataValue>>) -> Self

Converts to this type from the input type.
Source§

impl MulAssign<HashMap<String, DataValue>> for ColumnFrame

Source§

fn mul_assign(&mut self, rhs: HashMap<String, DataValue>)

Performs the *= operation. Read more
Source§

impl MulAssign for ColumnFrame

Source§

fn mul_assign(&mut self, rhs: Self)

Performs the *= operation. Read more
Source§

impl PartialEq for ColumnFrame

Source§

fn eq(&self, other: &ColumnFrame) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Serialize for ColumnFrame

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl SubAssign<HashMap<String, DataValue>> for ColumnFrame

Source§

fn sub_assign(&mut self, rhs: HashMap<String, DataValue>)

Performs the -= operation. Read more
Source§

impl SubAssign for ColumnFrame

Source§

fn sub_assign(&mut self, rhs: Self)

Performs the -= operation. Read more
Source§

impl StructuralPartialEq for ColumnFrame

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source§

impl<T> Ungil for T
where T: Send,