pub struct ColumnFrame {
pub index: KeyIndex,
pub data_frame: Vec<TypedDataArray>,
}Expand description
Column-oriented in-memory store for candidate data.
Each column is a TypedData — a tagged union over the supported
primitives (bool, u8/u32/u64, i32/i64, f32/f64, String)
with a Generic fallback for heterogeneous values. Columns are looked up
by Key through the embedded KeyIndex.
Most operations work directly against the typed columns (zero-copy where
possible). Use ColumnFrame::select to materialize a row-major
Array2<DataValue> when you need a 2-D view.
Fields§
§index: KeyIndex§data_frame: Vec<TypedDataArray>Implementations§
Source§impl ColumnFrame
impl ColumnFrame
Sourcepub fn new<K, V>(index: K, data_frame: Vec<V>) -> Self
pub fn new<K, V>(index: K, data_frame: Vec<V>) -> Self
Creates a new ColumnFrame from a column index and column-oriented
data.
Each element of data_frame becomes one column and must convert into a
TypedData. The number of entries must match the number of keys in
index.
Accepted column shapes out of the box include Vec<DataValue>,
Vec<T> for every supported primitive, Array1<DataValue>, and raw
TypedData values.
The column data is stored as-is — no coercion between the value type
and the key’s declared dtype happens here. Use
ColumnFrame::new_coerced or Self::try_fix_dtype if you want the
column promoted to the key’s dtype.
Sourcepub fn new_coerced<K, V>(index: K, data_frame: Vec<V>) -> Self
pub fn new_coerced<K, V>(index: K, data_frame: Vec<V>) -> Self
Creates a new ColumnFrame and coerces each column to the dtype
declared by its matching Key.
This is a convenience constructor for the common case where the input
data is generic (Vec<DataValue>) but the keys carry authoritative
dtypes.
Sourcepub fn shrink(&mut self)
pub fn shrink(&mut self)
Compacts internal storage to reduce memory usage after mutations that may have left over-allocated capacity.
Sourcepub fn try_fix_dtype_for_keys(&mut self, force: bool) -> Result<(), Error>
pub fn try_fix_dtype_for_keys(&mut self, force: bool) -> Result<(), Error>
This method will try to fix dtype based on data stored in each column. If dtype is crate::DataType::Unknown
this method will replace dtype for “correct” one based on DataValue.
NOTE: flag force will enforce this dtype even if dtype is known
Sourcepub fn try_fix_dtype(&mut self) -> Result<(), Error>
pub fn try_fix_dtype(&mut self) -> Result<(), Error>
Attempts to fix the dtype of every column by inspecting stored values.
Columns whose data does not match their declared dtype are cast in-place.
Returns Err with a list of (key, message) pairs for columns that failed casting.
Sourcepub fn get_column(&self, key: &Key) -> Result<&TypedDataArray, Error>
pub fn get_column(&self, key: &Key) -> Result<&TypedDataArray, Error>
Returns a shared reference to the column identified by key.
Sourcepub fn get_column_mut(
&mut self,
key: &Key,
) -> Result<&mut TypedDataArray, Error>
pub fn get_column_mut( &mut self, key: &Key, ) -> Result<&mut TypedDataArray, Error>
Returns a mutable reference to the column identified by key.
Sourcepub fn get_column_by_idx(&self, idx: usize) -> Result<&TypedDataArray, Error>
pub fn get_column_by_idx(&self, idx: usize) -> Result<&TypedDataArray, Error>
Returns a shared reference to the column at positional idx.
Sourcepub fn get_column_by_idx_mut(
&mut self,
idx: usize,
) -> Result<&mut TypedDataArray, Error>
pub fn get_column_by_idx_mut( &mut self, idx: usize, ) -> Result<&mut TypedDataArray, Error>
Returns a mutable reference to the column at positional idx.
Sourcepub fn get_row(&self, idx: usize) -> Result<Vec<DataValue>, Error>
pub fn get_row(&self, idx: usize) -> Result<Vec<DataValue>, Error>
Returns a single row as a Vec<DataValue>, one element per column.
Sourcepub fn try_fix_column_by_key(&mut self, key: &Key) -> Result<(), Error>
pub fn try_fix_column_by_key(&mut self, key: &Key) -> Result<(), Error>
Casts all values in the column identified by key to match the key’s declared dtype.
Returns an error if the key is not found in the frame.
Sourcepub fn enforce_dtype_for_column(
&mut self,
key: &str,
dtype: DataType,
) -> Result<(), Error>
pub fn enforce_dtype_for_column( &mut self, key: &str, dtype: DataType, ) -> Result<(), Error>
Forces all values in the named column to the specified dtype, updating
both the stored data and the column’s Key metadata.
Returns an error if the column name does not exist.
Sourcepub fn add_alias(&mut self, key: &str, alias: &str) -> Result<(), Error>
pub fn add_alias(&mut self, key: &str, alias: &str) -> Result<(), Error>
Registers alias as an alternative name for the column named key.
After aliasing, select and friends accept both the original name and the alias.
Sourcepub fn select_transposed_typed<D: Extract>(&self, keys: &[Key]) -> Vec<Vec<D>>
pub fn select_transposed_typed<D: Extract>(&self, keys: &[Key]) -> Vec<Vec<D>>
Returns the requested columns as Vec<Vec<D>> in row-major order —
each outer Vec is one row, each inner Vec holds one cell per
selected key. Values are coerced to D via Extract.
Despite the name, the result is not transposed. Returns an empty
Vec if no key resolves.
Sourcepub fn select_transposed(
&self,
keys: Option<&[Key]>,
) -> Result<Array2<DataValue>, Error>
pub fn select_transposed( &self, keys: Option<&[Key]>, ) -> Result<Array2<DataValue>, Error>
Selects columns and stacks them into a 2-D array of shape
(ncols, nrows) — i.e. each row of the output is one column from the
frame.
Despite the name, this is the only method that returns the
column-as-row layout; Self::select returns the row-major
(nrows, ncols) form.
When keys is None, every column from the KeyIndex is included.
Returns an empty Array2 if no key resolves.
Sourcepub fn select_column(&self, key: &Key) -> Option<Array1<DataValue>>
👎Deprecated: allocates O(n); use get_column() for zero-copy typed access
pub fn select_column(&self, key: &Key) -> Option<Array1<DataValue>>
allocates O(n); use get_column() for zero-copy typed access
Selects a whole column from the ColumnFrame by the given key as an
owned Array1<DataValue>.
Returns None if the key is missing. Typed columns are materialized on
the fly; for zero-copy access to typed data, use
Self::get_column to obtain the TypedDataArray directly.
Sourcepub fn apply_function<F>(&mut self, keys: &[Key], func: F) -> Result<(), Error>
pub fn apply_function<F>(&mut self, keys: &[Key], func: F) -> Result<(), Error>
Applies a user-defined closure to this frame.
The closure receives the provided keys and a mutable reference to the
frame, allowing arbitrary in-place transformations.
Sourcepub fn validate_entry_access(
&self,
column: &Key,
row_index: usize,
) -> Result<usize, Error>
pub fn validate_entry_access( &self, column: &Key, row_index: usize, ) -> Result<usize, Error>
Validates that (column, row_index) is in range and returns the
corresponding column position.
§Errors
Error::NotFoundifcolumnis not present in the frame.Error::IndexOutOfRangeifrow_index >= nrows.
Sourcepub fn get_by_row_index(
&self,
column: &Key,
row_index: usize,
) -> Option<DataValue>
pub fn get_by_row_index( &self, column: &Key, row_index: usize, ) -> Option<DataValue>
Returns an owned DataValue for the given column and row index, or
None if either is out of range.
Typed columns allocate a DataValue on the fly. For zero-copy typed
access see Self::get_column + TypedData::as_slice_i32 (and its
siblings for other primitive types).
Sourcepub fn set_by_row_index(
&mut self,
column: &Key,
row_index: usize,
value: DataValue,
) -> Result<(), Error>
pub fn set_by_row_index( &mut self, column: &Key, row_index: usize, value: DataValue, ) -> Result<(), Error>
Sets the value at (column, row_index), coercing value to the
column’s dtype via Extract.
Returns Error::NotFound if the column is missing or
Error::IndexOutOfRange if the row index is out of range.
Sourcepub fn select_as_map(
&self,
keys: Option<&[Key]>,
) -> HashMap<Key, Vec<DataValue>>
pub fn select_as_map( &self, keys: Option<&[Key]>, ) -> HashMap<Key, Vec<DataValue>>
Returns the requested columns as a HashMap<Key, Vec<DataValue>>.
When keys is None, every column from the KeyIndex is included.
Keys absent from the frame are simply omitted from the result; an empty
map is returned when no key resolves.
Sourcepub fn select(&self, keys: Option<&[Key]>) -> Array2<DataValue>
pub fn select(&self, keys: Option<&[Key]>) -> Array2<DataValue>
Materializes the requested columns into a row-major
Array2<DataValue> of shape (nrows, ncols).
When keys is None, every column from the KeyIndex is included.
Keys absent from the frame produce a Null-filled column at the
corresponding output position. Returns an empty Array2 if no key
resolves.
This always allocates a fresh buffer — values are cloned into the
output array. For zero-copy single-column access, use
Self::select_column (or Self::get_column for direct
TypedData access).
Sourcepub fn select_vec_view(
&self,
keys: Option<&[Key]>,
) -> Result<Vec<Option<&TypedDataArray>>, Error>
pub fn select_vec_view( &self, keys: Option<&[Key]>, ) -> Result<Vec<Option<&TypedDataArray>>, Error>
Returns selected columns as a Vec of owned Array1 columns.
Each entry of the returned Vec corresponds to one requested column,
in the same order as keys. Missing keys yield None; present keys
yield Some(&TypedData) borrowed directly from the column store. When
keys is None every column is borrowed in storage order.
§Errors
Returns Error::EmptyData when keys resolves to an empty or
entirely unknown key set.
§Examples
use trs_dataframe::{column_frame, Key};
let cf = column_frame! { "a" => [1i32, 2i32], "b" => [3i32, 4i32] };
let cols = cf.select_vec_view(Some(&["a".into()])).unwrap();
assert_eq!(cols.len(), 1);
assert_eq!(cols[0].as_ref().unwrap().len(), 2);Sourcepub fn select_typed_columns(
&self,
keys: Option<&[Key]>,
) -> Result<Vec<TypedDataArray>, Error>
pub fn select_typed_columns( &self, keys: Option<&[Key]>, ) -> Result<Vec<TypedDataArray>, Error>
Returns selected columns as a Vec<TypedDataArray>, preserving the
native primitive storage and null bitmap when available.
Sourcepub fn select_view(&self, keys: Option<&[Key]>) -> Result<MaybeView<'_>, Error>
pub fn select_view(&self, keys: Option<&[Key]>) -> Result<MaybeView<'_>, Error>
Returns selected columns wrapped in a MaybeView.
Each column is materialized as an owned Array1<DataValue> and then
stacked along axis 0, producing an owned Array2 of shape
(ncols, nrows). When keys is None, every column from the
KeyIndex is included.
Use MaybeView::row_view on the result to obtain a uniform
(nrows, ncols) view regardless of which variant was returned.
§Errors
Returns Error::EmptyData when keys resolves to an empty or
entirely unknown key set.
§Examples
use trs_dataframe::{column_frame, Key};
let cf = column_frame! { "x" => [10i32, 20i32], "y" => [30i32, 40i32] };
let view = cf.select_view(Some(&["x".into(), "y".into()])).unwrap();
// row_view() gives a (nrows, ncols) view — shape is [2, 2] here.
assert_eq!(view.row_view().nrows(), 2);Sourcepub fn select_typed<T: Extract + Clone>(
&self,
keys: Option<&[Key]>,
) -> Array2<T>
pub fn select_typed<T: Extract + Clone>( &self, keys: Option<&[Key]>, ) -> Array2<T>
Returns selected columns as a typed 2D array, converting each DataValue
via the Extract trait.
This is the typed counterpart of select. If keys is None,
all columns are returned. The data is in column-major (Fortran) order
internally but the shape is (nrows, ncols) — indexing and row
iteration work identically to a row-major array.
§Type coercion
The Extract trait performs best-effort numeric coercion (e.g. I32 -> f64).
Values that cannot be meaningfully converted yield the type’s default
(0 for numbers, false for bool, empty string for String).
Sourcepub fn push<C: CandidateData>(&mut self, row_candidate: C) -> Result<(), Error>
pub fn push<C: CandidateData>(&mut self, row_candidate: C) -> Result<(), Error>
Pushes the row candidate into the ColumnFrame
If the column is not found this method will add the column to the ColumnFrame
This operation clones all values from the candidate. For batch operations,
consider using Self::extend instead which can be more efficient.
Sourcepub fn remove_column(&mut self, keys: &[Key]) -> Result<Self, Error>
pub fn remove_column(&mut self, keys: &[Key]) -> Result<Self, Error>
Removes the specified columns from this frame and returns them as a new ColumnFrame.
The remaining columns stay in self with their data intact.
Sourcepub fn extend(&mut self, other: Self) -> Result<(), Error>
pub fn extend(&mut self, other: Self) -> Result<(), Error>
Extends the ColumnFrame with the data from the other ColumnFrame
If the KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame
If the other KeyIndex is empty, nothing happens
If the length of the KeyIndex of the other data frame is greater then current,
an error is returned Error::DataSetSizeDoesntMatch
If Key from other data frame - extends the KeyIndex and add column to the current ColumnFrame
Sourcepub fn replace(&mut self, other: Self) -> Result<(), Error>
pub fn replace(&mut self, other: Self) -> Result<(), Error>
Replace the ColumnFrame with the other ColumnFrame
If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame
If the other KeyIndex is empty, nothing happens
If the KeyIndex of the other data frame and current doesn’t match an error is returned Error::DataSetSizeDoesntMatch
If the Key from other data frame is not present in the current ColumnFrame - extends the KeyIndex and add column to the current ColumnFrame
Sourcepub fn join_by_id_inner(
&mut self,
right: Self,
keys: &[Key],
) -> Result<(), Error>
pub fn join_by_id_inner( &mut self, right: Self, keys: &[Key], ) -> Result<(), Error>
Joins the candidates by the keys in the JoinRelation::JoinById struct.
This function creates Index for the keys and then joins the candidates by the keys.
Sourcepub fn add_single_column<K, V>(
&mut self,
key: K,
column: V,
) -> Result<(), Error>
pub fn add_single_column<K, V>( &mut self, key: K, column: V, ) -> Result<(), Error>
Adds a single column to the current ColumnFrame.
The column may be provided as anything that converts into a
TypedData — e.g. Vec<DataValue>, a typed primitive Vec<T>, an
Array1<DataValue>, or a raw TypedData. If the column’s dtype is
Generic but key declares a concrete dtype, the column is coerced to
match.
Errors:
Error::ColumnAlreadyExistsifkeyis already present.Error::DataSetSizeDoesntMatchif the column length doesn’t match the current row count (and the frame is non-empty).
Sourcepub fn add_columns(&mut self, other: Self) -> Result<(), Error>
pub fn add_columns(&mut self, other: Self) -> Result<(), Error>
Adds the columns from the other ColumnFrame to the current ColumnFrame
If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame
If the other KeyIndex is empty, nothing happens
Sourcepub fn broadcast(&mut self, other: Self) -> Result<(), Error>
pub fn broadcast(&mut self, other: Self) -> Result<(), Error>
Broadcasts the data from the other ColumnFrame to the current ColumnFrame
If the current KeyIndex is empty, the ColumnFrame is replaced with the other ColumnFrame
If the other KeyIndex is empty, nothing happens
If the length (number of rows) of the other data frame is greater then 1 an error is returned Error::CannotBroadcast
Sourcepub fn cartesian_product(&mut self, other: Self) -> Result<(), Error>
pub fn cartesian_product(&mut self, other: Self) -> Result<(), Error>
Replaces self with the Cartesian product of self and other —
every row of self paired with every row of other.
Existing column order in self is preserved; columns that exist only
in other are appended at the end.
Sourcepub fn join(
&mut self,
right: Self,
join_type: &JoinRelation,
) -> Result<(), Error>
pub fn join( &mut self, right: Self, join_type: &JoinRelation, ) -> Result<(), Error>
Joins the candidates with the other candidates by the JoinRelation policy.
For JoinBy::AddColumns the columns are added to the existing structure via Self::add_columns
For JoinBy::Replace the columns are replaced with the new columns
For JoinBy::Extend the candidates are extended via Self::extend
For JoinBy::Broadcast each candidate is extended with the values of the other candidates Self::broadcast
For JoinBy::CartesianProduct the candidates are multiplied by the other candidates
For JoinBy::JoinById the candidates are joined by the keys in the JoinRelation::JoinById struct see Self::join_by_id_inner
Sourcepub fn get_single_column(&self, key: &Key) -> Option<Array1<DataValue>>
👎Deprecated: allocates O(n); use get_column() for zero-copy typed access
pub fn get_single_column(&self, key: &Key) -> Option<Array1<DataValue>>
allocates O(n); use get_column() for zero-copy typed access
Returns a single column materialized as an owned Array1<DataValue>.
Typed columns allocate a DataValue per element on the fly. For
zero-copy typed access, use Self::get_column to obtain the
underlying TypedData directly.
Sourcepub fn get_single_column_typed<T: Extract>(
&self,
key: &Key,
) -> Option<Array1<T>>
pub fn get_single_column_typed<T: Extract>( &self, key: &Key, ) -> Option<Array1<T>>
Returns a column extracted into a typed Array1<T>, where each
DataValue is converted via the Extract trait. Returns None if
the key does not exist in the frame.
§Type coercion
The Extract trait performs best-effort numeric coercion (e.g.
I32 -> f64). Values that cannot be meaningfully converted yield the
type’s default (0 for numbers, false for bool, empty string for
String).
Sourcepub fn sorted(&self, key: &Key) -> Result<SortedDataFrame<'_>, Error>
pub fn sorted(&self, key: &Key) -> Result<SortedDataFrame<'_>, Error>
Returns a SortedDataFrame view sorted ascending by the given column.
Null values are pushed to the end. Ties are broken by original row order.
Use SortedDataFrame::topn to efficiently
extract the first/last N rows.
Sourcepub fn filter(&self, filter: &FilterRules) -> Result<Self, Error>
pub fn filter(&self, filter: &FilterRules) -> Result<Self, Error>
Returns a new ColumnFrame containing only rows that match the filter expression.
The filter is applied against each row and matching row indices are collected, de-duplicated, and used to build the result.
Trait Implementations§
Source§impl AddAssign for ColumnFrame
impl AddAssign for ColumnFrame
Source§fn add_assign(&mut self, rhs: Self)
fn add_assign(&mut self, rhs: Self)
+= operation. Read moreSource§impl Clone for ColumnFrame
impl Clone for ColumnFrame
Source§fn clone(&self) -> ColumnFrame
fn clone(&self) -> ColumnFrame
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for ColumnFrame
impl Debug for ColumnFrame
Source§impl Default for ColumnFrame
impl Default for ColumnFrame
Source§fn default() -> ColumnFrame
fn default() -> ColumnFrame
Source§impl<'de> Deserialize<'de> for ColumnFrame
Custom Deserialize: accept row-major Array2<DataValue> and convert it to
the column-oriented TypedData storage, promoting each column to its
detected dtype where possible.
impl<'de> Deserialize<'de> for ColumnFrame
Custom Deserialize: accept row-major Array2<DataValue> and convert it to
the column-oriented TypedData storage, promoting each column to its
detected dtype where possible.
Source§fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>where
D: Deserializer<'de>,
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>where
D: Deserializer<'de>,
Source§impl Display for ColumnFrame
impl Display for ColumnFrame
Source§impl DivAssign for ColumnFrame
impl DivAssign for ColumnFrame
Source§fn div_assign(&mut self, rhs: Self)
fn div_assign(&mut self, rhs: Self)
/= operation. Read moreSource§impl From<ColumnFrame> for DataFrame
impl From<ColumnFrame> for DataFrame
Source§fn from(dataframe: ColumnFrame) -> Self
fn from(dataframe: ColumnFrame) -> Self
Source§impl From<HashMap<String, Vec<DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of
the keys are sorted!
impl From<HashMap<String, Vec<DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!
Source§impl From<SizedHashMap<SmartString<LazyCompact>, Vec<DataValue>>> for ColumnFrame
impl From<SizedHashMap<SmartString<LazyCompact>, Vec<DataValue>>> for ColumnFrame
Source§impl From<Vec<HashMap<Key, DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of
the keys are sorted!
impl From<Vec<HashMap<Key, DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!
Source§impl From<Vec<SizedHashMap<Key, DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of
the keys are sorted!
impl From<Vec<SizedHashMap<Key, DataValue>>> for ColumnFrame
NOTE: Because of randomnes of the key order in the hashmap, the order of the keys are sorted!
Source§impl MulAssign for ColumnFrame
impl MulAssign for ColumnFrame
Source§fn mul_assign(&mut self, rhs: Self)
fn mul_assign(&mut self, rhs: Self)
*= operation. Read moreSource§impl PartialEq for ColumnFrame
impl PartialEq for ColumnFrame
Source§fn eq(&self, other: &ColumnFrame) -> bool
fn eq(&self, other: &ColumnFrame) -> bool
self and other values to be equal, and is used by ==.Source§impl Serialize for ColumnFrame
impl Serialize for ColumnFrame
Source§impl SubAssign for ColumnFrame
impl SubAssign for ColumnFrame
Source§fn sub_assign(&mut self, rhs: Self)
fn sub_assign(&mut self, rhs: Self)
-= operation. Read more