pub struct FctColumn {
pub levels: Vec<String>,
pub data: Vec<u16>,
}Expand description
A compact categorical column: stores u16 indices into a levels table.
Invariant: data[i] < levels.len() for all i where bitmap is set.
Null rows (in NullableFactor) may carry index 0 – callers must check bitmap.
Fields§
§levels: Vec<String>Mapping from index → level string. Order = first-occurrence of each string in the source column (deterministic, no hashing).
data: Vec<u16>One u16 per row. Value is the index into levels.
Implementations§
Source§impl FctColumn
impl FctColumn
Sourcepub fn encode(strings: &[String]) -> Result<Self, TidyError>
pub fn encode(strings: &[String]) -> Result<Self, TidyError>
Encode a string column into a FctColumn.
Level order = first-occurrence in strings.
Returns Err if more than 65,535 distinct strings are found.
Sourcepub fn encode_from_view(view: &TidyView, col: &str) -> Result<Self, TidyError>
pub fn encode_from_view(view: &TidyView, col: &str) -> Result<Self, TidyError>
Encode a Column::Str from a TidyView column (respects mask & projection).
Sourcepub fn fct_lump(&self, n: usize) -> Result<Self, TidyError>
pub fn fct_lump(&self, n: usize) -> Result<Self, TidyError>
Lump all but the top-n most frequent levels into “Other”.
Tie-breaking: equal-frequency levels keep first-occurrence order in the top-n selection.
Edge cases:
- n = 0 → all levels become “Other” (one level total)
- n ≥ nlevels → no lumping, returns self.clone()
- “Other” already present → renamed to “Other_” (iterate until unique)
Sourcepub fn fct_reorder(
&self,
summary_vals: &[f64],
descending: bool,
) -> Result<Self, TidyError>
pub fn fct_reorder( &self, summary_vals: &[f64], descending: bool, ) -> Result<Self, TidyError>
Reorder levels by a numeric summary column from the same frame.
summary_vals[i] is the numeric value for level i.
Ascending = smallest summary value first.
NaN sorts LAST (same rule as arrange).
Tie-breaking: stable sort (original level order preserved within ties).
Sourcepub fn fct_reorder_by_col(
&self,
numeric_col: &Column,
descending: bool,
) -> Result<Self, TidyError>
pub fn fct_reorder_by_col( &self, numeric_col: &Column, descending: bool, ) -> Result<Self, TidyError>
Convenience: compute per-level mean of a numeric column, then reorder.
numeric_col must be Column::Float or Column::Int and same length as self.
NaN values in the numeric column are excluded from the mean; if all rows
for a level are NaN the level gets summary NaN (sorts last).
Sourcepub fn fct_collapse(&self, mapping: &[(&str, &str)]) -> Result<Self, TidyError>
pub fn fct_collapse(&self, mapping: &[(&str, &str)]) -> Result<Self, TidyError>
Collapse multiple old levels into single new level names.
mapping: slice of (old_level_name, new_level_name).
- Levels not in mapping keep their original name.
- Multiple old levels can map to the same new name → merged into one index.
- Output level order: first-occurrence of each NEW name, following the original first-occurrence order of OLD levels.
- Data buffer is rebuilt (O(N) remap) only when indices actually change. The levels Vec is rebuilt O(L) regardless.
- Empty mapping → returns self.clone().
Capacity: if collapsing reduces level count the result always fits in u16. The collapsed result can never exceed the original level count, so CapacityExceeded cannot occur from fct_collapse.
Sourcepub fn to_str_column(&self) -> Column
pub fn to_str_column(&self) -> Column
Decode all rows back into a Column::Str.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for FctColumn
impl RefUnwindSafe for FctColumn
impl Send for FctColumn
impl Sync for FctColumn
impl Unpin for FctColumn
impl UnsafeUnpin for FctColumn
impl UnwindSafe for FctColumn
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more