pub struct TextColumnStats {
pub min_len: usize,
pub max_len: usize,
pub mean_len: f64,
pub p50_len: usize,
pub p95_len: usize,
pub p99_len: usize,
pub empty_count: usize,
pub preamble_count: usize,
pub total: usize,
}Expand description
Statistics for a text (string) column — useful for ML classification audits.
Fields§
§min_len: usizeMinimum character length
max_len: usizeMaximum character length
mean_len: f64Mean character length
p50_len: usizeMedian character length (P50)
p95_len: usize95th percentile character length
p99_len: usize99th percentile character length
empty_count: usizeNumber of empty or whitespace-only strings
preamble_count: usizeNumber of strings matching a preamble pattern (e.g., “#!/”)
total: usizeTotal valid (non-null) strings
Implementations§
Source§impl TextColumnStats
impl TextColumnStats
Sourcepub fn from_dataset(
dataset: &ArrowDataset,
column: &str,
preamble_prefix: Option<&str>,
) -> Result<Self>
pub fn from_dataset( dataset: &ArrowDataset, column: &str, preamble_prefix: Option<&str>, ) -> Result<Self>
Trait Implementations§
Source§impl Clone for TextColumnStats
impl Clone for TextColumnStats
Source§fn clone(&self) -> TextColumnStats
fn clone(&self) -> TextColumnStats
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreAuto Trait Implementations§
impl Freeze for TextColumnStats
impl RefUnwindSafe for TextColumnStats
impl Send for TextColumnStats
impl Sync for TextColumnStats
impl Unpin for TextColumnStats
impl UnsafeUnpin for TextColumnStats
impl UnwindSafe for TextColumnStats
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreCreates a shared type from an unshared type.