Module transformations

Source
Expand description

Various transformation constructors.

The different crate::core::Transformation implementations in this module are accessed by calling the appropriate constructor function. Constructors are named in the form make_xxx(), where xxx indicates what the resulting Transformation does.

Re-exportsยง

pub use quantile_score_candidates::*;

Modulesยง

b_ary_tree ๐Ÿ”’
cast ๐Ÿ”’
cast_metric ๐Ÿ”’
clamp ๐Ÿ”’
count ๐Ÿ”’
count_cdf ๐Ÿ”’
covariance ๐Ÿ”’
dataframe ๐Ÿ”’
impute ๐Ÿ”’
index ๐Ÿ”’
lipschitz_mul ๐Ÿ”’
make_stable_expr ๐Ÿ”’
make_stable_lazyframe ๐Ÿ”’
manipulation ๐Ÿ”’
mean ๐Ÿ”’
quantile_score_candidates
resize ๐Ÿ”’
sum ๐Ÿ”’
sum_of_squared_deviations ๐Ÿ”’
variance ๐Ÿ”’

Structsยง

DataFrameDomain
Pairwise
Marker type to represent pairwise, or cascading summation
Sequential
Marker type to represent sequential, or recursive summation

Traitsยง

BAryTreeMetric
DatasetDomain
A Domain representing a dataset.
DatasetMetric
DropNullDomain
Utility trait to drop null values from a dataset, regardless of the representation of nullity.
ImputeConstantDomain
Utility trait to impute with a constant, regardless of the representation of nullity.
RowByRowDomain
StableDslPlan
StableExpr

Functionsยง

choose_branching_factor
Returns an approximation to the ideal branching_factor for a dataset of a given size, that minimizes error in cdf and quantile estimates based on b-ary trees.
make_b_ary_tree
Expand a vector of counts into a b-ary tree of counts, where each branch is the sum of its b immediate children.
make_bounded_float_checked_sum
Make a Transformation that computes the sum of bounded data with known dataset size.
make_bounded_float_ordered_sum
Make a Transformation that computes the sum of bounded floats with known ordering.
make_bounded_int_monotonic_sum
Make a Transformation that computes the sum of bounded ints, where all values share the same sign.
make_bounded_int_ordered_sum
Make a Transformation that computes the sum of bounded ints. You may need to use make_ordered_random to impose an ordering on the data.
make_bounded_int_split_sum
Make a Transformation that computes the sum of bounded ints. Adds the saturating sum of the positives to the saturating sum of the negatives.
make_cast
Make a Transformation that casts a vector of data from type TIA to type TOA. For each element, failure to parse results in None, else Some(out).
make_cast_default
Make a Transformation that casts a vector of data from type TIA to type TOA. Any element that fails to cast is filled with default.
make_cast_inherent
Make a Transformation that casts a vector of data from type TIA to a type that can represent nullity TOA. If cast fails, fill with TOAโ€™s null value.
make_cdf
Postprocess a noisy array of float summary counts into a cumulative distribution.
make_clamp
Make a Transformation that clamps numeric data in Vec<TA> to bounds.
make_consistent_b_ary_tree
Postprocessor that makes a noisy b-ary tree internally consistent, and returns the leaf layer.
make_count
Make a Transformation that computes a count of the number of records in data.
make_count_by
Make a Transformation that computes the count of each unique value in data. This assumes that the category set is unknown.
make_count_by_categories
Make a Transformation that computes the number of times each category appears in the data. This assumes that the category set is known.
make_count_distinct
Make a Transformation that computes a count of the number of unique, distinct records in data.
make_create_dataframeDeprecated
Make a Transformation that constructs a dataframe from a Vec<Vec<String>> (a vector of records).
make_df_cast_defaultDeprecated
Make a Transformation that casts the elements in a column in a dataframe from type TIA to type TOA. If cast fails, fill with default.
make_df_is_equalDeprecated
Make a Transformation that checks if each element in a column in a dataframe is equivalent to value.
make_drop_null
Make a Transformation that drops null values.
make_find
Find the index of a data value in a set of categories.
make_find_bin
Make a transformation that finds the bin index in a monotonically increasing vector of edges.
make_identity
Make a Transformation representing the identity function.
make_impute_constant
Make a Transformation that replaces null/None data with constant.
make_impute_uniform_float
Make a Transformation that replaces NaN values in Vec<TA> with uniformly distributed floats within bounds.
make_index
Make a transformation that treats each element as an index into a vector of categories.
make_is_equal
Make a Transformation that checks if each element is equal to value.
make_is_null
Make a Transformation that checks if each element in a vector is null or nan.
make_lipschitz_float_mul
Make a transformation that multiplies an aggregate by a constant.
make_mean
Make a Transformation that computes the mean of bounded data.
make_metric_bounded
Make a Transformation that converts the unbounded dataset metric MI to the respective bounded dataset metric with a no-op.
make_metric_unbounded
Make a Transformation that converts the bounded dataset metric MI to the respective unbounded dataset metric with a no-op.
make_ordered_random
Make a Transformation that converts the unordered dataset metric SymmetricDistance to the respective ordered dataset metric InsertDeleteDistance by assigning a random permutation.
make_quantiles_from_counts
Postprocess a noisy array of summary counts into quantiles.
make_resize
Make a Transformation that either truncates or imputes records with constant to match a provided size.
make_select_columnDeprecated
Make a Transformation that retrieves the column key from a dataframe as Vec<TOA>.
make_sized_bounded_covariance
make_sized_bounded_float_checked_sum
Make a Transformation that computes the sum of bounded floats with known dataset size.
make_sized_bounded_float_ordered_sum
Make a Transformation that computes the sum of bounded floats with known ordering and dataset size.
make_sized_bounded_int_checked_sum
Make a Transformation that computes the sum of bounded ints. The effective range is reduced, as (bounds * size) must not overflow.
make_sized_bounded_int_monotonic_sum
Make a Transformation that computes the sum of bounded ints, where all values share the same sign.
make_sized_bounded_int_ordered_sum
Make a Transformation that computes the sum of bounded ints with known dataset size.
make_sized_bounded_int_split_sum
Make a Transformation that computes the sum of bounded ints with known dataset size.
make_split_dataframeDeprecated
Make a Transformation that splits each record in a String into a Vec<Vec<String>>, and loads the resulting table into a dataframe keyed by col_names.
make_split_lines
Make a Transformation that takes a string and splits it into a Vec<String> of its lines.
make_split_records
Make a Transformation that splits each record in a Vec<String> into a Vec<Vec<String>>.
make_stable_expr
Create a stable transformation from an Expr.
make_stable_lazyframe
Create a stable transformation from a LazyFrame.
make_subset_byDeprecated
Make a Transformation that subsets a dataframe by a boolean column.
make_sum
Make a Transformation that computes the sum of bounded data. Use make_clamp to bound data.
make_sum_of_squared_deviations
Make a Transformation that computes the sum of squared deviations of bounded data.
make_unordered
Make a Transformation that converts the ordered dataset metric MI to the respective ordered dataset metric with a no-op.
make_variance
Make a Transformation that computes the variance of bounded data.

Type Aliasesยง

DataFrame