Crate datafusion_comet_spark_expr

Source

Re-exports§

pub use hash_funcs::*;

Modules§

cast
hash_funcs
rand
test_common
timezone
utils

Macros§

create_hashes_internal: Creates hash values for every row, based on the values in the columns.
downcast_compute_op
hash_array
hash_array_boolean
hash_array_decimal
hash_array_primitive
hash_array_primitive_float
hash_array_small_decimal
test_hashes_internal
test_hashes_with_nulls

Structs§

ArrayInsert
Avg: AVG aggregate expression
AvgDecimal: AVG aggregate expression
Cast
CheckOverflow: This is from Spark CheckOverflow expression. Spark CheckOverflow expression rounds decimals to given scale and check if the decimals can fit in given precision. As cast kernel rounds decimals already, Comet CheckOverflow expression only checks if the decimals can fit in the precision.
Contains
Correlation: CORR aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
Covariance: COVAR_SAMP and COVAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count.
CreateNamedStruct
EndsWith
GetArrayStructFields
GetStructField
IfExpr: IfExpr is a wrapper around CaseExpr, because IF(a, b, c) is semantically equivalent to CASE WHEN a THEN b ELSE c END.
Like
ListExtract
NegativeExpr: Negative expression
NormalizeNaNAndZero
RLike: Implementation of RLIKE operator.
RandExpr
SparkBitwiseCount
SparkBitwiseGet
SparkBitwiseNot
SparkCastOptions: Spark cast options
SparkChrFunc: Spark-compatible chr expression
SparkDateTrunc
SparkHour
SparkMinute
SparkSecond
StartsWith
Stddev: STDDEV and STDDEV_SAMP (standard deviation) aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
StringSpaceExpr
SubstringExpr
SumDecimal
TimestampTruncExpr
ToJson: to_json function
UnboundColumn: This is similar to UnKnownColumn in DataFusion, but it has data type. This is only used when the column is not bound to a schema, for example, the inputs to aggregation functions in final aggregation. In the case, we cannot bind the aggregation functions to the input schema which is grouping columns and aggregate buffer attributes in Spark (DataFusion has different design). But when creating certain aggregation functions, we need to know its input data types. As UnKnownColumn doesn’t have data type, we implement this UnboundColumn to carry the data type.
Variance: VAR_SAMP and VAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.

Enums§

EvalMode: Spark supports three evaluation modes when evaluating expressions, which affect the behavior when processing input values that are invalid or would result in an error, such as divide by zero errors, and also affects behavior when converting between types.
SparkError

Functions§

create_comet_physical_fun: Create a physical scalar function.
create_negate_expr
register_all_comet_functions: Registers all custom UDFs
spark_array_repeat
spark_cast: Spark-compatible cast implementation. Defers to DataFusion’s cast where that is known to be compatible, and returns an error when a not supported and not DF-compatible cast is requested.
spark_ceil: ceil function that simulates Spark ceil expression
spark_date_add
spark_date_sub
spark_decimal_div
spark_decimal_integral_div
spark_floor: floor function that simulates Spark floor expression
spark_hex: Spark-compatible hex function
spark_isnan: Spark-compatible isnan expression
spark_make_decimal: Spark-compatible MakeDecimal expression (internal to Spark optimizer)
spark_read_side_padding: Similar to DataFusion rpad, but not to truncate when the string is already longer than length
spark_round: round function that simulates Spark round expression
spark_rpad: Custom rpad because DataFusion’s rpad has differences in unicode handling
spark_unhex: Spark-compatible unhex expression
spark_unscaled_value: Spark-compatible UnscaledValue expression (internal to Spark optimizer)

Type Aliases§

SparkResult

Crate datafusion_comet_spark_exprCopy item path

Re-exports§

Modules§

Macros§

Structs§

Enums§

Functions§

Type Aliases§

Results

Crate datafusion_comet_spark_expr