Crate datafusion_comet_spark_expr

Source

Re-exports§

pub use hash_funcs::*;

Modules§

cast
hash_funcs
test_common
timezone
utils

Macros§

create_hashes_internal
Creates hash values for every row, based on the values in the columns.
downcast_compute_op
hash_array
hash_array_boolean
hash_array_decimal
hash_array_primitive
hash_array_primitive_float
hash_array_small_decimal
test_hashes_internal
test_hashes_with_nulls

Structs§

ArrayInsert
Avg
AVG aggregate expression
AvgDecimal
AVG aggregate expression
BitwiseNotExpr
BitwiseNot expression
Cast
CheckOverflow
This is from Spark CheckOverflow expression. Spark CheckOverflow expression rounds decimals to given scale and check if the decimals can fit in given precision. As cast kernel rounds decimals already, Comet CheckOverflow expression only checks if the decimals can fit in the precision.
Contains
Correlation
CORR aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
Covariance
COVAR_SAMP and COVAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count.
CreateNamedStruct
DateTruncExpr
EndsWith
GetArrayStructFields
GetStructField
HourExpr
IfExpr
IfExpr is a wrapper around CaseExpr, because IF(a, b, c) is semantically equivalent to CASE WHEN a THEN b ELSE c END.
Like
ListExtract
MinuteExpr
NegativeExpr
Negative expression
NormalizeNaNAndZero
RLike
Implementation of RLIKE operator.
SecondExpr
SparkCastOptions
Spark cast options
SparkChrFunc
Spark-compatible chr expression
StartsWith
Stddev
STDDEV and STDDEV_SAMP (standard deviation) aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
StringSpaceExpr
SubstringExpr
SumDecimal
TimestampTruncExpr
ToJson
to_json function
UnboundColumn
This is similar to UnKnownColumn in DataFusion, but it has data type. This is only used when the column is not bound to a schema, for example, the inputs to aggregation functions in final aggregation. In the case, we cannot bind the aggregation functions to the input schema which is grouping columns and aggregate buffer attributes in Spark (DataFusion has different design). But when creating certain aggregation functions, we need to know its input data types. As UnKnownColumn doesn’t have data type, we implement this UnboundColumn to carry the data type.
Variance
VAR_SAMP and VAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.

Enums§

EvalMode
Spark supports three evaluation modes when evaluating expressions, which affect the behavior when processing input values that are invalid or would result in an error, such as divide by zero errors, and also affects behavior when converting between types.
SparkError

Functions§

bitwise_not
create_comet_physical_fun
Create a physical scalar function.
create_negate_expr
spark_cast
Spark-compatible cast implementation. Defers to DataFusion’s cast where that is known to be compatible, and returns an error when a not supported and not DF-compatible cast is requested.
spark_ceil
ceil function that simulates Spark ceil expression
spark_date_add
spark_date_sub
spark_decimal_div
spark_decimal_integral_div
spark_floor
floor function that simulates Spark floor expression
spark_hex
Spark-compatible hex function
spark_isnan
Spark-compatible isnan expression
spark_make_decimal
Spark-compatible MakeDecimal expression (internal to Spark optimizer)
spark_read_side_padding
Similar to DataFusion rpad, but not to truncate when the string is already longer than length
spark_round
round function that simulates Spark round expression
spark_rpad
Custom rpad because DataFusion’s rpad has differences in unicode handling
spark_unhex
Spark-compatible unhex expression
spark_unscaled_value
Spark-compatible UnscaledValue expression (internal to Spark optimizer)

Type Aliases§

SparkResult