Crate datafusion_comet_spark_expr

Source

Re-exports§

pub use crate::DateTruncExpr;
pub use crate::HourExpr;
pub use crate::MinuteExpr;
pub use crate::SecondExpr;
pub use crate::TimestampTruncExpr;
pub use hash_funcs::*;

Modules§

Macros§

create_hashes_internal
Creates hash values for every row, based on the values in the columns.
hash_array
hash_array_boolean
hash_array_decimal
hash_array_primitive
hash_array_primitive_float
test_hashes_internal
test_hashes_with_nulls

Structs§

ArrayInsert
Avg
AVG aggregate expression
AvgDecimal
AVG aggregate expression
BitwiseNotExpr
BitwiseNot expression
Cast
CheckOverflow
This is from Spark CheckOverflow expression. Spark CheckOverflow expression rounds decimals to given scale and check if the decimals can fit in given precision. As cast kernel rounds decimals already, Comet CheckOverflow expression only checks if the decimals can fit in the precision.
Contains
Correlation
CORR aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
Covariance
COVAR_SAMP and COVAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count.
CreateNamedStruct
DateTruncExpr
EndsWith
GetArrayStructFields
GetStructField
HourExpr
IfExpr
IfExpr is a wrapper around CaseExpr, because IF(a, b, c) is semantically equivalent to CASE WHEN a THEN b ELSE c END.
Like
ListExtract
MinuteExpr
NegativeExpr
Negative expression
NormalizeNaNAndZero
RLike
Implementation of RLIKE operator.
SecondExpr
SparkCastOptions
Spark cast options
SparkSchemaAdapterFactory
An implementation of DataFusion’s SchemaAdapterFactory that uses a Spark-compatible cast implementation.
StartsWith
Stddev
STDDEV and STDDEV_SAMP (standard deviation) aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.
StringSpaceExpr
SubstringExpr
SumDecimal
TimestampTruncExpr
ToJson
to_json function
UnboundColumn
This is similar to UnKnownColumn in DataFusion, but it has data type. This is only used when the column is not bound to a schema, for example, the inputs to aggregation functions in final aggregation. In the case, we cannot bind the aggregation functions to the input schema which is grouping columns and aggregate buffer attributes in Spark (DataFusion has different design). But when creating certain aggregation functions, we need to know its input data types. As UnKnownColumn doesn’t have data type, we implement this UnboundColumn to carry the data type.
Variance
VAR_SAMP and VAR_POP aggregate expression The implementation mostly is the same as the DataFusion’s implementation. The reason we have our own implementation is that DataFusion has UInt64 for state_field count, while Spark has Double for count. Also we have added null_on_divide_by_zero to be consistent with Spark’s implementation.

Enums§

EvalMode
Spark supports three evaluation modes when evaluating expressions, which affect the behavior when processing input values that are invalid or would result in an error, such as divide by zero errors, and also affects behavior when converting between types.
SparkError

Functions§

bitwise_not
create_comet_physical_fun
Create a physical scalar function.
create_negate_expr
spark_cast
Spark-compatible cast implementation. Defers to DataFusion’s cast where that is known to be compatible, and returns an error when a not supported and not DF-compatible cast is requested.
spark_date_add
spark_date_sub
spark_isnan
Spark-compatible isnan expression
spark_read_side_padding
Similar to DataFusion rpad, but not to truncate when the string is already longer than length

Type Aliases§

SparkResult

Crate datafusion_comet_spark_exprCopy item path

Re-exports§

Modules§

Macros§

Structs§

Enums§

Functions§

Type Aliases§

Crate datafusion_comet_spark_expr