Struct ExprSimplifier

Source
pub struct ExprSimplifier<S> { /* private fields */ }
Expand description

This structure handles API for expression simplification

Provides simplification information based on DFSchema and ExecutionProps. This is the default implementation used by DataFusion

For example:

use arrow::datatypes::{Schema, Field, DataType};
use datafusion_expr::{col, lit};
use datafusion_common::{DataFusionError, ToDFSchema};
use datafusion_expr::execution_props::ExecutionProps;
use datafusion_expr::simplify::SimplifyContext;
use datafusion_optimizer::simplify_expressions::ExprSimplifier;

// Create the schema
let schema = Schema::new(vec![
    Field::new("i", DataType::Int64, false),
  ])
  .to_dfschema_ref().unwrap();

// Create the simplifier
let props = ExecutionProps::new();
let context = SimplifyContext::new(&props)
   .with_schema(schema);
let simplifier = ExprSimplifier::new(context);

// Use the simplifier

// b < 2 or (1 > 3)
let expr = col("b").lt(lit(2)).or(lit(1).gt(lit(3)));

// b < 2
let simplified = simplifier.simplify(expr).unwrap();
assert_eq!(simplified, col("b").lt(lit(2)));

Implementations§

Source§

impl<S> ExprSimplifier<S>
where S: SimplifyInfo,

Source

pub fn new(info: S) -> ExprSimplifier<S>

Create a new ExprSimplifier with the given info such as an instance of SimplifyContext. See simplify for an example.

Source

pub fn simplify(&self, expr: Expr) -> Result<Expr, DataFusionError>

Simplifies this Expr as much as possible, evaluating constants and applying algebraic simplifications.

The types of the expression must match what operators expect, or else an error may occur trying to evaluate. See coerce for a function to help.

§Example:

b > 2 AND b > 2

can be written to

b > 2

use arrow::datatypes::DataType;
use datafusion_expr::{col, lit, Expr};
use datafusion_common::Result;
use datafusion_expr::execution_props::ExecutionProps;
use datafusion_expr::simplify::SimplifyContext;
use datafusion_expr::simplify::SimplifyInfo;
use datafusion_optimizer::simplify_expressions::ExprSimplifier;
use datafusion_common::DFSchema;
use std::sync::Arc;

/// Simple implementation that provides `Simplifier` the information it needs
/// See SimplifyContext for a structure that does this.
#[derive(Default)]
struct Info {
  execution_props: ExecutionProps,
};

impl SimplifyInfo for Info {
  fn is_boolean_type(&self, expr: &Expr) -> Result<bool> {
    Ok(false)
  }
  fn nullable(&self, expr: &Expr) -> Result<bool> {
    Ok(true)
  }
  fn execution_props(&self) -> &ExecutionProps {
    &self.execution_props
  }
  fn get_data_type(&self, expr: &Expr) -> Result<DataType> {
    Ok(DataType::Int32)
  }
}

// Create the simplifier
let simplifier = ExprSimplifier::new(Info::default());

// b < 2
let b_lt_2 = col("b").gt(lit(2));

// (b < 2) OR (b < 2)
let expr = b_lt_2.clone().or(b_lt_2.clone());

// (b < 2) OR (b < 2) --> (b < 2)
let expr = simplifier.simplify(expr).unwrap();
assert_eq!(expr, b_lt_2);
Source

pub fn simplify_with_cycle_count( &self, expr: Expr, ) -> Result<(Expr, u32), DataFusionError>

Like Self::simplify, simplifies this Expr as much as possible, evaluating constants and applying algebraic simplifications. Additionally returns a u32 representing the number of simplification cycles performed, which can be useful for testing optimizations.

See Self::simplify for details and usage examples.

Source

pub fn coerce( &self, expr: Expr, schema: &DFSchema, ) -> Result<Expr, DataFusionError>

Apply type coercion to an Expr so that it can be evaluated as a PhysicalExpr.

See the type coercion module documentation for more details on type coercion

Source

pub fn with_guarantees( self, guarantees: Vec<(Expr, NullableInterval)>, ) -> ExprSimplifier<S>

Input guarantees about the values of columns.

The guarantees can simplify expressions. For example, if a column x is guaranteed to be 3, then the expression x > 1 can be replaced by the literal true.

The guarantees are provided as a Vec<(Expr, NullableInterval)>, where the Expr is a column reference and the NullableInterval is an interval representing the known possible values of that column.

use arrow::datatypes::{DataType, Field, Schema};
use datafusion_expr::{col, lit, Expr};
use datafusion_expr::interval_arithmetic::{Interval, NullableInterval};
use datafusion_common::{Result, ScalarValue, ToDFSchema};
use datafusion_expr::execution_props::ExecutionProps;
use datafusion_expr::simplify::SimplifyContext;
use datafusion_optimizer::simplify_expressions::ExprSimplifier;

let schema = Schema::new(vec![
  Field::new("x", DataType::Int64, false),
  Field::new("y", DataType::UInt32, false),
  Field::new("z", DataType::Int64, false),
  ])
  .to_dfschema_ref().unwrap();

// Create the simplifier
let props = ExecutionProps::new();
let context = SimplifyContext::new(&props)
   .with_schema(schema);

// Expression: (x >= 3) AND (y + 2 < 10) AND (z > 5)
let expr_x = col("x").gt_eq(lit(3_i64));
let expr_y = (col("y") + lit(2_u32)).lt(lit(10_u32));
let expr_z = col("z").gt(lit(5_i64));
let expr = expr_x.and(expr_y).and(expr_z.clone());

let guarantees = vec![
   // x ∈ [3, 5]
   (
       col("x"),
       NullableInterval::NotNull {
           values: Interval::make(Some(3_i64), Some(5_i64)).unwrap()
       }
   ),
   // y = 3
   (col("y"), NullableInterval::from(ScalarValue::UInt32(Some(3)))),
];
let simplifier = ExprSimplifier::new(context).with_guarantees(guarantees);
let output = simplifier.simplify(expr).unwrap();
// Expression becomes: true AND true AND (z > 5), which simplifies to
// z > 5.
assert_eq!(output, expr_z);
Source

pub fn with_canonicalize(self, canonicalize: bool) -> ExprSimplifier<S>

Should Canonicalizer be applied before simplification?

If true (the default), the expression will be rewritten to canonical form before simplification. This is useful to ensure that the simplifier can apply all possible simplifications.

Some expressions, such as those in some Joins, can not be canonicalized without changing their meaning. In these cases, canonicalization should be disabled.

use arrow::datatypes::{DataType, Field, Schema};
use datafusion_expr::{col, lit, Expr};
use datafusion_expr::interval_arithmetic::{Interval, NullableInterval};
use datafusion_common::{Result, ScalarValue, ToDFSchema};
use datafusion_expr::execution_props::ExecutionProps;
use datafusion_expr::simplify::SimplifyContext;
use datafusion_optimizer::simplify_expressions::ExprSimplifier;

let schema = Schema::new(vec![
  Field::new("a", DataType::Int64, false),
  Field::new("b", DataType::Int64, false),
  Field::new("c", DataType::Int64, false),
  ])
  .to_dfschema_ref().unwrap();

// Create the simplifier
let props = ExecutionProps::new();
let context = SimplifyContext::new(&props)
   .with_schema(schema);
let simplifier = ExprSimplifier::new(context);

// Expression: a = c AND 1 = b
let expr = col("a").eq(col("c")).and(lit(1).eq(col("b")));

// With canonicalization, the expression is rewritten to canonical form
// (though it is no simpler in this case):
let canonical = simplifier.simplify(expr.clone()).unwrap();
// Expression has been rewritten to: (c = a AND b = 1)
assert_eq!(canonical, col("c").eq(col("a")).and(col("b").eq(lit(1))));

// If canonicalization is disabled, the expression is not changed
let non_canonicalized = simplifier
  .with_canonicalize(false)
  .simplify(expr.clone())
  .unwrap();

assert_eq!(non_canonicalized, expr);
Source

pub fn with_max_cycles(self, max_simplifier_cycles: u32) -> ExprSimplifier<S>

Specifies the maximum number of simplification cycles to run.

The simplifier can perform multiple passes of simplification. This is because the output of one simplification step can allow more optimizations in another simplification step. For example, constant evaluation can allow more expression simplifications, and expression simplifications can allow more constant evaluations.

This method specifies the maximum number of allowed iteration cycles before the simplifier returns an Expr output. However, it does not always perform the maximum number of cycles. The simplifier will attempt to detect when an Expr is unchanged by all the simplification passes, and return early. This avoids wasting time on unnecessary Expr tree traversals.

If no maximum is specified, the value of DEFAULT_MAX_SIMPLIFIER_CYCLES is used instead.

use arrow::datatypes::{DataType, Field, Schema};
use datafusion_expr::{col, lit, Expr};
use datafusion_common::{Result, ScalarValue, ToDFSchema};
use datafusion_expr::execution_props::ExecutionProps;
use datafusion_expr::simplify::SimplifyContext;
use datafusion_optimizer::simplify_expressions::ExprSimplifier;

let schema = Schema::new(vec![
  Field::new("a", DataType::Int64, false),
  ])
  .to_dfschema_ref().unwrap();

// Create the simplifier
let props = ExecutionProps::new();
let context = SimplifyContext::new(&props)
   .with_schema(schema);
let simplifier = ExprSimplifier::new(context);

// Expression: a IS NOT NULL
let expr = col("a").is_not_null();

// When using default maximum cycles, 2 cycles will be performed.
let (simplified_expr, count) = simplifier.simplify_with_cycle_count(expr.clone()).unwrap();
assert_eq!(simplified_expr, lit(true));
// 2 cycles were executed, but only 1 was needed
assert_eq!(count, 2);

// Only 1 simplification pass is necessary here, so we can set the maximum cycles to 1.
let (simplified_expr, count) = simplifier.with_max_cycles(1).simplify_with_cycle_count(expr.clone()).unwrap();
// Expression has been rewritten to: (c = a AND b = 1)
assert_eq!(simplified_expr, lit(true));
// Only 1 cycle was executed
assert_eq!(count, 1);

Auto Trait Implementations§

§

impl<S> Freeze for ExprSimplifier<S>
where S: Freeze,

§

impl<S> !RefUnwindSafe for ExprSimplifier<S>

§

impl<S> Send for ExprSimplifier<S>
where S: Send,

§

impl<S> Sync for ExprSimplifier<S>
where S: Sync,

§

impl<S> Unpin for ExprSimplifier<S>
where S: Unpin,

§

impl<S> !UnwindSafe for ExprSimplifier<S>

Blanket Implementations§

Source§

impl<T> AlignerFor<1> for T

Source§

type Aligner = AlignTo1<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<1024> for T

Source§

type Aligner = AlignTo1024<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<128> for T

Source§

type Aligner = AlignTo128<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<16> for T

Source§

type Aligner = AlignTo16<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<16384> for T

Source§

type Aligner = AlignTo16384<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<2> for T

Source§

type Aligner = AlignTo2<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<2048> for T

Source§

type Aligner = AlignTo2048<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<256> for T

Source§

type Aligner = AlignTo256<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<32> for T

Source§

type Aligner = AlignTo32<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<32768> for T

Source§

type Aligner = AlignTo32768<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<4> for T

Source§

type Aligner = AlignTo4<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<4096> for T

Source§

type Aligner = AlignTo4096<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<512> for T

Source§

type Aligner = AlignTo512<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<64> for T

Source§

type Aligner = AlignTo64<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<8> for T

Source§

type Aligner = AlignTo8<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> AlignerFor<8192> for T

Source§

type Aligner = AlignTo8192<T>

The AlignTo* type which aligns Self to ALIGNMENT.
Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<S> ROExtAcc for S

Source§

fn f_get<F>(&self, offset: FieldOffset<S, F, Aligned>) -> &F

Gets a reference to a field, determined by offset. Read more
Source§

fn f_get_mut<F>(&mut self, offset: FieldOffset<S, F, Aligned>) -> &mut F

Gets a muatble reference to a field, determined by offset. Read more
Source§

fn f_get_ptr<F, A>(&self, offset: FieldOffset<S, F, A>) -> *const F

Gets a const pointer to a field, the field is determined by offset. Read more
Source§

fn f_get_mut_ptr<F, A>(&mut self, offset: FieldOffset<S, F, A>) -> *mut F

Gets a mutable pointer to a field, determined by offset. Read more
Source§

impl<S> ROExtOps<Aligned> for S

Source§

fn f_replace<F>(&mut self, offset: FieldOffset<S, F, Aligned>, value: F) -> F

Replaces a field (determined by offset) with value, returning the previous value of the field. Read more
Source§

fn f_swap<F>(&mut self, offset: FieldOffset<S, F, Aligned>, right: &mut S)

Swaps a field (determined by offset) with the same field in right. Read more
Source§

fn f_get_copy<F>(&self, offset: FieldOffset<S, F, Aligned>) -> F
where F: Copy,

Gets a copy of a field (determined by offset). The field is determined by offset. Read more
Source§

impl<S> ROExtOps<Unaligned> for S

Source§

fn f_replace<F>(&mut self, offset: FieldOffset<S, F, Unaligned>, value: F) -> F

Replaces a field (determined by offset) with value, returning the previous value of the field. Read more
Source§

fn f_swap<F>(&mut self, offset: FieldOffset<S, F, Unaligned>, right: &mut S)

Swaps a field (determined by offset) with the same field in right. Read more
Source§

fn f_get_copy<F>(&self, offset: FieldOffset<S, F, Unaligned>) -> F
where F: Copy,

Gets a copy of a field (determined by offset). The field is determined by offset. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> SelfOps for T
where T: ?Sized,

Source§

fn eq_id(&self, other: &Self) -> bool

Compares the address of self with the address of other. Read more
Source§

fn piped<F, U>(self, f: F) -> U
where F: FnOnce(Self) -> U, Self: Sized,

Emulates the pipeline operator, allowing method syntax in more places. Read more
Source§

fn piped_ref<'a, F, U>(&'a self, f: F) -> U
where F: FnOnce(&'a Self) -> U,

The same as piped except that the function takes &Self Useful for functions that take &Self instead of Self. Read more
Source§

fn piped_mut<'a, F, U>(&'a mut self, f: F) -> U
where F: FnOnce(&'a mut Self) -> U,

The same as piped, except that the function takes &mut Self. Useful for functions that take &mut Self instead of Self.
Source§

fn mutated<F>(self, f: F) -> Self
where F: FnOnce(&mut Self), Self: Sized,

Mutates self using a closure taking self by mutable reference, passing it along the method chain. Read more
Source§

fn observe<F>(self, f: F) -> Self
where F: FnOnce(&Self), Self: Sized,

Observes the value of self, passing it along unmodified. Useful in long method chains. Read more
Source§

fn into_<T>(self) -> T
where Self: Into<T>,

Performs a conversion with Into. using the turbofish .into_::<_>() syntax. Read more
Source§

fn as_ref_<T>(&self) -> &T
where Self: AsRef<T>, T: ?Sized,

Performs a reference to reference conversion with AsRef, using the turbofish .as_ref_::<_>() syntax. Read more
Source§

fn as_mut_<T>(&mut self) -> &mut T
where Self: AsMut<T>, T: ?Sized,

Performs a mutable reference to mutable reference conversion with AsMut, using the turbofish .as_mut_::<_>() syntax. Read more
Source§

fn drop_(self)
where Self: Sized,

Drops self using method notation. Alternative to std::mem::drop. Read more
Source§

impl<This> TransmuteElement for This
where This: ?Sized,

Source§

unsafe fn transmute_element<T>(self) -> Self::TransmutedPtr
where Self: CanTransmuteElement<T>,

Transmutes the element type of this pointer.. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> TypeIdentity for T
where T: ?Sized,

Source§

type Type = T

This is always Self.
Source§

fn into_type(self) -> Self::Type
where Self: Sized, Self::Type: Sized,

Converts a value back to the original type.
Source§

fn as_type(&self) -> &Self::Type

Converts a reference back to the original type.
Source§

fn as_type_mut(&mut self) -> &mut Self::Type

Converts a mutable reference back to the original type.
Source§

fn into_type_box(self: Box<Self>) -> Box<Self::Type>

Converts a box back to the original type.
Source§

fn into_type_arc(this: Arc<Self>) -> Arc<Self::Type>

Converts an Arc back to the original type. Read more
Source§

fn into_type_rc(this: Rc<Self>) -> Rc<Self::Type>

Converts an Rc back to the original type. Read more
Source§

fn from_type(this: Self::Type) -> Self
where Self: Sized, Self::Type: Sized,

Converts a value back to the original type.
Source§

fn from_type_ref(this: &Self::Type) -> &Self

Converts a reference back to the original type.
Source§

fn from_type_mut(this: &mut Self::Type) -> &mut Self

Converts a mutable reference back to the original type.
Source§

fn from_type_box(this: Box<Self::Type>) -> Box<Self>

Converts a box back to the original type.
Source§

fn from_type_arc(this: Arc<Self::Type>) -> Arc<Self>

Converts an Arc back to the original type.
Source§

fn from_type_rc(this: Rc<Self::Type>) -> Rc<Self>

Converts an Rc back to the original type.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> ErasedDestructor for T
where T: 'static,

Source§

impl<T> Ungil for T
where T: Send,