ClickHouseSessionContext

Struct ClickHouseSessionContext 

Source
pub struct ClickHouseSessionContext { /* private fields */ }
Expand description

Wrapper for SessionContext which allows running arbitrary ClickHouse functions.

Implementations§

Source§

impl ClickHouseSessionContext

Source

pub fn new( ctx: SessionContext, extension_planners: Option<Vec<Arc<dyn ExtensionPlanner + Send + Sync>>>, ) -> Self

Source

pub fn with_expr_planner(self, expr_planner: Arc<dyn ExprPlanner>) -> Self

Source

pub fn into_session_context(self) -> SessionContext

Source

pub async fn sql(&self, sql: &str) -> Result<DataFrame>

§Errors

Returns an error if the SQL query is invalid or if the query execution fails.

Source

pub async fn sql_with_options( &self, sql: &str, options: SQLOptions, ) -> Result<DataFrame>

§Errors

Returns an error if the SQL query is invalid or if the query execution fails.

Source

pub async fn statement_to_plan( &self, state: &SessionState, statement: Statement, ) -> Result<LogicalPlan>

§Errors
  • Returns an error if the SQL query is invalid or if the query execution fails.

Methods from Deref<Target = SessionContext>§

Source

pub async fn read_csv<P>( &self, table_paths: P, options: CsvReadOptions<'_>, ) -> Result<DataFrame, DataFusionError>
where P: DataFilePaths,

Creates a DataFrame for reading a CSV data source.

For more control such as reading multiple files, you can use read_table with a super::ListingTable.

Example usage is given below:

use datafusion::prelude::*;
let ctx = SessionContext::new();
// You can read a single file using `read_csv`
let df = ctx.read_csv("tests/data/example.csv", CsvReadOptions::new()).await?;
// you can also read multiple files:
let df = ctx.read_csv(vec!["tests/data/example.csv", "tests/data/example.csv"], CsvReadOptions::new()).await?;
Source

pub async fn register_csv( &self, table_ref: impl Into<TableReference>, table_path: impl AsRef<str>, options: CsvReadOptions<'_>, ) -> Result<(), DataFusionError>

Registers a CSV file as a table which can referenced from SQL statements executed against this context.

Source

pub async fn write_csv( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str>, ) -> Result<(), DataFusionError>

Executes a query and writes the results to a partitioned CSV file.

Source

pub async fn read_json<P>( &self, table_paths: P, options: NdJsonReadOptions<'_>, ) -> Result<DataFrame, DataFusionError>
where P: DataFilePaths,

Creates a DataFrame for reading an JSON data source.

For more control such as reading multiple files, you can use read_table with a super::ListingTable.

For an example, see read_csv

Source

pub async fn register_json( &self, table_ref: impl Into<TableReference>, table_path: impl AsRef<str>, options: NdJsonReadOptions<'_>, ) -> Result<(), DataFusionError>

Registers a JSON file as a table that it can be referenced from SQL statements executed against this context.

Source

pub async fn write_json( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str>, ) -> Result<(), DataFusionError>

Executes a query and writes the results to a partitioned JSON file.

Source

pub async fn read_parquet<P>( &self, table_paths: P, options: ParquetReadOptions<'_>, ) -> Result<DataFrame, DataFusionError>
where P: DataFilePaths,

Creates a DataFrame for reading a Parquet data source.

For more control such as reading multiple files, you can use read_table with a super::ListingTable.

For an example, see read_csv

§Note: Statistics

NOTE: by default, statistics are collected when reading the Parquet files This can slow down the initial DataFrame creation while greatly accelerating queries with certain filters.

To disable statistics collection, set the config option datafusion.execution.collect_statistics to false. See ConfigOptions and ExecutionOptions::collect_statistics for more details.

Source

pub async fn register_parquet( &self, table_ref: impl Into<TableReference>, table_path: impl AsRef<str>, options: ParquetReadOptions<'_>, ) -> Result<(), DataFusionError>

Registers a Parquet file as a table that can be referenced from SQL statements executed against this context.

§Note: Statistics

Statistics are not collected by default. See read_parquet for more details and how to enable them.

Source

pub async fn write_parquet( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str>, writer_properties: Option<WriterProperties>, ) -> Result<(), DataFusionError>

Executes a query and writes the results to a partitioned Parquet file.

Source

pub async fn refresh_catalogs(&self) -> Result<(), DataFusionError>

Finds any ListingSchemaProviders and instructs them to reload tables from “disk”

Source

pub fn session_start_time(&self) -> DateTime<Utc>

Returns the time this SessionContext was created

Source

pub fn add_optimizer_rule( &self, optimizer_rule: Arc<dyn OptimizerRule + Send + Sync>, )

Adds an optimizer rule to the end of the existing rules.

See SessionState for more control of when the rule is applied.

Source

pub fn add_analyzer_rule( &self, analyzer_rule: Arc<dyn AnalyzerRule + Send + Sync>, )

Adds an analyzer rule to the end of the existing rules.

See SessionState for more control of when the rule is applied.

Source

pub fn register_object_store( &self, url: &Url, object_store: Arc<dyn ObjectStore>, ) -> Option<Arc<dyn ObjectStore>>

Registers an ObjectStore to be used with a specific URL prefix.

See RuntimeEnv::register_object_store for more details.

§Example: register a local object store for the “file://” URL prefix
let object_store_url = ObjectStoreUrl::parse("file://").unwrap();
let object_store = object_store::local::LocalFileSystem::new();
let ctx = SessionContext::new();
// All files with the file:// url prefix will be read from the local file system
ctx.register_object_store(object_store_url.as_ref(), Arc::new(object_store));
Source

pub fn register_batch( &self, table_name: &str, batch: RecordBatch, ) -> Result<Option<Arc<dyn TableProvider>>, DataFusionError>

Registers the RecordBatch as the specified table name

Source

pub fn runtime_env(&self) -> Arc<RuntimeEnv>

Return the RuntimeEnv used to run queries with this SessionContext

Source

pub fn session_id(&self) -> String

Returns an id that uniquely identifies this SessionContext.

Source

pub fn table_factory( &self, file_type: &str, ) -> Option<Arc<dyn TableProviderFactory>>

Return the TableProviderFactory that is registered for the specified file type, if any.

Source

pub fn enable_ident_normalization(&self) -> bool

Return the enable_ident_normalization of this Session

Source

pub fn copied_config(&self) -> SessionConfig

Return a copied version of config for this Session

Source

pub fn copied_table_options(&self) -> TableOptions

Return a copied version of table options for this Session

Source

pub async fn sql(&self, sql: &str) -> Result<DataFrame, DataFusionError>

Creates a DataFrame from SQL query text.

Note: This API implements DDL statements such as CREATE TABLE and CREATE VIEW and DML statements such as INSERT INTO with in-memory default implementations. See Self::sql_with_options.

§Example: Running SQL queries

See the example on Self

§Example: Creating a Table with SQL
use datafusion::prelude::*;
let ctx = SessionContext::new();
ctx
  .sql("CREATE TABLE foo (x INTEGER)")
  .await?
  .collect()
  .await?;
assert!(ctx.table_exist("foo").unwrap());
Source

pub async fn sql_with_options( &self, sql: &str, options: SQLOptions, ) -> Result<DataFrame, DataFusionError>

Creates a DataFrame from SQL query text, first validating that the queries are allowed by options

§Example: Preventing Creating a Table with SQL

If you want to avoid creating tables, or modifying data or the session, set SQLOptions appropriately:

use datafusion::prelude::*;
let ctx = SessionContext::new();
let options = SQLOptions::new()
  .with_allow_ddl(false);
let err = ctx.sql_with_options("CREATE TABLE foo (x INTEGER)", options)
  .await
  .unwrap_err();
assert!(
  err.to_string().starts_with("Error during planning: DDL not supported: CreateMemoryTable")
);
Source

pub fn parse_sql_expr( &self, sql: &str, df_schema: &DFSchema, ) -> Result<Expr, DataFusionError>

Creates logical expressions from SQL query text.

§Example: Parsing SQL queries
// datafusion will parse number as i64 first.
let sql = "a > 10";
let expected = col("a").gt(lit(10 as i64));
// provide type information that `a` is an Int32
let schema = Schema::new(vec![Field::new("a", DataType::Int32, true)]);
let df_schema = DFSchema::try_from(schema).unwrap();
let expr = SessionContext::new()
 .parse_sql_expr(sql, &df_schema)?;
assert_eq!(expected, expr);
Source

pub async fn execute_logical_plan( &self, plan: LogicalPlan, ) -> Result<DataFrame, DataFusionError>

Execute the LogicalPlan, return a DataFrame. This API is not featured limited (so all SQL such as CREATE TABLE and COPY will be run).

If you wish to limit the type of plan that can be run from SQL, see Self::sql_with_options and SQLOptions::verify_plan.

Source

pub fn create_physical_expr( &self, expr: Expr, df_schema: &DFSchema, ) -> Result<Arc<dyn PhysicalExpr>, DataFusionError>

Create a PhysicalExpr from an Expr after applying type coercion and function rewrites.

Note: The expression is not simplified or otherwise optimized: a = 1 + 2 will not be simplified to a = 3 as this is a more involved process. See the expr_api example for how to simplify expressions.

§Example
// a = 1 (i64)
let expr = col("a").eq(lit(1i64));
// provide type information that `a` is an Int32
let schema = Schema::new(vec![Field::new("a", DataType::Int32, true)]);
let df_schema = DFSchema::try_from(schema).unwrap();
// Create a PhysicalExpr. Note DataFusion automatically coerces (casts) `1i64` to `1i32`
let physical_expr = SessionContext::new()
  .create_physical_expr(expr, &df_schema).unwrap();
§See Also
Source

pub fn register_variable( &self, variable_type: VarType, provider: Arc<dyn VarProvider + Send + Sync>, )

Registers a variable provider within this context.

Source

pub fn register_udtf(&self, name: &str, fun: Arc<dyn TableFunctionImpl>)

Register a table UDF with this context

Source

pub fn register_udf(&self, f: ScalarUDF)

Registers a scalar UDF within this context.

Note in SQL queries, function names are looked up using lowercase unless the query uses quotes. For example,

  • SELECT MY_FUNC(x)... will look for a function named "my_func"
  • SELECT "my_FUNC"(x) will look for a function named "my_FUNC"

Any functions registered with the udf name or its aliases will be overwritten with this new function

Source

pub fn register_udaf(&self, f: AggregateUDF)

Registers an aggregate UDF within this context.

Note in SQL queries, aggregate names are looked up using lowercase unless the query uses quotes. For example,

  • SELECT MY_UDAF(x)... will look for an aggregate named "my_udaf"
  • SELECT "my_UDAF"(x) will look for an aggregate named "my_UDAF"
Source

pub fn register_udwf(&self, f: WindowUDF)

Registers a window UDF within this context.

Note in SQL queries, window function names are looked up using lowercase unless the query uses quotes. For example,

  • SELECT MY_UDWF(x)... will look for a window function named "my_udwf"
  • SELECT "my_UDWF"(x) will look for a window function named "my_UDWF"
Source

pub fn deregister_udf(&self, name: &str)

Deregisters a UDF within this context.

Source

pub fn deregister_udaf(&self, name: &str)

Deregisters a UDAF within this context.

Source

pub fn deregister_udwf(&self, name: &str)

Deregisters a UDWF within this context.

Source

pub fn deregister_udtf(&self, name: &str)

Deregisters a UDTF within this context.

Source

pub async fn read_arrow<P>( &self, table_paths: P, options: ArrowReadOptions<'_>, ) -> Result<DataFrame, DataFusionError>
where P: DataFilePaths,

Creates a DataFrame for reading an Arrow data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

For an example, see read_csv

Source

pub fn read_empty(&self) -> Result<DataFrame, DataFusionError>

Creates an empty DataFrame.

Source

pub fn read_table( &self, provider: Arc<dyn TableProvider>, ) -> Result<DataFrame, DataFusionError>

Creates a DataFrame for a TableProvider such as a ListingTable or a custom user defined provider.

Source

pub fn read_batch( &self, batch: RecordBatch, ) -> Result<DataFrame, DataFusionError>

Creates a DataFrame for reading a RecordBatch

Source

pub fn read_batches( &self, batches: impl IntoIterator<Item = RecordBatch>, ) -> Result<DataFrame, DataFusionError>

Create a DataFrame for reading a [Vec[RecordBatch]]

Source

pub async fn register_listing_table( &self, table_ref: impl Into<TableReference>, table_path: impl AsRef<str>, options: ListingOptions, provided_schema: Option<Arc<Schema>>, sql_definition: Option<String>, ) -> Result<(), DataFusionError>

Registers a ListingTable that can assemble multiple files from locations in an ObjectStore instance into a single table.

This method is async because it might need to resolve the schema.

Source

pub async fn register_arrow( &self, name: &str, table_path: &str, options: ArrowReadOptions<'_>, ) -> Result<(), DataFusionError>

Registers an Arrow file as a table that can be referenced from SQL statements executed against this context.

Source

pub fn register_catalog( &self, name: impl Into<String>, catalog: Arc<dyn CatalogProvider>, ) -> Option<Arc<dyn CatalogProvider>>

Registers a named catalog using a custom CatalogProvider so that it can be referenced from SQL statements executed against this context.

Returns the CatalogProvider previously registered for this name, if any

Source

pub fn catalog_names(&self) -> Vec<String>

Retrieves the list of available catalog names.

Source

pub fn catalog(&self, name: &str) -> Option<Arc<dyn CatalogProvider>>

Retrieves a CatalogProvider instance by name

Source

pub fn register_table( &self, table_ref: impl Into<TableReference>, provider: Arc<dyn TableProvider>, ) -> Result<Option<Arc<dyn TableProvider>>, DataFusionError>

Registers a TableProvider as a table that can be referenced from SQL statements executed against this context.

If a table of the same name was already registered, returns “Table already exists” error.

Source

pub fn deregister_table( &self, table_ref: impl Into<TableReference>, ) -> Result<Option<Arc<dyn TableProvider>>, DataFusionError>

Deregisters the given table.

Returns the registered provider, if any

Source

pub fn table_exist( &self, table_ref: impl Into<TableReference>, ) -> Result<bool, DataFusionError>

Return true if the specified table exists in the schema provider.

Source

pub async fn table( &self, table_ref: impl Into<TableReference>, ) -> Result<DataFrame, DataFusionError>

Retrieves a DataFrame representing a table previously registered by calling the register_table function.

Returns an error if no table has been registered with the provided reference.

Source

pub fn table_function( &self, name: &str, ) -> Result<Arc<TableFunction>, DataFusionError>

Retrieves a TableFunction reference by name.

Returns an error if no table function has been registered with the provided name.

Source

pub async fn table_provider( &self, table_ref: impl Into<TableReference>, ) -> Result<Arc<dyn TableProvider>, DataFusionError>

Return a TableProvider for the specified table.

Source

pub fn task_ctx(&self) -> Arc<TaskContext>

Get a new TaskContext to run in this session

Source

pub fn state(&self) -> SessionState

Return a new SessionState suitable for executing a single query.

Notes:

  1. query_execution_start_time is set to the current time for the returned state.

  2. The returned state is not shared with the current session state and this changes to the returned SessionState such as changing ConfigOptions will not be reflected in this SessionContext.

Source

pub fn state_ref(&self) -> Arc<RwLock<RawRwLock, SessionState>>

Get reference to SessionState

Source

pub fn state_weak_ref(&self) -> Weak<RwLock<RawRwLock, SessionState>>

Get weak reference to SessionState

Source

pub fn register_catalog_list(&self, catalog_list: Arc<dyn CatalogProviderList>)

Source

pub fn register_table_options_extension<T>(&self, extension: T)
where T: ConfigExtension,

Registers a ConfigExtension as a table option extension that can be referenced from SQL statements executed against this context.

Trait Implementations§

Source§

impl Clone for ClickHouseSessionContext

Source§

fn clone(&self) -> ClickHouseSessionContext

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Deref for ClickHouseSessionContext

Source§

type Target = SessionContext

The resulting type after dereferencing.
Source§

fn deref(&self) -> &Self::Target

Dereferences the value.
Source§

impl From<&SessionContext> for ClickHouseSessionContext

Source§

fn from(inner: &SessionContext) -> Self

Converts to this type from the input type.
Source§

impl From<SessionContext> for ClickHouseSessionContext

Source§

fn from(inner: SessionContext) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<P, T> Receiver for P
where P: Deref<Target = T> + ?Sized, T: ?Sized,

Source§

type Target = T

🔬This is a nightly-only experimental API. (arbitrary_self_types)
The target type on which the method may be called.
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> ErasedDestructor for T
where T: 'static,