Struct datafusion::execution::context::SessionContext

impl SessionContext

pub fn new() -> Self

Creates a new SessionContext using the default SessionConfig.

pub async fn refresh_catalogs(&self) -> Result<()>

Finds any ListingSchemaProviders and instructs them to reload tables from “disk”

pub fn with_config(config: SessionConfig) -> Self

Creates a new SessionContext using the provided SessionConfig and a new RuntimeEnv.

See Self::with_config_rt for more details on resource limits.

pub fn with_config_rt(config: SessionConfig, runtime: Arc<RuntimeEnv>) -> Self

Creates a new SessionContext using the provided SessionConfig and a RuntimeEnv.

Resource Limits

By default, each new SessionContext creates a new RuntimeEnv, and therefore will not enforce memory or disk limits for queries run on different SessionContexts.

To enforce resource limits (e.g. to limit the total amount of memory used) across all DataFusion queries in a process, all SessionContext’s should be configured with the same RuntimeEnv.

pub fn with_state(state: SessionState) -> Self

Creates a new SessionContext using the provided SessionState

pub fn session_start_time(&self) -> DateTime<Utc>

Returns the time this SessionContext was created

pub fn register_batch( &self, table_name: &str, batch: RecordBatch ) -> Result<Option<Arc<dyn TableProvider>>>

Registers the RecordBatch as the specified table name

pub fn runtime_env(&self) -> Arc<RuntimeEnv>

Return the RuntimeEnv used to run queries with this SessionContext

pub fn session_id(&self) -> String

Returns an id that uniquely identifies this SessionContext.

pub fn table_factory( &self, file_type: &str ) -> Option<Arc<dyn TableProviderFactory>>

Return the TableProviderFactory that is registered for the specified file type, if any.

pub fn enable_ident_normalization(&self) -> bool

Return the enable_ident_normalization of this Session

pub fn copied_config(&self) -> SessionConfig

Return a copied version of config for this Session

pub async fn sql(&self, sql: &str) -> Result<DataFrame>

Creates a DataFrame from SQL query text.

Note: This API implements DDL statements such as CREATE TABLE and CREATE VIEW and DML statements such as INSERT INTO with in-memory default implementations. See Self::sql_with_options.

Example: Running SQL queries

See the example on Self

Example: Creating a Table with SQL

use datafusion::prelude::*;
let mut ctx = SessionContext::new();
ctx
  .sql("CREATE TABLE foo (x INTEGER)")
  .await?
  .collect()
  .await?;
assert!(ctx.table_exist("foo").unwrap());

pub async fn sql_with_options( &self, sql: &str, options: SQLOptions ) -> Result<DataFrame>

Creates a DataFrame from SQL query text, first validating that the queries are allowed by options

Example: Preventing Creating a Table with SQL

If you want to avoid creating tables, or modifying data or the session, set SQLOptions appropriately:

use datafusion::prelude::*;
let mut ctx = SessionContext::new();
let options = SQLOptions::new()
  .with_allow_ddl(false);
let err = ctx.sql_with_options("CREATE TABLE foo (x INTEGER)", options)
  .await
  .unwrap_err();
assert!(
  err.to_string().starts_with("Error during planning: DDL not supported: CreateMemoryTable")
);

pub async fn execute_logical_plan(&self, plan: LogicalPlan) -> Result<DataFrame>

Execute the LogicalPlan, return a DataFrame. This API is not featured limited (so all SQL such as CREATE TABLE and COPY will be run).

If you wish to limit the type of plan that can be run from SQL, see Self::sql_with_options and SQLOptions::verify_plan.

pub fn register_variable( &self, variable_type: VarType, provider: Arc<dyn VarProvider + Send + Sync> )

Registers a variable provider within this context.

pub fn register_udf(&self, f: ScalarUDF)

Registers a scalar UDF within this context.

Note in SQL queries, function names are looked up using lowercase unless the query uses quotes. For example,

SELECT MY_FUNC(x)... will look for a function named "my_func"
SELECT "my_FUNC"(x) will look for a function named "my_FUNC"

pub fn register_udaf(&self, f: AggregateUDF)

Registers an aggregate UDF within this context.

Note in SQL queries, aggregate names are looked up using lowercase unless the query uses quotes. For example,

SELECT MY_UDAF(x)... will look for an aggregate named "my_udaf"
SELECT "my_UDAF"(x) will look for an aggregate named "my_UDAF"

pub fn register_udwf(&self, f: WindowUDF)

Registers a window UDF within this context.

Note in SQL queries, window function names are looked up using lowercase unless the query uses quotes. For example,

SELECT MY_UDWF(x)... will look for a window function named "my_udwf"
SELECT "my_UDWF"(x) will look for a window function named "my_UDWF"

pub async fn read_avro<P: DataFilePaths>( &self, table_paths: P, options: AvroReadOptions<'_> ) -> Result<DataFrame>

Creates a DataFrame for reading an Avro data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

For an example, see read_csv

pub async fn read_json<P: DataFilePaths>( &self, table_paths: P, options: NdJsonReadOptions<'_> ) -> Result<DataFrame>

Creates a DataFrame for reading an JSON data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

For an example, see read_csv

pub async fn read_arrow<P: DataFilePaths>( &self, table_paths: P, options: ArrowReadOptions<'_> ) -> Result<DataFrame>

Creates a DataFrame for reading an Arrow data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

For an example, see read_csv

pub fn read_empty(&self) -> Result<DataFrame>

Creates an empty DataFrame.

pub async fn read_csv<P: DataFilePaths>( &self, table_paths: P, options: CsvReadOptions<'_> ) -> Result<DataFrame>

Creates a DataFrame for reading a CSV data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

Example usage is given below:

use datafusion::prelude::*;
let ctx = SessionContext::new();
// You can read a single file using `read_csv`
let df = ctx.read_csv("tests/data/example.csv", CsvReadOptions::new()).await?;
// you can also read multiple files:
let df = ctx.read_csv(vec!["tests/data/example.csv", "tests/data/example.csv"], CsvReadOptions::new()).await?;

pub async fn read_parquet<P: DataFilePaths>( &self, table_paths: P, options: ParquetReadOptions<'_> ) -> Result<DataFrame>

Creates a DataFrame for reading a Parquet data source.

For more control such as reading multiple files, you can use read_table with a ListingTable.

For an example, see read_csv

pub fn read_table(&self, provider: Arc<dyn TableProvider>) -> Result<DataFrame>

Creates a DataFrame for a TableProvider such as a ListingTable or a custom user defined provider.

pub fn read_batch(&self, batch: RecordBatch) -> Result<DataFrame>

Creates a DataFrame for reading a RecordBatch

pub async fn register_listing_table( &self, name: &str, table_path: impl AsRef<str>, options: ListingOptions, provided_schema: Option<SchemaRef>, sql_definition: Option<String> ) -> Result<()>

Registers a ListingTable that can assemble multiple files from locations in an ObjectStore instance into a single table.

This method is async because it might need to resolve the schema.

pub async fn register_csv( &self, name: &str, table_path: &str, options: CsvReadOptions<'_> ) -> Result<()>

Registers a CSV file as a table which can referenced from SQL statements executed against this context.

pub async fn register_json( &self, name: &str, table_path: &str, options: NdJsonReadOptions<'_> ) -> Result<()>

Registers a JSON file as a table that it can be referenced from SQL statements executed against this context.

pub async fn register_parquet( &self, name: &str, table_path: &str, options: ParquetReadOptions<'_> ) -> Result<()>

Registers a Parquet file as a table that can be referenced from SQL statements executed against this context.

pub async fn register_avro( &self, name: &str, table_path: &str, options: AvroReadOptions<'_> ) -> Result<()>

Registers an Avro file as a table that can be referenced from SQL statements executed against this context.

pub async fn register_arrow( &self, name: &str, table_path: &str, options: ArrowReadOptions<'_> ) -> Result<()>

Registers an Arrow file as a table that can be referenced from SQL statements executed against this context.

pub fn register_catalog( &self, name: impl Into<String>, catalog: Arc<dyn CatalogProvider> ) -> Option<Arc<dyn CatalogProvider>>

Registers a named catalog using a custom CatalogProvider so that it can be referenced from SQL statements executed against this context.

Returns the CatalogProvider previously registered for this name, if any

pub fn catalog_names(&self) -> Vec<String>

Retrieves the list of available catalog names.

pub fn catalog(&self, name: &str) -> Option<Arc<dyn CatalogProvider>>

Retrieves a CatalogProvider instance by name

pub fn register_table<'a>( &'a self, table_ref: impl Into<TableReference<'a>>, provider: Arc<dyn TableProvider> ) -> Result<Option<Arc<dyn TableProvider>>>

Registers a TableProvider as a table that can be referenced from SQL statements executed against this context.

Returns the TableProvider previously registered for this reference, if any

pub fn deregister_table<'a>( &'a self, table_ref: impl Into<TableReference<'a>> ) -> Result<Option<Arc<dyn TableProvider>>>

Deregisters the given table.

Returns the registered provider, if any

pub fn table_exist<'a>( &'a self, table_ref: impl Into<TableReference<'a>> ) -> Result<bool>

Return true if the specified table exists in the schema provider.

pub async fn table<'a>( &self, table_ref: impl Into<TableReference<'a>> ) -> Result<DataFrame>

Retrieves a DataFrame representing a table previously registered by calling the register_table function.

Returns an error if no table has been registered with the provided reference.

pub async fn table_provider<'a>( &self, table_ref: impl Into<TableReference<'a>> ) -> Result<Arc<dyn TableProvider>>

Return a TableProvider for the specified table.

pub fn tables(&self) -> Result<HashSet<String>>

👎Deprecated since 23.0.0: Please use the catalog provider interface (SessionContext::catalog) to examine available catalogs, schemas, and tables

Returns the set of available tables in the default catalog and schema.

Use table to get a specific table.

pub fn optimize(&self, plan: &LogicalPlan) -> Result<LogicalPlan>

👎Deprecated since 23.0.0: Use SessionState::optimize to ensure a consistent state for planning and execution

Optimizes the logical plan by applying optimizer rules.

pub async fn create_physical_plan( &self, logical_plan: &LogicalPlan ) -> Result<Arc<dyn ExecutionPlan>>

👎Deprecated since 23.0.0: Use SessionState::create_physical_plan or DataFrame::create_physical_plan to ensure a consistent state for planning and execution

Creates a physical plan from a logical plan.

pub async fn write_csv( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str> ) -> Result<()>

Executes a query and writes the results to a partitioned CSV file.

pub async fn write_json( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str> ) -> Result<()>

Executes a query and writes the results to a partitioned JSON file.

pub async fn write_parquet( &self, plan: Arc<dyn ExecutionPlan>, path: impl AsRef<str>, writer_properties: Option<WriterProperties> ) -> Result<()>

Executes a query and writes the results to a partitioned Parquet file.

pub fn task_ctx(&self) -> Arc<TaskContext>

Get a new TaskContext to run in this session

pub fn state(&self) -> SessionState

Snapshots the SessionState of this SessionContext setting the query_execution_start_time to the current time

pub fn state_weak_ref(&self) -> Weak<RwLock<SessionState>>

Get weak reference to SessionState

pub fn register_catalog_list(&mut self, catalog_list: Arc<dyn CatalogList>)

Register CatalogList in SessionState

Trait Implementations§

impl Clone for SessionContext

fn clone(&self) -> SessionContext

Returns a copy of the value. Read more

1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Default for SessionContext

fn default() -> Self

Returns the “default value” for a type. Read more

impl From<&SessionContext> for TaskContext

Create a new task context instance from SessionContext

fn from(session: &SessionContext) -> Self

Converts to this type from the input type.

impl FunctionRegistry for SessionContext

fn udfs(&self) -> HashSet<String>

Set of all available udfs.

fn udf(&self, name: &str) -> Result<Arc<ScalarUDF>>

Returns a reference to the udf named name.

fn udaf(&self, name: &str) -> Result<Arc<AggregateUDF>>

Returns a reference to the udaf named name.

fn udwf(&self, name: &str) -> Result<Arc<WindowUDF>>

Returns a reference to the udwf named name.

Auto Trait Implementations§

impl !UnwindSafe for SessionContext

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into for Twhere U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> Same<T> for T

type Output = T

Should always be Self

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom for Twhere U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

impl<T, U> TryInto for Twhere U: TryFrom<T>,

type Error = >::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,