Expand description
Integrations between Vortex and DataFusion.
The crate exposes two main entry points:
VortexFormatFactoryfor the file-based integration used by SQL,CREATE EXTERNAL TABLE, andListingTable.v2for direct integration from an existing VortexDataSourceRef.
§Registering The File Format
Most applications register VortexFormatFactory with a DataFusion
SessionContext and then let DataFusion create VortexFormat and
VortexSource instances as queries are planned:
use std::sync::Arc;
use datafusion::datasource::provider::DefaultTableFactory;
use datafusion::execution::SessionStateBuilder;
use datafusion::prelude::SessionContext;
use datafusion_common::GetExt;
use vortex_datafusion::VortexFormatFactory;
let factory = Arc::new(VortexFormatFactory::new());
let mut state_builder = SessionStateBuilder::new()
.with_default_features()
.with_table_factory(
factory.get_ext().to_uppercase(),
Arc::new(DefaultTableFactory::new()),
);
if let Some(file_formats) = state_builder.file_formats() {
file_formats.push(factory.clone() as _);
}
let ctx = SessionContext::new_with_state(state_builder.build()).enable_url_table();
ctx.sql(
"CREATE EXTERNAL TABLE metrics (service VARCHAR, value BIGINT) \
STORED AS vortex LOCATION 'file:///tmp/metrics/'",
)
.await?;§Registering An Existing Vortex Data Source
If your application already has a Vortex DataSourceRef, use
v2::VortexTable to register it directly with DataFusion:
use std::sync::Arc;
use arrow_schema::Schema;
use datafusion::prelude::SessionContext;
use vortex::VortexSessionDefault;
use vortex::scan::DataSourceRef;
use vortex::session::VortexSession;
use vortex_datafusion::v2::VortexTable;
let table = Arc::new(VortexTable::new(
data_source,
VortexSession::default(),
Arc::new(Schema::empty()),
));
let ctx = SessionContext::new();
ctx.register_table("vortex_data", table)?;Modules§
- metrics
- Helpers for extracting Vortex scan metrics from DataFusion execution plans.
- reader
- Factory for creating
VortexReadAtinstances fromPartitionedFiles. - v2
- Direct DataFusion integration for an existing Vortex
DataSourceRef.
Structs§
- Vortex
Access Plan - Additional Vortex-specific scan constraints attached to a
PartitionedFile. - Vortex
Format - DataFusion
FileFormatimplementation for.vortexfiles. - Vortex
Format Factory - Registration entry point for the file-backed Vortex integration.
- Vortex
Source - File scan implementation for reading one or more
.vortexfiles. - Vortex
Table Options - Options to configure
VortexFormatandVortexSource.
Traits§
- Expression
Convertor - Trait for converting DataFusion expressions to Vortex ones.