Skip to main content

Crate vortex_datafusion

Crate vortex_datafusion 

Source
Expand description

Integrations between Vortex and DataFusion.

The crate exposes two main entry points:

§Registering The File Format

Most applications register VortexFormatFactory with a DataFusion SessionContext and then let DataFusion create VortexFormat and VortexSource instances as queries are planned:

use std::sync::Arc;

use datafusion::datasource::provider::DefaultTableFactory;
use datafusion::execution::SessionStateBuilder;
use datafusion::prelude::SessionContext;
use datafusion_common::GetExt;
use vortex_datafusion::VortexFormatFactory;

let factory = Arc::new(VortexFormatFactory::new());
let mut state_builder = SessionStateBuilder::new()
    .with_default_features()
    .with_table_factory(
        factory.get_ext().to_uppercase(),
        Arc::new(DefaultTableFactory::new()),
    );

if let Some(file_formats) = state_builder.file_formats() {
    file_formats.push(factory.clone() as _);
}

let ctx = SessionContext::new_with_state(state_builder.build()).enable_url_table();
ctx.sql(
    "CREATE EXTERNAL TABLE metrics (service VARCHAR, value BIGINT) \
     STORED AS vortex LOCATION 'file:///tmp/metrics/'",
)
.await?;

§Registering An Existing Vortex Data Source

If your application already has a Vortex DataSourceRef, use v2::VortexTable to register it directly with DataFusion:

use std::sync::Arc;

use arrow_schema::Schema;
use datafusion::prelude::SessionContext;
use vortex::VortexSessionDefault;
use vortex::scan::DataSourceRef;
use vortex::session::VortexSession;
use vortex_datafusion::v2::VortexTable;

let table = Arc::new(VortexTable::new(
    data_source,
    VortexSession::default(),
    Arc::new(Schema::empty()),
));

let ctx = SessionContext::new();
ctx.register_table("vortex_data", table)?;

Modules§

metrics
Helpers for extracting Vortex scan metrics from DataFusion execution plans.
reader
Factory for creating VortexReadAt instances from PartitionedFiles.
v2
Direct DataFusion integration for an existing Vortex DataSourceRef.

Structs§

VortexAccessPlan
Additional Vortex-specific scan constraints attached to a PartitionedFile.
VortexFormat
DataFusion FileFormat implementation for .vortex files.
VortexFormatFactory
Registration entry point for the file-backed Vortex integration.
VortexSource
File scan implementation for reading one or more .vortex files.
VortexTableOptions
Options to configure VortexFormat and VortexSource.

Traits§

ExpressionConvertor
Trait for converting DataFusion expressions to Vortex ones.