Skip to main content

Module datafusion

Module datafusion 

Source
Expand description

DataFusion integration for SQL processing

This module provides the integration layer between LaminarDB’s push-based streaming engine and DataFusion’s pull-based SQL query execution.

§Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Ring 2: Query Planning                        │
│  SQL Query → SessionContext → LogicalPlan → ExecutionPlan       │
│                                      │                          │
│                            StreamingScanExec                    │
│                                      │                          │
│                              ┌───────▼──────┐                   │
│                              │ StreamBridge │ (tokio channel)   │
│                              └───────▲──────┘                   │
├──────────────────────────────────────┼──────────────────────────┤
│                    Ring 0: Hot Path   │                          │
│                                      │                          │
│  Source → Reactor.poll() ────────────┘                          │
│              (Events with RecordBatch data)                     │
└─────────────────────────────────────────────────────────────────┘

§Components

§Usage

use laminar_sql::datafusion::{
    create_streaming_context, ChannelStreamSource, StreamingTableProvider,
};
use std::sync::Arc;

// Create a streaming context
let ctx = create_streaming_context();

// Create a channel source
let schema = Arc::new(Schema::new(vec![
    Field::new("id", DataType::Int64, false),
    Field::new("value", DataType::Float64, true),
]));
let source = Arc::new(ChannelStreamSource::new(schema));
let sender = source.sender();

// Register as a table
let provider = StreamingTableProvider::new("events", source);
ctx.register_table("events", Arc::new(provider))?;

// Push data from the Reactor
sender.send(batch).await?;

// Execute SQL queries
let df = ctx.sql("SELECT * FROM events WHERE value > 100").await?;

Re-exports§

pub use aggregate_bridge::create_aggregate_factory;
pub use aggregate_bridge::lookup_aggregate_udf;
pub use aggregate_bridge::result_to_scalar_value;
pub use aggregate_bridge::scalar_value_to_result;
pub use aggregate_bridge::DataFusionAccumulatorAdapter;
pub use aggregate_bridge::DataFusionAggregateFactory;
pub use execute::execute_streaming_sql;
pub use execute::DdlResult;
pub use execute::QueryResult;
pub use execute::StreamingSqlResult;
pub use watermark_udf::WatermarkUdf;
pub use window_udf::HopWindowStart;
pub use window_udf::SessionWindowStart;
pub use window_udf::TumbleWindowStart;

Modules§

aggregate_bridge
F075: DataFusion aggregate bridge for streaming aggregation.
execute
End-to-end streaming SQL execution End-to-end streaming SQL execution (F005B)
watermark_udf
Watermark UDF for current watermark access Watermark UDF for DataFusion integration (F005B)
window_udf
Window function UDFs (TUMBLE, HOP, SESSION) Window function UDFs for DataFusion integration (F005B)

Structs§

BridgeSender
A cloneable sender for pushing RecordBatch instances into a bridge.
BridgeStream
A stream that pulls RecordBatch instances from the bridge.
ChannelStreamSource
A streaming source that receives data through a channel.
SortColumn
Declares a column’s sort ordering for a streaming source.
StreamBridge
A bridge that connects push-based data producers with pull-based consumers.
StreamingScanExec
A DataFusion execution plan that scans from a streaming source.
StreamingTableProvider
A DataFusion table provider backed by a streaming source.

Enums§

BridgeSendError
Error when sending a batch to the bridge.
BridgeTrySendError
Error when trying to send a batch without blocking.

Traits§

StreamSource
A source of streaming data for DataFusion queries.

Functions§

create_streaming_context
Creates a DataFusion session context configured for streaming queries.
register_streaming_functions
Registers LaminarDB streaming UDFs with a session context.
register_streaming_functions_with_watermark
Registers streaming UDFs with a live watermark source.

Type Aliases§

StreamSourceRef
A shared reference to a stream source.