Expand description
DataFusion integration for SQL processing
This module provides the integration layer between LaminarDB’s push-based
streaming engine and DataFusion’s pull-based SQL query execution.
§Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Ring 2: Query Planning │
│ SQL Query → SessionContext → LogicalPlan → ExecutionPlan │
│ │ │
│ StreamingScanExec │
│ │ │
│ ┌───────▼──────┐ │
│ │ StreamBridge │ (tokio channel) │
│ └───────▲──────┘ │
├──────────────────────────────────────┼──────────────────────────┤
│ Ring 0: Hot Path │ │
│ │ │
│ Source → Reactor.poll() ────────────┘ │
│ (Events with RecordBatch data) │
└─────────────────────────────────────────────────────────────────┘§Components
StreamSource: Trait for streaming data sourcesStreamBridge: Channel-based push-to-pull bridgeStreamingScanExec:DataFusionexecution plan for streaming scansStreamingTableProvider:DataFusiontable provider for streaming sourcesChannelStreamSource: Concrete source using channels
§Usage
ⓘ
use laminar_sql::datafusion::{
create_streaming_context, ChannelStreamSource, StreamingTableProvider,
};
use std::sync::Arc;
// Create a streaming context
let ctx = create_streaming_context();
// Create a channel source
let schema = Arc::new(Schema::new(vec![
Field::new("id", DataType::Int64, false),
Field::new("value", DataType::Float64, true),
]));
let source = Arc::new(ChannelStreamSource::new(schema));
let sender = source.sender();
// Register as a table
let provider = StreamingTableProvider::new("events", source);
ctx.register_table("events", Arc::new(provider))?;
// Push data from the Reactor
sender.send(batch).await?;
// Execute SQL queries
let df = ctx.sql("SELECT * FROM events WHERE value > 100").await?;Re-exports§
pub use aggregate_bridge::create_aggregate_factory;pub use aggregate_bridge::lookup_aggregate_udf;pub use aggregate_bridge::result_to_scalar_value;pub use aggregate_bridge::scalar_value_to_result;pub use aggregate_bridge::DataFusionAccumulatorAdapter;pub use aggregate_bridge::DataFusionAggregateFactory;pub use execute::execute_streaming_sql;pub use execute::DdlResult;pub use execute::QueryResult;pub use execute::StreamingSqlResult;pub use watermark_udf::WatermarkUdf;pub use window_udf::HopWindowStart;pub use window_udf::SessionWindowStart;pub use window_udf::TumbleWindowStart;
Modules§
- aggregate_
bridge - F075: DataFusion aggregate bridge for streaming aggregation.
- execute
- End-to-end streaming SQL execution End-to-end streaming SQL execution (F005B)
- watermark_
udf - Watermark UDF for current watermark access
Watermark UDF for
DataFusionintegration (F005B) - window_
udf - Window function UDFs (TUMBLE, HOP, SESSION)
Window function UDFs for
DataFusionintegration (F005B)
Structs§
- Bridge
Sender - A cloneable sender for pushing
RecordBatchinstances into a bridge. - Bridge
Stream - A stream that pulls
RecordBatchinstances from the bridge. - Channel
Stream Source - A streaming source that receives data through a channel.
- Sort
Column - Declares a column’s sort ordering for a streaming source.
- Stream
Bridge - A bridge that connects push-based data producers with pull-based consumers.
- Streaming
Scan Exec - A
DataFusionexecution plan that scans from a streaming source. - Streaming
Table Provider - A
DataFusiontable provider backed by a streaming source.
Enums§
- Bridge
Send Error - Error when sending a batch to the bridge.
- Bridge
TrySend Error - Error when trying to send a batch without blocking.
Traits§
- Stream
Source - A source of streaming data for
DataFusionqueries.
Functions§
- create_
streaming_ context - Creates a
DataFusionsession context configured for streaming queries. - register_
streaming_ functions - Registers
LaminarDBstreaming UDFs with a session context. - register_
streaming_ functions_ with_ watermark - Registers streaming UDFs with a live watermark source.
Type Aliases§
- Stream
Source Ref - A shared reference to a stream source.