Dataflow-rs
A high-performance workflow engine for building data processing pipelines in Rust with zero-overhead JSONLogic evaluation.
Dataflow-rs is a Rust library for creating high-performance data processing pipelines with pre-compiled JSONLogic and zero runtime overhead. It features an async-first architecture that separates compilation from execution, ensuring predictable low-latency performance. Whether you're building REST APIs, processing Kafka streams, or creating sophisticated data transformation pipelines, Dataflow-rs provides enterprise-grade performance with minimal complexity.
🚀 Key Features
- Async-First Architecture: Native async/await support with Tokio for high-throughput processing.
- Zero Runtime Compilation: All JSONLogic expressions pre-compiled at startup for optimal performance.
- Execution Tracing: Step-by-step debugging with message snapshots after each task.
- Built-in Functions: Parse (JSON/XML), Map, Validate, and Publish (JSON/XML) for complete data pipelines.
- Dynamic Workflows: Use JSONLogic to control workflow execution based on your data.
- Extensible: Easily add your own custom async processing steps (tasks) to the engine.
- WebAssembly Support: Run workflows in the browser with
@goplasmatic/dataflow-wasm. - React UI Components: Visualize and debug workflows with
@goplasmatic/dataflow-ui. - Auditing: Keep track of all the changes that happen to your data as it moves through the pipeline.
🏁 Getting Started
Here's a quick example to get you up and running.
1. Add to Cargo.toml
[]
= "2.0"
= { = "1", = ["rt-multi-thread", "macros"] }
= "1.0"
2. Create a Workflow
Workflows are defined in JSON and consist of a series of tasks.
3. Run the Engine
use ;
use Message;
use json;
use Arc;
async
✨ Core Concepts
- Engine: Async-first engine with pre-compiled logic and immutable workflows.
- Workflow: A sequence of tasks executed in order, with JSONLogic conditions.
- Task: A single async processing step with optional conditions.
- Message: The data structure flowing through workflows with
data,metadata,temp_data,payload, and audit trail. - ExecutionTrace: Step-by-step debugging with message snapshots after each task execution.
🏗️ Architecture
The v2.0 architecture uses an async-first design with pre-compiled JSONLogic for optimal performance:
Compilation Phase (Startup)
- All JSONLogic expressions compiled once when the Engine is created
- Compiled logic cached with Arc for zero-copy sharing
- Validates all expressions early, failing fast on errors
Execution Phase (Runtime)
- Engine orchestrates async message processing through workflows
- Built-in functions execute with pre-compiled logic (zero compilation overhead)
process_message()for normal execution,process_message_with_trace()for debugging- Each task can be async, enabling I/O operations without blocking
Key Design Decisions
- Async-First: Native async/await with Tokio for high-throughput processing
- Immutable Workflows: All workflows defined at engine creation
- Pre-compilation: All parsing/compilation done once at startup
- Execution Tracing: Optional step-by-step debugging with message snapshots
⚡ Performance
Dataflow-rs achieves optimal performance through architectural improvements:
- Pre-Compilation: All JSONLogic compiled at startup, zero runtime overhead
- Arc-Wrapped Logic: Zero-copy sharing of compiled expressions
- Context Arc Caching: 50% improvement via cached Arc context
- Async I/O: Non-blocking operations for external services
- Predictable Latency: No runtime allocations for logic evaluation
Run the included examples to test performance on your hardware:
🛠️ Custom Functions
You can extend the engine with your own custom logic by implementing the AsyncFunctionHandler trait:
use async_trait;
use ;
use DataLogic;
use json;
use HashMap;
use Arc;
;
// Register when creating the engine:
let mut custom_functions: = new;
custom_functions.insert;
let engine = new;
📦 Built-in Functions
| Function | Purpose | Modifies Data |
|---|---|---|
parse_json |
Parse JSON from payload into data context | Yes |
parse_xml |
Parse XML string into JSON data structure | Yes |
map |
Data transformation using JSONLogic | Yes |
validation |
Rule-based data validation | No (read-only) |
publish_json |
Serialize data to JSON string | Yes |
publish_xml |
Serialize data to XML string | Yes |
🌐 Related Packages
| Package | Description |
|---|---|
| @goplasmatic/dataflow-wasm | WebAssembly bindings for browser execution |
| @goplasmatic/dataflow-ui | React components for workflow visualization |
🤝 Contributing
We welcome contributions! Feel free to fork the repository, make your changes, and submit a pull request. Please make sure to add tests for any new features.
🏢 About Plasmatic
Dataflow-rs is developed by the team at Plasmatic. We're passionate about building open-source tools for data processing.
📄 License
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.