Dataflow-rs
A high-performance workflow engine for building data processing pipelines in Rust with zero-overhead JSONLogic evaluation.
Dataflow-rs is a Rust library for creating high-performance data processing pipelines with pre-compiled JSONLogic and zero runtime overhead. It features a modular architecture that separates compilation from execution, ensuring predictable low-latency performance. Whether you're building REST APIs, processing Kafka streams, or creating sophisticated data transformation pipelines, Dataflow-rs provides enterprise-grade performance with minimal complexity.
🚀 Key Features
- Zero Runtime Compilation: All JSONLogic expressions pre-compiled at startup for optimal performance.
- Modular Architecture: Clear separation between compilation (LogicCompiler) and execution (InternalExecutor).
- Direct DataLogic Instantiation: Each engine has its own DataLogic instance for zero contention.
- Immutable Workflows: Workflows compiled once at initialization for predictable performance.
- Dynamic Workflows: Use JSONLogic to control workflow execution based on your data.
- Extensible: Easily add your own custom processing steps (tasks) to the engine.
- Built-in Functions: Comes with thread-safe implementations of HTTP requests, data mapping, and validation.
- Resilient: Built-in error handling and retry mechanisms to handle transient failures.
- Auditing: Keep track of all the changes that happen to your data as it moves through the pipeline.
🏁 Getting Started
Here's a quick example to get you up and running.
1. Add to Cargo.toml
[]
= "1.0"
= { = "1.0", = ["full"] }
= "1.0"
2. Create a Workflow
Workflows are defined in JSON and consist of a series of tasks.
3. Run the Engine
use ;
use Message;
use json;
async
4. Parallel Processing with Multiple Engines
For parallel processing, create multiple engine instances across threads:
use ;
use Message;
use json;
use Arc;
use JoinSet;
async
✨ Core Concepts
- Engine: High-performance engine with pre-compiled logic and immutable workflows.
- LogicCompiler: Compiles all JSONLogic expressions at initialization for zero runtime overhead.
- InternalExecutor: Executes built-in functions using pre-compiled logic from the cache.
- Workflow: A sequence of tasks executed in order, with conditions accessing only metadata.
- Task: A single processing step with optional JSONLogic conditions.
- Message: The data structure flowing through workflows with audit trail support.
🏗️ Architecture
The v3.0 architecture focuses on simplicity and performance through clear separation of concerns:
Compilation Phase (Startup)
- LogicCompiler compiles all JSONLogic expressions from workflows and tasks
- Creates an indexed cache of compiled logic for O(1) runtime access
- Validates all logic expressions early, failing fast on errors
- Stores compiled logic in contiguous memory for cache efficiency
Execution Phase (Runtime)
- Engine orchestrates message processing through immutable workflows
- InternalExecutor evaluates conditions and executes built-in functions
- Uses compiled logic from cache - zero compilation overhead at runtime
- Direct DataLogic instantiation eliminates any locking or contention
Key Design Decisions
- Immutable Workflows: All workflows defined at engine creation, cannot be modified
- Pre-compilation: All expensive parsing/compilation done once at startup
- Direct Instantiation: Each engine owns its DataLogic instance directly
- Modular Design: Clear boundaries between compilation, execution, and orchestration
⚡ Performance
Dataflow-rs achieves optimal performance through architectural improvements:
- Pre-Compilation: All JSONLogic compiled at startup, zero runtime overhead
- Cache-Friendly: Compiled logic stored contiguously in memory
- Direct Instantiation: DataLogic instances created directly without locking
- Predictable Latency: No runtime allocations for logic evaluation
- Modular Design: Clear separation of compilation and execution phases
Run the included benchmark to test performance on your hardware:
🛠️ Custom Functions
You can extend the engine with your own custom logic by implementing the FunctionHandler
trait:
use ;
use Value;
;
// Register when creating the engine:
let mut custom_functions = new;
custom_functions.insert;
let engine = new;
🤝 Contributing
We welcome contributions! Feel free to fork the repository, make your changes, and submit a pull request. Please make sure to add tests for any new features.
🏢 About Plasmatic
Dataflow-rs is developed by the team at Plasmatic. We're passionate about building open-source tools for data processing.
📄 License
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.