dataflow-rs 2.0.4

A lightweight, rule-driven workflow engine for building powerful data processing pipelines and nanoservices in Rust. Extend it with your custom tasks to create robust, maintainable services.
Documentation
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Dataflow-rs is a lightweight, rule-driven workflow engine for building data processing pipelines and nanoservices in Rust. It provides an async-first execution model with pre-compiled JSONLogic for high performance.

### Core Architecture

- **Engine**: Central async component that processes messages through workflows sequentially
- **Workflow**: Collection of tasks with JSONLogic conditions (can only access metadata fields)
- **Task**: Individual processing units that implement `AsyncFunctionHandler` trait
- **Message**: Data structure containing `data`, `payload`, `metadata`, `temp_data`, audit trail, and errors
- **Built-in Functions**: Data mapping/transformation and validation

### Key Design Patterns

- **Sequential Workflow Processing**: Workflows execute sequentially to allow dependencies between workflows
- **Pre-compiled JSONLogic**: All logic expressions compiled at startup for zero runtime overhead
- **Retry Mechanisms**: Configurable retry policies with exponential backoff for transient failures
- **Audit Trails**: Automatic change tracking for debugging and monitoring

## Development Commands

### Build and Test
```bash
# Build the project
cargo build

# Run all tests
cargo test

# Run tests with output
cargo test -- --nocapture

# Run examples
cargo run --example benchmark            # Performance benchmark
cargo run --example custom_function      # Custom function implementation
cargo run --example complete_workflow    # Complete workflow example
```

### Code Quality
The project uses standard Rust tooling:
```bash
# Format code
cargo fmt

# Lint code
cargo clippy

# Check without building
cargo check
```

### Release Process
The project uses GitHub Actions for automated releases via `cargo-release` when pushing to main branch.

## Code Structure

### Core Engine (`src/engine/`)
- `mod.rs`: Main Engine implementation with async message processing
- `compiler.rs`: JSONLogic compilation and caching
- `executor.rs`: Internal function execution
- `workflow_executor.rs`: Workflow orchestration
- `task_executor.rs`: Task execution
- `message.rs`: Message structure with data, metadata, and audit trail
- `workflow.rs`: Workflow definition and validation
- `task.rs`: Task structure and Function definition
- `error.rs`: Comprehensive error types (DataflowError, ErrorInfo)
- `utils.rs`: Helper utilities for data manipulation

### Built-in Functions (`src/engine/functions/`)
- `map.rs`: Data transformation using JSONLogic (supports array notation)
- `validation.rs`: Rule-based validation with custom error messages
- `mod.rs`: Registration and management of built-in functions

### Key Implementation Details

- **Workflow Conditions**: Can ONLY access `metadata` fields, not `data` fields
- **Task Dependencies**: Tasks within workflows execute sequentially, allowing later tasks to depend on earlier results
- **Error Handling**: Workflows can continue processing despite individual task failures when `continue_on_error` is enabled
- **Custom Functions**: Implement `AsyncFunctionHandler` trait with async `execute()` returning `Result<(usize, Vec<Change>)>`
- **Structure Preservation**: DataLogic instances are configured with `with_preserve_structure()` to maintain object structure in JSONLogic operations
- **Async-First**: Engine uses async/await for all operations with tokio runtime support

### Testing Patterns

The test suite demonstrates:
- Custom async function handler implementation
- Workflow engine integration testing
- Message processing verification
- Data mapping and transformation patterns

When extending the engine:
1. Implement `AsyncFunctionHandler` for custom tasks
2. Register functions with engine constructor or `register_task_function()`
3. Use `Change` structs to track modifications for audit trails
4. Handle errors appropriately and return proper status codes