pipeflow 0.0.4

A lightweight, configuration-driven data pipeline framework
Documentation
# Pipeline Examples

[中文文档](README_CN.md)

This directory contains example pipeline configurations demonstrating pipeflow features.

## Examples by Complexity

### Beginner

| File                          | Description                                                                         |
| ----------------------------- | ----------------------------------------------------------------------------------- |
| `http_to_console.yaml`        | Basic HTTP polling with console output. Comprehensive comments explain all options. |
| `http_server_to_console.yaml` | HTTP webhook input with console output.                                             |
| `file_to_console.yaml`        | Read lines from a file and print them to the console.                               |
| `sql_to_console.yaml`         | Poll a SQLite database and print each row to the console.                           |
| `dlq_handling.yaml`           | Dead Letter Queue setup for error message handling.                                 |
| `redis_get_set.yaml`          | Redis GET polling with SET + TTL output.                                            |
| `notify.yaml`                 | Route system notifications to email, Telegram, and webhook sinks.                   |

### Intermediate

| File                          | Description                                                                              |
| ----------------------------- | ---------------------------------------------------------------------------------------- |
| `http_transform_to_file.yaml` | HTTP polling with filter + remap transforms, file output in multiple formats.            |
| `mixed_topology.yaml`         | Multi-source fan-out to multiple sinks demonstrating DAG topology.                       |
| `http_to_sqlite.yaml`         | HTTP polling to SQLite with deterministic hash id + SQL UPSERT (insert-only created_at). |

### Advanced / Real-World

| File                     | Description                                                                   |
| ------------------------ | ----------------------------------------------------------------------------- |
| `crypto_redis_postgres/` | Real-world crypto metrics pipeline to Redis and Postgres with tagged metrics. |

## How to Run

```bash
# Run a single config file
cargo run -- run examples/http_to_console.yaml

# Run with verbose/debug output
cargo run -- -v run examples/http_to_console.yaml

# Validate configuration without running
cargo run -- config validate examples/http_to_console.yaml

# Show merged + normalized configuration
cargo run -- config show examples/http_to_console.yaml --format yaml

# Run from a directory (merges all YAML files)
cargo run -- run examples/crypto_redis_postgres/
```

## Configuration Tips

### Buffer Size Tuning

The `output_buffer_size` parameter controls the broadcast channel capacity:

```yaml
system:
  output_buffer_size: 1024 # Default value
```

Guidelines:

- **Low latency** (< 100 msg/s): Keep default 1024
- **High throughput** (> 1000 msg/s): Increase to 4096 or higher
- **Memory constrained**: Reduce to 256-512

Per-source override:

```yaml
sources:
  - id: high_frequency_source
    type: http_client
    output_buffer_size: 4096 # Override for this source only
```

### Console Output Formats

```yaml
sinks:
  - id: console_out
    type: console
    config:
      format: pretty # Default: indented JSON with metadata
      # format: json  # Compact single-line JSON
      # format: text  # Payload only, no JSON formatting
```

### File Output Formats

```yaml
sinks:
  - id: file_out
    type: file
    config:
      path: "./output.jsonl"
      format: jsonl # Default: JSON Lines
      # format: tsv   # Tab-separated values
      # format: csv   # Comma-separated values
      append: true # Append to existing file (default)
      include_header: false # Header row for TSV/CSV
```