dag_exec

Sync DAG executor for CPU-heavy pipelines: bounded parallelism + partial evaluation (std-only).

Why

When you have a dependency graph of expensive work (hashing, decoding, proof steps, indexing stages), you often want:

compute only requested outputs (prune everything else)
run in parallel without bringing an async runtime
control backpressure (cap in-flight work)

dag_exec is a small, std-only crate focused on that niche.

Features

Partial evaluation: computes only requested outputs (prunes unused subgraph)
Sequential executor: Kahn-style topo scheduling + cycle detection
Parallel executor (std threads):
- worker pool with RAII shutdown
- bounded dispatch via max_in_flight + per-worker bounded queues (worker_queue_cap)
Build-time validation: missing deps, duplicate keys, empty graph

Status

Published as v0.1.0. API may change while pre-1.0.

Minimal example

use dag_exec::{DagBuilder, Executor, ExecutorConfig};
use std::sync::Arc;

let mut b = DagBuilder::<String, i32, ()>::new();
b.add_source("a".into(), 1)?;
b.add_source("b".into(), 2)?;
b.add_task("c".into(), vec!["a".into(), "b".into()], |xs: &[Arc<i32>]| Ok(*xs[0] + *xs[1]))?;
let dag = b.build()?;

let exec = Executor::new(ExecutorConfig::default());
let out = exec.run_sequential(&dag, ["c".to_string()])?;
assert_eq!(*out["c"], 3);
# Ok::<(), dag_exec::BuildError<String>>(())

Examples

cargo run --example pipeline
cargo run --example merkle
cargo run --release --example rollup
DAG_EXEC_MAX_WORKERS=4 DAG_EXEC_HASH_ITERS=200000 cargo run --release --example rollup

Benchmarks

cargo bench

Notes:

chain_* has no inherent parallelism (measures overhead).
fanout_*_heavy demonstrates parallel speedups for CPU-heavy nodes.

Design notes

max_in_flight bounds queued + running work in the parallel scheduler.
Physical maximum is n_workers * (worker_queue_cap + 1); effective cap is the minimum of both.

ExecutorConfig knobs

max_workers: worker threads in the parallel executor (default: available CPU threads).
max_in_flight: global bound on queued + running tasks (backpressure).
worker_queue_cap: per-worker queue capacity (bounded sync_channel); physical max in-flight is max_workers * (worker_queue_cap + 1) and the effective cap is min(max_in_flight, physical_max).

Tip: for benchmarks/demos, set DAG_EXEC_MAX_WORKERS=4 to avoid oversubscribing small DAGs.

Notes on performance

For tiny per-node work, the parallel executor can be slower (thread + channel overhead dominates).
Use DAG_EXEC_HASH_ITERS in rollup/pipeline examples to simulate CPU-heavy nodes and see speedups.
The pruning win (partial evaluation) is deterministic: fewer requested outputs => fewer executed nodes.

Error hardening + invariant tightening (see issue #6)
More docs/diagrams and additional DAG examples (as needed)

dag_exec 0.1.1