aetherdsp-core 0.1.4

Hard real-time modular DSP engine — lock-free graph scheduler, generational arena, and buffer pool
Documentation

aether-core

crates.io docs.rs CI License: MIT Downloads

Hard real-time modular DSP engine for Rust.

64-sample buffer · 48 kHz · ≤1.33 ms deadline · Zero allocations · Lock-free

What's in this crate

Module Description
arena Generational arena — O(1) insert/remove, no fragmentation
graph DAG audio graph with topological sort (Kahn's algorithm)
scheduler Lock-free RT scheduler — processes sorted node array each tick
buffer_pool Pre-allocated audio buffer pool, zero RT allocation
param Smoothed parameters with per-sample interpolation
command SPSC command ring — control thread → RT thread mutations
node DspNode trait — implement this to create a processing node
state Serializable node state for save/restore

Real-time guarantees

Rule Enforcement
No heap allocation in RT thread Pre-allocated arena + buffer pool
No locks in RT thread SPSC ring buffer (ringbuf)
No I/O in RT thread All I/O on control/tokio threads
Bounded execution Flat topo-sorted array, ≤32 commands/tick
No recursion Iterative Kahn's sort, iterative execution

Feature flags

This crate supports optional features to reduce compile times and binary size:

[dependencies]

aetherdsp-core = { version = "0.1", default-features = false, features = ["std"] }

Feature Default Description
std Standard library support (required)
parallel Parallel node execution via Rayon
serde Serialization support for graph snapshots

Examples:

# Minimal build (no parallel, no serde)

aetherdsp-core = { version = "0.1", default-features = false, features = ["std"] }



# Parallel execution only

aetherdsp-core = { version = "0.1", default-features = false, features = ["std", "parallel"] }



# All features (default)

aetherdsp-core = "0.1"

Performance impact:

  • Disabling parallel falls back to sequential node execution (slower for large graphs)
  • Disabling serde removes graph snapshot serialization (smaller binary)

Quick start

use aether_core::{
    scheduler::Scheduler,
    node::DspNode,
    param::ParamBlock,
    BUFFER_SIZE, MAX_INPUTS,
};

struct Gain { amount: f32 }

impl DspNode for Gain {
    fn process(
        &mut self,
        inputs: &[Option<&[f32; BUFFER_SIZE]>; MAX_INPUTS],
        output: &mut [f32; BUFFER_SIZE],
        _params: &mut ParamBlock,
        _sample_rate: f32,
    ) {
        let silence = [0.0f32; BUFFER_SIZE];
        let input = inputs[0].unwrap_or(&silence);
        for (o, i) in output.iter_mut().zip(input.iter()) {
            *o = i * self.amount;
        }
    }
    fn type_name(&self) -> &'static str { "Gain" }
}

// Build a graph and run it
let mut sched = Scheduler::new(48_000.0);
let id = sched.graph.add_node(Box::new(Gain { amount: 0.5 })).unwrap();
sched.graph.set_output_node(id);

Performance characteristics

Metric Value Notes
Latency 1.33 ms 64 samples @ 48 kHz
Throughput 1000+ nodes < 100 µs processing time
Memory ~2.5 MB Pre-allocated arena + buffer pool
CPU (idle) < 1% Single core, empty graph
CPU (100 nodes) 5-10% Single core, simple nodes
CPU (1000 nodes) 15-25% Multi-core, parallel execution
Allocation 0 bytes Zero allocation in RT thread
Lock contention None Lock-free SPSC ring

Benchmark results:

Benchmark Result Comparison
param_fill_buffer_64 51.7 ns 4× faster than std
Arena insert/remove ×1000 < 5 µs O(1) operations
Scheduler (1000 noop nodes) < 100 µs 10,000 nodes/sec
Parallel vs Sequential 3-4× faster 4+ cores

Test environment: AMD Ryzen 9 5950X, 64GB RAM, Windows 11

Run benchmarks: cargo bench -p aetherdsp-core

Comparison with other engines

Feature AetherDSP dasp fundsp cpal
Lock-free
Parallel execution
Runtime graph edits N/A
Generational arena N/A
Zero allocation ⚠️ Partial ⚠️ Partial
Topological sort N/A
Parameter smoothing N/A
Tuning systems N/A
Graph type Runtime Compile-time Compile-time N/A
Learning curve Medium Low Medium Low
Use case DAW/synth DSP research Audio FX Audio I/O

When to use AetherDSP:

  • Building a DAW, plugin host, or modular synthesizer
  • Need runtime graph mutations (add/remove nodes while playing)
  • Large graphs (100+ nodes) that benefit from parallel execution
  • Hard real-time requirements (no allocation, no locks)

When to use alternatives:

  • dasp: Simple DSP research, prototyping, learning
  • fundsp: Audio effects, compile-time graph optimization
  • cpal: Just need audio I/O, no graph processing

Common pitfalls

❌ DON'T: Allocate in process()

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        let buffer = vec![0.0; 1024]; // ❌ HEAP ALLOCATION!
        // This will cause audio glitches
    }
}

✅ DO: Pre-allocate in new()

struct GoodNode {
    buffer: Vec<f32>,
}

impl GoodNode {
    fn new() -> Self {
        Self {
            buffer: vec![0.0; 1024] // ✅ Allocated once
        }
    }
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        self.buffer.fill(0.0); // ✅ Reuse existing buffer
    }
}

❌ DON'T: Use Mutex in RT thread

use std::sync::Mutex;

struct BadNode {
    shared: Arc<Mutex<Vec<f32>>>, // ❌ CAN BLOCK!
}

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        let data = self.shared.lock().unwrap(); // ❌ DEADLOCK RISK!
    }
}

✅ DO: Use lock-free structures

use arc_swap::ArcSwap;

struct GoodNode {
    shared: Arc<ArcSwap<Vec<f32>>>, // ✅ Lock-free
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        let data = self.shared.load(); // ✅ No blocking
    }
}

❌ DON'T: Do I/O in process()

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        std::fs::write("output.wav", data); // ❌ BLOCKS!
        println!("Processing..."); // ❌ BLOCKS!
    }
}

✅ DO: Send data to another thread

use ringbuf::traits::Producer;

struct GoodNode {
    sender: Producer<f32>,
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        // ✅ Non-blocking send to I/O thread
        for &sample in output.iter() {
            let _ = self.sender.try_push(sample);
        }
    }
}

❌ DON'T: Use unbounded loops

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        while self.condition { // ❌ UNBOUNDED!
            // Could run forever
        }
    }
}

✅ DO: Use bounded iterations

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        for i in 0..BUFFER_SIZE { // ✅ Bounded
            output[i] = self.compute(i);
        }
    }
}

FAQ

General

Q: What is AetherDSP?
A: A hard real-time modular DSP engine for building DAWs, plugin hosts, and synthesizers in Rust.

Q: Is it production-ready?
A: Yes. It's used in production for audio applications. All RT safety guarantees are enforced.

Q: What's the minimum Rust version?
A: Rust 1.70+ (2021 edition)

Real-Time Safety

Q: Can I use std::sync::Mutex in a node?
A: No. Mutexes can block, violating RT safety. Use arc-swap::ArcSwap or lock-free structures.

Q: Can I allocate memory in process()?
A: No. All allocations must happen in new() or on the control thread. Reuse buffers in process().

Q: Can I do file I/O in process()?
A: No. Use a lock-free ring buffer to send data to an I/O thread.

Q: Can I use println!() for debugging?
A: No. println!() can block. Use a lock-free logging library or send debug data to another thread.

Performance

Q: How many nodes can I run?
A: 1000+ nodes at < 100 µs processing time. Actual limit depends on node complexity and CPU.

Q: Does parallel execution help?
A: Yes, for large graphs (100+ nodes). Nodes at the same BFS level run in parallel via Rayon.

Q: What's the latency?
A: 1.33 ms @ 48 kHz (64 samples). Configurable via BUFFER_SIZE constant.

Q: How much memory does it use?
A: ~2.5 MB for arena + buffer pool. Scales with MAX_NODES and MAX_BUFFERS.

Graph Mutations

Q: Can I add/remove nodes while audio is playing?
A: Yes. Send commands via the SPSC ring buffer. Changes apply at the next audio block.

Q: How many commands can I send per block?
A: Up to MAX_COMMANDS_PER_TICK (32 by default). Excess commands are processed next block.

Q: What happens if I remove a node that's connected?
A: All connections to/from that node are automatically disconnected.

Q: Can I create cycles in the graph?
A: No. The graph is a DAG (directed acyclic graph). Cycles are rejected by topological sort.

Debugging

Q: How do I debug audio glitches?
A: 1) Check for allocations (use a memory profiler), 2) Check for locks (use a thread profiler), 3) Check execution time (use benchmarks), 4) Verify topological order.

Q: How do I visualize the graph?
A: Use GraphSnapshot (requires serde feature) to serialize the graph, then visualize with a tool like Graphviz.

Q: How do I profile performance?
A: Use cargo bench for microbenchmarks, or perf/VTune for system-level profiling.

Q: Why is my graph not producing sound?
A: 1) Check that output_node is set, 2) Verify connections are correct, 3) Check that nodes are processing (add debug output), 4) Verify sample rate matches your audio device.

Advanced

Q: Can I use this in a plugin (VST/AU/CLAP)?
A: Yes. See aether-plugin crate for plugin wrappers.

Q: Can I use this with CPAL?
A: Yes. See examples/cpal_integration.rs for a complete example.

Q: Can I use custom tuning systems?
A: Yes. See aether-tuning crate for microtonal support (Ethiopian, Arabic, Indian scales).

Q: Can I save/restore graph state?
A: Yes. Use GraphSnapshot (requires serde feature) to serialize/deserialize.

Q: Can I use this in embedded systems?
A: Partially. Disable parallel and serde features for smaller binary. Full no_std support is planned.

Q: How do I contribute?
A: See CONTRIBUTING.md for guidelines. PRs welcome!

Resources

License

MIT — see LICENSE