aether-core

Hard real-time modular DSP engine for Rust.

64-sample buffer · 48 kHz · ≤1.33 ms deadline · Zero allocations · Lock-free

What's in this crate

Module	Description
`arena`	Generational arena — O(1) insert/remove, no fragmentation
`graph`	DAG audio graph with topological sort (Kahn's algorithm)
`scheduler`	Lock-free RT scheduler — processes sorted node array each tick
`buffer_pool`	Pre-allocated audio buffer pool, zero RT allocation
`param`	Smoothed parameters with per-sample interpolation
`command`	SPSC command ring — control thread → RT thread mutations
`node`	`DspNode` trait — implement this to create a processing node
`state`	Serializable node state for save/restore

Real-time guarantees

Rule	Enforcement
No heap allocation in RT thread	Pre-allocated arena + buffer pool
No locks in RT thread	SPSC ring buffer (`ringbuf`)
No I/O in RT thread	All I/O on control/tokio threads
Bounded execution	Flat topo-sorted array, ≤32 commands/tick
No recursion	Iterative Kahn's sort, iterative execution

Feature flags

This crate supports optional features to reduce compile times and binary size:

[dependencies]

aetherdsp-core = { version = "0.1", default-features = false, features = ["std"] }

Feature	Default	Description
`std`	✅	Standard library support (required)
`parallel`	✅	Parallel node execution via Rayon
`serde`	✅	Serialization support for graph snapshots

Examples:

# Minimal build (no parallel, no serde)

aetherdsp-core = { version = "0.1", default-features = false, features = ["std"] }



# Parallel execution only

aetherdsp-core = { version = "0.1", default-features = false, features = ["std", "parallel"] }



# All features (default)

aetherdsp-core = "0.1"

Performance impact:

Disabling parallel falls back to sequential node execution (slower for large graphs)
Disabling serde removes graph snapshot serialization (smaller binary)

Quick start

use aether_core::{
    scheduler::Scheduler,
    node::DspNode,
    param::ParamBlock,
    BUFFER_SIZE, MAX_INPUTS,
};

struct Gain { amount: f32 }

impl DspNode for Gain {
    fn process(
        &mut self,
        inputs: &[Option<&[f32; BUFFER_SIZE]>; MAX_INPUTS],
        output: &mut [f32; BUFFER_SIZE],
        _params: &mut ParamBlock,
        _sample_rate: f32,
    ) {
        let silence = [0.0f32; BUFFER_SIZE];
        let input = inputs[0].unwrap_or(&silence);
        for (o, i) in output.iter_mut().zip(input.iter()) {
            *o = i * self.amount;
        }
    }
    fn type_name(&self) -> &'static str { "Gain" }
}

// Build a graph and run it
let mut sched = Scheduler::new(48_000.0);
let id = sched.graph.add_node(Box::new(Gain { amount: 0.5 })).unwrap();
sched.graph.set_output_node(id);

Performance characteristics

Metric	Value	Notes
Latency	1.33 ms	64 samples @ 48 kHz
Throughput	1000+ nodes	< 100 µs processing time
Memory	~2.5 MB	Pre-allocated arena + buffer pool
CPU (idle)	< 1%	Single core, empty graph
CPU (100 nodes)	5-10%	Single core, simple nodes
CPU (1000 nodes)	15-25%	Multi-core, parallel execution
Allocation	0 bytes	Zero allocation in RT thread
Lock contention	None	Lock-free SPSC ring

Benchmark results:

Benchmark	Result	Comparison
`param_fill_buffer_64`	51.7 ns	4× faster than std
Arena insert/remove ×1000	< 5 µs	O(1) operations
Scheduler (1000 noop nodes)	< 100 µs	10,000 nodes/sec
Parallel vs Sequential	3-4× faster	4+ cores

Test environment: AMD Ryzen 9 5950X, 64GB RAM, Windows 11

Run benchmarks: cargo bench -p aetherdsp-core

Comparison with other engines

Feature	AetherDSP	dasp	fundsp	cpal
Lock-free	✅	❌	❌	❌
Parallel execution	✅	❌	❌	❌
Runtime graph edits	✅	❌	❌	N/A
Generational arena	✅	❌	❌	N/A
Zero allocation	✅	⚠️ Partial	⚠️ Partial	✅
Topological sort	✅	❌	✅	N/A
Parameter smoothing	✅	❌	✅	N/A
Tuning systems	✅	❌	❌	N/A
Graph type	Runtime	Compile-time	Compile-time	N/A
Learning curve	Medium	Low	Medium	Low
Use case	DAW/synth	DSP research	Audio FX	Audio I/O

When to use AetherDSP:

Building a DAW, plugin host, or modular synthesizer
Need runtime graph mutations (add/remove nodes while playing)
Large graphs (100+ nodes) that benefit from parallel execution
Hard real-time requirements (no allocation, no locks)

When to use alternatives:

dasp: Simple DSP research, prototyping, learning
fundsp: Audio effects, compile-time graph optimization
cpal: Just need audio I/O, no graph processing

Common pitfalls

❌ DON'T: Allocate in process()

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        let buffer = vec![0.0; 1024]; // ❌ HEAP ALLOCATION!
        // This will cause audio glitches
    }
}

✅ DO: Pre-allocate in new()

struct GoodNode {
    buffer: Vec<f32>,
}

impl GoodNode {
    fn new() -> Self {
        Self {
            buffer: vec![0.0; 1024] // ✅ Allocated once
        }
    }
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        self.buffer.fill(0.0); // ✅ Reuse existing buffer
    }
}

❌ DON'T: Use Mutex in RT thread

use std::sync::Mutex;

struct BadNode {
    shared: Arc<Mutex<Vec<f32>>>, // ❌ CAN BLOCK!
}

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        let data = self.shared.lock().unwrap(); // ❌ DEADLOCK RISK!
    }
}

✅ DO: Use lock-free structures

use arc_swap::ArcSwap;

struct GoodNode {
    shared: Arc<ArcSwap<Vec<f32>>>, // ✅ Lock-free
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        let data = self.shared.load(); // ✅ No blocking
    }
}

❌ DON'T: Do I/O in process()

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        std::fs::write("output.wav", data); // ❌ BLOCKS!
        println!("Processing..."); // ❌ BLOCKS!
    }
}

✅ DO: Send data to another thread

use ringbuf::traits::Producer;

struct GoodNode {
    sender: Producer<f32>,
}

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        // ✅ Non-blocking send to I/O thread
        for &sample in output.iter() {
            let _ = self.sender.try_push(sample);
        }
    }
}

❌ DON'T: Use unbounded loops

impl DspNode for BadNode {
    fn process(&mut self, ...) {
        while self.condition { // ❌ UNBOUNDED!
            // Could run forever
        }
    }
}

✅ DO: Use bounded iterations

impl DspNode for GoodNode {
    fn process(&mut self, ...) {
        for i in 0..BUFFER_SIZE { // ✅ Bounded
            output[i] = self.compute(i);
        }
    }
}

FAQ

General

Q: What is AetherDSP?
A: A hard real-time modular DSP engine for building DAWs, plugin hosts, and synthesizers in Rust.

Q: Is it production-ready?
A: Yes. It's used in production for audio applications. All RT safety guarantees are enforced.

Q: What's the minimum Rust version?
A: Rust 1.70+ (2021 edition)

Real-Time Safety

Q: Can I use std::sync::Mutex in a node?
A: No. Mutexes can block, violating RT safety. Use arc-swap::ArcSwap or lock-free structures.

Q: Can I allocate memory in process()?
A: No. All allocations must happen in new() or on the control thread. Reuse buffers in process().

Q: Can I do file I/O in process()?
A: No. Use a lock-free ring buffer to send data to an I/O thread.

Q: Can I use println!() for debugging?
A: No. println!() can block. Use a lock-free logging library or send debug data to another thread.

Performance

Q: How many nodes can I run?
A: 1000+ nodes at < 100 µs processing time. Actual limit depends on node complexity and CPU.

Q: Does parallel execution help?
A: Yes, for large graphs (100+ nodes). Nodes at the same BFS level run in parallel via Rayon.

Q: What's the latency?
A: 1.33 ms @ 48 kHz (64 samples). Configurable via BUFFER_SIZE constant.

Q: How much memory does it use?
A: ~2.5 MB for arena + buffer pool. Scales with MAX_NODES and MAX_BUFFERS.

Graph Mutations

Q: Can I add/remove nodes while audio is playing?
A: Yes. Send commands via the SPSC ring buffer. Changes apply at the next audio block.

Q: How many commands can I send per block?
A: Up to MAX_COMMANDS_PER_TICK (32 by default). Excess commands are processed next block.

Q: What happens if I remove a node that's connected?
A: All connections to/from that node are automatically disconnected.

Q: Can I create cycles in the graph?
A: No. The graph is a DAG (directed acyclic graph). Cycles are rejected by topological sort.

Debugging

Q: How do I debug audio glitches?
A: 1) Check for allocations (use a memory profiler), 2) Check for locks (use a thread profiler), 3) Check execution time (use benchmarks), 4) Verify topological order.

Q: How do I visualize the graph?
A: Use GraphSnapshot (requires serde feature) to serialize the graph, then visualize with a tool like Graphviz.

Q: How do I profile performance?
A: Use cargo bench for microbenchmarks, or perf/VTune for system-level profiling.

Q: Why is my graph not producing sound?
A: 1) Check that output_node is set, 2) Verify connections are correct, 3) Check that nodes are processing (add debug output), 4) Verify sample rate matches your audio device.

Advanced

Q: Can I use this in a plugin (VST/AU/CLAP)?
A: Yes. See aether-plugin crate for plugin wrappers.

Q: Can I use this with CPAL?
A: Yes. See examples/cpal_integration.rs for a complete example.

Q: Can I use custom tuning systems?
A: Yes. See aether-tuning crate for microtonal support (Ethiopian, Arabic, Indian scales).

Q: Can I save/restore graph state?
A: Yes. Use GraphSnapshot (requires serde feature) to serialize/deserialize.

Q: Can I use this in embedded systems?
A: Partially. Disable parallel and serde features for smaller binary. Full no_std support is planned.

Q: How do I contribute?
A: See CONTRIBUTING.md for guidelines. PRs welcome!

Resources

Documentation: https://docs.rs/aetherdsp-core
Examples: examples/
Migration Guide: MIGRATION.md
Benchmarks: cargo bench -p aetherdsp-core
Issues: https://github.com/1yos/aether-dsp/issues
Discussions: https://github.com/1yos/aether-dsp/discussions

License

MIT — see LICENSE

aetherdsp-core 0.1.4