tympan-apo 0.1.0

Rust framework for Windows Audio Processing Objects (APOs)
Documentation
# Architecture


*Read this in other languages: [日本語](ja/architecture.md).*

This document describes the framework's implemented architecture: the
module layout, the four-layer model, and the core abstractions users
implement against. The "In scope" feature set from
[`overview.md`](overview.md) is complete; see
[`decisions/0001-ci-verification-strategy.md`](decisions/0001-ci-verification-strategy.md)
and [`testing.md`](testing.md) for the verification strategy.

## Module layout


The framework crate is an `rlib` only. The four `Dll*` COM entry
points are emitted into the *consumer* crate's root by the
`register_apo!` / `register_aec_apo!` macros, so the framework itself
does not produce a `cdylib` — that avoids the parallel-link race
(`rust-lang/cargo#6313`) that a dual `rlib` + `cdylib` artefact would
hit. Each reference APO under `examples/` is its own `cdylib`.

```
tympan-apo/
├── src/
│   ├── lib.rs            # Re-exports; public API surface
│   ├── apo.rs            # ProcessingObject trait, ProcessInput,
│   │                     #   ApoCategory, SystemEffect
│   ├── instance.rs       # ApoInstance<T> + AnyApoInstance: the
│   │                     #   framework-side lifecycle wrapper
│   ├── buffer.rs         # BufferFlags, ConnectionProperty
│   ├── format.rs         # Format, FormatNegotiation,
│   │                     #   WAVEFORMATEX(TENSIBLE) conversions
│   ├── error.rs          # HResult wrapper + APO HRESULT constants
│   ├── clsid.rs          # Clsid (cross-platform GUID)
│   ├── inf.rs            # INF file generator
│   ├── fx_properties.rs  # FxProperties endpoint-binding helpers
│   ├── macros.rs         # register_apo! / register_aec_apo!
│   ├── raw/              # Low-level COM bindings (Windows-only)
│   │   ├── mod.rs
│   │   ├── abi.rs            # Compile-time ABI invariants
│   │   ├── class_factory.rs  # ApoVTable + ApoClassFactory
│   │   ├── instance_com.rs   # ApoInstanceCom: IAudioProcessingObject
│   │   │                     #   family + IAudioSystemEffects v1/v2/v3
│   │   ├── dispatch.rs       # Shared COM method bodies
│   │   ├── media_type.rs     # IAudioMediaType <-> Format bridge
│   │   ├── reg_properties.rs # APO_REG_PROPERTIES payload builder
│   │   ├── register.rs       # HKCU CLSID registry write/clear
│   │   └── exports.rs        # Dll* dispatch helpers
│   ├── realtime/         # Realtime-safe primitives (cross-platform)
│   │   ├── mod.rs
│   │   ├── context.rs    # RealtimeContext marker type
│   │   ├── ring.rs       # Lock-free SPSC ring buffer
│   │   ├── state.rs      # StateCell lifecycle state machine
│   │   └── refcount.rs   # Atomic COM-style refcount
│   └── aec/              # Windows 11 AEC APO support
│       │                 #   (Windows + `aec` feature)
│       ├── mod.rs            # AecProcessingObject, AecApoInstance<T>,
│       │                     #   AnyAecApoInstance, AuxiliaryInputBuffer
│       ├── class_factory.rs  # AecApoVTable + AecApoClassFactory
│       ├── instance_com.rs   # AecApoInstanceCom: the nine AEC IIDs
│       └── exports.rs        # AEC Dll* dispatch helpers
├── examples/
│   ├── passthrough.rs    # Trivial APO: copies input to output
│   ├── gain.rs           # Fixed linear gain; per-instance state
│   └── aec_scaffold.rs   # AEC APO skeleton (requires `aec` feature)
└── tests/
    ├── realtime_safety.rs    # assert_no_alloc guard on the RT path
    ├── register_apo.rs       # Macro-emitted export wiring
    ├── tier3_lifecycle.rs    # In-process COM activation (SISO)
    └── tier3_aec_lifecycle.rs# In-process COM activation (AEC)
```

## Layer model


Four conceptual layers, isolated by module boundary.

### Layer 1: `raw` — COM bindings


Windows-only (`#[cfg(windows)]`).

- Sole consumer of the `windows` / `windows-core` crates' APO
  interface types and the sole owner of `windows_core::implement`-based
  vtable construction.
- `instance_com::ApoInstanceCom` bridges `Arc<dyn AnyApoInstance>` to
  the `IAudioProcessingObject` family
  (`IAudioProcessingObject`, `IAudioProcessingObjectConfiguration`,
  `IAudioProcessingObjectRT`) plus `IAudioSystemEffects` v1/v2/v3.
- `dispatch` hoists the COM method bodies into free functions over
  `&dyn AnyApoInstance` so the SISO and AEC carriers stay in lock-step
  without copy-pasted impls.
- `class_factory` exposes `ApoVTable` (a CLSID + metadata + creator
  fn) and the `IClassFactory` that mints instances from it.
- `exports` supplies the reusable bodies the macro-emitted `Dll*`
  entry points call into; `register` writes the
  `HKCU\Software\Classes\CLSID\{…}` subtree; `reg_properties` builds
  the variable-length `APO_REG_PROPERTIES` payload; `media_type`
  bridges `IAudioMediaType` to `Format`; `abi` holds compile-time
  `size_of` / `align_of` assertions guarding `windows-rs` layout drift.

Users of `tympan-apo` are not expected to touch this module. It is
`pub` for advanced users and the framework's own test harness.

### Layer 2: `realtime` — zero-allocation primitives


Cross-platform — the realtime invariants do not depend on Windows
APIs, and unit-testing them on any host is more valuable than gating
them behind `#[cfg(windows)]`.

- No allocator use, no `std::sync::Mutex`, no `std::collections`.
- `RealtimeContext` — a zero-sized marker required as a parameter for
  any function safe to call from the realtime `APOProcess` path. It
  cannot be constructed by user code (the framework hands one out by
  reference from its `process` harness), so its presence in a call
  stack is a compile-time witness of realtime safety.
- `ring` — a lock-free single-producer / single-consumer ring buffer.
  `Producer` / `Consumer` are `Send` but not `Sync`; capacity is fixed
  at construction, so `try_push` / `try_pop` are wait-free and
  heap-touch-free.
- `state``StateCell`, the atomic lifecycle state machine
  (`Uninitialized → Initialized → Locked`), with bad transitions
  surfaced as `TransitionError` rather than silent corruption.
- `refcount``Refcount`, the wait-free atomic counter behind the
  COM `IUnknown` `AddRef` / `Release` contract.

### Layer 3: Public API — safe, idiomatic


This is the layer the large majority of users interact with. It lives
in the crate root and the cross-platform modules `apo`, `buffer`,
`clsid`, `error`, `format`, `instance`, `inf`, and `fx_properties`.

- `ProcessingObject` — the trait users implement (see below).
- `ApoInstance<T>` / `AnyApoInstance` — the framework-side wrapper that
  combines a `StateCell`, a `Refcount`, and an `UnsafeCell<T>` into the
  single object handed to the audio engine. `AnyApoInstance` is the
  type-erased view the COM bridge dispatches through.
- `Format` / `FormatNegotiation` — PCM stream description and the
  Accept / Suggest negotiation result.
- `ProcessInput` / `BufferFlags` / `ConnectionProperty` — the
  per-buffer payload and host flag words.
- `Clsid` / `HResult` — cross-platform GUID and HRESULT value types,
  layout-compatible with their `windows-core` counterparts.

### Layer 4: `aec` — Windows 11 AEC APO support


Gated on `#[cfg(all(windows, feature = "aec"))]` so non-AEC plugins do
not pull in the Windows 11 SDK surface.

- `AecProcessingObject` — extension trait over `ProcessingObject`
  adding the auxiliary-input lifecycle hooks (`add_aux_input`,
  `remove_aux_input`, `is_aux_format_supported`, `accept_aux_input`).
- `AecApoInstance<T>` / `AnyAecApoInstance` — the AEC wrapper, built on
  top of `ApoInstance<T>` so the SISO state machine is reused.
- `AuxiliaryInputBuffer` — the per-buffer reference-signal payload
  delivered to `accept_aux_input` on the realtime thread.
- `class_factory` / `instance_com` / `exports` — the AEC counterparts
  of the `raw` carriers. `AecApoInstanceCom` advertises nine COM
  interfaces: the six SISO interfaces plus
  `IApoAcousticEchoCancellation`, `IApoAuxiliaryInputConfiguration`,
  and `IApoAuxiliaryInputRT`.

## Core abstractions


### `ProcessingObject`


The top-level trait implemented by consumers. Each implementor is one
CLSID-identified APO. The framework's COM harness constructs the type
via `new`, drives the format-negotiation / `LockForProcess` /
`APOProcess` / `UnlockForProcess` sequence, and routes the audio
engine's calls into the trait methods.

```text
pub trait ProcessingObject: Sized + Send {
    const CLSID: Clsid;
    const NAME: &'static str;
    const COPYRIGHT: &'static str;
    const CATEGORY: ApoCategory;          // Sfx / Mfx / Efx

    fn new() -> Self;

    // Format negotiation — defaults accept any IEEE-float32 stream
    // and Suggest a float32 alternative for anything else.
    fn is_input_format_supported(&self, format: &Format) -> FormatNegotiation { … }
    fn is_output_format_supported(&self, format: &Format) -> FormatNegotiation { … }

    // System-effect enumeration / toggling (IAudioSystemEffects2/3).
    // Defaults: no enumerable effects, no-op toggle.
    fn system_effects(&self) -> &[SystemEffect] { &[] }
    fn set_system_effect_state(&mut self, id: &Clsid, state: SystemEffectState) { … }

    // Lifecycle. Pre-allocate in lock_for_process; release in unlock.
    fn lock_for_process(&mut self, input: &Format, output: &Format)
        -> Result<(), HResult> { Ok(()) }
    fn unlock_for_process(&mut self) {}

    // Realtime: allocation-free, lock-free, no syscalls.
    fn process(
        &mut self,
        rt: &RealtimeContext,
        input: ProcessInput<'_>,
        output: &mut [f32],
    ) -> BufferFlags;
}
```

`process` is the only required method past `new` and the associated
constants; everything else has a sensible default. The return value
becomes the `u32BufferFlags` of the host's output
`APO_CONNECTION_PROPERTY`.

The framework emits the COM in-process server entry points via a
macro:

```text
tympan_apo::register_apo!(MyApo);
```

This expands, in the calling crate's root, to the `ApoVTable` static,
a one-entry registry, and the four `#[no_mangle]` `Dll*` exports
(`DllGetClassObject`, `DllCanUnloadNow`, `DllRegisterServer`,
`DllUnregisterServer`) wired to the dispatch helpers in `raw::exports`.
It must be called exactly once per `cdylib` because the emitted
symbols have fixed names.

### `Format` and format negotiation


`Format` mirrors `WAVEFORMATEX` plus the `WAVEFORMATEXTENSIBLE`
extension (channel mask, valid-bits-per-sample, sub-format). Typed
constructors (`pcm_int16`, `pcm_int24`, `pcm_int32`, `pcm_float32`,
`pcm_float64`) produce the base variant; `with_extensible` opts into
the extensible wire format and fills a default channel mask.
`raw::media_type` converts to and from the host's `IAudioMediaType`.

```text
fn is_input_format_supported(&self, format: &Format) -> FormatNegotiation {
    if format.sample_rate() == 48_000 && format.channels() == 1 {
        FormatNegotiation::Accept
    } else {
        FormatNegotiation::Suggest(Format::pcm_float32(48_000, 1))
    }
}
```

### `RealtimeContext`


A zero-sized marker that compile-checks realtime safety. The framework
passes one by reference from its `APOProcess` harness into
`ProcessingObject::process`; it has no fields and no user-reachable
constructor (tests use the crate-private `new_unchecked`).

### `aec::AecProcessingObject`


Extension trait for AEC APOs. Adds the auxiliary-input (reference
stream) lifecycle on top of `ProcessingObject`:

```text
pub trait AecProcessingObject: ProcessingObject {
    fn add_aux_input(&mut self, id: u32, format: &Format, init_data: &[u8])
        -> Result<(), HResult> { Ok(()) }
    fn remove_aux_input(&mut self, id: u32) {}
    fn is_aux_format_supported(&self, format: &Format) -> FormatNegotiation { … }
    fn accept_aux_input(&mut self, rt: &RealtimeContext, input: AuxiliaryInputBuffer<'_>) {}
}
```

All four methods have defaults, so an implementor overrides only what
its echo-cancellation algorithm needs. `accept_aux_input` runs on the
realtime thread and carries the same allocation-free / lock-free
constraints as `process`.

## Cross-cutting concerns


### CLSID allocation


APOs are identified by COM Class IDs. `Clsid` is a cross-platform,
`#[repr(C)]`, GUID-layout-compatible type with `from_u128` /
`from_parts` constructors so authors can declare and unit-test CLSIDs
on any host. `Clsid::NIL` is the sentinel COM rejects as
`CLASS_E_CLASSNOTAVAILABLE`.

### Registration


Three layers of registration helper, increasing in platform
specificity:

- `raw::register``DllRegisterServer` / `DllUnregisterServer` write
  and clear the `HKCU\Software\Classes\CLSID\{…}` subtree, so
  `regsvr32 /n /i:user` works without administrative privilege.
- `inf``generate(&InfConfig)` emits a minimal INF for production
  drops that integrate with the Windows componentization model.
- `fx_properties` — binds a registered CLSID to a specific audio
  endpoint by writing the `FxProperties` subtree under
  `HKLM\…\MMDevices\Audio`. Requires elevation.

### Realtime logging


Realtime code cannot log via `tracing` or `log` (both allocate). The
`realtime::ring` SPSC buffer is the substrate for the "log from the
realtime thread, drain off-thread" pattern: push a small `Copy` event
from `process`, drain it from a non-realtime thread.

## Resolved design decisions


The questions that were open during the design phase have since been
settled:

- **Aggregation.** APOs are single-input single-output (with optional
  aux inputs in AEC mode). The framework enforces SISO at the type
  level and the class factory rejects aggregation with
  `CLASS_E_NOAGGREGATION`.
- **Minimum Windows version.** MSRV is Rust 1.80, matching the
  `windows` crate. The non-AEC path targets Windows 10+; the `aec`
  feature targets Windows 11 23H2+ and is gated so non-AEC builds do
  not require the newer SDK.
- **AEC reference stream.** The reference (loopback) signal is
  delivered through `IApoAuxiliaryInputRT::AcceptInput` and surfaced
  to user code as `AuxiliaryInputBuffer`. The framework does not open
  its own WASAPI loopback.
- **Signal-processing modes.** APOs declare their slot via
  `ApoCategory` (`Sfx` / `Mfx` / `Efx`); they are otherwise
  mode-agnostic at the framework layer.
- **Dynamic effect on/off.** `IAudioSystemEffects2` /
  `IAudioSystemEffects3` are implemented: `ProcessingObject::system_effects`
  advertises the effect list and `set_system_effect_state` receives
  the engine's toggle calls.

## Known limitations


- `IApoAuxiliaryInputRT::AcceptInput` infers the aux buffer geometry
  from the primary input's locked format. AEC APOs whose aux input
  uses a different format than the primary input need explicit
  per-aux-input format tracking, which is not yet implemented.
- `raw::reg_properties` advertises a fixed interface list per carrier
  (three SISO IIDs, or nine for the AEC carrier). Widening it for an
  APO with a bespoke interface set would require a code change.
- Tier 4 verification — driving a real `audiodg.exe` — cannot run on
  GitHub-hosted runners; it is a manual / self-hosted step. See
  [`testing.md`]testing.md.