audio_samples 1.0.12

# Plotting + Educational — Roadmap

## Architecture decision: two new crates

The advanced features in this roadmap — streaming playback with a synced playhead, WASM
recomputation, A/B difference spectrograms, filter Bode plots — pull in dependencies and
capabilities that have no place in the core `audio_samples` crate. Equally, the educational
layer depends on rich plotting, which depends on the core. The correct shape is a dependency
chain across the ecosystem:

```
audio_samples_io       audio_samples_streaming
       │                        │
       └────────────┬───────────┘
                    ▼
              audio_samples          (core: types, traits, algorithms)
                    │
                    ▼
         audio_samples_plotting      (all plot types, themes, DSP overlays)
                    │
                    ▼
         audio_samples_education     (educational HTML documents, WASM, playback)
```

`audio_samples_plotting` depends on `audio_samples` for `AudioSamples<T>`, `StandardSample`,
and the algorithm traits. It owns every plot type, theme, and DSP-overlay computation.

`audio_samples_education` depends on `audio_samples_plotting` for waveform/spectrogram/filter
rendering and on `audio_samples` for the explain-chain primitives. It will eventually depend on
`audio_samples_io` for inline base64 audio encoding and `audio_samples_streaming` for Web Audio
playback.

`audio_samples` drops the `plotting` and `educational` features entirely after migration. Users
who want visualization: `audio_samples_plotting`. Users who want the educational layer:
`audio_samples_education`.

---

## Phase 0 — Crate extraction

**Goal:** Establish the two new crates and migrate without breaking the existing public API.

### 0.1 — Create `audio_samples_plotting`

- New crate in the workspace: `crates/audio_samples_plotting/`
- Move `src/operations/plotting/` → `audio_samples_plotting/src/`
- Move the `AudioPlotting` trait from `audio_samples::operations::traits` into
  `audio_samples_plotting`
- `audio_samples` drops the `plotting` feature; existing users who depended on it migrate to
  the new crate
- Re-export `audio_samples_plotting` publicly from `audio_samples` behind an optional
  `plotting` feature as a thin compatibility shim until the next major version

### 0.2 — Create `audio_samples_education`

- New crate in the workspace: `crates/audio_samples_education/`
- Move `src/educational/` and `src/educational/processing.rs` → `audio_samples_education/src/`
- Move the `Explainable`, `ExplainMode`, `Explaining` re-exports
- `audio_samples` drops the `educational` feature
- Same thin shim strategy as above for compatibility

### 0.3 — Feature flags in the new crates

`audio_samples_plotting` feature flags:
- `transforms` — spectrogram types (mel, gammatone, CQT, chroma)
- `filters` — filter response plots (Bode, pole-zero, group delay)
- `static-plots` — static image export via `plotly_static`
- `html-view` — `.show()` opens in browser
- `wasm` — WASM-compatible rendering path (no filesystem, no browser-open)

`audio_samples_education` feature flags:
- `plotting` (required, on by default) — waveform/spectrogram panels per step
- `audio-playback` — inline base64 audio + Web Audio synced playhead
- `wasm-dsp` — compile DSP kernels to WASM for live parameter recomputation
- `offline` — inline CDN dependencies for offline use

---

## Phase 1 — Plotting primitives: scale and audio-native foundations

**Goal:** Make the plotting layer correct before making it pretty. Everything in this phase
is a prerequisite for phases 2–5.

### 1.1 — Peak/envelope decimation (highest priority of the entire roadmap)

A 3-minute stereo file at 48 kHz is ~17 M samples per channel. Handing that to Plotly freezes
the browser or silently renders garbage. The existing `decimate_waveform` with LTTB is
insufficient: LTTB is for line charts and will drop transient peaks. The correct algorithm for
waveforms is **min/max binning**: divide the signal into N bins (N = output pixel columns, ~2000
for a 1920-wide plot), and for each bin record `(min, max, rms)`. Draw each bin as a vertical
segment from min to max; overlay an RMS envelope. This is visually indistinguishable from the
full-resolution draw and preserves every transient.

- Replace `decimate_waveform` with `envelope_decimate(signal, n_bins) -> Vec<(f64, f64, f64)>`
  returning `(min, max, rms)` per bin
- Render each bin as a filled area trace (min–max) with an RMS line on top
- Make this automatic and zoom-aware: hook into Plotly's `relayout` event (via embedded JS),
  detect `xaxis.range` changes, recompute bins for the visible window, and push new trace data
  via `Plotly.restyle`. This makes zoom reveal more detail without ever re-sending the full
  signal to the DOM.
- Apply to both standalone waveforms and the educational before/after panels

### 1.2 — Design system: `PlotTheme` and baked Plotly defaults

- `pub enum PlotTheme { Midnight, Slate, Amber, Light, Custom(PlotThemeConfig) }`
- `PlotThemeConfig`: bg, surface, border, accent, accent2, text, text_muted, channel_colors
- Default theme: `Midnight` — matching the educational document so `.show()` and the
  educational doc share a visual language
- All `create_*_plot` functions accept a `PlotTheme` and apply it to the Plotly `Layout`
  (paper_bgcolor, plot_bgcolor, font color/family, gridcolor, linecolor, tickcolor)
- Define a canonical per-channel color palette per theme:
  `[#637aff, #38bdf8, #34d399, #f59e0b, #f87171, ...]` — the same values already in the
  educational CSS, now the single source of truth
- Wire up `FontSizes`: apply `font_sizes.title` to layout title, `font_sizes.axis_labels` to
  axis titles, `font_sizes.ticks` to tick fonts
- `Layout::auto_size(true)` on all plots by default

### 1.3 — Time axis formatting

Raw float seconds is wrong for audio. The time axis must:
- Format ticks as `mm:ss.mmm` for durations ≥ 1 s, `ss.mmm` for < 1 s
- Offer a toggle between wall-clock time and sample-index
- Use sufficiently fine tick density to be useful at both full-view and zoomed-in resolutions

### 1.4 — dBFS y-axis for waveform

`AudioStatistics` already has dBFS utilities. The waveform params should expose:
```rust
pub enum AmplitudeScale { Linear, DbFs }
```
When `DbFs`: convert samples via `20 * log10(|sample|)`, clamp at a configurable floor
(default -120 dBFS), label the axis accordingly. Useful for anything where dynamic range
matters more than waveform shape.

### 1.5 — Spectrogram quality

The existing spectrogram is underspecified on the axes most important to its legibility:

**Frequency axis options.** A linear-frequency spectrogram is close to useless for music and
speech: all the perceptually relevant structure is crammed into the bottom eighth of the axis.
Add a `FreqAxisScale` option:
```rust
pub enum FreqAxisScale { Linear, Log, Mel, Bark, Erb }
```
This is separate from the existing `SpectrogramType` (which controls the magnitude encoding and
spectrogram algorithm). `FreqAxisScale` controls only how the axis is labelled and the bins are
warped for display.

**Magnitude-to-color mapping.** Spectrograms must default to dB scale with:
- Explicit reference level (0 dBFS by convention)
- Configurable floor (default -80 dB)
- Top-dB clamp (default 0 dB)
- Perceptually uniform colormap (magma or viridis) as the default — not Plotly's default
  colorscale
The dynamic range clamp is the single biggest determinant of whether a spectrogram is readable.

**Colorbar with dB units.** The colorbar title should show "dB (re 0 dBFS)" with the floor and
ceiling values at the ends.

### 1.6 — Filter analysis plots

Phase 3.3 (educational) plans to add explain texts for `butterworth_lowpass/highpass`,
`chebyshev_i`, and `apply_iir_filter`. The natural visualization for a filter is not a
before/after waveform — it is the frequency response. These plotting primitives must exist in
`audio_samples_plotting` before 3.3 can be meaningful.

Required plot types:
- **`FilterResponsePlot`** — magnitude Bode plot (dB vs Hz, log x-axis) and phase Bode plot
  (degrees vs Hz), optionally combined in a two-row subplot
- **`PoleZeroPlot`** — unit circle with poles (×) and zeros (○) on the complex plane
- **`ImpulseResponsePlot`** — time-domain impulse/step response
- **`GroupDelayPlot`** — group delay (samples or ms) vs Hz

Compute from `IirFilter` coefficients directly: frequency-sample the transfer function
`H(e^jω)` at 1024 points across [0, Nyquist].

### 1.7 — CQT and chromagram as distinct plot types

`chromagram` and `mel_spectrogram` are listed in Phase 3.3 as operations to add explain texts
for, but they are not generic spectrograms. A chromagram has 12 pitch classes on the y-axis, a
circular pitch structure, and different colormap requirements (the pitch class axis wraps). A
CQT has logarithmically spaced frequency bins with a fixed number of bins per octave.

- **`ChromagramPlot`** — heatmap with pitch-class labels (C, C#, D, …, B), y-axis circularity
  implied by label order, viridis default colormap
- Add `CqtSpectrogram` as a `SpectrogramType` variant with the correct bin layout

### 1.8 — Annotations and labeled regions primitive

A reusable primitive used by onset/beat/segment visualization across all plot types:

```rust
pub struct TimeRegion {
    pub start: f64,            // seconds
    pub end: Option<f64>,      // None = point marker
    pub label: Option<String>,
    pub color: Option<String>,
}
```

All plot types that have a time axis accept `Vec<TimeRegion>` via a shared `add_regions` method.
This replaces the ad-hoc `add_vline`/`add_shaded_region` per-type duplication and is the
primitive for onset/beat/segment annotations on spectrograms (currently missing).

---

## Phase 2 — API simplification

**Goal:** Common cases need zero config; power cases stay possible.

### 2.1 — Zero-arg shortcuts on `AudioSamples`

```rust
audio.plot()          // -> AudioSampleResult<WaveformPlot>
audio.spectrogram()   // -> AudioSampleResult<SpectrogramPlot>
audio.spectrum()      // -> AudioSampleResult<MagnitudeSpectrumPlot>
```

Convenience wrappers on the existing trait methods with `PlotTheme::Midnight` and sensible
per-type defaults. No new behaviour, just less friction.

### 2.2 — Auto-computing overlay methods on `WaveformPlot`

Current pattern requires two steps: compute then add. Add high-level companions:

```rust
plot.with_rms_envelope(&audio)                       // sensible default window/hop
plot.with_rms_envelope_params(&audio, window, hop)   // explicit params
plot.with_peak_envelope(&audio)
plot.with_zcr_overlay(&audio)
plot.with_onset_markers(&audio, &config)             // compute + add
plot.with_beat_markers(&audio, &config)
```

The low-level `add_*` methods remain for users who pre-compute their own data.

### 2.3 — Replace `CompositePlot` iframe approach

The current implementation base64-encodes each sub-plot into an `<iframe>`, meaning sub-plots
are not interactive together (no shared hover, no linked zoom). Replace with Plotly's native
subplot system: each plot's traces are added to a shared `Plot` with grid/domain positioning.
`PlotComponent::requires_shared_x_axis()` already exists to guide this.

### 2.4 — Analysis dashboard

Built on the fixed `CompositePlot`:
```rust
audio.analysis_dashboard()  // -> AudioSampleResult<CompositePlot>
```
Default layout: waveform (with min/max envelope from 1.1) on top, mel-dB spectrogram in the
middle, magnitude spectrum at the bottom. Shared x (time) axis for waveform and spectrogram.

### 2.5 — Clean up reserved/unused fields

- `line_style: Option<String>` — implement (Plotly: solid/dash/dot/dashdot) or remove
- `window_type` in `MagnitudeSpectrumParams` — wire up for FFT windowing or remove
- `frame_position` — implement frame-based spectrum or remove
- Static export `width`/`height`/`scale` — expose as config on `PlotUtils::save`

---

## Phase 3 — More audio-native plot types

**Goal:** Cover the standard analysis views a serious audio library is expected to have.

### 3.1 — A/B and difference views

- **Waveform overlay**: render two `AudioSamples` on the same axes with distinct colors, a
  `difference` trace (A − B), and a legend. Entry point:
  `WaveformPlot::compare(audio_a, audio_b, params)`
- **Difference spectrogram**: compute both spectrograms, subtract in dB, render with a
  diverging colormap (blue=A louder, red=B louder, white=equal). Used in educational
  before/after at the spectral level.

### 3.2 — Loudness / metering over time

- LUFS momentary (400 ms), short-term (3 s), integrated (full file) per EBU R128
- True-peak (4× oversampled) per ITU-R BS.1770
- Loudness range (LRA)
- Plot type: `LoudnessMeterPlot` — time series of momentary and short-term LUFS with
  integrated as a horizontal reference line; dBTP overlay optional

### 3.3 — Stereo field visualization

- **`GoniometerPlot`** — L vs R Lissajous scatter, updated as a rolling window; reveals
  stereo width, phase issues, and mono compatibility
- **Inter-channel correlation over time** — windowed Pearson r between L and R; +1 = mono,
  0 = uncorrelated, -1 = out-of-phase
- **Mid/side decomposition view** — M = (L+R)/2, S = (L−R)/2, plotted as two waveforms

### 3.4 — Spectrogram overlays

Overlaying MIR output on a spectrogram is where analysis output becomes legible:
- f0/pitch track (add_pitch_track)
- Onset and beat markers (via `TimeRegion` from 1.8)
- Formant tracks (F1, F2, F3)
- Harmonic series lines from a detected f0

### 3.5 — Phase spectrogram and group delay

- Phase spectrogram: angle of complex STFT bins, rendered with a cyclic colormap (HSV/twilight)
- Instantaneous frequency: derivative of phase, more readable than raw phase
- Group delay: `−dφ/dω`, rendered as a separate heatmap or line overlay

---

## Phase 4 — Educational structural improvements

**Goal:** Make `audio_samples_education` robust, extensible, and honest about its dependencies.

### 4.1 — Replace raw-string parsing with `ExplanationData`

The current approach of encoding `[operation: Name]\n[formula: LaTeX]` into a plain string then
parsing with `strip_prefix` is the most fragile part of the system. Replace with:

```rust
pub struct ExplanationData {
    pub operation: String,
    pub formula_latex: Option<String>,
    pub prose: String,
    pub visual_type: VisualType,
    pub code: Option<String>,          // exact Rust call for this step
}

pub enum VisualType {
    Waveform,
    Spectrogram,
    FrequencyResponse,   // Bode plot — requires audio_samples_plotting filter plots
    Chromagram,          // requires ChromagramPlot
    Spectrum,            // magnitude spectrum overlay
    Difference,          // A/B difference view
    None,
}
```

`VisualType` is now wide enough for all operations targeted in 4.3. Each `explain_*` function
returns `ExplanationData` directly. The renderer consumes structured data, not parsed text.

### 4.2 — Remove the unsafe pointer cast

`render_visual_block` casts `*const dyn ExplainDisplay as *const AudioSamplesVisual`. This is
unsound. Fix by making `Explanation::visual` a concrete `Option<AudioSamplesVisual>` rather than
`Box<dyn ExplainDisplay>`. Since `audio_samples_education` will own both the `explainable`
integration and `AudioSamplesVisual`, this is straightforward.

### 4.3 — Extend explain texts to more operations

With `ExplanationData` (4.1) and the plotting primitives (Phase 1) in place, covering new
operations is mechanical:

- `AudioIirFiltering`: `butterworth_lowpass/highpass/bandpass`, `chebyshev_i`,
  `apply_iir_filter` — `VisualType::FrequencyResponse` (Bode) as the visual
- `AudioEditing`: `trim`, `pad`, `fade_in`, `fade_out`, `concatenate` — `VisualType::Waveform`
- `AudioTransforms`: `stft` — `VisualType::Spectrogram`; `mel_spectrogram` — `VisualType::Spectrogram`; `chromagram` — `VisualType::Chromagram`

### 4.4 — Statistics comparison block per step

Each card shows a before/after table below the formula:

```
           Before    After
Peak:      0.80      1.00
RMS:       0.32      0.40
Duration:  2.00 s    2.00 s
```

For spectral operations, also show spectral centroid and bandwidth.

### 4.5 — Show-the-code per step

The `code: Option<String>` field in `ExplanationData` (4.1) drives a per-card code block with
a copy button. Shows the exact `audio_samples` Rust call that produced the step. Makes the
document a reproduction recipe, not just an illustration.

### 4.6 — Spectral before/after overlay

The scalar stats table (4.4) cannot convey *where* in frequency an operation acted.
Add an optional magnitude-spectrum overlay card: before and after spectra on the same axes,
difference spectrum highlighted. Driven by `VisualType::Spectrum` and `VisualType::Difference`.

### 4.7 — Hover-definition glossary

Wrap DSP vocabulary (windowing, leakage, Nyquist, dBFS, LUFS, etc.) in `<span class="gloss">`
elements. A small JS tooltip shows a one-sentence definition on hover. The glossary is defined
once in the template; term highlighting is automatic via a lookup over known terms.

---

## Phase 5 — Educational UI and rich features

**Goal:** Turn the document from a static snapshot into a learning tool.

### 5.1 — Embedded audio playback with synced playhead (highest-value educational feature)

Encode the `AudioSamples` at each step as a base64 WAV `<audio>` element inlined in the HTML.
Wire a `timeupdate` listener to sweep a vertical playhead cursor across the waveform and
spectrogram panels in sync with playback. The before/after comparison cards become before/after
audio the user can A/B by ear. This is worth more than the sidebar, collapsible cards, and
linked brushing combined.

Depends on: `audio_samples_io` for WAV encoding (base64 inline data URI).

### 5.2 — Step navigation sidebar

For chains of 5+ operations: a sticky left sidebar with step names and operation types. Clicking
jumps to the card. The timeline structure is already in place; the sidebar is ~30 lines of JS.

### 5.3 — Collapsible cards

Cards expand/collapse by clicking the header. Collapsed state shows step number, operation
name, and a 1-line stats delta (`peak +25%, RMS +25%`). Makes long documents scannable.

### 5.4 — CSS/JS injection API

```rust
pub struct ExplainConfig {
    pub title: String,
    pub default_theme: PlotTheme,
    pub custom_css: Option<String>,
    pub custom_js: Option<String>,
}
```

`custom_css` overrides any CSS variable, color, or layout. Changing the accent colour or font
requires 2 lines. `ExplainConfig::new(title)` is the zero-friction entry point.
Replace `render_explanation_document(explanations, title)` with
`render_explanation_document(explanations, &ExplainConfig)`.

### 5.5 — Linked brush/selection

Dragging a region on the "before" waveform highlights the same time range on the "after"
waveform and seeks the audio playback to that region. ~50 lines of Plotly JS event handling.

### 5.6 — Precomputed parameter sweeps (cheap WASM alternative)

Render the operation at several parameter values and present as small multiples or a slider over
precomputed frames. Example: window-size sweep for STFT (128/512/2048/8192 samples) shown as
four side-by-side spectrograms. Most of the pedagogical value of live WASM at a fraction of the
complexity. This is the recommended stepping stone.

---

## Phase 6 — Ambitious / long-term

### 6.1 — WASM live recomputation

Compile DSP kernels to WASM (`wasm-bindgen`). Embed parameter controls (sliders, dropdowns) in
the educational card. Changing a slider reruns the operation in-browser and redraws the plots
without a Rust rebuild. This turns a static explainer into an interactive instrument.

Architecturally: DSP kernels in `audio_samples` are pure functions on slices — they compile to
WASM cleanly. `audio_samples_education` with the `wasm-dsp` feature links the WASM binary into
the HTML at document generation time.

### 6.2 — Offline self-contained output

`render_explanation_document` already produces CDN-linked HTML. Add an `ExplainConfig::offline`
flag that fetches and inlines KaTeX, Plotly, and fonts at document generation time. Works in
combination with 5.1 since audio is already inlined.

### 6.3 — Export formats

- PDF export via headless Chrome/Chromium (for reports and papers)
- PNG export of individual cards (for slides)
- Jupyter notebook export (`.ipynb`) so the explanation chain becomes a reproducible notebook

---

## Dependency order

```
Phase 0  (crate extraction)
   │
   ├──► Phase 1  (plotting primitives — scale, theme, spectrogram quality, filter plots, chroma)
   │         │
   │         └──► Phase 2  (API simplification, analysis dashboard)
   │                   │
   │                   └──► Phase 3  (more plot types: A/B, loudness, stereo field, phase)
   │
   └──► Phase 4  (educational structure — depends on Phase 1 for VisualType primitives)
             │
             └──► Phase 5  (educational UI — depends on Phase 4 for ExplainConfig/ExplanationData)
                       │
                       └──► Phase 6  (WASM, offline, export — depends on Phase 5)
```

Phases 1 and 4 share a hard dependency on Phase 0 (crate extraction) but are otherwise
independent of each other and can proceed in parallel once the crate boundaries are established.
Phase 3 requires Phase 2. Phase 5 requires Phase 4. Phase 6 requires Phase 5.

The decimation work (1.1) and the filter/chromagram plot types (1.6, 1.7) must land before
Phase 4 begins: `VisualType::FrequencyResponse` and `VisualType::Chromagram` in 4.1 are hollow
until those primitives exist. Do not start Phase 4 operations that use those variants until the
corresponding Phase 1 items are complete.
```