# Plotting + Educational — Roadmap
## Architecture decision: two new crates
The advanced features in this roadmap — streaming playback with a synced playhead, WASM
recomputation, A/B difference spectrograms, filter Bode plots — pull in dependencies and
capabilities that have no place in the core `audio_samples` crate. Equally, the educational
layer depends on rich plotting, which depends on the core. The correct shape is a dependency
chain across the ecosystem:
```
audio_samples_io audio_samples_streaming
│ │
└────────────┬───────────┘
▼
audio_samples (core: types, traits, algorithms)
│
▼
audio_samples_plotting (all plot types, themes, DSP overlays)
│
▼
audio_samples_education (educational HTML documents, WASM, playback)
```
`audio_samples_plotting` depends on `audio_samples` for `AudioSamples<T>`, `StandardSample`,
and the algorithm traits. It owns every plot type, theme, and DSP-overlay computation.
`audio_samples_education` depends on `audio_samples_plotting` for waveform/spectrogram/filter
rendering and on `audio_samples` for the explain-chain primitives. It will eventually depend on
`audio_samples_io` for inline base64 audio encoding and `audio_samples_streaming` for Web Audio
playback.
`audio_samples` drops the `plotting` and `educational` features entirely after migration. Users
who want visualization: `audio_samples_plotting`. Users who want the educational layer:
`audio_samples_education`.
---
## Phase 0 — Crate extraction
**Goal:** Establish the two new crates and migrate without breaking the existing public API.
### 0.1 — Create `audio_samples_plotting`
- New crate in the workspace: `crates/audio_samples_plotting/`
- Move `src/operations/plotting/` → `audio_samples_plotting/src/`
- Move the `AudioPlotting` trait from `audio_samples::operations::traits` into
`audio_samples_plotting`
- `audio_samples` drops the `plotting` feature; existing users who depended on it migrate to
the new crate
- Re-export `audio_samples_plotting` publicly from `audio_samples` behind an optional
`plotting` feature as a thin compatibility shim until the next major version
### 0.2 — Create `audio_samples_education`
- New crate in the workspace: `crates/audio_samples_education/`
- Move `src/educational/` and `src/educational/processing.rs` → `audio_samples_education/src/`
- Move the `Explainable`, `ExplainMode`, `Explaining` re-exports
- `audio_samples` drops the `educational` feature
- Same thin shim strategy as above for compatibility
### 0.3 — Feature flags in the new crates
`audio_samples_plotting` feature flags:
- `transforms` — spectrogram types (mel, gammatone, CQT, chroma)
- `filters` — filter response plots (Bode, pole-zero, group delay)
- `static-plots` — static image export via `plotly_static`
- `html-view` — `.show()` opens in browser
- `wasm` — WASM-compatible rendering path (no filesystem, no browser-open)
`audio_samples_education` feature flags:
- `plotting` (required, on by default) — waveform/spectrogram panels per step
- `audio-playback` — inline base64 audio + Web Audio synced playhead
- `wasm-dsp` — compile DSP kernels to WASM for live parameter recomputation
- `offline` — inline CDN dependencies for offline use
---
## Phase 1 — Plotting primitives: scale and audio-native foundations
**Goal:** Make the plotting layer correct before making it pretty. Everything in this phase
is a prerequisite for phases 2–5.
### 1.1 — Peak/envelope decimation (highest priority of the entire roadmap)
A 3-minute stereo file at 48 kHz is ~17 M samples per channel. Handing that to Plotly freezes
the browser or silently renders garbage. The existing `decimate_waveform` with LTTB is
insufficient: LTTB is for line charts and will drop transient peaks. The correct algorithm for
waveforms is **min/max binning**: divide the signal into N bins (N = output pixel columns, ~2000
for a 1920-wide plot), and for each bin record `(min, max, rms)`. Draw each bin as a vertical
segment from min to max; overlay an RMS envelope. This is visually indistinguishable from the
full-resolution draw and preserves every transient.
- Replace `decimate_waveform` with `envelope_decimate(signal, n_bins) -> Vec<(f64, f64, f64)>`
returning `(min, max, rms)` per bin
- Render each bin as a filled area trace (min–max) with an RMS line on top
- Make this automatic and zoom-aware: hook into Plotly's `relayout` event (via embedded JS),
detect `xaxis.range` changes, recompute bins for the visible window, and push new trace data
via `Plotly.restyle`. This makes zoom reveal more detail without ever re-sending the full
signal to the DOM.
- Apply to both standalone waveforms and the educational before/after panels
### 1.2 — Design system: `PlotTheme` and baked Plotly defaults
- `pub enum PlotTheme { Midnight, Slate, Amber, Light, Custom(PlotThemeConfig) }`
- `PlotThemeConfig`: bg, surface, border, accent, accent2, text, text_muted, channel_colors
- Default theme: `Midnight` — matching the educational document so `.show()` and the
educational doc share a visual language
- All `create_*_plot` functions accept a `PlotTheme` and apply it to the Plotly `Layout`
(paper_bgcolor, plot_bgcolor, font color/family, gridcolor, linecolor, tickcolor)
- Define a canonical per-channel color palette per theme:
`[#637aff, #38bdf8, #34d399, #f59e0b, #f87171, ...]` — the same values already in the
educational CSS, now the single source of truth
- Wire up `FontSizes`: apply `font_sizes.title` to layout title, `font_sizes.axis_labels` to
axis titles, `font_sizes.ticks` to tick fonts
- `Layout::auto_size(true)` on all plots by default
### 1.3 — Time axis formatting
Raw float seconds is wrong for audio. The time axis must:
- Format ticks as `mm:ss.mmm` for durations ≥ 1 s, `ss.mmm` for < 1 s
- Offer a toggle between wall-clock time and sample-index
- Use sufficiently fine tick density to be useful at both full-view and zoomed-in resolutions
### 1.4 — dBFS y-axis for waveform
`AudioStatistics` already has dBFS utilities. The waveform params should expose:
```rust
pub enum AmplitudeScale { Linear, DbFs }
```
When `DbFs`: convert samples via `20 * log10(|sample|)`, clamp at a configurable floor
(default -120 dBFS), label the axis accordingly. Useful for anything where dynamic range
matters more than waveform shape.
### 1.5 — Spectrogram quality
The existing spectrogram is underspecified on the axes most important to its legibility:
**Frequency axis options.** A linear-frequency spectrogram is close to useless for music and
speech: all the perceptually relevant structure is crammed into the bottom eighth of the axis.
Add a `FreqAxisScale` option:
```rust
pub enum FreqAxisScale { Linear, Log, Mel, Bark, Erb }
```
This is separate from the existing `SpectrogramType` (which controls the magnitude encoding and
spectrogram algorithm). `FreqAxisScale` controls only how the axis is labelled and the bins are
warped for display.
**Magnitude-to-color mapping.** Spectrograms must default to dB scale with:
- Explicit reference level (0 dBFS by convention)
- Configurable floor (default -80 dB)
- Top-dB clamp (default 0 dB)
- Perceptually uniform colormap (magma or viridis) as the default — not Plotly's default
colorscale
The dynamic range clamp is the single biggest determinant of whether a spectrogram is readable.
**Colorbar with dB units.** The colorbar title should show "dB (re 0 dBFS)" with the floor and
ceiling values at the ends.
### 1.6 — Filter analysis plots
Phase 3.3 (educational) plans to add explain texts for `butterworth_lowpass/highpass`,
`chebyshev_i`, and `apply_iir_filter`. The natural visualization for a filter is not a
before/after waveform — it is the frequency response. These plotting primitives must exist in
`audio_samples_plotting` before 3.3 can be meaningful.
Required plot types:
- **`FilterResponsePlot`** — magnitude Bode plot (dB vs Hz, log x-axis) and phase Bode plot
(degrees vs Hz), optionally combined in a two-row subplot
- **`PoleZeroPlot`** — unit circle with poles (×) and zeros (○) on the complex plane
- **`ImpulseResponsePlot`** — time-domain impulse/step response
- **`GroupDelayPlot`** — group delay (samples or ms) vs Hz
Compute from `IirFilter` coefficients directly: frequency-sample the transfer function
`H(e^jω)` at 1024 points across [0, Nyquist].
### 1.7 — CQT and chromagram as distinct plot types
`chromagram` and `mel_spectrogram` are listed in Phase 3.3 as operations to add explain texts
for, but they are not generic spectrograms. A chromagram has 12 pitch classes on the y-axis, a
circular pitch structure, and different colormap requirements (the pitch class axis wraps). A
CQT has logarithmically spaced frequency bins with a fixed number of bins per octave.
- **`ChromagramPlot`** — heatmap with pitch-class labels (C, C#, D, …, B), y-axis circularity
implied by label order, viridis default colormap
- Add `CqtSpectrogram` as a `SpectrogramType` variant with the correct bin layout
### 1.8 — Annotations and labeled regions primitive
A reusable primitive used by onset/beat/segment visualization across all plot types:
```rust
pub struct TimeRegion {
pub start: f64, // seconds
pub end: Option<f64>, // None = point marker
pub label: Option<String>,
pub color: Option<String>,
}
```
All plot types that have a time axis accept `Vec<TimeRegion>` via a shared `add_regions` method.
This replaces the ad-hoc `add_vline`/`add_shaded_region` per-type duplication and is the
primitive for onset/beat/segment annotations on spectrograms (currently missing).
---
## Phase 2 — API simplification
**Goal:** Common cases need zero config; power cases stay possible.
### 2.1 — Zero-arg shortcuts on `AudioSamples`
```rust
audio.plot() // -> AudioSampleResult<WaveformPlot>
audio.spectrogram() // -> AudioSampleResult<SpectrogramPlot>
audio.spectrum() // -> AudioSampleResult<MagnitudeSpectrumPlot>
```
Convenience wrappers on the existing trait methods with `PlotTheme::Midnight` and sensible
per-type defaults. No new behaviour, just less friction.
### 2.2 — Auto-computing overlay methods on `WaveformPlot`
Current pattern requires two steps: compute then add. Add high-level companions:
```rust
plot.with_rms_envelope(&audio) // sensible default window/hop
plot.with_rms_envelope_params(&audio, window, hop) // explicit params
plot.with_peak_envelope(&audio)
plot.with_zcr_overlay(&audio)
plot.with_onset_markers(&audio, &config) // compute + add
plot.with_beat_markers(&audio, &config)
```
The low-level `add_*` methods remain for users who pre-compute their own data.
### 2.3 — Replace `CompositePlot` iframe approach
The current implementation base64-encodes each sub-plot into an `<iframe>`, meaning sub-plots
are not interactive together (no shared hover, no linked zoom). Replace with Plotly's native
subplot system: each plot's traces are added to a shared `Plot` with grid/domain positioning.
`PlotComponent::requires_shared_x_axis()` already exists to guide this.
### 2.4 — Analysis dashboard
Built on the fixed `CompositePlot`:
```rust
audio.analysis_dashboard() // -> AudioSampleResult<CompositePlot>
```
Default layout: waveform (with min/max envelope from 1.1) on top, mel-dB spectrogram in the
middle, magnitude spectrum at the bottom. Shared x (time) axis for waveform and spectrogram.
### 2.5 — Clean up reserved/unused fields
- `line_style: Option<String>` — implement (Plotly: solid/dash/dot/dashdot) or remove
- `window_type` in `MagnitudeSpectrumParams` — wire up for FFT windowing or remove
- `frame_position` — implement frame-based spectrum or remove
- Static export `width`/`height`/`scale` — expose as config on `PlotUtils::save`
---
## Phase 3 — More audio-native plot types
**Goal:** Cover the standard analysis views a serious audio library is expected to have.
### 3.1 — A/B and difference views
- **Waveform overlay**: render two `AudioSamples` on the same axes with distinct colors, a
`difference` trace (A − B), and a legend. Entry point:
`WaveformPlot::compare(audio_a, audio_b, params)`
- **Difference spectrogram**: compute both spectrograms, subtract in dB, render with a
diverging colormap (blue=A louder, red=B louder, white=equal). Used in educational
before/after at the spectral level.
### 3.2 — Loudness / metering over time
- LUFS momentary (400 ms), short-term (3 s), integrated (full file) per EBU R128
- True-peak (4× oversampled) per ITU-R BS.1770
- Loudness range (LRA)
- Plot type: `LoudnessMeterPlot` — time series of momentary and short-term LUFS with
integrated as a horizontal reference line; dBTP overlay optional
### 3.3 — Stereo field visualization
- **`GoniometerPlot`** — L vs R Lissajous scatter, updated as a rolling window; reveals
stereo width, phase issues, and mono compatibility
- **Inter-channel correlation over time** — windowed Pearson r between L and R; +1 = mono,
0 = uncorrelated, -1 = out-of-phase
- **Mid/side decomposition view** — M = (L+R)/2, S = (L−R)/2, plotted as two waveforms
### 3.4 — Spectrogram overlays
Overlaying MIR output on a spectrogram is where analysis output becomes legible:
- f0/pitch track (add_pitch_track)
- Onset and beat markers (via `TimeRegion` from 1.8)
- Formant tracks (F1, F2, F3)
- Harmonic series lines from a detected f0
### 3.5 — Phase spectrogram and group delay
- Phase spectrogram: angle of complex STFT bins, rendered with a cyclic colormap (HSV/twilight)
- Instantaneous frequency: derivative of phase, more readable than raw phase
- Group delay: `−dφ/dω`, rendered as a separate heatmap or line overlay
---
## Phase 4 — Educational structural improvements
**Goal:** Make `audio_samples_education` robust, extensible, and honest about its dependencies.
### 4.1 — Replace raw-string parsing with `ExplanationData`
The current approach of encoding `[operation: Name]\n[formula: LaTeX]` into a plain string then
parsing with `strip_prefix` is the most fragile part of the system. Replace with:
```rust
pub struct ExplanationData {
pub operation: String,
pub formula_latex: Option<String>,
pub prose: String,
pub visual_type: VisualType,
pub code: Option<String>, // exact Rust call for this step
}
pub enum VisualType {
Waveform,
Spectrogram,
FrequencyResponse, // Bode plot — requires audio_samples_plotting filter plots
Chromagram, // requires ChromagramPlot
Spectrum, // magnitude spectrum overlay
Difference, // A/B difference view
None,
}
```
`VisualType` is now wide enough for all operations targeted in 4.3. Each `explain_*` function
returns `ExplanationData` directly. The renderer consumes structured data, not parsed text.
### 4.2 — Remove the unsafe pointer cast
`render_visual_block` casts `*const dyn ExplainDisplay as *const AudioSamplesVisual`. This is
unsound. Fix by making `Explanation::visual` a concrete `Option<AudioSamplesVisual>` rather than
`Box<dyn ExplainDisplay>`. Since `audio_samples_education` will own both the `explainable`
integration and `AudioSamplesVisual`, this is straightforward.
### 4.3 — Extend explain texts to more operations
With `ExplanationData` (4.1) and the plotting primitives (Phase 1) in place, covering new
operations is mechanical:
- `AudioIirFiltering`: `butterworth_lowpass/highpass/bandpass`, `chebyshev_i`,
`apply_iir_filter` — `VisualType::FrequencyResponse` (Bode) as the visual
- `AudioEditing`: `trim`, `pad`, `fade_in`, `fade_out`, `concatenate` — `VisualType::Waveform`
- `AudioTransforms`: `stft` — `VisualType::Spectrogram`; `mel_spectrogram` — `VisualType::Spectrogram`; `chromagram` — `VisualType::Chromagram`
### 4.4 — Statistics comparison block per step
Each card shows a before/after table below the formula:
```
Before After
Peak: 0.80 1.00
RMS: 0.32 0.40
Duration: 2.00 s 2.00 s
```
For spectral operations, also show spectral centroid and bandwidth.
### 4.5 — Show-the-code per step
The `code: Option<String>` field in `ExplanationData` (4.1) drives a per-card code block with
a copy button. Shows the exact `audio_samples` Rust call that produced the step. Makes the
document a reproduction recipe, not just an illustration.
### 4.6 — Spectral before/after overlay
The scalar stats table (4.4) cannot convey *where* in frequency an operation acted.
Add an optional magnitude-spectrum overlay card: before and after spectra on the same axes,
difference spectrum highlighted. Driven by `VisualType::Spectrum` and `VisualType::Difference`.
### 4.7 — Hover-definition glossary
Wrap DSP vocabulary (windowing, leakage, Nyquist, dBFS, LUFS, etc.) in `<span class="gloss">`
elements. A small JS tooltip shows a one-sentence definition on hover. The glossary is defined
once in the template; term highlighting is automatic via a lookup over known terms.
---
## Phase 5 — Educational UI and rich features
**Goal:** Turn the document from a static snapshot into a learning tool.
### 5.1 — Embedded audio playback with synced playhead (highest-value educational feature)
Encode the `AudioSamples` at each step as a base64 WAV `<audio>` element inlined in the HTML.
Wire a `timeupdate` listener to sweep a vertical playhead cursor across the waveform and
spectrogram panels in sync with playback. The before/after comparison cards become before/after
audio the user can A/B by ear. This is worth more than the sidebar, collapsible cards, and
linked brushing combined.
Depends on: `audio_samples_io` for WAV encoding (base64 inline data URI).
### 5.2 — Step navigation sidebar
For chains of 5+ operations: a sticky left sidebar with step names and operation types. Clicking
jumps to the card. The timeline structure is already in place; the sidebar is ~30 lines of JS.
### 5.3 — Collapsible cards
Cards expand/collapse by clicking the header. Collapsed state shows step number, operation
name, and a 1-line stats delta (`peak +25%, RMS +25%`). Makes long documents scannable.
### 5.4 — CSS/JS injection API
```rust
pub struct ExplainConfig {
pub title: String,
pub default_theme: PlotTheme,
pub custom_css: Option<String>,
pub custom_js: Option<String>,
}
```
`custom_css` overrides any CSS variable, color, or layout. Changing the accent colour or font
requires 2 lines. `ExplainConfig::new(title)` is the zero-friction entry point.
Replace `render_explanation_document(explanations, title)` with
`render_explanation_document(explanations, &ExplainConfig)`.
### 5.5 — Linked brush/selection
Dragging a region on the "before" waveform highlights the same time range on the "after"
waveform and seeks the audio playback to that region. ~50 lines of Plotly JS event handling.
### 5.6 — Precomputed parameter sweeps (cheap WASM alternative)
Render the operation at several parameter values and present as small multiples or a slider over
precomputed frames. Example: window-size sweep for STFT (128/512/2048/8192 samples) shown as
four side-by-side spectrograms. Most of the pedagogical value of live WASM at a fraction of the
complexity. This is the recommended stepping stone.
---
## Phase 6 — Ambitious / long-term
### 6.1 — WASM live recomputation
Compile DSP kernels to WASM (`wasm-bindgen`). Embed parameter controls (sliders, dropdowns) in
the educational card. Changing a slider reruns the operation in-browser and redraws the plots
without a Rust rebuild. This turns a static explainer into an interactive instrument.
Architecturally: DSP kernels in `audio_samples` are pure functions on slices — they compile to
WASM cleanly. `audio_samples_education` with the `wasm-dsp` feature links the WASM binary into
the HTML at document generation time.
### 6.2 — Offline self-contained output
`render_explanation_document` already produces CDN-linked HTML. Add an `ExplainConfig::offline`
flag that fetches and inlines KaTeX, Plotly, and fonts at document generation time. Works in
combination with 5.1 since audio is already inlined.
### 6.3 — Export formats
- PDF export via headless Chrome/Chromium (for reports and papers)
- PNG export of individual cards (for slides)
- Jupyter notebook export (`.ipynb`) so the explanation chain becomes a reproducible notebook
---
## Dependency order
```
Phase 0 (crate extraction)
│
├──► Phase 1 (plotting primitives — scale, theme, spectrogram quality, filter plots, chroma)
│ │
│ └──► Phase 2 (API simplification, analysis dashboard)
│ │
│ └──► Phase 3 (more plot types: A/B, loudness, stereo field, phase)
│
└──► Phase 4 (educational structure — depends on Phase 1 for VisualType primitives)
│
└──► Phase 5 (educational UI — depends on Phase 4 for ExplainConfig/ExplanationData)
│
└──► Phase 6 (WASM, offline, export — depends on Phase 5)
```
Phases 1 and 4 share a hard dependency on Phase 0 (crate extraction) but are otherwise
independent of each other and can proceed in parallel once the crate boundaries are established.
Phase 3 requires Phase 2. Phase 5 requires Phase 4. Phase 6 requires Phase 5.
The decimation work (1.1) and the filter/chromagram plot types (1.6, 1.7) must land before
Phase 4 begins: `VisualType::FrequencyResponse` and `VisualType::Chromagram` in 4.1 are hollow
until those primitives exist. Do not start Phase 4 operations that use those variants until the
corresponding Phase 1 items are complete.
```