audio_samples 1.0.2

A typed audio processing library for Rust that treats audio as a first-class, invariant-preserving object rather than an unstructured numeric buffer.
Documentation

AudioSamples

Fast, simple, and expressive audio in Rust

Crates.io Docs.rs License: MIT


Overview

Most audio libraries expose samples as raw numeric buffers. In Python, audio is typically represented as a NumPy array whose dtype is explicit, but whose meaning is not: sample rate, amplitude range, memory interleaving, and PCM versus floating-point semantics are tracked externally, if at all. In Rust, the situation is reversed but not resolved. Libraries provide fast and safe low-level primitives, yet users are still responsible for managing raw buffers, writing ad hoc conversion code, and manually preserving invariants across crates.

AudioSamples closes this gap with a strongly typed audio representation that encodes sample format, numeric domain, channel structure, and layout in the type system. All operations preserve or explicitly update these invariants, supporting both exploratory workflows and system-level use without requiring users to remember hidden conventions or reimplement common audio logic.


Installation

cargo add audio_samples

The default feature set (bare-bones) includes only the core types and traits. Add features for the operations you need — see Features.


Quick Start

Generating and mixing signals

use audio_samples::{sample_rate, AudioTypeConversion, cosine_wave, sine_wave};
use std::time::Duration;

fn main() {
    let sr = sample_rate!(44_100);
    let duration = Duration::from_secs_f64(1.0);

    // Generate a 440 Hz sine wave as i16 PCM, then convert to f32
    let float_sine = sine_wave::<i16>(440.0, duration, sr, 0.5).as_f32();

    // Mix with a 220 Hz cosine wave
    let cosine = cosine_wave::<f32>(220.0, duration, sr, 0.5);
    let mixed = float_sine + cosine;
}

Spectral transforms

Enable the transforms feature:

cargo add audio_samples --features transforms
use audio_samples::{AudioSamples, AudioTransforms, nzu, sample_rate, sine_wave};
use spectrograms::{ChromaParams, CqtParams, MfccParams, StftParams, WindowType};
use std::time::Duration;

fn main() -> audio_samples::AudioSampleResult<()> {
    let sr = sample_rate!(44100);
    let audio: AudioSamples<'static, f64> =
        sine_wave::<f64>(440.0, Duration::from_millis(200), sr, 0.8);

    let fft = audio.fft(nzu!(8192))?;

    let stft_params = StftParams::new(nzu!(1024), nzu!(256), WindowType::Hanning, true)?;
    let stft = audio.stft(&stft_params)?;
    let mfcc = audio.mfcc(&stft_params, nzu!(40), &MfccParams::speech_standard())?;
    let chroma = audio.chromagram(&stft_params, &ChromaParams::music_standard())?;
    let (_freqs, _psd) = audio.power_spectral_density(nzu!(1024), 0.5)?;
    let _cqt = audio.constant_q_transform(
        &CqtParams::new(nzu!(12), nzu!(7), 32.7)?,
        nzu!(256),
    )?;

    // Round-trip via inverse STFT
    let _reconstructed = AudioSamples::<f64>::istft(stft)?;
    Ok(())
}

Creating AudioSamples

AudioSamples creation returns a Result because validity requires consistent buffer length, channel count, and sample rate. The sample_rate! macro and non_empty_vec! guarantee invariants at construction:

use audio_samples::{AudioSamples, sample_rate};
use non_empty_slice::non_empty_vec;

let audio = AudioSamples::from_mono_vec(
    non_empty_vec![0.1f32, 0.2, 0.3],
    sample_rate!(44100),
);

For multi-channel audio:

use audio_samples::{AudioSamples, sample_rate};
use ndarray::array;

let stereo = AudioSamples::<f32>::new_multi_channel(
    array![[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]],
    sample_rate!(44100),
).unwrap();

Features

The default feature is bare-bones — the core types and traits with no optional dependencies. Enable features as needed:

Core operations

Feature Description
statistics Descriptive statistics: peak, RMS, mean, variance
processing Normalization, scaling, clipping (requires statistics)
editing Trim, pad, reverse, perturb, concatenate (requires statistics, random-generation)
channels Interleave/deinterleave, mono↔stereo conversion
iir-filtering IIR filter design and application
parametric-eq Parametric EQ bands (requires iir-filtering)
dynamic-range Compression, limiting, expansion
envelopes Amplitude, RMS, and attack-decay envelopes
vad Voice activity detection

Spectral and analysis

Feature Description
transforms FFT, STFT, MFCC, chromagram, CQT, PSD
pitch-analysis YIN and autocorrelation pitch detection (requires transforms)
onset-detection Onset detection (requires transforms, peak-picking, processing)
beat-tracking Beat tracking
peak-picking Peak picking on onset envelopes
decomposition Audio decomposition (requires onset-detection)

Utility

Feature Description
resampling Sample-rate conversion via rubato
random-generation Noise and random audio generation
fixed-size-audio Fixed-size buffer support (no heap allocation)
plotting Interactive HTML plots via plotly
static-plots PNG/SVG export (requires plotting — see PLOTTING.md)
simd SIMD acceleration (nightly only)

Bundles

Feature Description
full All features
full_no_plotting All features except plotting

Documentation

Full API documentation: https://docs.rs/audio_samples


Examples

The repository includes runnable examples in examples/. Each is self-contained and annotated with the required feature flags.

Additional demos:


Companion Crates


License

MIT License


Citing

If you use AudioSamples in research, please cite:

@inproceedings{geraghty2026audio,
  author    = {Geraghty, Jack and Golpayegani, Fatemeh and Hines, Andrew},
  title     = {Audio Made Simple: A Modern Framework for Audio Processing},
  booktitle = {ACM Multimedia Systems Conference 2026 (MMSys '26)},
  year      = {2026},
  month     = apr,
  publisher = {ACM},
  address   = {Hong Kong, Hong Kong},
  doi       = {10.1145/3793853.3799811},
  note      = {Accepted for publication}
}

Contributing

Contributions are welcome. Please submit a pull request and see CONTRIBUTING.md for guidance.