Skip to main content

Crate spectrograms

Crate spectrograms 

Source
Expand description

§Spectrograms - FFT-Based Computations

High-performance FFT-based computations for audio and image processing.

§Overview

This library provides:

  • 1D FFTs: For time-series and audio signals
  • 2D FFTs: For images and spatial data
  • Spectrograms: Time-frequency representations (STFT, Mel, ERB, CQT)
  • Image operations: Convolution, filtering, edge detection
  • Two backends: RealFFT (pure Rust) or FFTW (fastest)
  • Plan-based API: Reusable plans for batch processing

§Domain Organization

The library is organized by application domain:

  • audio - Audio processing (spectrograms, MFCC, chroma, pitch analysis)
  • image - Image processing (convolution, filtering, frequency analysis)
  • [fft] - Core FFT operations (1D and 2D transforms)

All functionality is also exported at the crate root for convenience.

§Audio Processing

Compute various types of spectrograms:

  • Linear-frequency spectrograms
  • Mel-frequency spectrograms
  • ERB spectrograms
  • Logarithmic-frequency spectrograms
  • CQT (Constant-Q Transform)

With multiple amplitude scales:

  • Power (|X|²)
  • Magnitude (|X|)
  • Decibels (10·log₁₀(power))

§Image Processing

Frequency-domain operations for images:

  • 2D FFT and inverse FFT
  • Convolution via FFT (faster for large kernels)
  • Spatial filtering (low-pass, high-pass, band-pass)
  • Edge detection
  • Sharpening and blurring

§Features

  • Two FFT backends: RealFFT (default, pure Rust) or FFTW (fastest performance)
  • Plan-based computation: Reuse FFT plans for efficient batch processing
  • Comprehensive window functions: Hanning, Hamming, Blackman, Kaiser, Gaussian, etc.
  • Type-safe API: Compile-time guarantees for spectrogram types
  • Zero-copy design: Efficient memory usage with minimal allocations

§Quick Start

§Audio: Compute a Mel Spectrogram

use spectrograms::*;
use std::f64::consts::PI;
use non_empty_slice::NonEmptyVec;

// Generate a sine wave at 440 Hz
let sample_rate = 16000.0;
let samples_vec: Vec<f64> = (0..16000)
    .map(|i| (2.0 * PI * 440.0 * i as f64 / sample_rate).sin())
    .collect();
let samples = NonEmptyVec::new(samples_vec).unwrap();

// Set up parameters
let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, sample_rate)?;
let mel = MelParams::new(nzu!(80), 0.0, 8000.0)?;

// Compute Mel spectrogram
let spec = MelPowerSpectrogram::compute(samples.as_ref(), &params, &mel, None)?;
println!("Computed {} bins x {} frames", spec.n_bins(), spec.n_frames());

§Image: Apply Gaussian Blur via FFT

use spectrograms::image_ops::*;
use spectrograms::nzu;
use ndarray::Array2;

// Create a 256x256 image
let image = Array2::<f64>::from_shape_fn((256, 256), |(i, j)| {
    ((i as f64 - 128.0).powi(2) + (j as f64 - 128.0).powi(2)).sqrt()
});

// Apply Gaussian blur
let kernel = gaussian_kernel_2d(nzu!(9), 2.0)?;
let blurred = convolve_fft(&image.view(), &kernel.view())?;

§General: 2D FFT

use spectrograms::fft2d::*;
use ndarray::Array2;

let data = Array2::<f64>::zeros((128, 128));
let spectrum = fft2d(&data.view())?;
let power = power_spectrum_2d(&data.view())?;

§Feature Flags

The library requires exactly one FFT backend:

  • realfft (default): Pure-Rust FFT implementation, no system dependencies
  • fftw: Uses FFTW C library for fastest performance (requires system install)

§Examples

§Mel Spectrogram

use spectrograms::*;
use non_empty_slice::non_empty_vec;

let samples = non_empty_vec![0.0; nzu!(16000)];

let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;
let mel = MelParams::new(nzu!(80), 0.0, 8000.0)?;
let db = LogParams::new(-80.0)?;

let spec = MelDbSpectrogram::compute(samples.as_ref(), &params, &mel, Some(&db))?;

§Efficient Batch Processing

use spectrograms::*;
use non_empty_slice::non_empty_vec;

let signals = vec![non_empty_vec![0.0; nzu!(16000)], non_empty_vec![0.0; nzu!(16000)]];

let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;

// Create plan once, reuse for all signals
let planner = SpectrogramPlanner::new();
let mut plan = planner.linear_plan::<Power>(&params, None)?;

for signal in &signals {
    let spec = plan.compute(&signal)?;
    // Process spec...
}

Re-exports§

pub use fft2d::*;
pub use image_ops::*;

Modules§

audio
Audio processing utilities (spectrograms, MFCC, chroma, etc.)
fft
Core FFT operations (1D and 2D)
fft2d
2D FFT operations for image and spatial data processing.
image
Image processing utilities (convolution, filtering, etc.)
image_ops
Image processing operations using 2D FFTs.

Macros§

nzu

Structs§

Axes
Spectrogram axes container.
ChromaParams
Chroma feature parameters.
Chromagram
Chromagram representation with 12 pitch classes.
CqtParams
CQT parameters
CqtResult
CQT result containing complex frequency bins and metadata.
ErbParams
ERB filterbank parameters
FftPlanner
A reusable FFT planner for efficient repeated FFT operations.
FrequencyAxis
InnerPlanner
A planner is used to create FFTs. It caches results internally, so when making more than one FFT it is advisable to reuse the same planner.
LogHzParams
Logarithmic frequency scale parameters
LogParams
MelParams
Mel filter bank parameters
Mfcc
MFCC features representation.
MfccParams
MFCC computation parameters.
RealFftInversePlan
Complex-to-Real Inverse FFT Plan
RealFftInversePlan2d
RealFftPlan
Real-to-Complex FFT Plan
RealFftPlan2d
2D Real-to-Complex FFT Plan
RealFftPlanner
RealFftPlanner
Spectrogram
Spectrogram structure holding the computed spectrogram data and metadata.
SpectrogramParams
Spectrogram computation parameters.
SpectrogramParamsBuilder
Builder for SpectrogramParams.
SpectrogramPlan
A spectrogram plan is the compiled, reusable execution object.
SpectrogramPlanner
A planner is an object that can build spectrogram plans.
StftParams
STFT parameters for spectrogram computation.
StftParamsBuilder
Builder for StftParams.
StftPlan
STFT plan containing reusable FFT plan and buffers.
StftResult
STFT (Short-Time Fourier Transform) result containing complex frequency bins.

Enums§

ChromaNorm
Normalization strategy for chroma features.
Cqt
Constant-Q Transform frequency scale
Decibels
Decibel amplitude scale
Erb
ERB/gammatone frequency scale
LinearHz
Linear frequency scale
LogHz
Logarithmic frequency scale
Magnitude
Magnitude amplitude scale
Mel
Mel frequency scale
MelNorm
Mel filterbank normalization strategy.
Power
Power amplitude scale
SpectrogramError
Represents errors that can occur in the spectrogram library.
WindowType
Window functions for spectral analysis and filtering.

Constants§

N_CHROMA
Number of pitch classes in Western music.

Traits§

AmpScaleSpec
Marker trait so we can specialise behaviour by AmpScale.
C2rPlan
A planned complex-to-real inverse FFT for a fixed transform length.
C2rPlanner
Planner that can construct inverse FFT plans.
ComplexToReal
An inverse FFT that takes a complex spectrum of length N/2+1 and transforms it to a real-valued signal of length N.
R2cPlan
A planned real-to-complex FFT for a fixed transform length.
R2cPlanner
Planner that can construct FFT plans.
RealToComplex
A forward FFT that takes a real-valued input signal of length N and transforms it to a complex spectrum of length N/2+1.

Functions§

blackman_window
chromagram
Compute chromagram directly from audio samples.
chromagram_from_spectrogram
Compute chromagram from a magnitude or power spectrogram.
cqt
Compute the Constant-Q Transform (CQT) of a signal.
fft
Compute the real-to-complex FFT of a real-valued signal.
gaussian_window
hamming_window
hanning_window
irfft
Compute the inverse real FFT (complex-to-real IFFT).
istft
Reconstruct a time-domain signal from its STFT using overlap-add.
kaiser_window
magnitude_spectrum
Compute the magnitude spectrum of a signal (|X|).
make_window
Generate window function samples.
mfcc
Compute MFCCs directly from audio samples.
mfcc_from_log_mel
Compute MFCCs from a log mel spectrogram.
power_spectrum
Compute the power spectrum of a signal (|X|²).
r2c_output_size
Output size for a real-to-complex FFT of length n.
rectangular_window
rfft
Compute the real-valued fft of a signal.
stft
Compute the Short-Time Fourier Transform (STFT) of a signal.

Type Aliases§

CqtDbSpectrogram
CqtMagnitudeSpectrogram
CqtPowerSpectrogram
CqtSpectrogram
ErbDbSpectrogram
ErbMagnitudeSpectrogram
ErbPowerSpectrogram
ErbSpectrogram
Gammatone
GammatoneDbSpectrogram
GammatoneMagnitudeSpectrogram
GammatoneParams
GammatonePowerSpectrogram
GammatoneSpectrogram
LinearDbSpectrogram
LinearMagnitudeSpectrogram
LinearPowerSpectrogram
LinearSpectrogram
LogHzDbSpectrogram
LogHzMagnitudeSpectrogram
LogHzPowerSpectrogram
LogHzSpectrogram
LogMelSpectrogram
MelDbSpectrogram
MelMagnitudeSpectrogram
MelPowerSpectrogram
MelSpectrogram
SpectrogramResult