Expand description
§Spectrograms - FFT-Based Computations
High-performance FFT-based computations for audio and image processing.
§Overview
This library provides:
- 1D FFTs: For time-series and audio signals
- 2D FFTs: For images and spatial data
- Spectrograms: Time-frequency representations (STFT, Mel, ERB, CQT)
- Image operations: Convolution, filtering, edge detection
- Two backends:
RealFFT(pure Rust) or FFTW (fastest) - Plan-based API: Reusable plans for batch processing
§Domain Organization
The library is organized by application domain:
audio- Audio processing (spectrograms, MFCC, chroma, pitch analysis)image- Image processing (convolution, filtering, frequency analysis)- [
fft] - Core FFT operations (1D and 2D transforms)
All functionality is also exported at the crate root for convenience.
§Audio Processing
Compute various types of spectrograms:
- Linear-frequency spectrograms
- Mel-frequency spectrograms
- ERB spectrograms
- Logarithmic-frequency spectrograms
- CQT (Constant-Q Transform)
With multiple amplitude scales:
- Power (
|X|²) - Magnitude (
|X|) - Decibels (
10·log₁₀(power))
§Image Processing
Frequency-domain operations for images:
- 2D FFT and inverse FFT
- Convolution via FFT (faster for large kernels)
- Spatial filtering (low-pass, high-pass, band-pass)
- Edge detection
- Sharpening and blurring
§Features
- Two FFT backends:
RealFFT(default, pure Rust) or FFTW (fastest performance) - Plan-based computation: Reuse FFT plans for efficient batch processing
- Comprehensive window functions: Hanning, Hamming, Blackman, Kaiser, Gaussian, etc.
- Type-safe API: Compile-time guarantees for spectrogram types
- Zero-copy design: Efficient memory usage with minimal allocations
§Quick Start
§Audio: Compute a Mel Spectrogram
use spectrograms::*;
use std::f64::consts::PI;
use non_empty_slice::NonEmptyVec;
// Generate a sine wave at 440 Hz
let sample_rate = 16000.0;
let samples_vec: Vec<f64> = (0..16000)
.map(|i| (2.0 * PI * 440.0 * i as f64 / sample_rate).sin())
.collect();
let samples = NonEmptyVec::new(samples_vec).unwrap();
// Set up parameters
let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, sample_rate)?;
let mel = MelParams::new(nzu!(80), 0.0, 8000.0)?;
// Compute Mel spectrogram
let spec = MelPowerSpectrogram::compute(samples.as_ref(), ¶ms, &mel, None)?;
println!("Computed {} bins x {} frames", spec.n_bins(), spec.n_frames());§Image: Apply Gaussian Blur via FFT
use spectrograms::image_ops::*;
use spectrograms::nzu;
use ndarray::Array2;
// Create a 256x256 image
let image = Array2::<f64>::from_shape_fn((256, 256), |(i, j)| {
((i as f64 - 128.0).powi(2) + (j as f64 - 128.0).powi(2)).sqrt()
});
// Apply Gaussian blur
let kernel = gaussian_kernel_2d(nzu!(9), 2.0)?;
let blurred = convolve_fft(&image.view(), &kernel.view())?;§General: 2D FFT
use spectrograms::fft2d::*;
use ndarray::Array2;
let data = Array2::<f64>::zeros((128, 128));
let spectrum = fft2d(&data.view())?;
let power = power_spectrum_2d(&data.view())?;§Feature Flags
The library requires exactly one FFT backend:
realfft(default): Pure-Rust FFT implementation, no system dependenciesfftw: Uses FFTW C library for fastest performance (requires system install)
§Examples
§Mel Spectrogram
use spectrograms::*;
use non_empty_slice::non_empty_vec;
let samples = non_empty_vec![0.0; nzu!(16000)];
let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;
let mel = MelParams::new(nzu!(80), 0.0, 8000.0)?;
let db = LogParams::new(-80.0)?;
let spec = MelDbSpectrogram::compute(samples.as_ref(), ¶ms, &mel, Some(&db))?;§Efficient Batch Processing
use spectrograms::*;
use non_empty_slice::non_empty_vec;
let signals = vec![non_empty_vec![0.0; nzu!(16000)], non_empty_vec![0.0; nzu!(16000)]];
let stft = StftParams::new(nzu!(512), nzu!(256), WindowType::Hanning, true)?;
let params = SpectrogramParams::new(stft, 16000.0)?;
// Create plan once, reuse for all signals
let planner = SpectrogramPlanner::new();
let mut plan = planner.linear_plan::<Power>(¶ms, None)?;
for signal in &signals {
let spec = plan.compute(&signal)?;
// Process spec...
}Re-exports§
Modules§
- audio
- Audio processing utilities (spectrograms, MFCC, chroma, etc.)
- fft
- Core FFT operations (1D and 2D)
- fft2d
- 2D FFT operations for image and spatial data processing.
- image
- Image processing utilities (convolution, filtering, etc.)
- image_
ops - Image processing operations using 2D FFTs.
Macros§
Structs§
- Axes
- Spectrogram axes container.
- Chroma
Params - Chroma feature parameters.
- Chromagram
- Chromagram representation with 12 pitch classes.
- CqtParams
- CQT parameters
- CqtResult
- CQT result containing complex frequency bins and metadata.
- ErbParams
- ERB filterbank parameters
- FftPlanner
- A reusable FFT planner for efficient repeated FFT operations.
- Frequency
Axis - Inner
Planner - A planner is used to create FFTs. It caches results internally, so when making more than one FFT it is advisable to reuse the same planner.
- LogHz
Params - Logarithmic frequency scale parameters
- LogParams
- MelParams
- Mel filter bank parameters
- Mfcc
- MFCC features representation.
- Mfcc
Params - MFCC computation parameters.
- Real
FftInverse Plan - Complex-to-Real Inverse FFT Plan
- Real
FftInverse Plan2d - Real
FftPlan - Real-to-Complex FFT Plan
- Real
FftPlan2d - 2D Real-to-Complex FFT Plan
- Real
FftPlanner - RealFftPlanner
- Spectrogram
- Spectrogram structure holding the computed spectrogram data and metadata.
- Spectrogram
Params - Spectrogram computation parameters.
- Spectrogram
Params Builder - Builder for
SpectrogramParams. - Spectrogram
Plan - A spectrogram plan is the compiled, reusable execution object.
- Spectrogram
Planner - A planner is an object that can build spectrogram plans.
- Stft
Params - STFT parameters for spectrogram computation.
- Stft
Params Builder - Builder for
StftParams. - Stft
Plan - STFT plan containing reusable FFT plan and buffers.
- Stft
Result - STFT (Short-Time Fourier Transform) result containing complex frequency bins.
Enums§
- Chroma
Norm - Normalization strategy for chroma features.
- Cqt
- Constant-Q Transform frequency scale
- Decibels
- Decibel amplitude scale
- Erb
- ERB/gammatone frequency scale
- Linear
Hz - Linear frequency scale
- LogHz
- Logarithmic frequency scale
- Magnitude
- Magnitude amplitude scale
- Mel
- Mel frequency scale
- MelNorm
- Mel filterbank normalization strategy.
- Power
- Power amplitude scale
- Spectrogram
Error - Represents errors that can occur in the spectrogram library.
- Window
Type - Window functions for spectral analysis and filtering.
Constants§
- N_
CHROMA - Number of pitch classes in Western music.
Traits§
- AmpScale
Spec - Marker trait so we can specialise behaviour by
AmpScale. - C2rPlan
- A planned complex-to-real inverse FFT for a fixed transform length.
- C2rPlanner
- Planner that can construct inverse FFT plans.
- Complex
ToReal - An inverse FFT that takes a complex spectrum of length N/2+1 and transforms it to a real-valued signal of length N.
- R2cPlan
- A planned real-to-complex FFT for a fixed transform length.
- R2cPlanner
- Planner that can construct FFT plans.
- Real
ToComplex - A forward FFT that takes a real-valued input signal of length N and transforms it to a complex spectrum of length N/2+1.
Functions§
- blackman_
window - chromagram
- Compute chromagram directly from audio samples.
- chromagram_
from_ spectrogram - Compute chromagram from a magnitude or power spectrogram.
- cqt
- Compute the Constant-Q Transform (CQT) of a signal.
- fft
- Compute the real-to-complex FFT of a real-valued signal.
- gaussian_
window - hamming_
window - hanning_
window - irfft
- Compute the inverse real FFT (complex-to-real IFFT).
- istft
- Reconstruct a time-domain signal from its STFT using overlap-add.
- kaiser_
window - magnitude_
spectrum - Compute the magnitude spectrum of a signal (|X|).
- make_
window - Generate window function samples.
- mfcc
- Compute MFCCs directly from audio samples.
- mfcc_
from_ log_ mel - Compute MFCCs from a log mel spectrogram.
- power_
spectrum - Compute the power spectrum of a signal (|X|²).
- r2c_
output_ size - Output size for a real-to-complex FFT of length
n. - rectangular_
window - rfft
- Compute the real-valued fft of a signal.
- stft
- Compute the Short-Time Fourier Transform (STFT) of a signal.
Type Aliases§
- CqtDb
Spectrogram - CqtMagnitude
Spectrogram - CqtPower
Spectrogram - CqtSpectrogram
- ErbDb
Spectrogram - ErbMagnitude
Spectrogram - ErbPower
Spectrogram - ErbSpectrogram
- Gammatone
- Gammatone
DbSpectrogram - Gammatone
Magnitude Spectrogram - Gammatone
Params - Gammatone
Power Spectrogram - Gammatone
Spectrogram - Linear
DbSpectrogram - Linear
Magnitude Spectrogram - Linear
Power Spectrogram - Linear
Spectrogram - LogHz
DbSpectrogram - LogHz
Magnitude Spectrogram - LogHz
Power Spectrogram - LogHz
Spectrogram - LogMel
Spectrogram - MelDb
Spectrogram - MelMagnitude
Spectrogram - MelPower
Spectrogram - MelSpectrogram
- Spectrogram
Result