Expand description
pYIN1 is a pitch (fundamental frequency) detection algorithm.
This crate provides a pitch estimate for each frame of the audio signal and a probability the frame is voiced region.
This crate provides a C language FFI and an executable binary. For the details of the executable binary, refer to the repo.
The implementation is based on librosa v0.9.1. For easy translation from Python + Numpy to Rust, the implementation is written on top of ndarray crate.
Usage
use ndarray::prelude::*;
use pyin::{PYINExecutor, PadMode};
let fmin = 60f64; // minimum frequency in Hz
let fmax = 600f64; // maximum frequency in Hz
let sr = 24000; // sampling rate of audio data in Hz
let frame_length = 2048; // frame length in samples
let (win_length, hop_length, resolution) = (None, None, None); // None to use default values
let mut pyin_exec = PYINExecutor::new(fmin, fmax, sr, frame_length, win_length, hop_length, resolution);
// let wav: CowArray<f64, Ix1> = ...;
let fill_unvoiced = f64::NAN;
let center = true; // If true, the first sample in wav becomes the center of the first frame.
let pad_mode = PadMode::Constant(0.); // Zero-padding is applied on both sides of the signal. (only if cetner is true)
// f0 (Array1<f64>) contains the pitch estimate in Hz. (NAN if unvoiced)
// voiced_flag (Array1<bool>) contains whether the frame is voiced or not.
// voiced_prob (Array1<f64>) contains the probability of the frame is voiced.
let (f0, voiced_flag, voiced_prob) = pyin_exec.pyin(wav, center, pad_mode);
Mauch, Matthias, and Simon Dixon. “pYIN: A fundamental frequency estimator using probabilistic threshold distributions.” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014. ↩
Structs
pYIN algorithm executor.