Crate pyin

Source
Expand description

pYIN1 is a pitch (fundamental frequency) detection algorithm.

This crate provides a pitch estimate for each frame of the audio signal and a probability the frame is voiced region.

This crate provides a C language FFI and an executable binary. For the details of the executable binary, refer to the repo.

The implementation is based on librosa v0.9.1. For easy translation from Python + Numpy to Rust, the implementation is written on top of ndarray crate.

§Usage

use ndarray::prelude::*;
use pyin::{PYINExecutor, Framing, PadMode};

let fmin = 60f64;  // minimum frequency in Hz
let fmax = 600f64; // maximum frequency in Hz
let sr = 24000;    // sampling rate of audio data in Hz
let frame_length = 2048; // frame length in samples
let (win_length, hop_length, resolution) = (None, None, None);  // None to use default values
let mut pyin_exec = PYINExecutor::new(fmin, fmax, sr, frame_length, win_length, hop_length, resolution);

let wav: Vec<f64> = (0..24000).map(|i| (2. * std::f64::consts::PI * (i as f64) / 200.).sin()).collect();
let fill_unvoiced = f64::NAN;
let framing = Framing::Center(PadMode::Constant(0.));  // Zero-padding is applied on both sides of the signal. (only if cetner is true)

// timestamp (Array1<f64>) - contains the timestamp (in seconds) of each frames
// f0 (Array1<f64>) contains the pitch estimate in Hz. (NAN if unvoiced)
// voiced_flag (Array1<bool>) contains whether the frame is voiced or not.
// voiced_prob (Array1<f64>) contains the probability of the frame is voiced.
let (timestamp, f0, voiced_flag, voiced_prob) = pyin_exec.pyin(&wav, fill_unvoiced, framing);

  1. Mauch, Matthias, and Simon Dixon. “pYIN: A fundamental frequency estimator using probabilistic threshold distributions.” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2014. 

Structs§

PYINExecutor
pYIN algorithm executor.

Enums§

Framing
Represents where the first frame starts.
PadMode
Padding mode

Functions§

pyin
C lang FFI for pYIN