Expand description

The McLeod pitch detection algorithm is based on the algorithm from the paper A Smarter Way To Find Pitch. It is efficient and offers an improvement over basic autocorrelation.

The algorithm is based on finding peaks of the normalized square difference function. Let $S=(s_0,s_1,\ldots,s_N)$ be a discrete signal. The square difference function at time $t$ is defined by $$ d'(t) = \sum_{i=0}^{N-t} (s_i-s_{i+t})^2. $$ This function is close to zero when the signal “lines up” with itself. However, close is a relative term, and the value of $d'(t)$ depends on volume, which should not affect the pitch of the signal. For this reason, the normalized square difference function, $n'(t)$, is computed. $$ n'(t) = \frac{d'(t)}{\sum_{i=0}^{N-t} (x_i^2+x_{i+t}^2) } $$ The algorithm then searches for the first local minimum of $n'(t)$ below a given threshold, called the clarity threshold.

Implementation

As outlined in A Smarter Way To Find Pitch, an FFT is used to greatly speed up the computation of the normalized square difference function. Further, the algorithm applies some algebraic tricks and actually searches for the peaks of $1-n'(t)$, rather than minimums of $n'(t)$.

After a peak is found, quadratic interpolation is applied to further refine the estimate.

Structs