Expand description

Autocorrelation is one of the most basic forms of pitch detection. Let $S=(s_0,s_1,\ldots,s_N)$ be a discrete signal. Then, the autocorrelation function of $S$ at time $t$ is $$ A_t(S) = \sum_{i=0}^{N-t} s_i s_{i+t}. $$ The autocorrelation function is largest when $t=0$. Subsequent peaks indicate when the signal is particularly well aligned with itself. Thus, peaks of $A_t(S)$ when $t>0$ are good candidates for the fundamental frequency of $S$.

Unfortunately, autocorrelation-based pitch detection is prone to octave errors, since a signal may “line up” with itself better when shifted by amounts larger than by the fundamental frequency. Further, autocorrelation is a bad choice for situations where the fundamental frequency may not be the loudest frequency (which is common in telephone speech and for certain types of instruments).

Implementation

Rather than compute the autocorrelation function directly, an FFT is used, providing a dramatic speed increase for large buffers.

Structs