earshot 0.1.0

Ridiculously fast voice activity detection in pure #[no_std] Rust
Documentation
  • Coverage
  • 51.72%
    15 out of 29 items documented8 out of 12 items with examples
  • Size
  • Source code size: 85.47 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 3.35 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 11s Average build duration of successful builds.
  • all releases: 12s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • pykeio/earshot
    54 3 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • decahedron1

Earshot

Ridiculously fast, only slightly bad voice activity detection in pure Rust. Port of the famous WebRTC VAD.

Features

  • #![no_std], doesn't even require alloc
    • Internal buffers can get pretty big when stored on the stack, so the alloc feature is enabled by default, which allocates them on the heap instead.
  • Stupidly fast; uses only fixed-point arithmetic
    • Achieves an RTF of ~3e-4 with 30 ms 48 KHz frames, ~3e-5 with 30 ms 8 KHz frames.
    • Comparatively, Silero VAD v4 w/ ort achieves an RTF of ~3e-3 with 60 ms 16 KHz frames.
  • Okay accuracy
    • Great at distinguishing between silence and noise, but not between noise and speech.
    • Earshot provides alternative models with slight accuracy gains compared to the base WebRTC model.