parakeet-rs
Rust bindings for NVIDIA's Parakeet ASR model via ONNX Runtime.
GPU support (optional):
= { = "0.1", = ["cuda"] }
Note: CoreML often doesn't work with this model - stick w/ CPU or CUDA. But its incredible fast in my Mac M3 16gb compared to Whisper metal :-)
Usage
use Parakeet;
let mut parakeet = from_pretrained?;
let text = parakeet.transcribe?;
println!;
GPU:
use ;
let config = new
.with_execution_provider;
let mut parakeet = from_pretrained_with_config?;
Model Files
Put these in your working directory:
model.onnx/model.onnx_dataconfig.jsonpreprocessor_config.jsontokenizer.json/tokenizer_config.jsonspecial_tokens_map.json
Get the model from HuggingFace here.
Audio Format
- WAV files, 16kHz, mono
- 16-bit PCM or 32-bit float
Examples
Basic:
w/ speaker diarization (needs pyannote models):
API
// Load model
let mut parakeet = from_pretrained?;
// Single file
let text = parakeet.transcribe?;
// Batch
let files = vec!;
let results = parakeet.transcribe_batch?;
What it does
- Transcribes speech to text w/ punctuation & capitalization
Note: This uses the CTC-based Parakeet model (nvidia/parakeet-ctc-0.6b):
- English only
- No timestamps (CTC limitation), use with pyannote for diarization (see example)
License
This Rust codebase: MIT OR Apache-2.0
FYI: The Parakeet ONNX models (downloaded separately from HuggingFace) are licensed under CC-BY-4.0 by NVIDIA. This library does not distribute the models.