parakeet-rs
Rust bindings for NVIDIA's Parakeet ASR model via ONNX Runtime.
GPU support (optional):
= { = "0.1", = ["cuda"] }
Note: CoreML often doesn't work with this model - stick w/ CPU or CUDA. But its incredible fast in my Mac M3 16gb compared to Whisper metal :-)
Usage
use Parakeet;
let mut parakeet = from_pretrained?;
let result = parakeet.transcribe?;
println!;
// Access token-level timestamps
for token in result.tokens
GPU:
use ;
let config = new
.with_execution_provider;
let mut parakeet = from_pretrained_with_config?;
Model Files
Put these in your working directory:
model.onnx/model.onnx_dataconfig.jsonpreprocessor_config.jsontokenizer.json/tokenizer_config.jsonspecial_tokens_map.json
Get the model from HuggingFace here.
Audio Format
- WAV files, 16kHz, mono
- 16-bit PCM or 32-bit float
Examples
Basic:
w/ speaker diarization (needs pyannote models):
API
// Load model
let mut parakeet = from_pretrained?;
// Transcribe (returns text + token timestamps)
let result = parakeet.transcribe?;
println!;
// Access timestamps
for token in &result.tokens
// Batch
let results = parakeet.transcribe_batch?;
for result in results
What it does
- Transcribes speech to text w/ punctuation & capitalization
Note: This uses the CTC-based Parakeet model (nvidia/parakeet-ctc-0.6b):
- English only
- Token-level timestamps supported (CTC frame-level output)
- For word-level timestamps & speaker diarization, use with pyannote (see example)
License
This Rust codebase: MIT OR Apache-2.0
FYI: The Parakeet ONNX models (downloaded separately from HuggingFace) are licensed under CC-BY-4.0 by NVIDIA. This library does not distribute the models.