Crate vosk

Source
Expand description

Safe FFI bindings around the Vosk API Speech Recognition Toolkit.

Basic usage:

Structs§

Alternative
An alternative transcript in a CompleteResultMultiple.
BatchModelbatch
The same as Model, but uses a CUDA enabled Nvidia GPU and dynamic batching to enable higher throughput.
BatchRecognizerbatch
The main object which processes data using GPU inferencing. Takes audio as input and returns decoded information as words, confidences, times, and other metadata.
CompleteResultMultiple
Recognition result if Recognizer::set_max_alternatives is passed a non-zero value.
CompleteResultSingle
Recognition result if Recognizer::set_max_alternatives is passed a zero (default).
Model
Model that stores all the data required for recognition.
PartialResult
Result returned by Recognizer::partial_result. The result may change after processing more data as decoding is not yet complete.
Recognizer
The main object which processes data. Takes audio as input and returns decoded information as words, confidences, times, and other metadata.
SpeakerInfo
Data useful for speaker identification.
SpeakerModel
The same as Model but contains the data for speaker identification.
Word
A single word in a CompleteResultSingle and metadata about it.
WordInAlternative
A single word in an Alternative and metadata about it.

Enums§

AcceptWaveformError
Possible errors that accept_waveform methods might return.
CompleteResult
Different results that can be returned from Recognizer::result and Recognizer::final_result.
DecodingState
State of the decodification after processing a chunk of data.
LogLevel
Log level for Kaldi messages.

Functions§

gpu_initbatch
Init, automatically select a CUDA device and allow multithreading. Must be called once from the main thread.
gpu_thread_initbatch
Init CUDA device in a multi-threaded environment. Must be called for each thread.
set_log_level
Set log level for Kaldi messages.