Skip to main content

sherpa_onnx/
lib.rs

1//! Safe Rust bindings for the public sherpa-onnx inference APIs.
2//!
3//! This crate wraps the sherpa-onnx C API with RAII-owned Rust types and
4//! idiomatic configuration structs. The main feature families are:
5//!
6//! - offline ASR through [`OfflineRecognizer`]
7//! - streaming ASR through [`OnlineRecognizer`]
8//! - offline text-to-speech through [`OfflineTts`]
9//! - voice activity detection through [`VoiceActivityDetector`]
10//! - speaker embeddings and diarization
11//! - online punctuation
12//! - offline and streaming speech denoising
13//! - audio tagging
14//! - WAV I/O helpers through [`Wave`] and [`write()`]
15//!
16//! # Setup
17//!
18//! This crate now links statically by default. If `SHERPA_ONNX_LIB_DIR` is not
19//! set, the build script downloads a matching prebuilt `-lib` archive from
20//! [GitHub releases](https://github.com/k2-fsa/sherpa-onnx/releases) and uses
21//! it automatically during the build.
22//!
23//! In other words, the default setup for most users is simply:
24//!
25//! ```toml
26//! sherpa-onnx = "1.13.1"
27//! ```
28//!
29//! If you want shared libraries instead, disable the default feature and enable
30//! `shared`:
31//!
32//! ```toml
33//! sherpa-onnx = { version = "1.13.1", default-features = false, features = ["shared"] }
34//! ```
35//!
36//! For advanced use cases, set `SHERPA_ONNX_LIB_DIR` to a directory that already
37//! contains sherpa-onnx libraries:
38//!
39//! ```bash
40//! export SHERPA_ONNX_LIB_DIR=/path/to/sherpa-onnx/lib
41//! ```
42//!
43//! That override works for both static and shared builds.
44//!
45//! Shared mode is also intended to work out of the box for normal users:
46//!
47//! - Linux and macOS: the build script adds both absolute and relative rpath
48//!   entries automatically, and copies the required shared runtime libraries
49//!   next to Cargo-generated binaries and examples.
50//! - Windows: the build script copies the required DLLs next to the generated
51//!   binaries automatically when using shared libraries.
52//!
53//! So most users do not need to manually set `LD_LIBRARY_PATH` or
54//! `DYLD_LIBRARY_PATH`.
55//!
56//! Example `v1.13.1` archives used by the build script:
57//!
58//! Default static archives:
59//!
60//! - Linux x86_64:
61//!   [sherpa-onnx-v1.13.1-linux-x64-static-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-linux-x64-static-lib.tar.bz2)
62//! - Linux aarch64:
63//!   [sherpa-onnx-v1.13.1-linux-aarch64-static-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-linux-aarch64-static-lib.tar.bz2)
64//! - macOS x86_64:
65//!   [sherpa-onnx-v1.13.1-osx-x64-static-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-osx-x64-static-lib.tar.bz2)
66//! - macOS arm64:
67//!   [sherpa-onnx-v1.13.1-osx-arm64-static-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-osx-arm64-static-lib.tar.bz2)
68//! - Windows x64:
69//!   [sherpa-onnx-v1.13.1-win-x64-static-MT-Release-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-win-x64-static-MT-Release-lib.tar.bz2)
70//!
71//! Optional shared archives:
72//!
73//! - Linux x86_64:
74//!   [sherpa-onnx-v1.13.1-linux-x64-shared-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-linux-x64-shared-lib.tar.bz2)
75//! - Linux aarch64:
76//!   [sherpa-onnx-v1.13.1-linux-aarch64-shared-cpu-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-linux-aarch64-shared-cpu-lib.tar.bz2)
77//! - macOS x86_64:
78//!   [sherpa-onnx-v1.13.1-osx-x64-shared-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-osx-x64-shared-lib.tar.bz2)
79//! - macOS arm64:
80//!   [sherpa-onnx-v1.13.1-osx-arm64-shared-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-osx-arm64-shared-lib.tar.bz2)
81//! - Windows x64:
82//!   [sherpa-onnx-v1.13.1-win-x64-shared-MT-Release-lib.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.13.1/sherpa-onnx-v1.13.1-win-x64-shared-MT-Release-lib.tar.bz2)
83//!
84//! # How the Rust API is organized
85//!
86//! Most APIs follow the same pattern:
87//!
88//! 1. Start with a `*Config` value and fill the fields for exactly one model
89//!    family.
90//! 2. Call `create()` to construct the runtime object.
91//! 3. Create a stream if the API is stream-based.
92//! 4. Feed audio or text, then fetch results with the provided accessor methods.
93//!
94//! All runtime wrappers automatically free their underlying C resources on drop.
95//!
96//! # Examples
97//!
98//! The repository contains end-to-end Rust examples under
99//! [`rust-api-examples/examples/`](https://github.com/k2-fsa/sherpa-onnx/tree/master/rust-api-examples/examples).
100//! Good entry points are:
101//!
102//! - [`sense_voice.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/sense_voice.rs)
103//! - [`nemo_parakeet.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/nemo_parakeet.rs)
104//! - [`streaming_zipformer.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/streaming_zipformer.rs)
105//! - [`pocket_tts.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/pocket_tts.rs)
106//! - [`silero_vad_remove_silence.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/silero_vad_remove_silence.rs)
107//! - [`online_punctuation.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/online_punctuation.rs)
108//! - [`offline_punctuation.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/offline_punctuation.rs)
109//! - [`keyword_spotter.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/keyword_spotter.rs)
110//! - [`spoken_language_identification.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/spoken_language_identification.rs)
111//! - [`offline_speaker_diarization.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/offline_speaker_diarization.rs)
112//! - [`speaker_embedding_manager.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/speaker_embedding_manager.rs)
113//!
114//! # Offline recognition example
115//!
116//! ```no_run
117//! use sherpa_onnx::{
118//!     OfflineRecognizer, OfflineRecognizerConfig, OfflineSenseVoiceModelConfig, Wave,
119//! };
120//!
121//! let wave = Wave::read("./test.wav").expect("read wave");
122//!
123//! let mut config = OfflineRecognizerConfig::default();
124//! config.model_config.sense_voice = OfflineSenseVoiceModelConfig {
125//!     model: Some(
126//!         "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx".into(),
127//!     ),
128//!     language: Some("auto".into()),
129//!     use_itn: true,
130//! };
131//! config.model_config.tokens = Some(
132//!     "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt".into(),
133//! );
134//!
135//! let recognizer = OfflineRecognizer::create(&config).expect("create recognizer");
136//! let stream = recognizer.create_stream();
137//! stream.accept_waveform(wave.sample_rate(), wave.samples());
138//! recognizer.decode(&stream);
139//!
140//! let result = stream.get_result().expect("result");
141//! println!("{}", result.text);
142//! ```
143//!
144//! # Streaming recognition example
145//!
146//! ```no_run
147//! use sherpa_onnx::{OnlineRecognizer, OnlineRecognizerConfig, Wave};
148//!
149//! let wave = Wave::read("./test.wav").expect("read wave");
150//!
151//! let mut config = OnlineRecognizerConfig::default();
152//! config.model_config.transducer.encoder = Some(
153//!     "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx".into(),
154//! );
155//! config.model_config.transducer.decoder = Some(
156//!     "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx".into(),
157//! );
158//! config.model_config.transducer.joiner = Some(
159//!     "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx".into(),
160//! );
161//! config.model_config.tokens = Some(
162//!     "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt".into(),
163//! );
164//! config.enable_endpoint = true;
165//! config.decoding_method = Some("greedy_search".into());
166//!
167//! let recognizer = OnlineRecognizer::create(&config).expect("create recognizer");
168//! let stream = recognizer.create_stream();
169//! stream.accept_waveform(wave.sample_rate(), wave.samples());
170//! stream.input_finished();
171//! while recognizer.is_ready(&stream) {
172//!     recognizer.decode(&stream);
173//! }
174//! ```
175//!
176//! # TTS example
177//!
178//! ```no_run
179//! use sherpa_onnx::{OfflineTts, OfflineTtsConfig, OfflineTtsModelConfig, OfflineTtsPocketModelConfig};
180//!
181//! let config = OfflineTtsConfig {
182//!     model: OfflineTtsModelConfig {
183//!         pocket: OfflineTtsPocketModelConfig {
184//!             lm_flow: Some("./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx".into()),
185//!             lm_main: Some("./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx".into()),
186//!             encoder: Some("./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx".into()),
187//!             decoder: Some("./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx".into()),
188//!             text_conditioner: Some(
189//!                 "./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx".into(),
190//!             ),
191//!             vocab_json: Some("./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json".into()),
192//!             token_scores_json: Some(
193//!                 "./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json".into(),
194//!             ),
195//!             ..Default::default()
196//!         },
197//!         ..Default::default()
198//!     },
199//!     ..Default::default()
200//! };
201//!
202//! let tts = OfflineTts::create(&config).expect("create tts");
203//! println!("{}", tts.sample_rate());
204//! ```
205mod audio_tagging;
206mod display;
207mod kws;
208mod offline_asr;
209mod offline_punctuation;
210mod offline_speaker_diarization;
211mod offline_speech_denoiser;
212mod online_asr;
213mod online_punctuation;
214mod online_speech_denoiser;
215mod resampler;
216mod speaker_embedding;
217mod spoken_language_identification;
218mod tts;
219mod utils;
220mod vad;
221mod wave;
222
223pub use audio_tagging::*;
224pub use display::*;
225pub use kws::*;
226pub use offline_asr::*;
227pub use offline_punctuation::*;
228pub use offline_speaker_diarization::*;
229pub use offline_speech_denoiser::*;
230pub use online_asr::*;
231pub use online_punctuation::*;
232pub use online_speech_denoiser::*;
233pub use resampler::*;
234pub use speaker_embedding::*;
235pub use spoken_language_identification::*;
236pub use tts::*;
237pub use utils::*;
238pub use vad::*;
239pub use wave::*;