1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
//! SenseVoice ONNX transcription engine.
//!
//! This module provides transcription using the SenseVoice/FunASR model via ONNX Runtime.
//! SenseVoice is a CTC-based speech recognition model with built-in language detection,
//! emotion recognition, and audio event detection.
//!
//! # Model Architecture
//!
//! SenseVoice uses a CTC encoder with special prefix tokens:
//! - Processes audio via FBANK features → LFR stacking → CMVN normalization
//! - Outputs include language, emotion, and event classification alongside speech text
//!
//! # Model Format
//!
//! Expects a directory containing:
//! - `model.onnx` - The SenseVoice encoder model
//! - `tokens.txt` - Token vocabulary (ID-to-symbol mapping)
//!
//! # Supported Languages
//!
//! Chinese (Mandarin), English, Japanese, Korean, Cantonese, or auto-detect.
//!
//! # Audio Requirements
//!
//! - Sample rate: 16 kHz
//! - Format: Mono, 16-bit PCM
//!
//! # Example
//!
//! ```rust,no_run
//! use std::path::PathBuf;
//! use transcribe_rs::{TranscriptionEngine, engines::sense_voice::{SenseVoiceEngine, SenseVoiceModelParams}};
//!
//! let mut engine = SenseVoiceEngine::new();
//! engine.load_model_with_params(
//! &PathBuf::from("models/sense-voice"),
//! SenseVoiceModelParams::default(),
//! )?;
//!
//! let result = engine.transcribe_file(&PathBuf::from("audio.wav"), None)?;
//! println!("Transcription: {}", result.text);
//! # Ok::<(), Box<dyn std::error::Error>>(())
//! ```
pub use ;
pub use SenseVoiceError;