1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
//! # wavekat-asr
//!
//! Streaming ASR trait surface, intended to wrap one or more speech-to-text
//! backends behind a common Rust API. Modeled on the same pattern as
//! [`wavekat-vad`] and [`wavekat-turn`].
//!
//! [`wavekat-vad`]: https://crates.io/crates/wavekat-vad
//! [`wavekat-turn`]: https://crates.io/crates/wavekat-turn
//!
//! # Status
//!
//! This crate is pre-1.0. The trait surface may iterate as more
//! backends land. Pin to an exact patch version.
//!
//! The bundled backend is [`backends::sherpa_onnx`] (behind the
//! `sherpa-onnx` Cargo feature): a local streaming Zipformer that
//! auto-downloads its model from HuggingFace on first use.
pub use AsrError;
pub use AudioFrame;
/// Which side of a two-channel call the audio (or transcript) belongs to.
///
/// The daemon tees both RTP directions through one ASR instance, so every
/// event needs to carry the channel it came from.
/// One transcript event emitted by a [`StreamingAsr`] backend.
/// A streaming ASR session.
///
/// Implementations are expected to:
///
/// - Accept any [`AudioFrame`] sample rate; resample internally.
/// - Be `Send` so the daemon can move them between tasks.
/// - Emit [`TranscriptEvent`]s via the receiver returned at construction
/// time (see backend docs for the constructor shape).
///
/// The trait is intentionally tiny in `0.0.1`. Expect additions
/// (per-utterance reset, hot-swappable config, metric hooks) as real
/// backends land in later releases.