1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
//! Voice-pipeline orchestration — the architecture-agnostic
//! VAD + STT + LLM + TTS + audio-out composition surface, ported from
//! [`mlx_audio.sts.voice_pipeline`][vp].
//!
//! Per the no per-model arch porting rule and the
//! mirror-reference-structure rule, mlxrs lifts the
//! *orchestration shape* mlx-audio's [`VoicePipeline`][vp-class]
//! exposes as a typed trait surface every caller composes:
//!
//! - [`mod@config`] — [`VoicePipelineConfig`] (the typed argument
//! bundle) + [`LatencyProfile`] (typed analogue of mlx-audio's
//! `latency_profile: str`).
//! - [`mod@chunker`] — [`AudioChunker`] trait + [`FixedSizeAudioChunker`]
//! default + [`PreRollBuffer`] (the sample-buffer primitives).
//! - [`mod@barge_in`] — [`BargeInDetector`] trait +
//! [`EnergyBargeInDetector`] (the user-overlap decision shape).
//! - [`mod@turn_taking`] — [`TurnTakingPolicy`] trait +
//! [`SilenceTurnTakingPolicy`] (the turn-end decision shape).
//! - [`mod@voice_pipeline`] — [`VoicePipeline`] trait (the high-level
//! `run(mic_input, audio_out)` shape).
//!
//! - [`mod@orchestrator`] — [`VoiceSession`] (the default
//! [`VoicePipeline`] implementor that composes every trait
//! surface together into one synchronous mic-iterator-driven
//! loop) plus the four per-step adapter traits
//! [`orchestrator::VadFrameAdapter`] /
//! [`orchestrator::SttTurnAdapter`] /
//! [`orchestrator::LlmResponderAdapter`] /
//! [`orchestrator::TtsStreamAdapter`] the orchestrator needs (a
//! streaming view over the whole-utterance shapes the existing
//! [`crate::audio::vad::VadModel`] /
//! [`crate::audio::stt::model::Model`] /
//! [`crate::lm::model::Model`] /
//! [`crate::audio::tts::model::TtsModel`] trait surfaces
//! expose).
//!
//! ## Scope cuts (explicit)
//!
//! mlx-audio's `voice_pipeline.py` carries 1500+ lines of
//! `asyncio` orchestration, real-time logging, and per-architecture
//! state (Voxtral streaming sessions, smart-turn endpoint
//! detectors). Per the project's match-official-binding-design
//! and no per-model arch porting rules, mlxrs ports only
//! the **shape** of that orchestration as composable traits + a
//! synchronous default `VoiceSession`. Out-of-scope:
//!
//! - **`sounddevice` mic capture** — mic input is an iterator the
//! caller supplies (a `cpal::Stream` consumer, a file reader, a
//! unit-test fixture). The orchestrator is sink-agnostic on the
//! input side too.
//! - **`asyncio` worker / `MLXWorkScheduler`** — the orchestrator
//! runs synchronously; a caller who needs the async fan-out wires
//! their own runtime around the per-frame `step` call.
//! - **The per-model `VoxtralRealtimeTranscriber` /
//! `SileroSpeechGate` / `SmartTurnEndpointDetector` /
//! `LocalLLMResponseEngine` / `PocketTTSResponder` /
//! `AudioOutputStream` classes** — these wrap concrete
//! architectures (Voxtral STT, Silero VAD, etc.). mlxrs's no-arch
//! rule pushes them to user code; the orchestrator instead consumes
//! the architecture-agnostic [`crate::audio::stt::model::Model`] /
//! [`crate::audio::vad::VadModel`] /
//! [`crate::audio::tts::model::TtsModel`] / [`crate::lm::model::Model`]
//! / [`crate::audio::playback::AudioOutputStream`] trait surfaces.
//! - **The full barge-in / echo-correlation state machine** — mlxrs
//! ports the **decision shape** ([`BargeInDetector`]) but not the
//! full echo-cancellation pipeline mlx-audio's
//! `_handle_speech_started` / `_maybe_confirm_barge_candidate`
//! build. The orchestrator wires the trait's `detect` call; the
//! confirmation / suppression dance lives in user code that wraps
//! the orchestrator's per-step events.
//!
//! [vp]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/sts/voice_pipeline.py
//! [vp-class]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/sts/voice_pipeline.py#L570-L572
//! [orch-session]: https://github.com/Blaizzy/mlx-audio/blob/main/mlx_audio/sts/voice_pipeline.py#L570-L572
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
pub use VoicePipeline;