//! Speech-to-text provider abstraction.
use async_trait;
/// One-shot speech-to-text. v1 is buffer-based: the caller hands over a
/// complete utterance (PTT-bounded or VAD-bounded on the client) and gets
/// the final transcript back as a single string.
///
/// Streaming transcription (partial hypotheses while the user is still
/// speaking) is intentionally out of scope for v1 — none of our planned
/// initial providers (whisper-rs, OpenAI Whisper API) expose a true
/// streaming surface that would matter at the round-trip latencies we
/// already accept for sub-10-second utterances.