Module voice

Expand description

Voice capture: microphone-to-WAV pipeline.

The library half of omni-dev voice capture. The CLI entry point lives in crate::cli::voice. This module is intentionally CLI-free so the audio pipeline (source → mixdown → resample → idle-detect → trim → write) can be unit-tested against fixture WAVs without a real microphone.

The AudioSource trait in audio is the test seam: production code uses audio::CpalAudioSource, tests use audio::FileAudioSource. See ADR-0031 for the rationale.

Re-exports§

pub use audio::AudioSource;
pub use audio::CpalAudioSource;
pub use audio::FileAudioSource;
pub use capture::install_ctrl_c_handler;
pub use capture::run_capture;
pub use capture::CaptureOpts;
pub use capture::CaptureSummary;
pub use capture::TerminationReason;
pub use clock::Clock;
pub use clock::FixedClock;
pub use clock::SystemClock;
pub use det::CountingUlidRng;
pub use det::SystemUlidRng;
pub use det::UlidRng;
pub use factory::create_default_transcriber;
pub use factory::VoiceOpts;
pub use paths::captures_dir;
pub use paths::omni_dev_voice_root;
pub use paths::speaker_file;
pub use paths::speakers_dir;
pub use render::detect_format;
pub use render::render_jsonl;
pub use render::render_markdown;
pub use render::OutputFormat;
pub use speaker::cosine;
pub use speaker::l2_normalise;
pub use speaker::EnrolledSpeaker;
pub use speaker::WespeakerEmbedder;
pub use speaker::MIN_EMBED_SAMPLES;
pub use transcriber::AudioChunk;
pub use transcriber::AudioInput;
pub use transcriber::EndpointKind;
pub use transcriber::EventId;
pub use transcriber::EventStream;
pub use transcriber::SpeakerId;
pub use transcriber::Transcriber;
pub use transcriber::TranscriptEvent;
pub use transcriber::VecAudioInput;
pub use transcriber::Word;

Modules§

audio: Audio source abstraction.
backends: Transcriber backends.
capture: End-to-end capture pipeline orchestrator.
clock: Pluggable wall clock for deterministic timestamps in tests.
det: Pluggable RNG for EventId (ULID) generation.
events: Reflection event schema (the events.jsonl contract from #799).
factory: Backend factory for crate::voice::Transcriber.
features: Kaldi-style FBANK (log-mel filterbank) feature extraction.
idle: Idle (silence) detection and trailing-silence trimming.
models: Model storage convention and path resolution.
paths: User-state directory helpers for the voice subsystem.
reconcile: Pure reconciliation of events.jsonl into materialised markdown.
reflect: voice reflect — transcript-to-events Claude consumer.
render: Rendering helpers for crate::voice::TranscriptEvent streams.
review: voice review driver — wraps the pure crate::voice::reconcile function with the I/O the CLI needs.
session: ~/.omni-dev/voice/<id>/ session directory I/O.
speaker: Speaker embedding via tract-onnx + wespeaker, plus the persisted enrolled-speaker JSON shape.
transcriber: Transcriber trait and event types per issues #799 and #801.
wav: Mixdown, resampling, and WAV writing.

Module voice

Module voice Copy item path

Re-exports§

Modules§

Module voice