whis-core 0.6.0

Core library for whis voice-to-text functionality
Documentation

Features

  • Audio recording — capture microphone input via cpal
  • Multi-provider transcription — OpenAI, Mistral, Groq, Deepgram, ElevenLabs, or local Whisper
  • Parallel processing — split long recordings into chunks for parallel transcription
  • LLM post-processing — clean up transcriptions using Ollama
  • Clipboard — copy results to system clipboard (X11, Wayland, Flatpak)
  • Config management — persistent settings in ~/.config/whis/

Usage

use whis_core::{
    AudioRecorder, TranscriptionProvider, RecordingOutput,
    transcribe_audio, copy_to_clipboard, ClipboardMethod,
};

// Configure provider and API key
let provider = TranscriptionProvider::OpenAI;
let api_key = std::env::var("OPENAI_API_KEY")?;

// Record audio
let mut recorder = AudioRecorder::new()?;
recorder.start_recording()?;
// ... wait for user input ...
let output = recorder.finalize_recording()?;

// Extract audio data from RecordingOutput
let audio_data = match output {
    RecordingOutput::Single(data) => data,
    RecordingOutput::Chunked(chunks) => {
        // For chunked audio, use parallel_transcribe instead
        chunks.into_iter().next().unwrap().data
    }
};

// Transcribe
let text = transcribe_audio(&provider, &api_key, None, audio_data)?;

// Copy to clipboard
copy_to_clipboard(&text, ClipboardMethod::Auto)?;

Feature Flags

Feature Default Description
ffmpeg Yes Desktop audio encoding via FFmpeg subprocess
clipboard Yes Clipboard support via arboard/xclip/wl-copy
local-whisper Yes Local whisper.cpp transcription (requires model)
embedded-encoder No Mobile MP3 encoding via mp3lame (no FFmpeg)

Modules

Module Description
audio AudioRecorder, AudioChunk, RecordingOutput, recording utilities
transcribe Single-file and parallel chunked transcription
provider Provider registry and TranscriptionBackend trait
config TranscriptionProvider enum (OpenAI, Mistral, Groq, etc.)
settings User preferences (provider, API keys, language, hotkeys)
preset Named configuration presets
post_processing LLM-based transcription cleanup
ollama Ollama client for local LLM post-processing
clipboard System clipboard operations with multiple backends
model Whisper model management
state Recording state machine
verbose Debug logging utilities

License

MIT