_ _
__ __| |__ (_)___ _ __ ___
\ \ /\ / /| '_ \| / __|| '__/ __|
\ V V / | | | | \__ \| | \__ \
\_/\_/ |_| |_|_|___/|_| |___/
speak. type. done.
whisrs
Linux-first voice-to-text dictation tool, written in Rust.
Speech-to-text for Wayland, X11, Hyprland, Sway, GNOME, and KDE. Press a hotkey, speak, and your words appear at the cursor. Works with any app, any window manager, any desktop environment. Supports cloud transcription (Groq, Deepgram, OpenAI) and fully offline local transcription via whisper.cpp. Fast, private, open source.
Why whisrs?
Dictation tools like Wispr Flow and Superwhisper are not available on Linux. xhisper proved the concept works, but I kept running into limitations. whisrs takes that idea and rebuilds it in Rust as a single async process with native keyboard layout support, window tracking, and multiple transcription backends.
Installation
Quick install (any distro)
|
Or clone and run locally:
&& &&
The install script handles everything: detects your distro, installs system dependencies, builds the project, and runs interactive setup.
After install, press your hotkey to start recording, press again to stop. Text appears at your cursor.
Pre-built binary (Linux x86_64)
Each tagged release publishes a tarball on GitHub Releases with both whisrs and whisrsd plus the contrib files (udev rule, systemd unit, man pages).
# Full build (cloud + local whisper.cpp)
# Or the minimal build (cloud backends only — smaller, no whisper.cpp)
&&
| Variant | Includes local whisper.cpp | Tarball |
|---|---|---|
whisrs-linux-x86_64.tar.gz |
yes | full build |
whisrs-linux-x86_64-minimal.tar.gz |
no (cloud backends only) | minimal build |
Arch Linux (AUR)
After install, run whisrs setup to configure your backend, API keys, permissions, and keybindings.
Cargo
Requires system dependencies: alsa-lib, libxkbcommon, clang, cmake.
After install, run whisrs setup.
Nix
Or add to your flake inputs:
inputs.whisrs.url = "github:y0sif/whisrs";
Manual install
1. Dependencies
# Arch Linux
# Debian/Ubuntu
# Fedora
2. Build
3. Setup
The interactive setup will walk you through backend selection, API keys / model download, microphone test, uinput permissions, systemd service, and keybindings.
4. Bind a hotkey
Example for Hyprland (~/.config/hypr/hyprland.conf):
bind = $mainMod, W, exec, whisrs toggle
Example for Sway (~/.config/sway/config):
bindsym $mod+w exec whisrs toggle
Transcription Backends
| Backend | Type | Streaming | Cost | Best for |
|---|---|---|---|---|
| Groq | Cloud | Batch | Free tier available | Getting started, budget use |
| Deepgram Streaming | Cloud (WebSocket) | True streaming | $200 free credit | Streaming with free credits |
| Deepgram REST | Cloud | Batch | $200 free credit | Simple, 60+ languages |
| OpenAI Realtime | Cloud (WebSocket) | True streaming | Paid | Best UX, text as you speak |
| OpenAI REST | Cloud | Batch | Paid | Simple fallback |
| Local whisper.cpp | Local (CPU/GPU) | Sliding window | Free | Privacy, offline use |
Groq is the default. For fully offline use, run whisrs setup and select Local > whisper.cpp — base.en (142 MB, ~388 MB RAM) is recommended; tiny.en (75 MB) for low-end hardware, small.en (466 MB) for higher accuracy.
Configuration
Config file: ~/.config/whisrs/config.toml — whisrs setup writes a working file. A minimal example:
[]
= "groq" # groq | deepgram-streaming | deepgram | openai-realtime | openai | local-whisper
= "en" # ISO 639-1 or "auto"
= false # bottom-screen recording overlay
[]
= "gsk_..."
Env-var overrides: WHISRS_GROQ_API_KEY, WHISRS_DEEPGRAM_API_KEY, WHISRS_OPENAI_API_KEY.
For the full reference (overlay, [input], [llm], [hotkeys], GNOME extension setup), see docs/configuration.md.
CLI Commands
whisrs setup # Interactive onboarding
whisrs toggle # Start/stop recording
whisrs cancel # Cancel recording, discard audio
whisrs status # Query daemon state
whisrs command # Command mode: select text + speak instruction → LLM rewrite
whisrs log # Show recent transcription history
whisrs log -n 5 # Show last 5 entries
whisrs log --clear # Clear all history
Supported Environments
| Component | Support |
|---|---|
| Hyprland | Tested by maintainer and community (Arch Linux) |
| Sway / i3 | Implemented; additional reports welcome |
| X11 (any WM) | Tested by community on Ubuntu 24.04 (Xorg) |
| GNOME Wayland | Tested by community on Ubuntu 24.04 and Arch (mutter); overlay via the bundled GNOME Shell extension |
| KDE Wayland | Implemented via D-Bus; reports welcome |
| Audio | PipeWire, PulseAudio, ALSA (auto-detected via cpal) |
| Distros | Confirmed on Arch Linux and Ubuntu 24.04; any Linux with the system dependencies above |
Note: whisrs is daily-driven on Hyprland (Arch Linux), with community confirmation on GNOME Wayland (Ubuntu 24.04 + Arch) and Xorg (Ubuntu 24.04). Sway, i3, and KDE reports are still wanted — if you use whisrs there, please open an issue with what works and what doesn't.
Project Status
whisrs is functional and usable for daily dictation. Streaming transcription, command mode, multi-language support, system tray, OSD overlay, layout-aware injection (incl. AltGr + dead keys), and packaging for AUR / Nix / crates.io all ship today. Local Vosk and Parakeet backends are next.
Per-release details: docs/version-roadmap.md.
Troubleshooting
Contributing
The biggest way to help right now:
- Test on your compositor — Sway, i3, KDE, GNOME. Report what works and what doesn't.
- Test on your distro — Ubuntu, Fedora, NixOS, etc. Build issues, missing deps, etc.
- Bug reports — if text goes to the wrong window, characters get dropped, or audio doesn't capture, open an issue.
See CONTRIBUTING.md for development setup and project structure.
How whisrs Compares
FAQ
License
MIT