stt-cli-0.2.1 is not a library.
STT CLI (Speech-to-Text Command Line Interface)
A command-line tool for real-time speech-to-text transcription with AI (Groq and OpenAI).
Features
- Real-time audio capture from microphone
- Support for multiple transcription providers:
- Groq (using whisper-large-v3)
- OpenAI (using Whisper)
- Efficient audio processing with proper chunking
- Configurable activation modes: always-on or hotkey
- Clean shutdown handling with Ctrl+C
- Optional text insertion, auto-capitalization, and auto-punctuation
Installation
via cargo
Manual way
-
Make sure you have Rust installed on your system. If not, install it from rustup.rs
-
Clone the repository:
-
Build the project:
Usage
Run the CLI with your desired options:
Command-Line Options
Flag / Option | Description |
---|---|
-d, --device <DEVICE> |
Audio device name to use |
-m, --mode <MODE> |
Transcription activation mode [default: always-on] [possible values: always-on, hotkey] |
-k, --hotkey <HOTKEY> |
Hotkey for toggling recording (when in hotkey mode) [default: ctrl+space] |
--data-dir <DATA_DIR> |
Directory to store data [default: data_dir] |
--debug |
Enable debug mode |
-t, --transcription-provider <TRANSCRIPTION_PROVIDER> |
Transcription provider [default: groq] [possible values: groq, open-ai] |
--enable-text-insertion |
Enable text insertion at cursor position |
--auto-capitalize |
Automatically capitalize first letter of transcribed text |
--auto-punctuate |
Automatically add trailing punctuation if missing |
-h, --help |
Print help |
-V, --version |
Print version |
Examples
# Using Groq
# Using OpenAI
# Using hotkey mode with a custom hotkey
Environment Variables
Before running the application, make sure to set up the required API keys:
-
For Groq:
-
For OpenAI:
Expected Output
When running the application, you'll see:
- Initialization messages for audio device setup
- Real-time transcription of your speech
- Status messages for audio processing and API requests
Example:
Initializing audio device...
Audio capture started. Speak into your microphone.
[Transcription] "Hello, this is a test of the speech to text system."
...
Press Ctrl+C to gracefully stop the application.
Contributing
Contributions are welcome! Please feel free to submit a Issue.
License
This project is licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.