stt-cli 0.2.1 - Docs.rs

# STT CLI (Speech-to-Text Command Line Interface)

A command-line tool for real-time speech-to-text transcription with AI (Groq and OpenAI).

## Features

- Real-time audio capture from microphone
- Support for multiple transcription providers:
  - Groq (using whisper-large-v3)
  - OpenAI (using Whisper)
- Efficient audio processing with proper chunking
- Configurable activation modes: always-on or hotkey
- Clean shutdown handling with Ctrl+C
- Optional text insertion, auto-capitalization, and auto-punctuation

## Installation

#### via cargo 

```bash
cargo install stt-cli
```

#### Manual way 

1. Make sure you have Rust installed on your system. If not, install it from [rustup.rs](https://rustup.rs)

2. Clone the repository:
   ```bash
   git clone https://github.com/TwistingTwists/stt-cli
   cd stt-cli
   ```

3. Build the project:
   ```bash
   cargo build --release
   ```

## Usage

Run the CLI with your desired options:

```bash
stt-cli [OPTIONS]
```

### Command-Line Options

| Flag / Option                     | Description                                                           |
|-----------------------------------|-----------------------------------------------------------------------|
| `-d, --device <DEVICE>`           | Audio device name to use                                              |
| `-m, --mode <MODE>`               | Transcription activation mode [default: always-on] [possible values: always-on, hotkey] |
| `-k, --hotkey <HOTKEY>`           | Hotkey for toggling recording (when in hotkey mode) [default: ctrl+space] |
| `    --data-dir <DATA_DIR>`       | Directory to store data [default: data_dir]                           |
| `    --debug`                     | Enable debug mode                                                     |
| `-t, --transcription-provider <TRANSCRIPTION_PROVIDER>` | Transcription provider [default: groq] [possible values: groq, open-ai] |
| `    --enable-text-insertion`     | Enable text insertion at cursor position                              |
| `    --auto-capitalize`           | Automatically capitalize first letter of transcribed text              |
| `    --auto-punctuate`            | Automatically add trailing punctuation if missing                      |
| `-h, --help`                      | Print help                                                            |
| `-V, --version`                   | Print version                                                         |

### Examples

```bash
# Using Groq
stt-cli -t groq

# Using OpenAI
stt-cli -t open-ai

# Using hotkey mode with a custom hotkey
stt-cli -m hotkey -k "alt+r"
```

### Environment Variables

Before running the application, make sure to set up the required API keys:

- For Groq:
  ```bash
  export GROQ_API_KEY='your-groq-api-key'
  ```

- For OpenAI:
  ```bash
  export OPENAI_API_KEY='your-openai-api-key'
  ```

## Expected Output

When running the application, you'll see:
1. Initialization messages for audio device setup
2. Real-time transcription of your speech
3. Status messages for audio processing and API requests

Example:
```
Initializing audio device...
Audio capture started. Speak into your microphone.
[Transcription] "Hello, this is a test of the speech to text system."
...
```

Press Ctrl+C to gracefully stop the application.

## Contributing

Contributions are welcome! Please feel free to submit a [Issue](https://github.com/TwistingTwists/stt-cli/issues/new).

## License

This project is licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.