transcribe-cli 0.0.4

Whisper CLI transcription pipeline on CTranslate2 with CPU and optional CUDA support
# transcribe-cli

`transcribe-cli` is a Rust command-line transcription pipeline built on Whisper and CTranslate2.

It supports:

- CPU-optimized transcription
- optional NVIDIA CUDA execution
- automatic Whisper model download into `models/`
- local media files or `http/https` media URLs
- streaming transcription modes
- model cleanup commands

## Install

From crates.io:

```bash
cargo install transcribe-cli --locked
```

From a local checkout:

```bash
cargo install --path . --locked
```

With CUDA support:

```bash
CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda CT2_CUDA_ARCH_LIST=Auto cargo install transcribe-cli --locked --features cuda
```

With CUDA + cuDNN support:

```bash
CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda CT2_CUDA_ARCH_LIST=Auto cargo install transcribe-cli --locked --features cudnn
```

## Usage

```bash
transcribe-cli --model small audio.mp3
transcribe-cli --model medium --stream audio.flac
transcribe-cli --model small movie.mp4
transcribe-cli --model tiny https://example.com/audio.wav
```

## Features

- `cuda`: enable CUDA support with dynamic loading
- `cuda-static`: enable static CUDA support
- `cuda-dynamic-loading`: alias for the dynamic CUDA path
- `cudnn`: enable cuDNN on top of CUDA

## Notes

- Whisper models are downloaded automatically on first use.
- Media files are decoded through the built-in Rust pipeline; no external `ffmpeg` dependency is required.
- Video containers work when they include an audio track in a codec supported by `symphonia`.
- By default models are stored in `models/` next to the executable unless `--models-dir` is set.
- Whisper decoding is handled in-project through a local wrapper around CTranslate2 `sys::Whisper` and Hugging Face `tokenizers`.
- `cuda` and `cudnn` build CTranslate2 from source through `ct2rs`, so the NVIDIA driver alone is not enough: a CUDA Toolkit install is required.
- `cudnn` also requires the cuDNN development files. `ct2rs` looks for `cuda.h` under `$CUDA_TOOLKIT_ROOT_DIR/include` and for `cudnn.h` plus `libcudnn` under the same CUDA root.
- `ct2rs` defaults to `CUDA_ARCH_LIST=Common`, which can include architectures removed from newer CUDA toolkits such as CUDA 13.x. This project sets `CT2_CUDA_ARCH_LIST=Auto` by default to avoid `nvcc fatal: Unsupported gpu architecture 'compute_53'`.
- If auto-detection is not what you want, override it explicitly with `CUDA_ARCH_LIST=8.6` or another value supported by your GPU and CUDA toolkit.
- `--locked` is recommended for `cargo install` so published installs use the crate's resolved dependency set instead of newer patch releases.