# ragcli
[](https://github.com/mfmezger/ragcli/actions/workflows/cargo-test.yml)
[](https://github.com/mfmezger/ragcli/actions/workflows/cargo-test.yml)
[](https://www.rust-lang.org/)
[](https://crates.io/crates/ragcli)
`ragcli` is a small local RAG CLI written in Rust.
It indexes local files into a persistent LanceDB store, uses Ollama for embeddings and generation, and stays intentionally simple so the whole flow is easy to inspect and extend.
## Features
- local text, Markdown, HTML, CSV/TSV, source code, PDF, and image indexing
- local embedding and generation through Ollama
- hybrid retrieval with LanceDB vector search + BM25 full-text search
- persistent per-store data under `~/.config/ragcli/<name>`
- idempotent re-indexing by `source_path`
- compatibility checks for embedding model + chunk settings
- `doctor` checks for store state, Ollama reachability, and installed models
- `stat` summarizes indexed content, approximate embedded token volume, and store disk usage
- `sources`/`ls`, `delete`, `clear`, and `prune` help inspect and maintain indexed content
- query modes with inspectable retrieval output via `--mode`, `--show-plan`, `--show-scores`, `--show-citations`, and `--show-trace`
## Quick Start
Install [`ragcli` from crates.io](https://crates.io/crates/ragcli):
```bash
cargo install ragcli
```
Then:
1. Start Ollama.
2. Pull the embedding and chat models.
3. Index a local file or folder.
4. Query your local corpus.
```bash
ollama pull nomic-embed-text-v2-moe:latest
ollama pull qwen3.5:4b
ragcli --help
ragcli doctor
ragcli stat
ragcli index ./docs
ragcli query "What is this project about?"
```
If `ragcli` is not found after installation, make sure `~/.cargo/bin` is on your `PATH`.
When developing from a local checkout, you can also run commands with `cargo run --`, or install the checkout with `cargo install --path . --force`.
## Commands
Index a directory or file:
```bash
ragcli index ./docs
ragcli index ./My_Neighbor_Totoro.pdf
ragcli index ./images
ragcli index . --exclude '**/target/**' --exclude '**/.git/**'
```
Use a named store:
```bash
ragcli index ./docs --name work
ragcli query "Summarize the notes" --name work
```
Inspect retrieved context before generation:
```bash
ragcli query "What is My Neighbor Totoro about?" --show-context
ragcli query "Summarize this file" --source ./notes/today.md
ragcli query "What happens on this page?" --source ./My_Neighbor_Totoro.pdf --page 3
ragcli query "What changed in the docs?" --path-prefix ./docs/
ragcli query "Summarize the markdown notes" --format markdown
ragcli query "What is this project about?" --mode hybrid --show-plan
ragcli query "Summarize the release notes" --mode agentic --show-trace
ragcli query "Which file mentions Totoro?" --show-citations --show-scores
```
Check local setup:
```bash
ragcli doctor
ragcli doctor --json
```
Inspect what is already embedded:
```bash
ragcli stat
ragcli stat --json
ragcli sources
ragcli ls
```
Remove or clean indexed content:
```bash
ragcli delete ./notes/today.md
ragcli prune
ragcli prune --json
ragcli prune --apply
ragcli clear --yes
```
## Configuration
Config lives at:
```text
~/.config/ragcli/<name>/config.toml
```
Default config:
```toml
[ollama]
base_url = "http://localhost:11434"
[models]
embed = "nomic-embed-text-v2-moe:latest"
chat = "qwen3.5:4b"
vision = "qwen3.5:4b"
[chunk]
size = 1000
overlap = 200
```
Effective runtime values can be overridden with environment variables:
```bash
export RAGCLI_OLLAMA_URL=http://localhost:11434
export RAGCLI_EMBED_MODEL=nomic-embed-text-v2-moe:latest
export RAGCLI_CHAT_MODEL=qwen3.5:4b
export RAGCLI_VISION_MODEL=qwen3.5:4b
```
You can inspect and update config without editing TOML by hand:
```bash
ragcli config show
ragcli config show --json
ragcli config set models.embed nomic-embed-text-v2-moe:latest
ragcli config set ollama.base_url http://localhost:11434
```
Supported indexable formats currently include:
- plain text: `.txt`, `.rst`
- Markdown: `.md`, `.markdown`
- HTML: `.html`, `.htm`
- tabular text: `.csv`, `.tsv`
- source/config text: `.rs`, `.py`, `.js`, `.ts`, `.tsx`, `.jsx`, `.go`, `.java`, `.c`, `.cc`, `.cpp`, `.cxx`, `.h`, `.hpp`, `.sh`, `.bash`, `.toml`, `.yaml`, `.yml`, `.json`
- PDF: `.pdf`
- images: `.png`, `.jpg`, `.jpeg`, `.webp`
## Telemetry
`ragcli` can optionally export tracing spans over OTLP while keeping the normal stderr log output.
OTLP export is disabled unless `OTEL_EXPORTER_OTLP_ENDPOINT` is set.
Currently supported protocols:
- `http/protobuf` (default when `OTEL_EXPORTER_OTLP_PROTOCOL` is unset)
- `grpc`
Common environment variables:
```bash
export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_EXPORTER_OTLP_TIMEOUT=10000
# optional
export OTEL_EXPORTER_OTLP_HEADERS="api_key=..."
```
Collector example:
```bash
export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
ragcli doctor
```
Phoenix local example:
```bash
export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:6006
ragcli query "What is this project about?"
```
For `http/protobuf`, `ragcli` appends `/v1/traces` only when the configured endpoint has no path (for example `http://localhost:4318`). If you provide a custom path such as `http://localhost:6006/ingest`, that path is used as-is.
You can inspect the resolved telemetry settings with:
```bash
ragcli doctor
ragcli doctor --json
```
`doctor` reports whether telemetry is enabled, the resolved service name, protocol, endpoint, timeout, whether OTLP headers are configured, and any telemetry configuration parse error. Header values are never printed.
By default, exported spans include operational metadata such as command names, query/index execution details, model names, endpoint hosts, durations, and request/response sizes. They do **not** include prompt bodies, retrieved source content, image bytes, or OTLP header values.
## Storage
Each store lives under:
```text
~/.config/ragcli/<name>/
lancedb/
meta/
cache/
models/
config.toml
```
`meta/store.toml` records the embedding model, embedding dimension, chunk settings, and store schema version used to build the store.
When upgrading across store schema changes, reindex into a fresh store or remove the old store before indexing again.
## Behavior Notes
- Re-indexing replaces existing rows for the same `source_path`.
- `sources`/`ls` lists indexed paths with per-source format, chunk count, character count, token estimate, and page count when applicable.
- `delete <path>` removes one indexed source path.
- `prune` previews rows whose stored `source_path` no longer exists on disk; add `--apply` to remove them.
- `clear` removes all indexed rows for the selected store and requires `--yes`.
- Text and source files are decoded lossily, so non-UTF-8 files do not abort indexing.
- Hidden files and directories are skipped during directory traversal unless `--include-hidden` is set.
- `index --exclude <glob>` can be repeated to skip unwanted files or directories.
- HTML is converted to readable text before chunking, and CSV/TSV rows are flattened into labeled text.
- Images are captioned with an Ollama vision model at index time and stored as text for retrieval.
- Queries support `--mode naive|hybrid|agentic|local|global|mix`; `agentic` runs the iterative Ralph-style retrieval loop, while `local`, `global`, and `mix` use distinct placeholder graph-mode paths that still fall back to hybrid retrieval until graph indexing lands.
- Queries use LanceDB hybrid search: semantic nearest-neighbor search plus BM25 full-text search on `chunk_text`.
- Query-time retrieval filters support `--source`, `--path-prefix`, `--page`, and `--format`.
- Querying refuses to mix a store with a different embedding model than the one used to build it.
- Ollama chat requests are sent with `think: false` to reduce hangs with reasoning-capable models.
## Development
Build:
```bash
cargo build
```
Verify:
```bash
cargo check
cargo test --all-targets
cargo fmt -- --check
```
Coverage:
```bash
cargo install cargo-llvm-cov
rustup component add llvm-tools-preview
cargo llvm-cov --all-targets --fail-under-lines 80
cargo llvm-cov --all-targets --html
```
If you use [Task](https://taskfile.dev/), the repo also includes [Taskfile.yml](Taskfile.yml) with shortcuts for the common workflows:
```bash
task build
task check
task test
task coverage
task coverage-html
task coverage-lcov
task changelog
task changelog-preview
task release -- patch
task release-execute -- patch
task fmt
task doctor
task stat
task sources
task delete -- ./notes/today.md
task clear -- --yes
task prune
task prune -- --apply
```
Task arguments can be forwarded to CLI tasks with `--`, for example:
```bash
task index -- ./docs
task query -- "What is this project about?"
```
## CI and Coverage
GitHub Actions runs both `cargo test --all-targets` and a separate coverage job using `cargo-llvm-cov` for pushes to `main` and pull requests targeting `main`.
The coverage job fails if line coverage drops below 80%:
```bash
cargo llvm-cov --all-targets --fail-under-lines 80
```
To make this a merge requirement on GitHub, set the `Coverage (80% required)` status check as a required check in the branch protection rule for `main`.
Releases are configured with `cargo-release` through [`Cargo.toml`](Cargo.toml). The repository is set up to:
- only release from `main`
- create tags as `v<version>`
- regenerate [`CHANGELOG.md`](CHANGELOG.md) with [`git-cliff`](https://git-cliff.org/) before the release commit
- default to `cargo release` dry runs unless you pass `--execute`
- skip automatic crates.io publishing during `cargo release` so publishing can be done explicitly with `cargo publish`
Typical usage:
```bash
cargo install cargo-release
cargo install git-cliff
cargo release patch
cargo release patch --execute
cargo publish
```