ragcli 0.2.0

CLI for local RAG
ragcli-0.2.0 is not a library.

ragcli

Cargo Test Coverage Rust 2021 crates.io

ragcli is a small local RAG CLI written in Rust.

It indexes local files into a persistent LanceDB store, uses Ollama for embeddings and generation, and stays intentionally simple so the whole flow is easy to inspect and extend.

Features

  • local text, Markdown, HTML, CSV/TSV, source code, PDF, and image indexing
  • local embedding and generation through Ollama
  • hybrid retrieval with LanceDB vector search + BM25 full-text search
  • persistent per-store data under ~/.config/ragcli/<name>
  • idempotent re-indexing by source_path
  • compatibility checks for embedding model + chunk settings
  • doctor checks for store state, Ollama reachability, and installed models
  • stat summarizes indexed content, approximate embedded token volume, and store disk usage
  • sources/ls, delete, clear, and prune help inspect and maintain indexed content
  • query modes with inspectable retrieval output via --mode, --show-plan, --show-scores, --show-citations, and --show-trace

Quick Start

Install ragcli from crates.io:

cargo install ragcli

Then:

  1. Start Ollama.
  2. Pull the embedding and chat models.
  3. Index a local file or folder.
  4. Query your local corpus.
ollama pull nomic-embed-text-v2-moe:latest
ollama pull qwen3.5:4b

ragcli --help
ragcli doctor
ragcli stat
ragcli index ./docs
ragcli query "What is this project about?"

If ragcli is not found after installation, make sure ~/.cargo/bin is on your PATH.

When developing from a local checkout, you can also run commands with cargo run --, or install the checkout with cargo install --path . --force.

Commands

Index a directory or file:

ragcli index ./docs
ragcli index ./My_Neighbor_Totoro.pdf
ragcli index ./images
ragcli index . --exclude '**/target/**' --exclude '**/.git/**'

Use a named store:

ragcli index ./docs --name work
ragcli query "Summarize the notes" --name work

Inspect retrieved context before generation:

ragcli query "What is My Neighbor Totoro about?" --show-context
ragcli query "Summarize this file" --source ./notes/today.md
ragcli query "What happens on this page?" --source ./My_Neighbor_Totoro.pdf --page 3
ragcli query "What changed in the docs?" --path-prefix ./docs/
ragcli query "Summarize the markdown notes" --format markdown
ragcli query "What is this project about?" --mode hybrid --show-plan
ragcli query "Summarize the release notes" --mode agentic --show-trace
ragcli query "Which file mentions Totoro?" --show-citations --show-scores

Check local setup:

ragcli doctor
ragcli doctor --json

Inspect what is already embedded:

ragcli stat
ragcli stat --json
ragcli sources
ragcli ls

Remove or clean indexed content:

ragcli delete ./notes/today.md
ragcli prune
ragcli prune --json
ragcli prune --apply
ragcli clear --yes

Configuration

Config lives at:

~/.config/ragcli/<name>/config.toml

Default config:

[ollama]
base_url = "http://localhost:11434"

[models]
embed = "nomic-embed-text-v2-moe:latest"
chat = "qwen3.5:4b"
vision = "qwen3.5:4b"

[chunk]
size = 1000
overlap = 200

Effective runtime values can be overridden with environment variables:

export RAGCLI_OLLAMA_URL=http://localhost:11434
export RAGCLI_EMBED_MODEL=nomic-embed-text-v2-moe:latest
export RAGCLI_CHAT_MODEL=qwen3.5:4b
export RAGCLI_VISION_MODEL=qwen3.5:4b

You can inspect and update config without editing TOML by hand:

ragcli config show
ragcli config show --json
ragcli config set models.embed nomic-embed-text-v2-moe:latest
ragcli config set ollama.base_url http://localhost:11434

Supported indexable formats currently include:

  • plain text: .txt, .rst
  • Markdown: .md, .markdown
  • HTML: .html, .htm
  • tabular text: .csv, .tsv
  • source/config text: .rs, .py, .js, .ts, .tsx, .jsx, .go, .java, .c, .cc, .cpp, .cxx, .h, .hpp, .sh, .bash, .toml, .yaml, .yml, .json
  • PDF: .pdf
  • images: .png, .jpg, .jpeg, .webp

Telemetry

ragcli can optionally export tracing spans over OTLP while keeping the normal stderr log output.

OTLP export is disabled unless OTEL_EXPORTER_OTLP_ENDPOINT is set.

Currently supported protocols:

  • http/protobuf (default when OTEL_EXPORTER_OTLP_PROTOCOL is unset)
  • grpc

Common environment variables:

export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_EXPORTER_OTLP_TIMEOUT=10000
# optional
export OTEL_EXPORTER_OTLP_HEADERS="api_key=..."

Collector example:

export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
ragcli doctor

Phoenix local example:

export OTEL_SERVICE_NAME=ragcli
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:6006
ragcli query "What is this project about?"

For http/protobuf, ragcli appends /v1/traces only when the configured endpoint has no path (for example http://localhost:4318). If you provide a custom path such as http://localhost:6006/ingest, that path is used as-is.

You can inspect the resolved telemetry settings with:

ragcli doctor
ragcli doctor --json

doctor reports whether telemetry is enabled, the resolved service name, protocol, endpoint, timeout, whether OTLP headers are configured, and any telemetry configuration parse error. Header values are never printed.

By default, exported spans include operational metadata such as command names, query/index execution details, model names, endpoint hosts, durations, and request/response sizes. They do not include prompt bodies, retrieved source content, image bytes, or OTLP header values.

Storage

Each store lives under:

~/.config/ragcli/<name>/
  lancedb/
  meta/
  cache/
  models/
  config.toml

meta/store.toml records the embedding model, embedding dimension, chunk settings, and store schema version used to build the store.

When upgrading across store schema changes, reindex into a fresh store or remove the old store before indexing again.

Behavior Notes

  • Re-indexing replaces existing rows for the same source_path.
  • sources/ls lists indexed paths with per-source format, chunk count, character count, token estimate, and page count when applicable.
  • delete <path> removes one indexed source path.
  • prune previews rows whose stored source_path no longer exists on disk; add --apply to remove them.
  • clear removes all indexed rows for the selected store and requires --yes.
  • Text and source files are decoded lossily, so non-UTF-8 files do not abort indexing.
  • Hidden files and directories are skipped during directory traversal unless --include-hidden is set.
  • index --exclude <glob> can be repeated to skip unwanted files or directories.
  • HTML is converted to readable text before chunking, and CSV/TSV rows are flattened into labeled text.
  • Images are captioned with an Ollama vision model at index time and stored as text for retrieval.
  • Queries support --mode naive|hybrid|agentic|local|global|mix; agentic runs the iterative Ralph-style retrieval loop, while local, global, and mix use distinct placeholder graph-mode paths that still fall back to hybrid retrieval until graph indexing lands.
  • Queries use LanceDB hybrid search: semantic nearest-neighbor search plus BM25 full-text search on chunk_text.
  • Query-time retrieval filters support --source, --path-prefix, --page, and --format.
  • Querying refuses to mix a store with a different embedding model than the one used to build it.
  • Ollama chat requests are sent with think: false to reduce hangs with reasoning-capable models.

Development

Build:

cargo build

Verify:

cargo check
cargo test --all-targets
cargo fmt -- --check

Coverage:

cargo install cargo-llvm-cov
rustup component add llvm-tools-preview
cargo llvm-cov --all-targets --fail-under-lines 80
cargo llvm-cov --all-targets --html

If you use Task, the repo also includes Taskfile.yml with shortcuts for the common workflows:

task build
task check
task test
task coverage
task coverage-html
task coverage-lcov
task changelog
task changelog-preview
task release -- patch
task release-execute -- patch
task fmt
task doctor
task stat
task sources
task delete -- ./notes/today.md
task clear -- --yes
task prune
task prune -- --apply

Task arguments can be forwarded to CLI tasks with --, for example:

task index -- ./docs
task query -- "What is this project about?"

CI and Coverage

GitHub Actions runs both cargo test --all-targets and a separate coverage job using cargo-llvm-cov for pushes to main and pull requests targeting main.

The coverage job fails if line coverage drops below 80%:

cargo llvm-cov --all-targets --fail-under-lines 80

To make this a merge requirement on GitHub, set the Coverage (80% required) status check as a required check in the branch protection rule for main.

Releases are configured with cargo-release through Cargo.toml. The repository is set up to:

  • only release from main
  • create tags as v<version>
  • regenerate CHANGELOG.md with git-cliff before the release commit
  • default to cargo release dry runs unless you pass --execute
  • skip automatic crates.io publishing during cargo release so publishing can be done explicitly with cargo publish

Typical usage:

cargo install cargo-release
cargo install git-cliff
cargo release patch
cargo release patch --execute
cargo publish