ragcli
ragcli is a small local RAG CLI written in Rust.
It indexes local files into a persistent LanceDB store, uses Ollama for embeddings and generation, and stays intentionally simple so the whole flow is easy to inspect and extend.
Features
- local text, Markdown, HTML, CSV/TSV, source code, PDF, and image indexing
- local embedding and generation through Ollama
- hybrid retrieval with LanceDB vector search + BM25 full-text search
- persistent per-store data under
~/.config/ragcli/<name> - idempotent re-indexing by
source_path - compatibility checks for embedding model + chunk settings
doctorchecks for store state, Ollama reachability, and installed modelsstatsummarizes indexed content, approximate embedded token volume, and store disk usagesources/ls,delete,clear, andprunehelp inspect and maintain indexed content- query modes with inspectable retrieval output via
--mode,--show-plan,--show-scores,--show-citations, and--show-trace
Quick Start
Install ragcli from crates.io:
Then:
- Start Ollama.
- Pull the embedding and chat models.
- Index a local file or folder.
- Query your local corpus.
If ragcli is not found after installation, make sure ~/.cargo/bin is on your PATH.
When developing from a local checkout, you can also run commands with cargo run --, or install the checkout with cargo install --path . --force.
Commands
Index a directory or file:
Use a named store:
Inspect retrieved context before generation:
Check local setup:
Inspect what is already embedded:
Remove or clean indexed content:
Configuration
Config lives at:
~/.config/ragcli/<name>/config.toml
Default config:
[]
= "http://localhost:11434"
[]
= "nomic-embed-text-v2-moe:latest"
= "qwen3.5:4b"
= "qwen3.5:4b"
[]
= 1000
= 200
Effective runtime values can be overridden with environment variables:
You can inspect and update config without editing TOML by hand:
Supported indexable formats currently include:
- plain text:
.txt,.rst - Markdown:
.md,.markdown - HTML:
.html,.htm - tabular text:
.csv,.tsv - source/config text:
.rs,.py,.js,.ts,.tsx,.jsx,.go,.java,.c,.cc,.cpp,.cxx,.h,.hpp,.sh,.bash,.toml,.yaml,.yml,.json - PDF:
.pdf - images:
.png,.jpg,.jpeg,.webp
Telemetry
ragcli can optionally export tracing spans over OTLP while keeping the normal stderr log output.
OTLP export is disabled unless OTEL_EXPORTER_OTLP_ENDPOINT is set.
Currently supported protocols:
http/protobuf(default whenOTEL_EXPORTER_OTLP_PROTOCOLis unset)grpc
Common environment variables:
# optional
Collector example:
Phoenix local example:
For http/protobuf, ragcli appends /v1/traces only when the configured endpoint has no path (for example http://localhost:4318). If you provide a custom path such as http://localhost:6006/ingest, that path is used as-is.
You can inspect the resolved telemetry settings with:
doctor reports whether telemetry is enabled, the resolved service name, protocol, endpoint, timeout, whether OTLP headers are configured, and any telemetry configuration parse error. Header values are never printed.
By default, exported spans include operational metadata such as command names, query/index execution details, model names, endpoint hosts, durations, and request/response sizes. They do not include prompt bodies, retrieved source content, image bytes, or OTLP header values.
Storage
Each store lives under:
~/.config/ragcli/<name>/
lancedb/
meta/
cache/
models/
config.toml
meta/store.toml records the embedding model, embedding dimension, chunk settings, and store schema version used to build the store.
When upgrading across store schema changes, reindex into a fresh store or remove the old store before indexing again.
Behavior Notes
- Re-indexing replaces existing rows for the same
source_path. sources/lslists indexed paths with per-source format, chunk count, character count, token estimate, and page count when applicable.delete <path>removes one indexed source path.prunepreviews rows whose storedsource_pathno longer exists on disk; add--applyto remove them.clearremoves all indexed rows for the selected store and requires--yes.- Text and source files are decoded lossily, so non-UTF-8 files do not abort indexing.
- Hidden files and directories are skipped during directory traversal unless
--include-hiddenis set. index --exclude <glob>can be repeated to skip unwanted files or directories.- HTML is converted to readable text before chunking, and CSV/TSV rows are flattened into labeled text.
- Images are captioned with an Ollama vision model at index time and stored as text for retrieval.
- Queries support
--mode naive|hybrid|agentic|local|global|mix;agenticruns the iterative Ralph-style retrieval loop, whilelocal,global, andmixuse distinct placeholder graph-mode paths that still fall back to hybrid retrieval until graph indexing lands. - Queries use LanceDB hybrid search: semantic nearest-neighbor search plus BM25 full-text search on
chunk_text. - Query-time retrieval filters support
--source,--path-prefix,--page, and--format. - Querying refuses to mix a store with a different embedding model than the one used to build it.
- Ollama chat requests are sent with
think: falseto reduce hangs with reasoning-capable models.
Development
Build:
Verify:
Coverage:
If you use Task, the repo also includes Taskfile.yml with shortcuts for the common workflows:
Task arguments can be forwarded to CLI tasks with --, for example:
CI and Coverage
GitHub Actions runs both cargo test --all-targets and a separate coverage job using cargo-llvm-cov for pushes to main and pull requests targeting main.
The coverage job fails if line coverage drops below 80%:
To make this a merge requirement on GitHub, set the Coverage (80% required) status check as a required check in the branch protection rule for main.
Releases are configured with cargo-release through Cargo.toml. The repository is set up to:
- only release from
main - create tags as
v<version> - regenerate
CHANGELOG.mdwithgit-cliffbefore the release commit - default to
cargo releasedry runs unless you pass--execute - skip automatic crates.io publishing during
cargo releaseso publishing can be done explicitly withcargo publish
Typical usage: