# visual-rubric
[](.forgejo/workflows/ci.yaml) [](flake.nix) [](https://crates.io/crates/visual-rubric)
`visual-rubric` runs AI-assisted rubric checks against screenshots through
`codex-acp`. It is intended for local visual UX review loops where deterministic
tests can prove structure and screenshots can catch layout, hierarchy, and
readability regressions.
The crate exposes:
- `evaluate_image_rubric_with_options` for one-off screenshot checks.
- `evaluate_image_rubric_with_config` when callers need a custom `codex-acp`
binary, environment, or working directory.
- `RubricPool` for repeated checks with process reuse, retry backoff, quota
detection, and worker recycling.
- `BatchRubricRun` for caller-provided asset batches with changed-file
selection, partial-error reports, ACP log capture, and optional issue
classification hooks.
- `visual-rubric` CLI for image checks, local static hosting, screenshot
capture, and advisory audit reports.
Feature flags:
- `fake-codex-acp` builds the `fake-codex-acp` test helper binary. It is off by
default and does not change the library API.
Project-specific judgment belongs in the caller-provided `system_prompt`; the
default prompt only covers generic screenshot breakage such as clipped text,
overlapping controls, blank regions, illegible contrast, and visibly broken
layout.
```sh
visual-rubric \
--image site-desktop.png \
--question "Does the install flow make prerequisites, commands, and next steps clear?" \
--system-prompt "You are auditing a software project website install section."
```
The same form is available as an explicit subcommand:
```sh
visual-rubric image \
--image site-desktop.png \
--question "Does the install section stay readable?"
```
For generated assets, callers can keep project-specific discovery downstream
and let the crate own generic batch mechanics:
```rust
use visual_rubric::{
AssetSnapshot, BatchRubricConfig, BatchRubricRun, PoolConfig, SelectionMode,
diff_snapshots,
};
let stable_hash = |bytes: &[u8]| format!("{:x}", bytes.len());
let before = AssetSnapshot::capture(["before.png"], stable_hash)?;
let after = AssetSnapshot::capture(["after.png"], stable_hash)?;
let changes = diff_snapshots(&before, &after);
let report = BatchRubricRun::new(BatchRubricConfig {
pool: PoolConfig::default(),
question: "Does this image pass visual QA?".to_owned(),
selection_mode: SelectionMode::ChangedOnly,
classifier: None,
})
.run(&changes);
```
For local website iteration, serve a static directory, capture browser
screenshots, and write a report:
```sh
visual-rubric audit \
--root website/public \
--path __audit/install.html \
--browser chromium \
--browser-arg=--disable-dev-shm-usage \
--wait-ms 250 \
--capture-retries 1 \
--viewport desktop=1440x1100 \
--viewport mobile=390x1800 \
--question "Does this install section make the next action obvious?" \
--report target/visual-rubric/report.json
```
Audit reports are versioned JSON. They include an aggregate status, capture URL,
elapsed time, effective high-level options, and one rubric result per screenshot.
Use `--fail-on-rubric` when CI should fail on rubric failures or rubric errors.
For Home Manager-managed setups, keep backend selection in
`~/.config/visual-rubric/config.toml` and run:
```sh
visual-rubric configured \
--image site-desktop.png \
--question "Does this page stay readable?"
```
Set `mode = "direct"` for direct `codex-acp` screenshot review, or
`mode = "pipeline"` for Qwen3-VL extraction followed by ACP rubric scoring.
`--mode direct` and `--mode pipeline` override the TOML mode for one run.
For manual inspection without rubric evaluation:
```sh
visual-rubric serve --root website/public --port 1111
```
The model must return strict JSON:
```json
{ "verdict": "pass", "reason": "short reason", "anomalies": [] }
```