agent-image-diff 0.2.0

Structured image diff with JSON output for agent workflows
Documentation

agent-image-diff

Perceptual image diffing CLI that outputs structured JSON. Built for LLM agents that need to compare screenshots in iterative loops — compact output by default to minimise context window usage.

Install

npm install -g agent-image-diff

Or with Cargo:

cargo install agent-image-diff

Quick Start

Compare two images:

agent-image-diff baseline.png candidate.png
{"match":false,"diff_percentage":1.0,"regions":[{"id":1,"bounding_box":{"x":20,"y":30,"width":10,"height":10},"label":"color-change"}]}

Images match:

agent-image-diff expected.png actual.png
{"match":true,"diff_percentage":0.0,"regions":[]}

Output Schema

Default (compact)

Optimised for agents. Single line, only actionable fields.

Field Type Description
match bool true if images are identical (no regions detected)
diff_percentage float Percentage of pixels that differ, rounded to 1 decimal place
regions array List of changed regions, largest first
regions[].id int Region identifier (1-indexed)
regions[].bounding_box object {x, y, width, height} in pixels from top-left
regions[].label string One of: added, removed, color-change, content-change
dimension_mismatch object? Present only when images have different dimensions

Verbose (-v)

Adds diagnostic fields for debugging. Not recommended for agent loops.

agent-image-diff baseline.png candidate.png -v --pretty
{
  "dimensions": { "width": 100, "height": 100 },
  "stats": {
    "changed_pixels": 100,
    "total_pixels": 10000,
    "diff_percentage": 1.0,
    "region_count": 1,
    "antialiased_pixels": 0
  },
  "match": false,
  "regions": [
    {
      "id": 1,
      "bounding_box": { "x": 20, "y": 30, "width": 10, "height": 10 },
      "pixel_count": 100,
      "avg_delta": 0.628,
      "max_delta": 0.628,
      "label": "color-change"
    }
  ]
}

Additional verbose fields: dimensions, stats.*, regions[].pixel_count, regions[].avg_delta, regions[].max_delta.

Summary (-f summary)

Human-readable text output.

Images differ.
  Dimensions: 100x100
  Changed pixels: 100 (1.00%)
  Regions: 1

  Region #1 [color-change]: 10x10 at (20,30) — 100 pixels, avg delta 0.628, max delta 0.628

Quiet (-q)

Suppresses all stdout. Still writes the diff image if -o is provided.

Region Labels

Label Meaning
added Content present in candidate but not baseline (new element appeared)
removed Content present in baseline but not candidate (element disappeared)
color-change Same structure, different colors (theme change, hover state)
content-change Structure differs (text changed, layout shifted)

Visual Diff Image

Generate a diff image alongside any output format:

agent-image-diff baseline.png candidate.png -o diff.png

The diff image shows the candidate with semi-transparent colour overlays and borders on each changed region.

Parameters

Flag Default Description
-t, --threshold 0.1 Colour difference sensitivity (0.0 = exact, 1.0 = match everything)
--denoise 25 Remove noise clusters smaller than N pixels before analysis
--dilate 0 Expand diff mask by N pixels to bridge nearby changes
--merge-distance 50 Merge regions within N pixels of each other
--min-region-size 25 Filter out regions smaller than N pixels
--connectivity 8 Pixel connectivity for clustering: 4 (cross) or 8 (with diagonals)
--detect-antialias true Ignore anti-aliased edge pixels
-f, --format json Output format: json, summary, image
-o, --output Write visual diff image to path
-v, --verbose off Include all diagnostic fields in JSON
--pretty off Pretty-print JSON
-q, --quiet off Suppress stdout

Recipes

Strict pixel-exact comparison (CI screenshot tests):

agent-image-diff baseline.png candidate.png -t 0.0 --denoise 0

Web UI screenshots (tolerant of font rendering noise):

agent-image-diff baseline.png candidate.png

Detailed debugging (all fields, formatted):

agent-image-diff baseline.png candidate.png -v --pretty

Diff image only (no JSON):

agent-image-diff baseline.png candidate.png -q -o diff.png

Library Usage

use agent_image_diff::{diff_images, DiffOptions};

let baseline = image::open("baseline.png").unwrap().to_rgba8();
let candidate = image::open("candidate.png").unwrap().to_rgba8();

let result = diff_images(&baseline, &candidate, DiffOptions::default());

if !result.is_match {
    for region in &result.regions {
        println!("Region {} [{}]: {}x{} at ({},{})",
            region.id, region.label,
            region.bounding_box.width, region.bounding_box.height,
            region.bounding_box.x, region.bounding_box.y,
        );
    }
}

How It Works

  1. Compare — Per-pixel YIQ perceptual colour distance with configurable threshold
  2. Anti-alias filter — Detects and excludes anti-aliased edge pixels
  3. Denoise — Removes small noise clusters (font rendering artifacts, compression noise)
  4. Dilate — Optional morphological expansion to bridge nearby changed pixels
  5. Cluster — Connected component labelling groups changed pixels into regions
  6. Merge — Combines regions within --merge-distance pixels of each other
  7. Classify — Labels each region as added, removed, color-change, or content-change