aiseo
Agent-first CLI for SEO, GEO (generative engine optimisation), and AEO (answer engine optimisation) audits.
Built on agent-cli-framework. The binary is the interface: every command emits a JSON envelope when piped, coloured human output in a terminal, and aiseo agent-info returns the full machine-readable capability manifest.
This CLI is designed for coding agents (Claude Code, Codex CLI, Gemini CLI) to use as a tool. Humans get a colour fallback, but the surface is shaped around what an agent needs to make decisions about a page.
Install
Then drop the wrapper skill into your AI agent platforms:
This writes a small SKILL.md into ~/.claude/skills/aiseo/, ~/.codex/skills/aiseo/, and ~/.gemini/skills/aiseo/. The skill itself is a one-screen pointer — the documentation lives inside the binary via aiseo agent-info.
Use
Four user commands:
# or any previous suggestion still present
Compose with anything that emits HTML or Markdown
aiseo audit - reads from stdin. HTML vs Markdown is sniffed from the first non-whitespace character. Lets the CLI plug into any extraction tool without growing its own crawler:
# Plain curl for static pages
|
# Via the `search` CLI for JS-heavy or anti-bot pages (firecrawl, stealth, browserless)
| |
# Pre-deploy: pipe a built page through aiseo as part of the build
|
Markup-only signals (meta tags, OG, schema, freshness) require raw HTML — most extraction tools strip these, so the markup half of the audit will come back empty. Content signals (keywords, entities, voice, position-bias) work on either.
What audit returns
A single typed JSON envelope. Agents read sub-objects directly; humans read the suggestion list.
meta—<title>, description, keywords, author, canonicalopen_graph,twitter_card— social preview surfacesschema_types— every@typefound in JSON-LD blocks (e.g.["Article", "FAQPage"])content— H1/H2/H3 lists, word count, presence flagskeywords—{ primary[], questions[], density{} }entities—{ people[{name, credentials?}], organizations[] }evidence—{ stat_count, quote_count, unsupported_claims[] }voice—{ featured_snippet_candidate, speakable_eligible, avg_sentence_words }ai_slop—{ signals[], density_per_1000_words, verdict }. Regex-only LLM-writing detector (negation pivots,delvefamily,tapestrycluster, false-conclusion openers, bold-colon headers, etc.). Em-dashes deliberately NOT flagged — broken signal in 2026 and Boris-style British English uses them.verdictisclean | suspicious | likely_aiby confidence-weighted density per 1000 words.information_gain—{ score: 0..10, counts, samples[] }. Google's March-2026 Core Update made Information Gain a dominant content-quality signal. Counts named-source quotes, sample-size disclosures (n=…), year-over-year deltas, first-person evidence (we analysed…), methodology disclosure, numbered citations. Below 5 starts deducting from the score; below 2 hits hard.metatext—{ signals[], heading_skeleton: { jaccard, matched[] }, weighted_score_per_1000_words, verdict }. Catches the agent-speaking-instead-of-content class of slop the lexical detector misses: process narration ("I'll start by…"), self-identification ("As an AI…"), closing pleasantries ("Hope this helps"), bracket asides, markdown envelopes, hedge stacks, sycophancy ("Great question!"). Plus a novel heading-skeleton Jaccard detector — flags pages whose outline matches the canonical AI table-of-contents (Introduction / Background / Key Features / FAQ / Conclusion). Position-weighted: openers in the first 15% and pleasantries in the last 20% weigh more.copy_precision—{ score: 0..10, counts, densities, verdict }. Positive score. Velvet-glove discipline: rewards tight prose, penalises filler words (very,really,actually),-lyadverbs, hedged modals, empty-emphasis adjectives (crucial,key,vital), throat-clearing openers, low sentence-length variance, long Latinate words, absent concrete nouns. Verdictstight(≥8),mid(5..7),padded(<5), orinsufficient_contentwhen the body is too short (<20 words) to assess honestly.design_slop—{ findings[], counts, verdict }. Visual / typographic / colour tells of AI-generated UIs in 2026. Catches the indigo-CTA cluster, gradient text, the AI purple/violet palette,cubic-bezierovershoot bouncing, monotonous spacing,everything-centered, single-font pages, the overused-font list (Inter, Geist, Fraunces, Plus Jakarta Sans, etc.), shadcn's unmodified:rootpalette, next-forge defaults, and the 2026 Claude/Anthropic Fraunces + warm-brown italic wave. Pattern bank ported from Paul Bakaus'spbakaus/impeccable(Apache-2.0) — seeNOTICE.position_bias— word-offset percentages of TL;DR / first stat / first credential, with warnings when high-leverage signals sit past the first 30% of the pagefreshness—dateModified,datePublished, days since last modification, year mentions, plus the first<time datetime>value, the visible "Updated …" label, and aschema_vs_visible_severity(none/mild/severe) that flags pages whose JSON-LD claims a fresher date than the visible textperformance—{ inline_style_bytes, inline_script_bytes, external_stylesheets, external_scripts, render_blocking_estimate, lazy_loaded_images, eager_below_fold_estimate, font_links }. Markup-only Core Web Vitals proxies — no headless browser needed.link_graph—{ internal_count, external_count, authority_external, broken_anchors[], nofollow_external_pct }. Catches orphan pages, missing authority citations, broken in-page anchors.score— rough 0–100, weighted toward AI-citation surfacescore_breakdown— per-component deductions:{ name, deducted, reason }[]suggestions— flat list of concrete next actions
Output formats
--out extension |
What you get | Use case |
|---|---|---|
.json |
Pretty JSON envelope | Programmatic, default |
.html |
Self-contained printable report, serif typography, no JS, no CDN | Share with a client, print, attach to a PR |
.sarif |
SARIF 2.1.0 | GitHub Code Scanning annotations |
Closing the loop with verify
LLMs and coding agents often claim work they did not finish. verify is the gate:
# ...agent edits page.html...
Returns a typed diff:
Exit code is 1 if anything is still present or a new suggestion regressed. The audit data is on stdout regardless — the gate only flips the exit code.
Why this exists
The Python skill at claude-skill-seo-geo-optimizer was a 13-script bundle stitched together with regex and urllib. It worked, but the install footprint was big, the parsing was brittle, and it was ergonomically wrong for an agent: too many subcommands, too much documentation in the skill file, and no way to call it without Python on the box.
aiseo collapses the same surface into a single binary, parses HTML with scraper instead of regex, returns honest typed JSON, and ships zero documentation in the skill — the binary describes itself.
Research basis (May 2026)
The audit's heuristics are grounded in the post-Gemini-3 AI search landscape:
- Position bias — first ~30% of a page captures ~44% of AI-search citations (iPullRank, AIBoost 2026).
- Schema — useful for entity clarity and rich results, but only ~+2.4% AI Mode citation lift in the Ahrefs 1,885-page test. Don't oversell it.
- Freshness — Perplexity and the post-Gemini-3 AI Overviews both lift recently-modified content.
- Named credentials and primary-source citations — held up in the AgenticGEO and Citation Selection vs Absorption studies (2026).
- FAQ rich results — retired by Google on 7 May 2026, but FAQ schema still matters for Bing, Brave, DuckDuckGo, and as a structural signal for AI platforms.
Exit codes
0— success1— transient (IO / network) — retry2— config error — fix setup3— bad input — fix arguments4— rate limited — wait and retry
Licence
MIT. © Boris Djordjevic, 199 Biotechnologies.