harn-rules
The declarative structural rule engine for Harn — the Rust core behind
harn rules / lint / codemod surfaces. Part of the
Rule Engine program
(Epic A, harn#2827).
A rule says what to match and optionally how to rewrite it. The
engine compiles the rule against the tree-sitter machinery in harn-hostlib
and produces matches with metavariable bindings — the structural complement
to regex/glob search.
This crate ships the atomic matching tier (harn#2832), the relational + composite algebra (harn#2833), the predicate + rewrite layer (harn#2834), the safety + idempotency gate (harn#2835), and the whole-project scan lifecycle (harn#2836), with Harn-only semantic capture metadata for resolved bindings and simple static types (harn#2882).
Rule shape (TOML)
= "destructure-with-defaults"
= "typescript"
= "warning" # info | warning (default) | error
= "Collapse `?.x ?? default`"
= "{ $KEY: $SRC }" # presence makes the rule a codemod
[] # the matcher block — keep it LAST
= "$SRC?.$KEY ?? $DEFAULT" # one of: pattern | kind | regex
Key ordering: because
[rule]opens a TOML table, every scalar field (id,language,severity,message,fix) must appear before it.
A rule's kind is derived from its shape: a fix makes it a codemod; a
message with no fix makes it a lint; a bare matcher is a search.
Atomic matcher forms
pattern— a code snippet in the target grammar with$VARmetavariable holes. Compiled to a tree-sitter query: each$VARbecomes a capture, the snippet's operators/keywords are matched literally (so??≠||), and a repeated$VARunifies (must bind identical text). Variadic$$$holes land with the relational tier (#2833).kind— a bare tree-sitter node kind (e.g."call_expression").regex— a regular expression over the source text.
A metavar-free pattern is a literal pattern: foo() matches calls to
foo specifically (every non-metavar identifier/literal is constrained to
its exact text).
A metavar can carry a typed $VAR:kind constraint (#2839) so it binds
only to nodes of a syntactic class: log($ARG:identifier) matches log(x)
but not log(f()). :kind is a semantic alias (expr/expression,
stmt/statement, ty/type, ident/identifier, resolved to the
grammar's supertype) or an exact tree-sitter kind. A constraint that names no
kind in the target grammar is a compile error — the supertype aliases exist in
some grammars (expression in TypeScript/JS/Python) but not others
(Rust/Go), where an exact kind is used instead.
Relational + composite algebra
Beyond the atomic leaf, a rule node can add relational and composite keys — all ANDed. A node matches iff its atomic part matches and every other key holds:
[]
= "let $NAME = $SRC?.$KEY ?? $DEF"
[] # ancestor must match this sub-rule
= "statement_block"
= "end" # neighbor (default) | end | <rule>
[] # composite `not` of a relational `inside`
= "try_statement"
= "end"
- Relational:
inside(ancestor),has(descendant),follows/precedes(siblings), each a sub-rule tuned bystopByandfield(restrict to a tree-sitter field). - Composite:
all/any(lists of sub-rules),not(a sub-rule), andmatches(reference a[utils.NAME]utility rule by id).
where constraints, transform, and fix
A rule can narrow matches with where predicates, synthesize new metavars
with transform, and rewrite with fix:
= "snakeify-getters"
= "typescript"
= "$SNAKE()" # interpolates $VAR / ${VAR} (and $$ -> $)
[]
= "$FN()"
[[]] # keep only matches that pass every predicate
= "FN"
= "^get[A-Z]" # or: comparison = { op = ">", value = 100 }
# or: pattern = "..." (recursive sub-pattern)
[] # derive a new metavar before fixing
= "FN"
= "snake" # or: replace = { regex, by } / substring = { start, end }
For Harn rules, captures are also enriched with semantic metadata when the
engine can resolve the node to a local declaration/binding or infer a simple
type from an annotation/literal. The string captures stay in captures; the
metadata is exposed separately as capture_metadata.
= "global-target-call"
= "harn"
[]
= "$FN($ARG)"
[[]]
= "FN"
= { = "target", = "fn", = 1 } # 1-based line
[[]]
= "ARG"
= "int"
resolvesTo accepts any subset of id, name, kind, line, and column;
id is <kind>:<name>@<line>:<column> using 1-based line/column. This is a
Harn-only first cut: cross-language name/type resolvers are intentionally not
invented here.
CompiledRule::apply(source) runs the rule, drops matches that fail any
constraint, interpolates each match's fix (from its captured + transformed
metavars), and splices the replacements in — format-preserving, the same
byte-splice guarantee as ast.batch_apply. It returns the rewritten source
plus the per-match edits; the caller decides whether to write.
Safety, applicability, and idempotency
A rule declares a safety tier — format-only → behavior-preserving →
scope-local (default) → surface-changing → capability-changing →
needs-human. The two safest map to machine-applicable; the rest are
suggestions (opt-in). The gate:
applyalways computes the preview (and reportssafety,applicability, and whether the fix isidempotent).auto_applyrefuses anything abovebehavior-preserving— so the runner never silently applies a risky fix.apply_checkedadditionally fails if the fix is not idempotent (re- running it produces further changes — it never reaches a fixed point).diagnostics(source)emits one diagnostic per match (message, severity, span, applicability, interpolated fix) — the mapping surface the linter and LSP convert intoLintDiagnostic/FixEdit.
Usage
use ;
let rule = from_toml_str?;
let compiled = compile?;
for m in compiled.run?
Load from disk with load_rule_file(path) or load_rule_dir(dir).
Whole-project lifecycle
For rules that must see the whole repo before editing — or that create /
delete files (import insertion, codegen, dead-code removal) — implement a
ScanningRecipe (OpenRewrite-style): a deterministic, path-sorted scan
pass folds every file into a typed accumulator, then a generate pass turns
that state into a set of FileChanges (Edit / Create / Delete).
use ;
// Run a declarative codemod across a project (per-file, no scan state):
let run = run_recipe?;
for change in &run.changes
run_recipe returns the changes; the caller (a CLI, the staged filesystem)
decides whether to write and harn fmt them.
Data tables (report-only)
data_table(rule, files) runs a rule across a project without editing and
returns a columnar DataTable — one row per match (path, position, text,
metavar bindings) plus a metrics summary (total findings, files, per-file
counts). It serializes to JSON for inventory / impact analysis / audit:
let table = data_table?;
println!; // { rule_id, columns, rows, summary }