texform-transform 0.1.0

Profile-based AST transform engine for TeXForm (internal; use the texform crate)
Documentation
# Transform Rules

This directory stores concrete transform rules.

## Attribute Markers

Do not add one-off transform rules for declarative-scope commands such as
`\bf`, `\rm`, `\large`, or `\displaystyle`, or for registered prefix wrappers
such as `\mathbf` and `\textbf`. These markers are handled by the dedicated
LowerAttributes phase before and after normal rule execution. The data source
for that phase is `src/lower_attributes/data.yaml`.

Ordinary rewrite rules can assume registered attribute markers have already
been lowered to the canonical form by the time they run.

## Adding a New Rule

1. Create a new `.rs` file under the package/level/group layout, for example
   `base/standard/over_family/over_to_frac.rs` or
   `physics/standard/trace_alias/trace_to_tr.rs`.
2. Define and export exactly one `pub static MY_RULE: MyRuleType` where the
   constant name is the UPPER_SNAKE_CASE form of the file stem.
3. That's it — the build script auto-discovers the file and registers it.

No manual edits to `mod.rs` are required. The `build.rs` at the crate root
scans this directory at compile time, generates a standard Rust module tree in
`generated.rs`, and aggregates every rule constant into `ALL_RULES`.

## File Layout

Each rule must live in a single file.

Keep the following pieces together in that file:

1. The rule type itself
2. Small rule-local helpers
3. Inline tests under `#[cfg(test)] mod tests`

`mod.rs` only contains `include!("generated.rs")`; the generated file is
tracked so release verification can confirm it is up to date without modifying
the package source. Do not edit the generated registry by hand.

Rules use this directory structure:

```text
<package>/<level>/<directory_group>/<rule_file_stem>.rs
```

- `package` is the owning rule package, such as `base`, `ams`, or `physics`.
- `level` is the normalization level: `standard`, `expand`, `drop`, or
  `equiv`.
- `directory_group` is the Rust module path segment for a human-readable group.
  It must be snake_case. It is not part of `RuleMeta` and does not affect
  scheduling.
- `rule_file_stem` is the rule id/rule name with `-` converted to `_`.
  Uppercase letters from the rule id are preserved, for example
  `implies-to-Longrightarrow` becomes `implies_to_Longrightarrow.rs`.

Rule group directories must use standard snake_case Rust module names. Rule
file stems must be valid Rust module names and may contain ASCII uppercase
letters when the rule id does. Group names written as dash slugs elsewhere are
converted to snake_case Rust path segments here.

The rule key remains `<package>/<name>`, independent of the directory group.

Prefer defining metadata as a function-local static inside `meta()`:

```rust
fn meta(&self) -> &'static RuleMeta {
    static META: RuleMeta = RuleMeta { /* ... */ };
    &META
}
```

This keeps the metadata physically close to the trait implementation without
adding another file-level symbol for every rule.

For repeated rule shells, prefer the crate-private authoring macros:

```rust
use crate::rewrite::{alias_rule, char_targets, cmd_targets, define_rule, env_targets};
```

These macros are intentionally local to `texform-transform`; they are
ergonomics helpers for builtin rules, not a public rule-definition API.

## Scheduling With `triggers`

`RuleMeta.triggers` is the required scheduling entry list. The engine attempts
the rule only on nodes matching `triggers`.

For ordinary single-target rules, set `triggers` to the eliminated target. Use
a smaller trigger list when a rule consumes multiple targets but has a smaller
natural entry point. Examples include owner-command structures such as
`\buildrel ... \over ...` and `\root ... \of ...`.

Do not use `triggers` to hide dependencies. Every trigger target must also
appear in `eliminates` or `touches`.

```rust
triggers: cmd_targets![&base::cmd::OVER],
consumes: RuleConsumes {
    eliminates: cmd_targets![&base::cmd::OVER],
    touches: &[],
},
```

```rust
triggers: cmd_targets![&base::cmd::BUILDREL],
consumes: RuleConsumes {
    eliminates: cmd_targets![&base::cmd::BUILDREL],
    touches: cmd_targets![&base::cmd::OVER],
},
```

Here `cmd:over` is a touched separator inside the structure, not a global
eliminated-form contract owned by `buildrel-expand`.

## Builtin Record Imports

Always import builtin records through an explicit package module:

```rust
use texform_knowledge::builtin::base;
use texform_knowledge::builtin::ams;
use texform_knowledge::builtin::bboldx;
```

When referencing builtin records in consumes or produces, always use
the package-qualified path:

```rust
RuleTarget::Command(&base::cmd::FRAC)
RuleTarget::Environment(&ams::env::ALIGN)
RuleTarget::Character(&bboldx::chars::BBDOTLESSI)
```

The target contract is package-insensitive: each target means `kind + name`.
The package-qualified Rust path exists because `RuleTarget` stores a concrete
builtin record reference. For each `kind + name`, choose the first package that
defines that record in texform package import order.

## Package Variants

Do not duplicate same-name package variants in rule metadata. `RuleConsumes`
and `RuleProduces` are interpreted as `kind + name`, so each target appears once:

```rust
use texform_knowledge::builtin::base;

consumes: RuleConsumes {
    eliminates: cmd_targets![&base::cmd::OVER],
    touches: &[],
},
produces: RuleProduces {
    targets: cmd_targets![&base::cmd::FRAC],
},
```

If the same command, environment, or character name exists in multiple packages,
choose the first builtin record by texform package import order.
`enabled_by_packages` declares which input packages make the rule loadable; it
does not constrain which package supplies a produced target.

Package-specific split decisions are based on structural signatures:

1. Commands use `CommandKind + argspec.source`
2. Environments use `argspec.source + body_mode`
3. Same signature means one rule with all matching packages in
   `enabled_by_packages`
4. Different signatures mean separate rules

The transform plan collapses every target to `RuleTargetKey` (`kind + name`) for
topological sort, cleanup-boundary checks, mutation filtering, and
eliminated-form derivation.

## define_rule!

Use `define_rule!` when the rule metadata is regular but the AST rewrite logic
still needs ordinary Rust code:

```rust
define_rule! {
    pub static OVER_TO_FRAC: OverToFracRule {
        key: Base / "over-to-frac",
        level: Standard,
        summary: "Rewrite infix \\over into prefix \\frac",
        fidelity: Approximate,
        enabled_by_packages: [Base],
        consumes: RuleConsumes {
            eliminates: cmd_targets![&base::cmd::OVER],
            touches: &[],
        },
        produces: RuleProduces {
            targets: cmd_targets![&base::cmd::FRAC],
        },
        apply(rule, cx, node_id) {
            // normal Rust body
        }
    }
}
```

Prefer this macro for rules that:

1. Rebuild nodes or subtrees
2. Need shape validation with `RuleContext`
3. Need bespoke matching logic beyond simple rename canonicalization

The inline form exposes `Self::KEY`, and rule bodies should bind a scoped
context with `cx.for_rule(Self::KEY)` for shape checks and argument extraction.
The explicit rule variable remains available for rare cases that need the full
rule value.

When IDE navigation matters more than keeping the body inline, use the
`apply_fn: path` variant and move the rewrite code into a normal function.

## Choosing `level` and `fidelity`

Classify a rule by asking which profile first accepts its output as a suitable
product, then declare the rule's fidelity independently. `fidelity` does not
determine `level`: an `Equiv` rule may still be `Full` when its output is
pixel-identical but too expanded to serve as a corpus label, as with fenced
matrix environment expansion.

A rule's `fidelity` is the worst-case render-fidelity guarantee over its declared
input domain and must not fall below its level's floor. See `RuleFidelity` in the `texform-transform`
README for the fidelity ladder, the per-level floor table, and how to document a
rule whose worst case is rarer than its usual behavior (such as
`displaylines-to-gather-env`).

## alias_rule!

Use `alias_rule!` only for prefix-command canonicalization where aliases and
the canonical command share the same `allowed_mode` and `argspec.source`, and
the rule only renames the command:

```rust
alias_rule! {
    pub static TRACE_TO_TR: TraceToTrRule {
        key: Physics / "trace-to-tr",
        level: Standard,
        summary: "Canonicalize \\Tr, \\trace, and \\Trace into \\tr",
        fidelity: Full,
        enabled_by_packages: [Physics],
        canonical: &physics::cmd::TR,
        aliases: [
            &physics::cmd::TR_2,
            &physics::cmd::TRACE,
            &physics::cmd::TRACE_2,
        ],
    }
}
```

`alias_rule!` enforces only structural invariants:

1. Canonical and alias commands must all be `Prefix`
2. `allowed_mode` must match
3. `argspec.source` must match
4. The alias list must be non-empty and must not contain the canonical command

`alias_rule!` declares aliases as eliminated commands and the canonical command
as the produced command. The engine attempts the rule when the current node
matches one of the alias command names.

Do not use `alias_rule!` for:

1. Package-variant handling of same-name commands
2. Character-backed commands such as base `\Re`
3. Infix, declarative, or environment canonicalization
4. Rules that need any AST surgery beyond renaming a prefix command

## Sugar Macros

Use the small metadata helpers when they reduce noise:

```rust
cmd_targets![&base::cmd::FRAC]
env_targets![&ams::env::ALIGN]
char_targets![&bboldx::chars::BBDOTLESSI]
```

These macros only wrap builtin paths into `RuleTarget::*` arrays. They do not
infer package variants, enabled packages, canonical forms, or any other rule
semantics.

## Shared Helper Imports

For shared transform helpers, import the specific functions you use:

```rust
use crate::rewrite::helpers::{mandatory_content_slot, prefix_command_node};
```

Use `RuleContext` helpers for node matching and shape checks when possible:

```rust
let Some(infix) = cx.match_infix(node_id, &base::cmd::OVER) else {
    return Ok(RuleEffect::Skipped);
};
let subject = infix.subject();
cx.for_rule(Self::KEY).expect_no_args(infix.args, &subject)?;
```

Preferred style:

1. Keep package prefixes for builtin records, such as `base::cmd::OVER`
2. Import shared constructor helpers directly, such as `prefix_command_node` and `mandatory_content_slot`
3. Prefer `RuleContext` match helpers and scoped shape helpers over open-coded `match` + repeated error construction

## Transform Profiles

Use `BuildConfig` to select a profile and narrow rules in tests and examples:

```rust
let context = TransformContext::from_build_config(
    BuildConfig::profile(Profile::Authoring).only_rule_for_tests(OVER_TO_FRAC.meta().key),
    &parse_ctx,
)?;
let report = context.run(&mut ast, &parse_ctx)?;
```

Profiles select rewrite rules by normalization level before rule-specific filters are applied:

- `Authoring` includes `NormalizationLevel::Standard`
- `Faithful` includes `NormalizationLevel::Standard` and `NormalizationLevel::Expand`
- `Corpus` includes `NormalizationLevel::Standard`, `NormalizationLevel::Expand`, and
  `NormalizationLevel::Drop`
- `Equiv` includes all levels

Rule-specific filters are allowlists or denylists inside the selected profile;
they do not enable an `expand`, `drop`, or `equiv` rule when the profile level
set excludes it.