Expand description
Axis 2: tool-call trajectory divergence.
For each response, extract the sequence of tool-call tokens that capture both the structural shape (tool name + sorted arg keys) AND the argument values (8-byte digest of canonical-JSON input). Compare baseline vs candidate sequences with Levenshtein edit distance. Normalize by max(len(baseline_seq), len(candidate_seq)) so the metric is in [0, 1].
Why include the value digest: a sequence that calls the same tools
in the same order with different argument values is a real
behavioural change (e.g. delete_user(id="alice") vs
delete_user(id="bob")). Without the value digest the per-axis
trajectory metric reports zero divergence on this case — even
though the alignment-based first-divergence detector picks it up
via its W_ARGS component. The value digest brings the per-axis
number in line with the alignment finding.
The digest is the leading 8 bytes (16 hex chars) of SHA-256 over
the canonical-JSON serialisation of the input object. Birthday-
paradox collision probability at 16 hex chars is ~1.8e-10 for 1000
tool calls — negligible for any realistic agent trace.
§Coverage cross-references
What this axis catches:
- Tool added / dropped / reordered (structural)
- Tool argument keys added / dropped (schema)
- Tool argument values changed (digest mismatch, v2.7+)
What it does NOT catch:
- Same tool sequence + same arg values + different RESPONSE
text — that’s a content regression visible on the semantic
axis (axis 1) and via the v2.7+
text_chars_log/numeric_token_density/error_token_flagdimensions ofshadow.statistical.fingerprint(Hotelling T²). - Tool sequence policy violations (“verify before refund”,
“no execute_sql without preview”) — the LTLf checker
(
shadow.ltl) withmust_call_before/no_callrules. - First moment of regression — the alignment module
(
shadow_core::diff::alignment) walks both traces and points to the exact turn where divergence began, with kind classification (Structural / Decision / Style).
Functions§
- compute
- Compute the tool-trajectory axis.