jetro 0.5.6

Jetro - transform, query, and compare JSON
Documentation
# Demand Propagation TODO

This is the remaining work from the demand-propagated tape streaming plan.

## Consolidation

- [x] Use the shared `plan::demand` model in symbolic pipeline normalization.
- [x] Move demand propagation adapters out of parser-facing `parse::chain_ir`.
- [x] Move chain operator spec/demand tests out of `parse::chain_ir`; parser chain IR now only owns operator identity.
- [x] Treat unknown or unregistered builtins as conservative barriers for demand propagation.
- [ ] Keep builtin planning facts centralized in `builtins::defs` / `builtins::registry`.

## Precision

- [x] Add field-level projection demand, e.g. `Projection(fields)`.
- [x] Split coarse value needs into count-only, exists-only, predicate-only, numeric-only, projection, and whole-value needs.
- [x] Make pure one-to-one map delay an explicit physical-plan annotation.
- [ ] Let delayed projection compose through chained maps and flush only at predicates, barriers, or final materialization. Symbolic/view paths compose today; durable projection slot exists via payload/source annotations but executor-wide flushing remains.

## Source Capabilities

- [x] Attach source capabilities to each physical source instead of relying on a single view-array profile.
- [x] Keep direct first, last, nth, reverse, bounded-prefix, and materialized fallback selection in one capability-driven path.
- [x] Add explicit fallback-boundary annotations so correctness-preserving fallbacks are visible in plans.

## Builtins

- [ ] Add demand metadata for more scalar/object helpers where it is safe.
- [ ] Keep `flat_map`, full `sort`, `group_by`, `unique`, and unknown builtins conservative unless metadata proves a bounded strategy.
- [ ] Add bounded top-k support only for ordering-safe `sort(...).take(n)` / suffix combinations.

## Tests And Benchmarks

- [x] Add call-count tests for `map(test).first()`, `map(test).last()`, and `map(test).take(n)`.
- [x] Add access-count tests for indexed tape first, last, nth, and fallback scans.
- [x] Add unsafe-chain regression tests for `filter(...).last()`, `take_while(...).last()`, `drop_while(...).last()`, `flat_map(...).last()`, and ordered suffixes.
- [x] Add benchmarks for large-array `map().last()`, `filter().last()`, and `sort/take/map` chains.

## High-Performance Algorithmic Plan

The next phase should optimize for asymptotic wins and low materialization, not just adding more builtin metadata.

### Core Plan

- [x] Add field-level projection demand, including nested field paths.
- [x] Split demand into at least two lanes:
  - [x] `scan_need`: fields/value form needed while scanning, filtering, sorting, or selecting.
  - [x] `result_need`: fields/value form needed only for emitted winners.
- [x] Add explicit late projection slots to the physical plan. Planner metadata now exposes the lanes and stores them on `Pipeline`.
- [ ] Compose chained pure one-to-one maps into pending projection kernels. Symbolic composition exists; plan-level projection slots remain.
- [ ] Flush pending projections only when required by predicates, barriers, mutation/update semantics, or final materialization.
- [ ] Keep high-performance logic generic; do not add query-shape fusions like `map_last` or `filter_last`.

### Source And Tape Capabilities

- [x] Attach concrete capabilities to every physical source:
  - [x] indexed array child access
  - [x] reverse iteration
  - [x] bounded prefix iteration
  - [ ] field-by-key reads without object materialization
  - [ ] subtree skipping
  - [ ] selected-row-only materialization
  - [x] materialized fallback
- [x] Choose access mode from source capability plus propagated demand:
  - [x] direct first/nth/last child
  - [x] reverse scan until enough outputs
  - [x] bounded forward scan
  - [x] full scan fallback

### Generic Algorithms

- [ ] Implement positional selection with pending projection.
- [ ] Implement scan-until-output with separate predicate and result needs.
- [ ] Implement reverse-scan-until-output with separate predicate and result needs.
- [ ] Implement predicate scan that projects only selected rows.
- [ ] Implement bounded top-k followed by late projection.
- [ ] Add cost/benefit guards so small arrays avoid expensive planning overhead.

### Target Behaviors

- [x] `$.books.map(isbn).last()` selects the last row first, then projects `isbn` once.
- [x] `$.books.filter(price > 20).map(isbn).last()` scans using `price`, selects the semantic winner, then projects `isbn` once.
- [x] `$.books.sort(-score).filter(price > 20).map(isbn).last()` uses ordering-aware bounded/lazy strategy only when suffix legality proves it safe.
- [x] `$.books.sort(-score).drop_while(name.contains("_test")).filter(price > 20).map(isbn).last()` preserves `drop_while` prefix semantics, then filters/selects/projects lazily after the prefix boundary.

### Immediate Next Step

- [x] Implement two-lane field demand plus explicit pending projections.

## Functional Batched Update TODO

High-performance functional writes must be planned as one update batch, not as a sequence of
full-document patches. The target syntax is functional, not jq-style operators.

Status: release-critical work is complete. Remaining unchecked items in this
section are intentionally postponed architecture cleanup / future semantics
design, not blockers for the current release.

### Target Syntax

- [x] `$.books[*].update({ tags: tags.append("test"), reviewed: true })`
- [x] `$.books[* if year > 1980].update({ tags: tags.append("modern") })`
- [x] `$.update({ "books[*].tags": @.append("test"), active: false })`
- [x] `$.update({ "books[* if year > 1980].tags": @.append("modern") })`
- [x] `$.update({ "books[*].tmp": DELETE, meta.updated_at: $.now })`
- [x] Existing `.set/.modify/.delete/.unset` chain writes accept wildcard paths before wildcard read expansion.

### Semantics

- [x] Result is the full updated root document.
- [x] RHS expressions read from the original snapshot, not partially updated values for selected-object `.update`.
- [x] Selected-object `.update({...})` evaluates local fields against the selected object.
- [x] Root-level `$.update({...})` evaluates `@` as the current target value.
- [x] `$` always reads the original root.
- [x] Per-field `when` guards are supported for selected-object `.update`.
- [x] Selected-object `.update` RHS/guards support scoped expressions including lambdas, comprehensions, pipelines, nested patches, and match arms.
- [x] Overlapping writes are source-order deterministic; later writes win.
- [x] Non-object selected values have explicit behavior, following existing patch object materialization.

### Planner And IR

- [x] Add a first-class `UpdateBatch` AST node instead of immediately expanding into many patches.
- [x] Add first-class `UpdateBatch` planner/physical IR nodes instead of compiling through patch compatibility lowering.
- [x] Preserve selector, update mode, ordered ops, conditions, and RHS expressions in the planner.
- [x] Mark `UpdateBatch` as a materialization boundary only at final root output.
- [x] Build an update trie grouped by shared path prefixes.
- [x] Analyze dependencies for selectors, wildcard filters, guards, dynamic indexes, and RHS expressions.
- [x] Record per-selected-object reads separately from root reads.
- [ ] POSTPONED: Reject or split unsafe batches only when semantics require it.

### Clean Architecture Constraints

- [ ] POSTPONED: Keep one authoritative update semantics layer; do not duplicate write behavior across parser, planner, VM, and executors.
- [ ] POSTPONED: Parser should only recognize syntax and produce/update AST shape; no execution-policy decisions in parser code.
- [x] Planner owns grouping, dependency analysis, materialization boundaries, and update-trie construction.
- [ ] POSTPONED: Executors consume planned update ops and should not rediscover query shapes or add handwritten per-query fusions.
- [x] Existing patch execution and new functional update execution should share path traversal, conflict resolution, guard evaluation, and structural-sharing helpers.
- [x] Keep write terminals (`set`, `modify`, `delete`, `unset`, future `append`) as thin lowering adapters into the same `UpdateBatch`/patch core.
- [ ] POSTPONED: Isolate path selection/traversal from mutation application so view/read traversal can be reused for writes.
- [ ] POSTPONED: Prefer small traits or strategy objects only at stable boundaries: path traversal, RHS evaluation, mutation application, and materialization.
- [ ] POSTPONED: Avoid inheritance-style enum sprawl; add new variants only when they represent real domain concepts.
- [ ] POSTPONED: Keep builtin metadata as the source of truth for purity, arity, barrier behavior, and update/write-terminal classification.
- [x] Add tests at architecture boundaries: parser lowering, planner batch/trie shape, executor semantics, and public API behavior.
- [ ] POSTPONED: Delete or refactor older duplicated patch-fusion helpers once `UpdateBatch` takes over their responsibility.

### Execution Algorithm

- [x] Traverse each shared selector prefix once.
- [x] Clone the root shell once.
- [x] Clone each touched array/object ancestor once using `Arc::make_mut`.
- [x] Clone each selected object once even when multiple fields are updated.
- [x] Evaluate all RHS values against the original selected snapshot plus original root.
- [x] Apply writes/deletes in source order to the cloned target.
- [x] Preserve untouched subtrees by `Arc`.
- [x] Return full root only once after all updates.

### Builtin Integration

- [x] Treat `append`, `remove`, `unique`, etc. as normal RHS value kernels inside update batches.
- [ ] POSTPONED: Add optional write-terminal lowering for rooted `path.append(v)`, `path.extend(xs)`, `path.prepend(v)`, and `path.remove_at(i)`.
- [x] Extend chain-write path lowering to accept wildcard and wildcard-filter steps.
- [x] Lower `.set/.modify/.delete/.unset/.merge/.deep_merge` into the batch planner, preserving current public behavior.

### Performance Tests

- [x] Multi-field update over `books[*]` clones the books array once and each changed book once.
- [x] Sibling root updates share one root clone.
- [x] `books[*].tags`, `books[*].reviewed`, and `books[*].tmp` share the `books[*]` traversal.
- [x] Untouched large subtrees remain structurally shared.
- [x] RHS root reads are hoisted/cached when invariant.
- [x] No full-document materialization occurs between fields in one update batch.
- [x] Benchmarks cover large-array wildcard update, filtered update, unrelated root-path batch, and nested update.