scena 1.0.0 - Docs.rs

# Texture Array Batching Implementation Plan

Type: Implementation contract.

Closes the open `[ ]` checkboxes at
[`docs/checklists/state-of-art-threejs-replacement-plan.md`](../checklists/state-of-art-threejs-replacement-plan.md)
line 778 ("texture arrays and texture-array batching remain open") and
line 866 ("GPU batching for alpha/material ordering remains part of the
GPU material gate"), driven by the `Capabilities::texture_arrays`
field that already ships per Phase 1F step 1.

## Status

Closed. All four commits have landed.

* Commit 1: `MaterialBatchPlan` + `RendererStats::material_batch_layers`
  detect array-batching opportunities at prepare time.
* Commit 2: GPU now allocates a shared `texture_2d_array<f32>` per role
  with N+1 layers (layer 0 reserved for the synthetic fallback slot)
  when `batchable` is true; per-material fall-back uses 1-layer arrays.
* Commit 3: WGSL fragment uses `texture_2d_array<f32>` and routes
  sampling through `MaterialUniform::material_layer_index`. The bind
  group layout's uniform binding uses `has_dynamic_offset: true`, so
  a single shared bind group services every draw with dynamic offsets
  instead of N bind-group switches.
* Commit 4:
  `tests/m1_geometry_materials::texture_array_batching_collapses_to_single_bind`
  proves `RendererStats::material_bind_groups = 1` when the renderer
  collapses two compatible materials, and falls back to
  `material_count + 1` when dimensions mismatch. WebGL2 keeps its
  independent per-material GLSL pipeline (`webgl2_program.rs`), which
  was unchanged by the array work; `Capabilities::texture_arrays`
  already reports `max_texture_array_layers` per Phase 1F step 1.

## Why this is multi-commit work

A clean implementation touches every layer of the renderer:

1. **prepare layer** — material grouping by `(sampler, format,
   width, height)` per texture role; per-material `array_layer_index`
   assignment; fallback diagnostics when materials are not group-
   compatible.
2. **GPU resource layer** (`src/render/gpu/materials.rs`) —
   replace per-material 2D textures with shared `texture_2d_array`
   per role; per-material upload to a specific layer; allocation
   accounting in `RendererStats`.
3. **uniform layer** — material uniform buffer becomes one buffer
   with `min_uniform_buffer_offset_alignment` stride per material
   so the bind group binds the array textures once and uses a
   dynamic offset to select per-material uniform data.
4. **bind group layer** — collapse N per-material bind groups into
   1 shared bind group; the shader's `@group(1)` bindings become
   array-typed; the dynamic offset on the uniform binding picks the
   correct material per draw.
5. **WGSL** — `texture_2d<f32>` → `texture_2d_array<f32>` on every
   material texture binding; `MaterialUniform` gains a layer-index
   field; sampling adds the layer argument.
6. **WebGL2** (`src/render/gpu/webgl2_*`) — capability gate keeps
   the per-material path on WebGL2 lanes that do not expose
   `sampler2DArray`; the renderer falls back gracefully and reports
   `texture_arrays = Degraded` on those lanes.
7. **doctor truth** — new substrings under `ARCH-RENDER-TRUTH` and
   `ARCH-M3A-SCENE-IMPORT` lock the array-typed bindings + layer-
   index uniform field so a regression cannot ship.
8. **visual fixtures** — every existing visual proof keeps its
   pinned RGBA hash; mixed-dimension materials must continue to
   render byte-identical to today's per-material path through the
   fallback gate.

## Implementation roadmap

The work splits into four commits, each individually shippable:

### Commit 1 — `MaterialBatchPlan` computation + diagnostics

`src/render/prepare/resources.rs`:
- Add `MaterialBatchPlan { batchable: bool, layer_count: u32,
  incompatible_role: Option<MaterialTextureRole>,
  incompatible_reason: Option<MaterialBatchIncompatibility> }`.
- Compute at prepare time from `&[PreparedMaterialSlot]`. All
  materials must share `(sampler, format, width, height)` for
  every role; record the first incompatibility otherwise.
- Add `RendererStats::material_batch_layers: u32` (= layer count
  when batchable, 0 otherwise).

Tests:
- One material → batchable, layer_count = 1.
- Two same-dim materials → batchable, layer_count = 2.
- Two materials with different base-color dimensions → not
  batchable, incompatible_role = BaseColor.
- Two materials with different samplers → not batchable.

This is a prepare-side observability step. The GPU still uses the
per-material path. The plan is exposed for downstream consumers
(test harnesses, capability auditors) to verify the renderer
detects batching opportunities.

### Commit 2 — GPU array texture allocation under the new plan

`src/render/gpu/materials.rs`:
- When `MaterialBatchPlan::batchable`: allocate one
  `texture_2d_array<f32>` per role with `layer_count` layers.
  Upload each material's role bytes into its assigned layer.
- When not batchable: keep current per-material path. Both paths
  coexist.
- New `MaterialBatchResources` struct holds the array textures +
  shared uniform buffer (one buffer with `material_count *
  align(MATERIAL_UNIFORM_BYTE_LEN, 256)` bytes).

Bind-group layout still has 11 entries (5 textures + 5 samplers +
1 uniform). The textures become `texture_2d_array<f32>` with
`filterable: true` for color roles and `filterable: false` for
data roles. The uniform binding gains `has_dynamic_offset: true`
and `min_binding_size: NonZero::new(MATERIAL_UNIFORM_BYTE_LEN)`.

Tests:
- Two same-dim materials produce a 2-layer array; layer 0 + layer
  1 contain the expected per-material bytes (verified through
  `wgpu`'s `copy_texture_to_buffer` debugging path).

### Commit 3 — WGSL switch + bind group collapse

`src/render/gpu/output.rs`:
- Each material texture binding switches type to
  `texture_2d_array<f32>`.
- `MaterialUniform` adds `material_layer_index: vec4<u32>` (using
  one slot for layer, three slots for future per-role layer
  flexibility + 16-byte alignment).
- Sampling becomes `textureSample(base_color_texture,
  base_color_sampler, transformed_uv,
  i32(material.material_layer_index.x))` (and equivalents for the
  other 4 roles).

`src/render/gpu/draw.rs`:
- Collapse N per-material `set_bind_group(1, ...)` calls into ONE.
  The pass binds the shared material bind group ONCE; each
  draw_batch sets its dynamic offset on the material uniform via
  the existing draw-uniform `set_bind_group(2, ..., &[draw_offset])`
  pattern (extended to also pass the material offset).

Doctor:
- `ARCH-RENDER-TRUTH` requires the new `texture_2d_array<f32>`
  substring, the `material_layer_index` field, and the
  `textureSample(base_color_texture, base_color_sampler,
  transformed_uv, i32(material.material_layer_index.x))` form.

### Commit 4 — WebGL2 fallback + capability gate flip

`src/render/gpu/webgl2_program.rs`:
- WebGL2 keeps the per-material path. The capability stays
  `Capabilities::texture_arrays = Supported` for backends that
  emit array bindings; WebGL2 reports the per-material fallback
  via `RendererStats::material_batch_layers = 0` even when the
  batch plan would batch.

Tests:
- `tests/m1_geometry_materials.rs::texture_array_batching_collapses_to_single_bind`
  asserts the headless GPU renderer with two compatible
  materials reports exactly one material bind-group switch per
  pass instead of two.
- `tests/m8_visual_proof.rs::m8-multi-material-array` extends to
  ≥4 distinct materials, renders identically to the per-bind path
  (within the existing `max_abs_diff` tolerance), and emits a
  new TOML entry recording `material_batch_layers: 4`.

## Capability flip

`Capabilities::texture_arrays` flips from `Supported` (with no
runtime use) to `Supported` (with measured renderer-stats evidence)
once commit 4's tests close. No RFC or capability-table change
required since the field already reports `Supported` per Phase 1F
step 1; the master plan checkbox at line 778 closes when the
visual fixture matrix at commit 4 lands.

## Tradeoffs explicitly considered

- **Texture deduplication vs array batching**: deduplication
  (same texture handle reused across materials → uploaded once)
  is a simpler memory optimization that does not require WGSL
  changes. It is orthogonal to array batching and can land
  separately without blocking this plan. Texture array batching
  is what closes the `Capabilities::texture_arrays = Supported`
  truth-claim.
- **Mixed-dimension materials**: a future commit could group
  materials into multiple compatibility groups (one
  `texture_2d_array` per group). For v1.0 the simpler "all
  materials share one group or fall back" policy is sufficient
  per the master plan and avoids draw-call regrouping.
- **WebGPU `binding_array<texture_2d, N>`**: the Vulkan-style
  bindless path is faster but lands later than `texture_2d_array`
  in the wgpu surface. Not v1.0 scope.

## Closure evidence

Each commit lands with:
- the local gate stack green (fmt, clippy, test, doctor, wasm32),
- the master plan box updated with a pointer to the test that
  proves the contract,
- a CHANGELOG `[Unreleased]` entry.

The `Capabilities::texture_arrays = Supported` claim closes when
commit 4's `texture_array_batching_collapses_to_single_bind` test
passes on every claimed lane and the master plan's line 778 box
moves to `[x]` with that test cited as evidence.