1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
//! Compact per-edge-sample shadow-visibility texture (Plan B Stage 5b-shadow,
//! docs/plans/deferred-shared-prep-pass.md).
//!
//! Under MSAA, `cs_prep_edge` fills this texture (per edge pixel × MSAA sample ×
//! `ceil(K/4)` packed slot-layers) and `cs_edge` reads it instead of inline-
//! sampling shadow maps — which is what lets the ~50 KB `sample_shadow_*` block
//! drop from the MSAA opaque module (the MSAA analog of Stage 4).
//!
//! **Why a TEXTURE, not a storage buffer.** The unified MSAA opaque module's
//! `cs_edge` already binds 10 storage buffers (the macOS Metal baseline
//! `maxStorageBuffersPerShaderStage` cap — see `edge_buffers.rs` and
//! `edge_bind_group.rs`). Adding an 11th storage buffer would exceed the cap.
//! A sampled/storage TEXTURE does not count against that limit, so the compact
//! buffer is an `Rgba8unorm` `texture_2d_array`: `cs_prep_edge` writes it as a
//! storage texture, `cs_edge` reads it with `textureLoad` (no sampler).
//!
//! **Keying.** Flat index `idx = edge_pixel_id * MAX_EDGE_SHADOW_SAMPLES +
//! sample`, mapped to 2D as `(idx % EDGE_SHADOW_TEX_WIDTH, idx /
//! EDGE_SHADOW_TEX_WIDTH)`; the packed slot group `slot / 4` selects the array
//! layer. Both `cs_prep_edge` (write) and `cs_edge` (read) compute the identical
//! mapping (the WGSL `EDGE_SHADOW_TEX_WIDTH` const is rendered from the same
//! `EDGE_SHADOW_TEX_WIDTH` value below).
//!
//! **Size.** `EDGE_SHADOW_TEX_WIDTH × height × layers` texels, where `height =
//! ceil(max_edge_budget * MAX_EDGE_SHADOW_SAMPLES / WIDTH)` and `layers =
//! ceil(K/4)`. At the 512K desktop budget, K≤4: 4096 × 512 × 1 × 4 B ≈ 8 MB —
//! the spec's target. Only allocated under prep + MSAA.
//!
//! ── PREP-VS-RECOMPUTE RULE (why shadows get an edge buffer but UV/vcolor don't) ──
//!
//! Prep exists to materialize material-INDEPENDENT per-pixel work once so the slim
//! per-material kernel READS it instead of recomputing. That only pays when the
//! materialized work is expensive enough to beat the cost of writing it here and
//! reading it back there — AND/OR when caching it lets bulky code drop out of every
//! specialized material module. Cheap work is re-derived in the wrapper instead.
//!
//! * Shadow visibility → PREP (this buffer). Shadow sampling is the expensive
//! `sample_shadow_*` block; caching it per edge-sample lets that ~50 KB of code
//! drop from the MSAA opaque module entirely (interior reads the full-screen
//! prep buffer, edges read THIS one → neither recomputes → the code is gone;
//! that's the bulk of the -53 KB MSAA win). Easily worth the write+read.
//!
//! * Edge-sample UV0 / vertex-color → RECOMPUTE (deliberately NOT an edge buffer).
//! The edge arm in `cs_shade` already holds the per-sample triangle + barycentric
//! in-register (it needs them to shade at all), so the UV/vcolor lerp there is a
//! few buffer reads — cheaper than computing the same thing in `cs_prep_edge`,
//! writing it, and reading it back, plus the ~16-48 MB the buffer would cost.
//! There is also no bulky code to evict (the recompute helper is ~10 lines).
//! Same call as world-position, which prep also deliberately never materializes
//! (re-projected from depth on demand). See `helpers/texture_uvs.wgsl` +
//! `helpers/vertex_color_attrib.wgsl`, and `docs/SHADER_GUIDELINES.md`.
//!
//! Either way this is invisible to material authors: they call an accessor
//! (`texture_uv` / `material_uv` / `input.world_position`); the accessor picks
//! prep-read vs recompute under the hood.
use ;
/// Fixed width (texels) of the compact edge-shadow texture. The flat edge-sample
/// index wraps at this width. MUST equal the WGSL `EDGE_SHADOW_TEX_WIDTH` const
/// (rendered from this value via the prep compute template). Chosen so that even
/// the 24-bit `MAX_EDGE_BUDGET` ceiling keeps `height` within `maxTextureDimension2D`
/// for the desktop/mobile defaults.
pub const EDGE_SHADOW_TEX_WIDTH: u32 = 4096;
/// MSAA samples per edge pixel the compact buffer reserves a slot for. MSAA-4
/// today; mirrors the WGSL `MAX_EDGE_SHADOW_SAMPLES`.
pub const MAX_EDGE_SHADOW_SAMPLES: u32 = 4;
/// The compact per-edge-sample shadow-visibility texture + its sampled/array
/// view. Owned by the prep module; allocated only when prep is enabled AND MSAA
/// is on. Sized from `max_edge_budget` × samples × `shadow_visibility_layers`.
/// Height (rows) needed to hold `max_edge_budget * MAX_EDGE_SHADOW_SAMPLES`
/// texels at `EDGE_SHADOW_TEX_WIDTH` columns.