wirm 4.0.0-rc3

A lightweight WebAssembly Transformation Library for the Component Model
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
//! ## Scope Tracking and Stable Identity
//!
//! This module defines the infrastructure used to safely track **nested index
//! spaces** across parsing, instrumentation, and encoding phases of a
//! WebAssembly *component*.
//!
//! WebAssembly components introduce hierarchical index scopes: components may
//! contain subcomponents, instances, types, and other constructs that form
//! their own index spaces. Additionally, `(outer ...)` references allow inner
//! scopes to refer to indices defined in enclosing scopes. Correctly resolving
//! these relationships at encode time therefore requires an explicit model of
//! scope nesting rather than a single flat index map.
//!
//! At the same time, this crate supports **component instrumentation**, meaning
//! the IR may be visited, transformed, and encoded in an order that does *not*
//! correspond to the original parse order. As a result, index resolution cannot
//! rely on traversal order alone.
//!
//! To address these constraints, this module separates **identity** from
//! **ownership** using a central registry and a small set of carefully enforced
//! invariants.
//!
//! ---
//!
//! ### `ScopeRegistry`
//!
//! `ScopeRegistry` is a shared registry that maps *IR node identity* to the index
//! scope (`SpaceId`) that the node owns or inhabits. This mapping is established
//! during parsing and maintained throughout the lifetime of the IR.
//!
//! The registry supports **two identity mechanisms**, depending on the kind of
//! node being tracked:
//!
//! #### Component scopes (special-cased)
//!
//! Components are identified by a stable `ComponentId`, assigned when the
//! component is parsed or created. Component scopes are registered and looked
//! up **by `ComponentId`**, rather than by pointer.
//!
//! This reflects the fact that components:
//! - May be stored in a central registry
//! - Are visited via an explicit *component ID stack* during traversal
//! - Do not rely on memory address stability for identity
//!
//! By using `ComponentId` as the identity key, component scope lookup remains
//! robust even as components are nested, traversed out of order, or referenced
//! indirectly.
//!
//! #### All other scoped IR nodes
//!
//! All non-component nodes that introduce or inhabit scopes (e.g. component types,
//! core types, etc.) are tracked using **raw pointers**
//! (`*const T`) as identity keys.
//!
//! These nodes are stored in append-only, stable allocations (`Box<T>`
//! inside append-only vectors), ensuring that their addresses remain
//! valid for the lifetime of the component graph.
//!
//! Raw pointers are used **only for identity comparison**; they are never
//! dereferenced.
//!
//! ---
//!
//! ### Scope Resolution During Encoding
//!
//! During encoding, scopes are resolved dynamically using two stacks:
//!
//! - A **component ID stack**, tracking which component is currently being
//!   traversed
//! - A **scope stack**, tracking nested index spaces within that component
//!
//! When an IR node needs to resolve its associated scope:
//!
//! - If the node is a component, the current `ComponentId` is used to query the
//!   registry
//! - Otherwise, the node’s pointer identity is used to retrieve its `SpaceId`
//!
//! This design allows correct resolution of arbitrarily nested constructs such
//! as deeply nested components, instances, and `(outer ...)` references without
//! encoding traversal order into the registry itself.
//!
//! ---
//!
//! ### Safety and Invariants
//!
//! This design relies on the following invariants:
//!
//! - Each component is assigned a unique `ComponentId` that remains stable for
//!   its lifetime.
//! - All non-component IR nodes that participate in scoping are allocated in
//!   stable memory (e.g. boxed and stored in append-only vectors).
//! - IR nodes are never moved or removed after registration with the
//!   `ScopeRegistry`.
//! - `ScopeRegistry` entries are created during parsing and may be extended
//!   during instrumentation, but are never removed.
//! - Raw pointer usage is confined strictly to identity comparison; no pointer
//!   is ever dereferenced.
//!
//! These constraints allow the system to use low-level identity mechanisms in a
//! controlled, domain-specific way while preserving correctness and debuggability.
//!
//! ---
//!
//! ### Design Tradeoffs
//!
//! This approach deliberately favors:
//!
//! - Explicit scope modeling over implicit traversal order
//! - Stable identity over borrow-driven lifetimes
//! - Append-only IR construction over in-place mutation
//!
//! While this introduces some bookkeeping and indirection, it ensures that index
//! correctness is enforced structurally and remains robust in the presence of
//! instrumentation, reordering, and future extensions to the component model.
//!
//! In short: **index correctness is enforced structurally, not procedurally**.
//!
//! ## Why `ScopeOwnerKind` Exists
//!
//! In the IR, multiple wrapper structs may reference the same underlying
//! scoped node. For example, a user-facing struct might contain a field
//! pointing to a `CoreType` that is also stored directly in a component's
//! internal vectors. Without additional tracking, the scope resolution logic
//! would see two references to the same pointer and mistakenly treat them as
//! separate scopes.
//!
//! `ScopeOwnerKind` is used to **disambiguate these cases**. Each node in the
//! scope registry records whether it is:
//! - An **original owner** of the scope (the canonical IR node), or
//! - A **derived/alias** that references an existing scope
//!
//! This ensures that the same scope is **never entered twice**, preventing
//! double-counting or incorrect index resolution during encoding.

use crate::ir::component::idx_spaces::ScopeId;
use crate::ir::id::ComponentId;
use crate::ir::types::CustomSection;
use crate::{Component, Module};
use std::cell::RefCell;
use std::collections::HashMap;
use std::ptr::NonNull;
use std::rc::Rc;
use wasmparser::{
    CanonicalFunction, CanonicalOption, ComponentAlias, ComponentDefinedType, ComponentExport,
    ComponentFuncType, ComponentImport, ComponentInstance, ComponentInstantiationArg,
    ComponentStartFunction, ComponentType, ComponentTypeDeclaration, ComponentTypeRef,
    ComponentValType, CompositeInnerType, CompositeType, CoreType, Export, FieldType, FuncType,
    Import, Instance, InstanceTypeDeclaration, InstantiationArg, ModuleTypeDeclaration,
    PrimitiveValType, RecGroup, RefType, StorageType, StructType, SubType, TypeRef, ValType,
    VariantCase,
};

/// ## Scope Tracking and Index Resolution
///
/// WebAssembly components introduce **nested index spaces**: components may
/// contain subcomponents, instances, types, and other constructs that define
/// their own indices. Inner scopes may also reference indices defined in
/// enclosing scopes via `(outer ...)`.
///
/// Because this crate supports **instrumentation and transformation** of
/// components, the order in which the IR is visited and encoded may differ from
/// the original parse order. As a result, index resolution cannot rely on
/// traversal order alone.
///
/// This module provides the infrastructure that ensures **correct and stable
/// index resolution** across parsing, instrumentation, and encoding.
///
/// ---
///
/// ### The Core Idea
///
/// Each IR node that participates in indexing is associated with a logical
/// **scope**. These associations are recorded once and later queried during
/// encoding.
///
/// The system guarantees that:
///
/// - Index scopes are assigned explicitly, not inferred from traversal order
/// - Nested scopes are resolved correctly, even under reordering or
///   instrumentation
/// - Encoding always uses the correct index space for the node being emitted
///
/// ---
///
/// ### Component Scopes
///
/// Components are identified by a stable **component ID** assigned when the
/// component is created or parsed.
///
/// Component scopes are registered and resolved using this ID rather than by
/// memory identity. During traversal, encoding maintains a **stack of component
/// IDs** representing the current nesting of components.
///
/// This makes component scope resolution:
///
/// - Independent of ownership or storage layout
/// - Robust to reordering and nested traversal
/// - Explicit and easy to reason about
///
/// ---
///
/// ### Scopes Within Components
///
/// All other scoped IR nodes—such as instances, type declarations, aliases, and
/// similar constructs—are associated with scopes relative to their enclosing
/// component.
///
/// During encoding, a **scope stack** tracks the currently active index spaces
/// as traversal enters and exits nested constructs. When an IR node needs to
/// resolve an index, its associated scope is retrieved and interpreted relative
/// to the current stack.
///
/// This allows deeply nested structures and `(outer ...)` references to be
/// encoded correctly without baking traversal assumptions into the IR.
///
/// ---
///
/// ### What This Enables
///
/// This design ensures that:
///
/// - Instrumentation can reorder or inject IR nodes without breaking index
///   correctness
/// - Encoding logic remains simple and declarative
/// - Index resolution remains correct for arbitrarily nested components
///
/// Users of the library do not need to manage scopes manually—scope tracking is
/// handled transparently as part of parsing and encoding.
///
/// ---
///
/// ### Design Philosophy
///
/// The scope system is intentionally explicit and conservative. Rather than
/// inferring meaning from traversal order, it records the structure of index
/// spaces directly and resolves them mechanically at encode time.
///
/// In short: **index correctness is enforced structurally, not procedurally**.
/// ```
#[derive(Default, Debug)]
pub(crate) struct IndexScopeRegistry {
    pub(crate) node_scopes: HashMap<NonNull<()>, ScopeEntry>,
    pub(crate) comp_scopes: HashMap<ComponentId, ScopeId>,
}
impl IndexScopeRegistry {
    pub fn register<T: GetScopeKind>(&mut self, node: &T, space: ScopeId) {
        let ptr = NonNull::from(node).cast::<()>();
        let kind = node.scope_kind();
        debug_assert_ne!(
            kind,
            ScopeOwnerKind::Unregistered,
            "attempted to register an unscoped node"
        );

        let old = self.node_scopes.insert(ptr, ScopeEntry { space, kind });

        debug_assert!(old.is_none(), "node registered twice: {:p}", node);
    }

    pub fn scope_entry<T: GetScopeKind>(&self, node: &T) -> Option<ScopeEntry> {
        let ptr = NonNull::from(node).cast::<()>();

        if let Some(entry) = self.node_scopes.get(&ptr) {
            if entry.kind == node.scope_kind() {
                return Some(*entry);
            }
        }
        None
    }
    pub fn register_comp(&mut self, comp_id: ComponentId, space: ScopeId) {
        self.comp_scopes.insert(comp_id, space);
    }
    pub fn scope_of_comp(&self, comp_id: ComponentId) -> Option<ScopeId> {
        self.comp_scopes.get(&comp_id).copied()
    }
}

/// Every IR node can have a reference to this to allow for instrumentation
/// to have access to the index scope mappings and perform manipulations!
pub(crate) type RegistryHandle = Rc<RefCell<IndexScopeRegistry>>;

#[derive(Debug, Clone, Copy)]
pub struct ScopeEntry {
    pub space: ScopeId,
    pub kind: ScopeOwnerKind,
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ScopeOwnerKind {
    /// A `(component ...)`
    Component,

    /// A core `(core type (module ...))`
    CoreTypeModule,

    /// A `(component type (component ...))`
    ComponentTypeComponent,
    /// A `(component type (instance ...))`
    ComponentTypeInstance,

    // Extend as needed
    Unregistered,
}

pub trait GetScopeKind {
    fn scope_kind(&self) -> ScopeOwnerKind {
        ScopeOwnerKind::Unregistered
    }
}
impl GetScopeKind for Component<'_> {
    fn scope_kind(&self) -> ScopeOwnerKind {
        ScopeOwnerKind::Component
    }
}
impl GetScopeKind for CoreType<'_> {
    fn scope_kind(&self) -> ScopeOwnerKind {
        match self {
            CoreType::Module(_) => ScopeOwnerKind::CoreTypeModule,
            // other variants that do NOT introduce scopes should never be registered
            _ => ScopeOwnerKind::Unregistered,
        }
    }
}
impl GetScopeKind for ComponentType<'_> {
    fn scope_kind(&self) -> ScopeOwnerKind {
        match self {
            ComponentType::Component(_) => ScopeOwnerKind::ComponentTypeComponent,
            ComponentType::Instance(_) => ScopeOwnerKind::ComponentTypeInstance,
            ComponentType::Defined(_) | ComponentType::Func(_) | ComponentType::Resource { .. } => {
                ScopeOwnerKind::Unregistered
            }
        }
    }
}
impl GetScopeKind for Module<'_> {}
impl GetScopeKind for ComponentTypeRef {}
impl GetScopeKind for ComponentDefinedType<'_> {}
impl GetScopeKind for ComponentFuncType<'_> {}
impl GetScopeKind for ComponentTypeDeclaration<'_> {}
impl GetScopeKind for InstanceTypeDeclaration<'_> {}
impl GetScopeKind for ComponentInstance<'_> {}
impl GetScopeKind for CanonicalFunction {}
impl GetScopeKind for ComponentAlias<'_> {}
impl GetScopeKind for ComponentImport<'_> {}
impl GetScopeKind for ComponentExport<'_> {}
impl GetScopeKind for Instance<'_> {}
impl GetScopeKind for ComponentStartFunction {}
impl GetScopeKind for CustomSection<'_> {}
impl GetScopeKind for ValType {}
impl GetScopeKind for ComponentInstantiationArg<'_> {}
impl GetScopeKind for CanonicalOption {}
impl GetScopeKind for ComponentValType {}
impl GetScopeKind for InstantiationArg<'_> {}
impl GetScopeKind for Export<'_> {}
impl GetScopeKind for PrimitiveValType {}
impl GetScopeKind for VariantCase<'_> {}
impl GetScopeKind for CompositeInnerType {}
impl GetScopeKind for FuncType {}
impl GetScopeKind for FieldType {}
impl GetScopeKind for StructType {}
impl GetScopeKind for CompositeType {}
impl GetScopeKind for StorageType {}
impl GetScopeKind for RefType {}
impl GetScopeKind for RecGroup {}
impl GetScopeKind for ModuleTypeDeclaration<'_> {}
impl GetScopeKind for Import<'_> {}
impl GetScopeKind for TypeRef {}
impl GetScopeKind for SubType {}

/// Assert that a node is registered in the `ScopeRegistry` at this point.
/// Panics if the node is not found.
/// This helps with debugging issues where a node may have been moved and
/// no longer upholds the invariants required by the scope lookup mechanism.
/// These checks will not be present in a release build, only debug builds, since
/// the check is encapsulated inside a `debug_assert_eq`.
#[macro_export]
macro_rules! assert_registered {
    ($registry:expr, $node:expr) => {{
        debug_assert!(
            $registry.borrow().scope_entry($node).is_some(),
            // concat!(
            "Debug assertion failed: node is not registered in ScopeRegistry: {:?}",
            $node // )
        );
    }};
}
#[macro_export]
macro_rules! assert_registered_with_id {
    ($registry:expr, $node:expr, $scope_id:expr) => {{
        debug_assert_eq!(
            $scope_id,
            $registry
                .borrow()
                .scope_entry($node)
                .expect(concat!(
                    "Debug assertion failed: node is not registered in ScopeRegistry: ",
                    stringify!($node)
                ))
                .space
        );
    }};
}

#[derive(Clone, Debug)]
pub struct ComponentStore<'a> {
    components: HashMap<ComponentId, &'a Component<'a>>,
}
impl<'a> ComponentStore<'a> {
    pub fn get(&self, id: &ComponentId) -> &'a Component<'a> {
        self.components.get(id).unwrap()
    }
}

pub fn build_component_store<'a>(root: &'a Component<'a>) -> ComponentStore<'a> {
    let mut map = HashMap::new();

    fn walk<'a>(comp: &'a Component<'a>, map: &mut HashMap<ComponentId, &'a Component<'a>>) {
        map.insert(comp.id, comp);
        for child in comp.components.iter() {
            walk(child, map);
        }
    }

    walk(root, &mut map);

    ComponentStore { components: map }
}