webnn-graph 0.3.0

Simple DSL for WebNN graphs
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
# Dynamic Dimensions Guide

A practical guide for choosing dimension override values when converting ONNX models to WebNN format.

## Table of Contents

- [Understanding Dynamic Dimensions]#understanding-dynamic-dimensions
- [Inspection Methods]#inspection-methods
- [Common Values by Model Type]#common-values-by-model-type
- [Decision-Making Process]#decision-making-process
- [Troubleshooting]#troubleshooting

## Understanding Dynamic Dimensions

### What Are Dynamic Dimensions?

ONNX models often use symbolic dimensions (like `batch_size`, `sequence_length`) instead of fixed
numbers. This allows the same model to handle different input sizes.

Example ONNX input shape:
```
input_ids: [batch_size, sequence_length]  # Dynamic
```

In many models, shape-driving expressions must become static for conversion:
```
input_ids: [1, 128]  # Static
```

### Why Provide Overrides?

WebNN executes in browsers and edge devices where:
- Memory must be allocated upfront
- Shape-driving expressions (for example, reshape targets) must be resolvable
- Performance is optimized for specific sizes

`webnn-graph` can preserve unresolved symbolic **input metadata** in v2 graphs with
`--experimental-dynamic-inputs`, but conversion still needs concrete values when dynamic shape math
cannot be folded.

## Inspection Methods

### Method 1: Using Python + ONNX

```python
pip install onnxslim
onnxslim --inspect model.onnx
```

### Method 2: Using Netron

1. Install Netron: `pip install netron`
2. Open model: `netron model.onnx`
3. Click on input nodes to see shape information
4. Look for dimension parameters like `batch_size`, `seq_len`, etc.

### Method 3: Check Model Documentation

Most models on Hugging Face include dimension information:

```bash
# Visit the model page
# https://huggingface.co/<org>/<model-name>

# Look for:
# - "Model Details" section
# - "max_seq_length" in config
# - Example usage code
```

### Method 4: Check the Sidecar File

webnn-graph supports automatic dimension discovery via `.dims.json` files:

```bash
# If model.onnx has a model.dims.json file, check it:
cat model.dims.json
```

Example content:
```json
{
  "freeDimensionOverrides": {
    "batch_size": 1,
    "sequence_length": 128
  }
}
```

## Common Values by Model Type

### Text / NLP Models (Transformers, BERT, GPT)

**Typical dimensions:**
```bash
--override-dim batch_size=1
--override-dim sequence_length=<128|256|512>
```

**Choosing sequence_length:**

| Model Type | Recommended | Max | Use Case |
|------------|-------------|-----|----------|
| Sentence embeddings (MiniLM, MPNet) | 128 | 256 | Sentences, titles, short text |
| Document classification (BERT) | 256 | 512 | Paragraphs, articles |
| Question answering (BERT, RoBERTa) | 384 | 512 | Q&A pairs with context |
| Text generation (GPT-2) | 512 | 1024 | Long-form generation |
| Long-document (Longformer) | 1024 | 4096 | Full documents |

**How to determine:**
```python
# Check tokenizer max length
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("model-name")
print(f"Max length: {tokenizer.model_max_length}")
```

**Common dimension parameter names:**
- `batch_size`, `batch`, `N`, `B`
- `sequence_length`, `seq_len`, `max_len`, `T`, `L`
- `hidden_size`, `hidden_dim` (usually fixed, not dynamic)

### Vision Models (CNNs, Vision Transformers)

**Typical dimensions:**
```bash
--override-dim batch_size=1
--override-dim height=224
--override-dim width=224
```

**Standard image sizes by architecture:**

| Architecture | Size | Notes |
|--------------|------|-------|
| ResNet-50/101 | 224×224 | ImageNet standard |
| EfficientNet-B0 | 224×224 | Scales up with variants |
| EfficientNet-B7 | 600×600 | Higher accuracy, slower |
| Vision Transformer (ViT-B) | 224×224 or 384×384 | Two common variants |
| MobileNet V2/V3 | 224×224 | Mobile-optimized |
| YOLO (object detection) | 416×416 or 640×640 | Detection-specific |
| Semantic segmentation | 512×512 or 1024×1024 | Full-resolution |

**How to determine:**
```python
# Check preprocessing configuration
from transformers import AutoImageProcessor
processor = AutoImageProcessor.from_pretrained("model-name")
print(f"Size: {processor.size}")
```

**Common dimension parameter names:**
- `batch_size`, `batch`, `N`, `B`
- `height`, `H`, `image_height`
- `width`, `W`, `image_width`
- `channels`, `C` (usually 3 for RGB, 1 for grayscale)

### Audio Models (Speech Recognition, Audio Classification)

**Typical dimensions:**
```bash
--override-dim batch_size=1
--override-dim sequence_length=<varies>  # Based on audio duration
```

**Determining sequence_length for audio:**
```python
# Formula: sequence_length = sample_rate * duration_seconds / hop_length

# Example for Wav2Vec2 (16kHz, 10 seconds)
sample_rate = 16000
duration = 10  # seconds
hop_length = 320  # model-specific
sequence_length = (sample_rate * duration) // hop_length
# Result: ~500 for 10-second audio
```

**Common values:**
- Whisper: Processes 30-second chunks → sequence_length based on mel spectrogram frames
- Wav2Vec2: Variable based on audio duration
- Audio classification: Often 16000 samples (1 second at 16kHz)

## Decision-Making Process

### Step-by-Step Guide

#### Step 1: Identify Dynamic Dimensions That Need Values

Try converting without overrides first:

```bash
./webnn-graph convert-onnx --input model.onnx
```

If conversion cannot resolve required dims, error output will indicate what to set:
```
Error: unresolved dynamic dimension(s) require explicit overrides:
 - input 'input_ids' dim 'batch_size': --override-dim batch_size=<value>
 - input 'input_ids' dim 'sequence_length': --override-dim sequence_length=<value>
```

#### Step 2: Determine Model Type

Look at the model filename or documentation:
- `*bert*`, `*roberta*`, `*gpt*` → Text model
- `*resnet*`, `*efficientnet*`, `*vit*` → Vision model
- `*wav2vec*`, `*whisper*` → Audio model

#### Step 3: Start with Conservative Values

**Always start with:**
- `batch_size=1` (single inference)
- Smallest reasonable size for other dimensions

**Why start small?**
- Faster conversion and testing
- Less memory usage
- Easier to debug
- Can always increase later

#### Step 4: Look Up Standard Values

Use the tables in [Common Values by Model Type](#common-values-by-model-type) above.

#### Step 5: Test and Iterate

```bash
# Test with initial values
./webnn-graph convert-onnx \
  --input model.onnx \
  --override-dim batch_size=1 \
  --override-dim sequence_length=128

# If successful, test inference
# If shapes are wrong, adjust and retry
```

### Example Workflows

#### Example 1: BERT Sentence Embeddings

```bash
# Model: sentence-transformers/all-MiniLM-L12-v2
# Task: Generate sentence embeddings

# 1. Check documentation
# Hugging Face says: max_seq_length = 256

# 2. Choose conservative value
# Most sentences < 128 tokens, so start there

# 3. Convert
./webnn-graph convert-onnx \
  --input all-MiniLM-L12-v2.onnx \
  --override-dim batch_size=1 \
  --override-dim sequence_length=128 \
  --optimize

# 4. If you need longer sequences, increase
./webnn-graph convert-onnx \
  --input all-MiniLM-L12-v2.onnx \
  --override-dim batch_size=1 \
  --override-dim sequence_length=256 \
  --optimize
```

#### Example 2: ResNet Image Classification

```bash
# Model: ResNet-50
# Task: Image classification

# 1. Standard size for ResNet is 224×224
# 2. No need to check - this is well-known

# 3. Convert
./webnn-graph convert-onnx \
  --input resnet50.onnx \
  --override-dim batch_size=1 \
  --override-dim height=224 \
  --override-dim width=224
```

#### Example 3: Unknown Custom Model

```bash
# 1. Inspect the model
python -c "
import onnx
model = onnx.load('custom_model.onnx')
for inp in model.graph.input:
    print(inp.name, inp.type.tensor_type.shape)
"

# Output shows: data [N, 3, H, W]
# This is an image model (3 channels)

# 2. Try standard image sizes
./webnn-graph convert-onnx \
  --input custom_model.onnx \
  --override-dim N=1 \
  --override-dim H=224 \
  --override-dim W=224

# 3. If conversion fails, check error messages
# 4. Try other common sizes: 256, 299, 384, 512
```

## Troubleshooting

### Problem: "Dynamic dimensions require explicit overrides"

**Solution:** Some model paths still need concrete values for symbolic dimensions.

```bash
# Error shows which dimensions need values
Error: unresolved dynamic dimension(s) require explicit overrides:
 - input 'input' dim 'height': --override-dim height=<value>

# Provide the missing dimensions
./webnn-graph convert-onnx \
  --input model.onnx \
  --override-dim height=224 \
  --override-dim width=224
```

### Problem: "Shape inference failed"

**Cause:** The dimension values you provided are incompatible with the model's operations.

**Solution:**

1. Check if the model has specific size requirements:
```python
# Some models only work with specific sizes
# e.g., YOLO expects multiples of 32
```

2. Try multiples of common factors:
```bash
# For CNNs: try 224, 256, 288, 320, 384, 416, 512
# For transformers: try 128, 256, 384, 512
```

3. Enable optimization to help with shape inference:
```bash
./webnn-graph convert-onnx \
  --input model.onnx \
  --optimize \
  --override-dim ...
```

### Problem: "Constant folding failed"

**Cause:** The model uses dynamic operations that can't be resolved with the given dimensions.

**Solution:**

1. Make sure you've provided ALL dynamic dimensions:
```bash
# Check for missing dimensions
python -c "
import onnx
model = onnx.load('model.onnx')
for inp in model.graph.input:
    for dim in inp.type.tensor_type.shape.dim:
        if dim.dim_param:
            print(f'Dynamic: {dim.dim_param}')
"
```

2. Use the `--optimize` flag:
```bash
./webnn-graph convert-onnx \
  --input model.onnx \
  --optimize \
  --override-dim batch_size=1 \
  --override-dim sequence_length=128
```

### Problem: Conversion succeeds but inference fails

**Cause:** The dimension values are too small for your actual input data.

**Solution:**

1. Check your actual input size:
```javascript
// In JavaScript
console.log("Input shape:", inputData.shape);
```

2. Increase dimensions to match:
```bash
# If your inputs are 256 tokens but you used 128
./webnn-graph convert-onnx \
  --input model.onnx \
  --override-dim sequence_length=256  # Increased from 128
```

### Problem: Out of memory during conversion

**Cause:** Dimension values are too large.

**Solution:**

1. Reduce to smaller values:
```bash
# Instead of 512, try 256
# Instead of 256, try 128
```

2. Use batch_size=1 (not higher):
```bash
--override-dim batch_size=1  # Always use 1 for inference
```

## Best Practices

### 1. Always Start with batch_size=1

For inference in WebNN (browsers/edge devices):
```bash
--override-dim batch_size=1
```

Only increase batch size if you're doing batch processing server-side.

### 2. Match Your Use Case

Choose dimensions based on your actual inputs:
- Short texts (tweets, titles): `sequence_length=128`
- Medium texts (articles): `sequence_length=256`
- Long texts (documents): `sequence_length=512+`

### 3. Consider Memory Constraints

Larger dimensions = more memory:
```
Memory ∝ batch_size × sequence_length × hidden_size
```

For browser inference, prefer smaller dimensions.

### 4. Use Standard Values When Possible

Standard sizes are well-tested:
- Images: 224, 256, 384, 512
- Text: 128, 256, 512, 1024
- These are powers of 2 or common multiples

### 5. Document Your Choices

Create a `.dims.json` file alongside your model:

```json
{
  "freeDimensionOverrides": {
    "batch_size": 1,
    "sequence_length": 128
  },
  "notes": "128 tokens handles 95% of our sentences. Max length is 256 if needed."
}
```

### 6. Test with Real Data

After conversion, test with actual inputs:
```javascript
// Verify converted model works with your data
const result = await context.compute(graph, {
  input_ids: actualInputIds,  // Your real input
  attention_mask: actualMask
});
```

## Quick Reference

### Text Models
```bash
--override-dim batch_size=1 --override-dim sequence_length=128
```

### Vision Models
```bash
--override-dim batch_size=1 --override-dim height=224 --override-dim width=224
```

### Unknown Model
```bash
# 1. Try without overrides (see error message)
# 2. Check model documentation
# 3. Start with smallest reasonable values
# 4. Iterate based on errors/results
```

## Additional Resources

- [ONNX Model Zoo]https://github.com/onnx/models - Standard models with known shapes
- [Hugging Face Model Hub]https://huggingface.co/models - Model documentation
- [Netron]https://netron.app/ - Visual model inspector
- [WebNN Specification]https://www.w3.org/TR/webnn/ - WebNN requirements

## Summary

**The decision process:**
1. Try converting without overrides and capture unresolved dimensions from the error
2. Determine model type (text/vision/audio)
3. Look up standard values for that model type
4. Start with conservative (small) values
5. Test and iterate as needed

**Most common pattern:**
```bash
./webnn-graph convert-onnx \
  --input model.onnx \
  --optimize \
  --override-dim batch_size=1 \
  --override-dim <other_dims>=<standard_value>
```

Remember: **batch_size=1** is almost always correct for inference!