arborium 2.1.1

Tree-sitter syntax highlighting with HTML rendering and WASM support
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
# arborium

Batteries-included [tree-sitter](https://tree-sitter.github.io/tree-sitter/) grammar collection with HTML rendering and WASM support.

[![Crates.io](https://img.shields.io/crates/v/arborium.svg)](https://crates.io/crates/arborium)
[![Documentation](https://docs.rs/arborium/badge.svg)](https://docs.rs/arborium)
[![License](https://img.shields.io/crates/l/arborium.svg)](LICENSE-MIT)

## Features

- **69 language grammars** included out of the box
- **67 permissively licensed** (MIT/Apache-2.0/CC0/Unlicense) grammars enabled by default
- **WASM support** with custom allocator fix
- **Feature flags** for fine-grained control over included languages

## Usage

```toml
[dependencies]
arborium = "1.1"
```

By default, all permissively-licensed grammars are included. To select specific languages:

```toml
[dependencies]
arborium = { version = "1.1", default-features = false, features = ["lang-rust", "lang-javascript"] }
```

## Browser Usage

Arborium can be used in the browser in two ways:

### Option 1: Drop-in Script (Easiest)

Add a single script tag and arborium auto-highlights all code blocks:

```html
<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>
```

That's it! Arborium will:
- Auto-detect languages from `class="language-*"` or `data-lang="*"` attributes
- Load grammar WASM plugins on-demand from jsDelivr CDN
- Inject the default theme CSS

**Configuration via data attributes:**

```html
<script
  src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"
  data-theme="mocha"
  data-selector="pre code"
  data-manual
></script>
```

**Configuration via JavaScript:**

```html
<script>
  window.Arborium = {
    theme: 'tokyo-night',
    selector: 'pre code, .highlight',
    cdn: 'jsdelivr',  // or 'unpkg' or a custom URL
    version: '1', // or 'latest'
  };
</script>
<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>
```

**Manual highlighting:**

```html
<script src="..." data-manual></script>
<script>
  // Highlight all code blocks
  arborium.highlightAll();

  // Highlight a specific element
  arborium.highlightElement(document.querySelector('code'), 'rust');
</script>
```

### Option 2: ESM Module (Programmatic)

For bundlers (Vite, webpack, etc.) or ESM-native environments:

```bash
npm install @arborium/arborium
```

```typescript
import { loadGrammar, highlight } from '@arborium/arborium';

// Load a grammar (fetched from CDN on first use)
const grammar = await loadGrammar('rust');

// Highlight code
const html = grammar.highlight('fn main() { println!("Hello!"); }');

// Or use the convenience function
const html = await highlight('rust', code);
```

### Option 3: Compile Rust to WASM (Maximum Control)

For complete control and offline-first apps, compile the Rust crate directly to WASM:

```toml
[dependencies]
arborium = { version = "1.1", default-features = false, features = ["lang-rust", "lang-javascript"] }
```

```bash
# Requires LLVM with WASM support (see FAQ below)
cargo build --target wasm32-unknown-unknown
```

This embeds selected grammars directly in your WASM binary - no CDN required at runtime.

### Themes

Arborium includes 32 built-in themes from popular color schemes.

**Dark themes:** `catppuccin-mocha`, `catppuccin-macchiato`, `catppuccin-frappe`, `dracula`, `tokyo-night`, `nord`, `one-dark`, `github-dark`, `gruvbox-dark`, `monokai`, `kanagawa-dragon`, `rose-pine-moon`, `ayu-dark`, `solarized-dark`, `ef-melissa-dark`, `melange-dark`, `cobalt2`, `zenburn`, `desert256`, `rustdoc-dark`, `rustdoc-ayu`

**Light themes:** `catppuccin-latte`, `github-light`, `gruvbox-light`, `ayu-light`, `solarized-light`, `melange-light`, `light-owl`, `lucius-light`, `dayfox`, `alabaster`, `rustdoc-light`

Import theme CSS:
```html
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/themes/tokyo-night.css">
```

Or let the IIFE bundle auto-inject it via the `data-theme` attribute.

**Theme Attribution:** All themes are adaptations of color schemes from their original projects. See the [arborium-theme crate README](https://github.com/bearcove/arborium/tree/main/crates/arborium-theme#built-in-themes) for full attribution and source links.

## Feature Flags

### Grammar Collections

| Feature | Description |
|---------|-------------|
| `mit-grammars` | All permissively licensed grammars (MIT, Apache-2.0, CC0) - **default** |
| `gpl-grammars` | GPL-licensed grammars (copyleft - may affect your project's license) |
| `all-grammars` | All grammars including GPL |

### Permissive Grammars (67)

These grammars use permissive licenses (MIT, Apache-2.0, CC0, Unlicense) and are included by default.

| Feature | Language | License | Source |
|---------|----------|---------|--------|
| `lang-asm` | Assembly | MIT | [tree-sitter-asm]https://github.com/RubixDev/tree-sitter-asm |
| `lang-awk` | AWK | MIT | [tree-sitter-awk]https://github.com/Beaglefoot/tree-sitter-awk |
| `lang-bash` | Bash | MIT | [tree-sitter-bash]https://github.com/tree-sitter/tree-sitter-bash |
| `lang-batch` | Batch | MIT | [tree-sitter-batch]https://github.com/davidevofficial/tree-sitter-batch |
| `lang-c` | C | MIT | [tree-sitter-c]https://github.com/tree-sitter/tree-sitter-c |
| `lang-c-sharp` | C# | MIT | [tree-sitter-c-sharp]https://github.com/tree-sitter/tree-sitter-c-sharp |
| `lang-caddy` | Caddyfile | MIT | [tree-sitter-caddy]https://github.com/Samonitari/tree-sitter-caddy |
| `lang-capnp` | Cap'n Proto | MIT | [tree-sitter-capnp]https://github.com/tree-sitter-grammars/tree-sitter-capnp |
| `lang-clojure` | Clojure | Unlicense | [tree-sitter-clojure]https://github.com/sogaiu/tree-sitter-clojure |
| `lang-cpp` | C++ | MIT | [tree-sitter-cpp]https://github.com/tree-sitter/tree-sitter-cpp |
| `lang-css` | CSS | MIT | [tree-sitter-css]https://github.com/tree-sitter/tree-sitter-css |
| `lang-dart` | Dart | MIT | [tree-sitter-dart]https://github.com/UserNobody14/tree-sitter-dart |
| `lang-devicetree` | Device Tree | MIT | [tree-sitter-devicetree]https://github.com/joelspadin/tree-sitter-devicetree |
| `lang-diff` | Diff | MIT | [tree-sitter-diff]https://github.com/the-mikedavis/tree-sitter-diff |
| `lang-dockerfile` | Dockerfile | MIT | [tree-sitter-dockerfile]https://github.com/camdencheek/tree-sitter-dockerfile |
| `lang-elixir` | Elixir | Apache-2.0 | [tree-sitter-elixir]https://github.com/elixir-lang/tree-sitter-elixir |
| `lang-elm` | Elm | MIT | [tree-sitter-elm]https://github.com/elm-tooling/tree-sitter-elm |
| `lang-fsharp` | F# | MIT | [tree-sitter-fsharp]https://github.com/ionide/tree-sitter-fsharp |
| `lang-gleam` | Gleam | Apache-2.0 | [tree-sitter-gleam]https://github.com/gleam-lang/tree-sitter-gleam |
| `lang-glsl` | GLSL | MIT | [tree-sitter-glsl]https://github.com/tree-sitter-grammars/tree-sitter-glsl |
| `lang-go` | Go | MIT | [tree-sitter-go]https://github.com/tree-sitter/tree-sitter-go |
| `lang-haskell` | Haskell | MIT | [tree-sitter-haskell]https://github.com/tree-sitter/tree-sitter-haskell |
| `lang-hcl` | HCL (Terraform) | Apache-2.0 | [tree-sitter-hcl]https://github.com/tree-sitter-grammars/tree-sitter-hcl |
| `lang-hlsl` | HLSL | MIT | [tree-sitter-hlsl]https://github.com/tree-sitter-grammars/tree-sitter-hlsl |
| `lang-html` | HTML | MIT | [tree-sitter-html]https://github.com/tree-sitter/tree-sitter-html |
| `lang-idris` | Idris | MIT | [tree-sitter-idris]https://github.com/kayhide/tree-sitter-idris |
| `lang-ini` | INI | Apache-2.0 | [tree-sitter-ini]https://github.com/justinmk/tree-sitter-ini |
| `lang-java` | Java | MIT | [tree-sitter-java]https://github.com/tree-sitter/tree-sitter-java |
| `lang-javascript` | JavaScript | MIT | [tree-sitter-javascript]https://github.com/tree-sitter/tree-sitter-javascript |
| `lang-jq` | jq | MIT | [tree-sitter-jq]https://github.com/flurie/tree-sitter-jq |
| `lang-json` | JSON | MIT | [tree-sitter-json]https://github.com/tree-sitter/tree-sitter-json |
| `lang-lean` | Lean | MIT | [tree-sitter-lean]https://github.com/Julian/tree-sitter-lean |
| `lang-lua` | Lua | MIT | [tree-sitter-lua]https://github.com/tree-sitter-grammars/tree-sitter-lua |
| `lang-markdown` | Markdown | MIT | [tree-sitter-markdown]https://github.com/tree-sitter-grammars/tree-sitter-markdown |
| `lang-meson` | Meson | MIT | [tree-sitter-meson]https://github.com/tree-sitter-grammars/tree-sitter-meson |
| `lang-nix` | Nix | MIT | [tree-sitter-nix]https://github.com/nix-community/tree-sitter-nix |
| `lang-objc` | Objective-C | MIT | [tree-sitter-objc]https://github.com/tree-sitter-grammars/tree-sitter-objc |
| `lang-perl` | Perl | MIT | [tree-sitter-perl]https://github.com/tree-sitter-perl/tree-sitter-perl |
| `lang-php` | PHP | MIT | [tree-sitter-php]https://github.com/tree-sitter/tree-sitter-php |
| `lang-postscript` | PostScript | MIT | [tree-sitter-postscript]https://github.com/smoeding/tree-sitter-postscript |
| `lang-powershell` | PowerShell | MIT | [tree-sitter-powershell]https://github.com/airbus-cert/tree-sitter-powershell |
| `lang-prolog` | Prolog | MIT | [tree-sitter-prolog]https://codeberg.org/foxy/tree-sitter-prolog |
| `lang-python` | Python | MIT | [tree-sitter-python]https://github.com/tree-sitter/tree-sitter-python |
| `lang-r` | R | MIT | [tree-sitter-r]https://github.com/r-lib/tree-sitter-r |
| `lang-rescript` | ReScript | MIT | [tree-sitter-rescript]https://github.com/rescript-lang/tree-sitter-rescript |
| `lang-ron` | RON | MIT OR Apache-2.0 | [tree-sitter-ron]https://github.com/tree-sitter-grammars/tree-sitter-ron |
| `lang-rust` | Rust | MIT | [tree-sitter-rust]https://codeberg.org/grammar-orchard/tree-sitter-rust-orchard |
| `lang-scala` | Scala | MIT | [tree-sitter-scala]https://github.com/tree-sitter/tree-sitter-scala |
| `lang-scss` | SCSS | MIT | [tree-sitter-scss]https://github.com/serenadeai/tree-sitter-scss |
| `lang-sql` | SQL | MIT | [tree-sitter-sql]https://github.com/DerekStride/tree-sitter-sql |
| `lang-starlark` | Starlark | MIT | [tree-sitter-starlark]https://github.com/tree-sitter-grammars/tree-sitter-starlark |
| `lang-svelte` | Svelte | MIT | [tree-sitter-svelte]https://github.com/tree-sitter-grammars/tree-sitter-svelte |
| `lang-thrift` | Thrift | MIT | [tree-sitter-thrift]https://github.com/tree-sitter-grammars/tree-sitter-thrift |
| `lang-tlaplus` | TLA+ | MIT | [tree-sitter-tlaplus]https://github.com/tlaplus-community/tree-sitter-tlaplus |
| `lang-toml` | TOML | MIT | [tree-sitter-toml]https://github.com/tree-sitter-grammars/tree-sitter-toml |
| `lang-typescript` | TypeScript | MIT | [tree-sitter-typescript]https://github.com/tree-sitter/tree-sitter-typescript |
| `lang-vb` | Visual Basic | MIT | [tree-sitter-vb]https://github.com/CodeAnt-AI/tree-sitter-vb-dotnet |
| `lang-verilog` | Verilog | MIT | [tree-sitter-verilog]https://github.com/tree-sitter/tree-sitter-verilog |
| `lang-vhdl` | VHDL | MIT | [tree-sitter-vhdl]https://github.com/alemuller/tree-sitter-vhdl |
| `lang-vim` | Vimscript | MIT | [tree-sitter-vim]https://github.com/tree-sitter-grammars/tree-sitter-vim |
| `lang-vue` | Vue | MIT | [tree-sitter-vue]https://github.com/tree-sitter-grammars/tree-sitter-vue |
| `lang-wasm` | WebAssembly | Apache-2.0 | [tree-sitter-wasm]https://github.com/wasm-lsp/tree-sitter-wasm |
| `lang-x86asm` | x86 Assembly | MIT | local |
| `lang-xml` | XML | MIT | [tree-sitter-xml]https://github.com/tree-sitter-grammars/tree-sitter-xml |
| `lang-yaml` | YAML | MIT | [tree-sitter-yaml]https://github.com/tree-sitter-grammars/tree-sitter-yaml |
| `lang-zig` | Zig | MIT | [tree-sitter-zig]https://github.com/tree-sitter-grammars/tree-sitter-zig |
| `lang-zsh` | Zsh | MIT | [tree-sitter-zsh]https://github.com/georgeharker/tree-sitter-zsh |

### GPL-Licensed Grammars (2)

These grammars are **not included by default** due to their copyleft license.
Enabling them may have implications for your project's licensing.

| Feature | Language | License | Source |
|---------|----------|---------|--------|
| `lang-jinja2` | Jinja2 | GPL-3.0 | [tree-sitter-jinja2]https://github.com/dbt-labs/tree-sitter-jinja2 |
| `lang-nginx` | nginx | GPL-3.0 | [tree-sitter-nginx]https://gitlab.com/joncoole/tree-sitter-nginx |

## HTML Tag Reference

Arborium renders syntax highlighting using custom HTML elements. When highlighting code, it wraps spans of text with tags like `<a-k>`, `<a-f>`, etc. These tags are styled by the theme CSS you choose.

### Tag Mappings

Each tag corresponds to a semantic code element. Here's the complete reference:

| Tag | Element Type | Description |
|-----|--------------|-------------|
| `<a-k>` | **Keyword** | Language keywords (if, else, while, class, fn, etc.) |
| `<a-f>` | **Function** | Function names and method calls |
| `<a-s>` | **String** | String literals and character literals |
| `<a-c>` | **Comment** | Comments (line and block) |
| `<a-t>` | **Type** | Type names and type annotations |
| `<a-v>` | **Variable** | Variable names and identifiers |
| `<a-co>` | **Constant** | Constants and boolean literals |
| `<a-n>` | **Number** | Numeric literals (integers and floats) |
| `<a-o>` | **Operator** | Operators (+, -, *, /, &&, etc.) |
| `<a-p>` | **Punctuation** | Delimiters and punctuation (parentheses, brackets, commas) |
| `<a-pr>` | **Property** | Object properties and struct fields |
| `<a-at>` | **Attribute** | Attributes and annotations (@, #[derive], etc.) |
| `<a-tg>` | **Tag** | HTML/XML tags |
| `<a-m>` | **Macro** | Macro names and invocations |
| `<a-l>` | **Label** | Labels and goto targets |
| `<a-ns>` | **Namespace** | Namespaces and modules |
| `<a-cr>` | **Constructor** | Constructor functions and type constructors |

### Markup Tags (Markdown, etc.)

| Tag | Element Type | Description |
|-----|--------------|-------------|
| `<a-tt>` | **Title** | Headings and titles |
| `<a-st>` | **Strong** | Bold text |
| `<a-em>` | **Emphasis** | Italic text |
| `<a-tu>` | **Link** | URLs and hyperlinks |
| `<a-tl>` | **Literal** | Code blocks and inline code |
| `<a-tx>` | **Strikethrough** | Strikethrough text |

### Diff Tags

| Tag | Element Type | Description |
|-----|--------------|-------------|
| `<a-da>` | **Diff Add** | Added lines in diffs |
| `<a-dd>` | **Diff Delete** | Deleted lines in diffs |

### Special Tags

| Tag | Element Type | Description |
|-----|--------------|-------------|
| `<a-eb>` | **Embedded** | Embedded language content |
| `<a-er>` | **Error** | Syntax errors |

### How It Works

Arborium uses tree-sitter grammars to parse code and identify semantic elements. Multiple capture names from tree-sitter queries (like `@keyword.function`, `@keyword.import`, `@conditional`) all map to the same theme slot. For example:

- `@keyword`, `@keyword.function`, `@include`, `@conditional` → all become `<a-k>` (keyword)
- `@function`, `@function.builtin`, `@method` → all become `<a-f>` (function)
- `@comment`, `@comment.documentation` → all become `<a-c>` (comment)

Adjacent spans with the same tag are automatically merged into a single element for efficiency.

### Styling Example

To create a custom theme, target these elements in your CSS:

```css
/* Keywords in blue */
a-k { color: #569cd6; }

/* Functions in yellow */
a-f { color: #dcdcaa; }

/* Strings in green */
a-s { color: #ce9178; }

/* Comments in gray */
a-c { color: #6a9955; font-style: italic; }

/* Types in cyan */
a-t { color: #4ec9b0; }
```

See the [included themes](https://github.com/bearcove/arborium/tree/main/packages/arborium/src/themes) for more examples.

## Sponsors

Thanks to all individual sponsors:

<p>
<a href="https://github.com/sponsors/fasterthanlime">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./static/sponsors-v3/github-dark.svg">
<img src="./static/sponsors-v3/github-light.svg" height="40" alt="GitHub Sponsors">
</picture>
</a>
<a href="https://patreon.com/fasterthanlime">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./static/sponsors-v3/patreon-dark.svg">
<img src="./static/sponsors-v3/patreon-light.svg" height="40" alt="Patreon">
</picture>
</a>
</p>

...along with corporate sponsors:

<p>
<a href="https://zed.dev">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./static/sponsors-v3/zed-dark.svg">
<img src="./static/sponsors-v3/zed-light.svg" height="40" alt="Zed">
</picture>
</a>
<a href="https://depot.dev?utm_source=arborium">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="./static/sponsors-v3/depot-dark.svg">
<img src="./static/sponsors-v3/depot-light.svg" height="40" alt="Depot">
</picture>
</a>
</p>

## License

This project is dual-licensed under [MIT](LICENSE-MIT) OR [Apache-2.0](LICENSE-APACHE).

The bundled grammar sources retain their original licenses - see [LICENSES.md](LICENSES.md) for details.

## WASM Support

Arborium supports building for `wasm32-unknown-unknown`. This requires compiling C code (tree-sitter core and grammar parsers) to WebAssembly.

### macOS

On macOS, the built-in Apple clang does **not** support the `wasm32-unknown-unknown` target. You need to install LLVM via Homebrew:

```bash
brew install llvm
```

Then ensure the Homebrew LLVM is in your PATH when building:

```bash
export PATH="$(brew --prefix llvm)/bin:$PATH"
cargo build --target wasm32-unknown-unknown
```

## FAQ

### Build fails with "No available targets are compatible with triple wasm32-unknown-unknown"

**Error message:**
```
error: unable to create target: 'No available targets are compatible with triple "wasm32-unknown-unknown"'
```

**Cause:** You're using Apple's built-in clang, which doesn't include the WebAssembly backend.

**Solution:** Install LLVM via Homebrew and use it instead:

```bash
brew install llvm
export PATH="$(brew --prefix llvm)/bin:$PATH"
cargo build --target wasm32-unknown-unknown
```

You may want to add the PATH export to your shell profile (`.zshrc`, `.bashrc`, etc.) or use a tool like [direnv](https://direnv.net/) to set it per-project.

## Development

This project uses `cargo xtask` for most development and release tasks.

For detailed architecture, workflows, publishing order, and layout, see `DEVELOP.md`.

For a quick overview of available commands, run:

```bash
cargo xtask help
```