arborium 2.4.0

Tree-sitter syntax highlighting with HTML rendering and WASM support
Documentation

arborium

Batteries-included tree-sitter grammar collection with HTML rendering and WASM support.

Crates.io Documentation License

Features

  • 69 language grammars included out of the box
  • 67 permissively licensed (MIT/Apache-2.0/CC0/Unlicense) grammars enabled by default
  • WASM support with custom allocator fix
  • Feature flags for fine-grained control over included languages

Usage

[dependencies]
arborium = "0.0.0"

By default, all permissively-licensed grammars are included. To select specific languages:

[dependencies]
arborium = { version = "0.0.0", default-features = false, features = ["lang-rust", "lang-javascript"] }

Browser Usage

Arborium can be used in the browser in two ways:

Option 1: Drop-in Script (Easiest)

Add a single script tag and arborium auto-highlights all code blocks:

<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>

That's it! Arborium will:

  • Auto-detect languages from class="language-*" or data-lang="*" attributes
  • Load grammar WASM plugins on-demand from jsDelivr CDN
  • Inject the default theme CSS

Configuration via data attributes:

<script
  src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"
  data-theme="mocha"
  data-selector="pre code"
  data-manual
></script>

Configuration via JavaScript:

<script>
  window.Arborium = {
    theme: 'tokyo-night',
    selector: 'pre code, .highlight',
    cdn: 'jsdelivr',  // or 'unpkg' or a custom URL
    version: '1', // or 'latest'
  };
</script>
<script src="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/arborium.iife.js"></script>

Manual highlighting:

<script src="..." data-manual></script>
<script>
  // Highlight all code blocks
  arborium.highlightAll();

  // Highlight a specific element
  arborium.highlightElement(document.querySelector('code'), 'rust');
</script>

Option 2: ESM Module (Programmatic)

For bundlers (Vite, webpack, etc.) or ESM-native environments:

npm install @arborium/arborium
import { loadGrammar, highlight } from '@arborium/arborium';

// Load a grammar (fetched from CDN on first use)
const grammar = await loadGrammar('rust');

// Highlight code
const html = grammar.highlight('fn main() { println!("Hello!"); }');

// Or use the convenience function
const html = await highlight('rust', code);

Option 3: Compile Rust to WASM (Maximum Control)

For complete control and offline-first apps, compile the Rust crate directly to WASM:

[dependencies]
arborium = { version = "0.0.0", default-features = false, features = ["lang-rust", "lang-javascript"] }
# Requires LLVM with WASM support (see FAQ below)
cargo build --target wasm32-unknown-unknown

This embeds selected grammars directly in your WASM binary - no CDN required at runtime.

Themes

Arborium includes 32 built-in themes from popular color schemes.

Dark themes: catppuccin-mocha, catppuccin-macchiato, catppuccin-frappe, dracula, tokyo-night, nord, one-dark, github-dark, gruvbox-dark, monokai, kanagawa-dragon, rose-pine-moon, ayu-dark, solarized-dark, ef-melissa-dark, melange-dark, cobalt2, zenburn, desert256, rustdoc-dark, rustdoc-ayu

Light themes: catppuccin-latte, github-light, gruvbox-light, ayu-light, solarized-light, melange-light, light-owl, lucius-light, dayfox, alabaster, rustdoc-light

Import theme CSS:

<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@arborium/arborium/dist/themes/tokyo-night.css">

Or let the IIFE bundle auto-inject it via the data-theme attribute.

Theme Attribution: All themes are adaptations of color schemes from their original projects. See the arborium-theme crate README for full attribution and source links.

Feature Flags

Grammar Collections

Feature Description
mit-grammars All permissively licensed grammars (MIT, Apache-2.0, CC0) - default
gpl-grammars GPL-licensed grammars (copyleft - may affect your project's license)
all-grammars All grammars including GPL

Permissive Grammars (76)

These grammars use permissive licenses (MIT, Apache-2.0, CC0, Unlicense) and are included by default.

Feature Language License Source

| lang-asciidoc | Asciidoc | Apache-2.0 | tree-sitter-asciidoc |

| lang-asm | Assembly | MIT | tree-sitter-asm |

| lang-awk | AWK | MIT | tree-sitter-awk |

| lang-bash | Bash | MIT | tree-sitter-bash |

| lang-batch | Batch | MIT | tree-sitter-batch |

| lang-c | C | MIT | tree-sitter-c |

| lang-c-sharp | C# | MIT | tree-sitter-c-sharp |

| lang-caddy | Caddyfile | MIT | tree-sitter-caddy |

| lang-capnp | Cap'n Proto | MIT | tree-sitter-capnp |

| lang-clojure | Clojure | Unlicense | tree-sitter-clojure |

| lang-cpp | C++ | MIT | tree-sitter-cpp |

| lang-css | CSS | MIT | tree-sitter-css |

| lang-dart | Dart | MIT | tree-sitter-dart |

| lang-devicetree | Device Tree | MIT | tree-sitter-devicetree |

| lang-diff | Diff | MIT | tree-sitter-diff |

| lang-dockerfile | Dockerfile | MIT | tree-sitter-dockerfile |

| lang-elixir | Elixir | Apache-2.0 | tree-sitter-elixir |

| lang-elm | Elm | MIT | tree-sitter-elm |

| lang-fsharp | F# | MIT | tree-sitter-fsharp |

| lang-gleam | Gleam | Apache-2.0 | tree-sitter-gleam |

| lang-glsl | GLSL | MIT | tree-sitter-glsl |

| lang-go | Go | MIT | tree-sitter-go |

| lang-groovy | Groovy | MIT | tree-sitter-groovy |

| lang-haskell | Haskell | MIT | tree-sitter-haskell |

| lang-hcl | HCL | Apache-2.0 | tree-sitter-hcl |

| lang-hlsl | HLSL | MIT | tree-sitter-hlsl |

| lang-html | HTML | MIT | tree-sitter-html |

| lang-idris | Idris | MIT | tree-sitter-idris |

| lang-ini | INI | Apache-2.0 | tree-sitter-ini |

| lang-java | Java | MIT | tree-sitter-java |

| lang-javascript | JavaScript | MIT | tree-sitter-javascript |

| lang-jinja2 | Jinja2 | Apache-2.0 | tree-sitter-jinja2 |

| lang-jq | jq | MIT | tree-sitter-jq |

| lang-json | JSON | MIT | tree-sitter-json |

| lang-kotlin | Kotlin | MIT | tree-sitter-kotlin |

| lang-lean | Lean | MIT | tree-sitter-lean |

| lang-lua | Lua | MIT | tree-sitter-lua |

| lang-markdown | Markdown | MIT | tree-sitter-markdown |

| lang-meson | Meson | MIT | tree-sitter-meson |

| lang-nix | Nix | MIT | tree-sitter-nix |

| lang-objc | Objective-C | MIT | tree-sitter-objc |

| lang-ocaml | OCaml | MIT | tree-sitter-ocaml |

| lang-perl | Perl | MIT | tree-sitter-perl |

| lang-php | PHP | MIT | tree-sitter-php |

| lang-postscript | PostScript | MIT | tree-sitter-postscript |

| lang-powershell | PowerShell | MIT | tree-sitter-powershell |

| lang-prolog | Prolog | MIT | tree-sitter-prolog |

| lang-python | Python | MIT | tree-sitter-python |

| lang-r | R | MIT | tree-sitter-r |

| lang-rescript | ReScript | MIT | tree-sitter-rescript |

| lang-ron | RON | MIT OR Apache-2.0 | tree-sitter-ron |

| lang-rust | Rust | MIT | tree-sitter-rust |

| lang-scala | Scala | MIT | tree-sitter-scala |

| lang-scheme | Scheme | MIT | tree-sitter-scheme |

| lang-scss | SCSS | MIT | tree-sitter-scss |

| lang-sql | SQL | MIT | tree-sitter-sql |

| lang-starlark | Starlark | MIT | tree-sitter-starlark |

| lang-svelte | Svelte | MIT | tree-sitter-svelte |

| lang-swift | Swift | MIT | tree-sitter-swift |

| lang-thrift | Thrift | MIT | tree-sitter-thrift |

| lang-tlaplus | TLA+ | MIT | tree-sitter-tlaplus |

| lang-toml | TOML | MIT | tree-sitter-toml |

| lang-tsx | TSX | MIT | tree-sitter-tsx |

| lang-typescript | TypeScript | MIT | tree-sitter-typescript |

| lang-vb | Visual Basic | MIT | tree-sitter-vb |

| lang-verilog | Verilog | MIT | tree-sitter-verilog |

| lang-vhdl | VHDL | MIT | tree-sitter-vhdl |

| lang-vim | Vimscript | MIT | tree-sitter-vim |

| lang-vue | Vue | MIT | tree-sitter-vue |

| lang-wit | WIT | Apache-2.0 WITH LLVM-exception | tree-sitter-wit |

| lang-x86asm | x86 Assembly | MIT | local |

| lang-xml | XML | MIT | tree-sitter-xml |

| lang-yaml | YAML | MIT | tree-sitter-yaml |

| lang-yuri | Yuri | Apache-2.0 | tree-sitter-yuri |

| lang-zig | Zig | MIT | tree-sitter-zig |

| lang-zsh | Zsh | MIT | tree-sitter-zsh |

GPL-Licensed Grammars (1)

These grammars are not included by default due to their copyleft license. Enabling them may have implications for your project's licensing.

Feature Language License Source

| lang-nginx | nginx | GPL-3.0 | tree-sitter-nginx |

HTML Tag Reference

Arborium renders syntax highlighting using custom HTML elements. When highlighting code, it wraps spans of text with tags like <a-k>, <a-f>, etc. These tags are styled by the theme CSS you choose.

Tag Mappings

Each tag corresponds to a semantic code element. Here's the complete reference:

Tag Element Type Description
<a-k> Keyword Language keywords (if, else, while, class, fn, etc.)
<a-f> Function Function names and method calls
<a-s> String String literals and character literals
<a-c> Comment Comments (line and block)
<a-t> Type Type names and type annotations
<a-v> Variable Variable names and identifiers
<a-co> Constant Constants and boolean literals
<a-n> Number Numeric literals (integers and floats)
<a-o> Operator Operators (+, -, *, /, &&, etc.)
<a-p> Punctuation Delimiters and punctuation (parentheses, brackets, commas)
<a-pr> Property Object properties and struct fields
<a-at> Attribute Attributes and annotations (@, #[derive], etc.)
<a-tg> Tag HTML/XML tags
<a-m> Macro Macro names and invocations
<a-l> Label Labels and goto targets
<a-ns> Namespace Namespaces and modules
<a-cr> Constructor Constructor functions and type constructors

Markup Tags (Markdown, etc.)

Tag Element Type Description
<a-tt> Title Headings and titles
<a-st> Strong Bold text
<a-em> Emphasis Italic text
<a-tu> Link URLs and hyperlinks
<a-tl> Literal Code blocks and inline code
<a-tx> Strikethrough Strikethrough text

Diff Tags

Tag Element Type Description
<a-da> Diff Add Added lines in diffs
<a-dd> Diff Delete Deleted lines in diffs

Special Tags

Tag Element Type Description
<a-eb> Embedded Embedded language content
<a-er> Error Syntax errors

How It Works

Arborium uses tree-sitter grammars to parse code and identify semantic elements. Multiple capture names from tree-sitter queries (like @keyword.function, @keyword.import, @conditional) all map to the same theme slot. For example:

  • @keyword, @keyword.function, @include, @conditional → all become <a-k> (keyword)
  • @function, @function.builtin, @method → all become <a-f> (function)
  • @comment, @comment.documentation → all become <a-c> (comment)

Adjacent spans with the same tag are automatically merged into a single element for efficiency.

Styling Example

To create a custom theme, target these elements in your CSS:

/* Keywords in blue */
a-k { color: #569cd6; }

/* Functions in yellow */
a-f { color: #dcdcaa; }

/* Strings in green */
a-s { color: #ce9178; }

/* Comments in gray */
a-c { color: #6a9955; font-style: italic; }

/* Types in cyan */
a-t { color: #4ec9b0; }

See the included themes for more examples.

Sponsors

Thanks to all individual sponsors:

...along with corporate sponsors:

License

This project is dual-licensed under MIT OR Apache-2.0.

The bundled grammar sources retain their original licenses - see LICENSES.md for details.

WASM Support

Arborium supports building for wasm32-unknown-unknown. This requires compiling C code (tree-sitter core and grammar parsers) to WebAssembly.

macOS

On macOS, the built-in Apple clang does not support the wasm32-unknown-unknown target. You need to install LLVM via Homebrew:

brew install llvm

Then ensure the Homebrew LLVM is in your PATH when building:

export PATH="$(brew --prefix llvm)/bin:$PATH"
cargo build --target wasm32-unknown-unknown

Development

This project uses cargo xtask for most development and release tasks.

For detailed architecture, workflows, publishing order, and layout, see DEVELOP.md.

For a quick overview of available commands, run:

cargo xtask help