Gonidium

Gonidium is a DSL for symbolic expression pipelines used in deep-learning style math (especially scalar / elementwise kernels).

Why a DSL?

Deep-learning frameworks often need a small, predictable language to represent and optimise math kernels:

Parse compact surface syntax into a typed IR.
Run algebraic simplification / fusion (e-graph).
Interpret for validation or emit plain Python kernels for integration.

Gonidium focuses on that workflow and intentionally keeps the language small.

Design Principles (Current)

Domain-focused: elementwise math + common DL functions (exp, log, sigmoid, relu, floor, ceil, ...).
Closed-world output: after lowering/optimisation, the result is always expressible using the supported operators/builtins.
Typed IR first: a TypedDag is the semantic contract; surface syntax is just a frontend.
Explicit annotations + implicit promotion: params are annotated; expressions are inferred via promotion rules.
Diagnostics are part of the API: stable short codes + configurable severities.

Non-goals (by design):

General CAS (e.g. symbolic integration), rational exactness, or arbitrary user-defined functions.
Modules/package system.

Quick Start

Build and run the CLI:

cargo build
cargo run -- repl

Run a one-off expression (interpreting when inputs are provided):

cargo run -- run -e "|x: f32| x + 1" --inputs "1"

If you omit --inputs, run prints the inferred signature + a simplified expression instead of evaluating.

Python bindings are also available from the same crate via maturin:

uvx maturin develop
uv run python -c "import gonidium; print(gonidium.simplify_expr('|x: f32|\\nexp(x)'))"

The Python package exposes a small public facade rather than asking callers to import the native extension directly:

gonidium.__version__
gonidium.GonidiumError
gonidium.simplify_expr(source, optimize=False) -> simplified single-output expression
gonidium.diff(source, variable, optimize=False) -> simplified symbolic derivative
gonidium.emit_python(source, function_name='_kernel', optimize=False) -> full Python kernel source

For Rust, the root crate now exposes the small stable facade plus the main compile pipeline:

root facade: diff, simplify_expr, emit_python
pipeline: parse_dsl, lower, optimize, interp_eval
diagnostics: Diag, DiagCode, DiagConfig, Severity
advanced differentiation internals: gonidium::experimental::*
backend / config / repl helpers live under their modules, e.g. gonidium::backend, gonidium::config, gonidium::repl

Language At a Glance

|a: f32, b: f64|
q1 = a + 1
q2 = b + 1
q1 * q2

|x: f32|
if x > 0.0 then x else 0.0

|x: f32, w: f32, b: f32|
sigmoid(x * w + b)

Key syntax notes:

expr @ dtype is postfix cast sugar, equivalent to cast<dtype>(expr).
// is integer floor division; % follows floor-division remainder semantics.
Comparisons do not chain: a < b < c is rejected.
Line comments start with #.

Full grammar/spec: docs/grammar.md.

Type System Snapshot

Supported dtypes: bool, u8/u16/u32/u64, i8/i16/i32/i64, f16/bf16/f32/f64, c64/c128.
Default literal typing (when no @dtype):
- int: smallest signed int that fits
- float/complex: smallest IEEE type that represents the value exactly (promotes to wider if needed)
Promotion inserts explicit Cast nodes in the typed DAG; each promotion emits an info diagnostic (configurable).

Details: docs/type-system.md.

Diagnostics

Diagnostics have stable short codes (e.g. L301) and configurable severities.

Spec: docs/diagnostics.md
Config file: gonidium.toml (project root)
Config doc: docs/config.md

REPL

Start:

cargo run -- repl

REPL v1 accepts three kinds of single-line input:

Input declaration: x: f32
Assignment: t1 = x + 1
Expression: x + 1

Output policy:

If it can be interpreted: prints value@dtype
Otherwise: prints expr @ dtype (optionally simplified)
Prints diagnostics produced by this line (promotions, precision loss, etc.)

Architecture Overview

Pipeline (conceptual layers):

source
  -> parse (chumsky)
  -> AST (untyped)
  -> lower (type inference + literal rules + checks)
  -> TypedDag
  -> const_fold (typed)
  -> strip_types (TypedDag -> RecExpr<MathLang> + TypeMap)
  -> egg runner (algebraic rewrites + fusion)
  -> extractor (FusionCost)
  -> restore_types (bottom-up)
  -> TypedDag (optimised)
  -> backend: interpret / codegen

TypedDag layer: typing, checking, const-folding, explicit-vs-implicit cast marking.
Opt (e-graph) layer: type-erased algebraic optimisation and fusion selection.

Embedding (Kernel Composition)

If you embed Gonidium as a kernel IR in another framework:

parse_dsl to FuncDef
Compose graphs with compose / compose_with_diags
Optionally normalise parameter names with rename_params
lower -> optimize -> your own Backend implementation (or the default PythonBackend) or interp_eval

compose_with_diags emits C201 (compose-symbol-reuse) when a symbol name appears on both sides and is treated as the same variable.

Project Layout

docs/       language + diagnostic + config docs
src/parse/  lexer + chumsky parser
src/ir/     TypedDag + dtypes + lowering
src/opt/    e-graph language + rewrite rules + extraction + roundtrip
src/backend/ interpreter + codegen backends
tests/      parser/lower/opt/backend integration tests

Status / Roadmap

Implemented (v0.0.1):

Parser + grammar-driven precedence
Lowering: literal typing, promotion + explicit Cast insertion, range/precision checks
Optimisation: const-fold + e-graph rewrites + fusion cost extraction
Backends: interpreter, Python codegen
REPL + configurable diagnostics

Planned directions (non-binding):

More builtins and rewrite coverage (focus: DL kernels)
Better backend docs and stability guarantees for embedding
Improved tooling (format/check, richer trace/visualisation)

gonidium 0.0.1