hypothalamus 0.5.0

An optimizing Brainfuck AOT compiler with an LLVM IR backend
Documentation
# Hypothalamus

[![Crates.io Version](https://img.shields.io/crates/v/hypothalamus.svg)](https://crates.io/crates/hypothalamus)
[![Crates.io Downloads](https://img.shields.io/crates/d/hypothalamus.svg)](https://crates.io/crates/hypothalamus)
[![CI](https://github.com/Aspenini/hypothalamus/actions/workflows/ci.yml/badge.svg)](https://github.com/Aspenini/hypothalamus/actions/workflows/ci.yml)

Hypothalamus is a Brainfuck ahead-of-time compiler with an LLVM IR backend. It
parses Brainfuck source, lowers it into an optimized Brainfuck-specific IR,
emits LLVM IR, and can use `clang` to lower that IR to an executable, object
file, or assembly for any target your LLVM toolchain supports. It can also
execute the generated LLVM IR directly through `lli`.

The language behavior follows Daniel B. Cristofani's
[Brainfuck reference](https://brainfuck.org/brainfuck.html) besides using `.bf`
as this project's conventional source extension:

- Only the eight Brainfuck commands are meaningful; all other bytes are ignored.
- The tape defaults to 30,000 zeroed byte cells.
- Cell arithmetic wraps modulo 256.
- `[` and `]` must match and nest correctly.
- `.` writes one byte through `putchar`.
- `,` reads one byte through `getchar`; EOF leaves the current cell unchanged.
- Moving the pointer outside the configured tape is undefined behavior, matching
  the reference's "unpredictable" boundary behavior.

## Optimizations

Hypothalamus performs Brainfuck-specific optimizations before LLVM emission:

- folds adjacent cell additions and pointer moves;
- removes no-op arithmetic and movement;
- turns clear loops such as `[-]` and `[+]` into direct stores;
- combines arithmetic at fixed pointer offsets;
- removes dead arithmetic before clears;
- turns scan loops such as `[>]`, `[<]`, `[>>]`, and `[<<]` into explicit scan operations;
- turns transfer and multiply-transfer loops such as `[->+<]` and
  `[->+++>++<<]` into straight-line multiply/add operations.

Native builds pass `-O2` to `clang` by default. Use `--opt-level` or `-O0`,
`-O1`, `-O2`, `-O3`, `-Os`, or `-Oz` to choose another LLVM optimization level.
Use `--bounds-check` when debugging if you want generated programs to trap on
out-of-range tape access instead of using Brainfuck's usual undefined boundary
behavior.

## Freestanding Payloads

Hypothalamus can emit freestanding Brainfuck payloads for kernels, boot demos,
ROM targets, or other no-libc environments. Freestanding output uses a stable
ABI so a separate runtime can call pure Brainfuck code. Target presets may add
LLVM triples, CPU flags, default output kinds, or complete image builders.

In freestanding mode:

- the generated entry point is `void @bf_main()` by default;
- `.` calls `void @bf_putchar(i8)`;
- `,` calls `i32 @bf_getchar()`, where `-1` means EOF and leaves the cell unchanged;
- no hosted `main`, `putchar`, or `getchar` symbols are emitted.

Compile a Brainfuck payload to a freestanding object:

```sh
cargo run -- --target x86_64-none examples/hello.bf -o kernel_bf.o
```

Custom runtime symbol names are available when your boot/runtime layer uses a
different ABI:

```sh
cargo run -- --target x86_64-none --entry kernel_bf_main \
  --putchar-symbol serial_write_byte \
  --getchar-symbol serial_read_byte \
  --emit llvm-ir examples/hello.bf -o kernel_bf.ll
```

Raw LLVM triples still work. Use `--freestanding` when a raw triple should use
the freestanding ABI:

```sh
cargo run -- --freestanding --target x86_64-unknown-none examples/hello.bf -o kernel_bf.o
```

Target runtime notes live in `examples/runtimes/`.

## Target Presets

Run:

```sh
cargo run -- --list-targets
```

Built-in presets:

| Target | Runtime | Default | Notes |
| --- | --- | --- | --- |
| `native` | hosted | executable | host LLVM default target |
| `x86_64-none` | freestanding | object | caller-provided x86_64 runtime |
| `i386-none` | freestanding | object | tiny 32-bit x86 boot/runtime layers |
| `nds-arm9` | freestanding | object | Nintendo DS ARM9 payloads |
| `gba` | freestanding | image | complete target image with built-in runtime |

Target presets only choose the LLVM triple, runtime ABI, default emit kind,
extra LLVM-driver flags, and optional image-builder integration. Runtime and
toolchain notes for platform targets live in `examples/runtimes/`.

## Build

```sh
cargo build --release
```

The compiler itself has no third-party runtime Rust dependencies. Native code
emission requires a `clang` executable or compatible LLVM driver.

Because Hypothalamus is a plain Rust binary, the compiler can also be cross-built through
Cargo for any Rust target available in your toolchain:

```sh
cargo build --release --target x86_64-unknown-linux-gnu
```

## Usage

Compile a Brainfuck program to a native executable:

```sh
cargo run -- examples/hello.bf -o hello
./hello
```

Emit LLVM IR:

```sh
cargo run -- --emit llvm-ir examples/hello.bf -o hello.ll
```

Emit an object file for another LLVM target:

```sh
cargo run -- --emit obj --target x86_64-unknown-linux-gnu examples/hello.bf -o hello.o
```

Emit assembly:

```sh
cargo run -- --emit asm examples/hello.bf -o hello.s
```

Run through LLVM's JIT-capable `lli` tool:

```sh
cargo run -- --emit jit examples/hello.bf
```

Compile the owned Brainfuck-in-Brainfuck interpreter fixture:

```sh
cargo run -- -O3 examples/interpreter.bf -o bfi
printf ',+.!A' | ./bfi
```

Emit a freestanding object for a tiny boot/runtime layer to link:

```sh
cargo run -- --target i386-none examples/hello.bf -o hello_bf.o
```

Emit a complete target image when the preset supports one:

```sh
cargo run -- --target gba examples/hello.bf -o hello.gba
```

BFOS, the Brainfuck-native cartridge OS demo built with Hypothalamus, lives at
[Aspenini/BFOS](https://github.com/Aspenini/BFOS). Hypothalamus stays generic:
it just compiles the generated Brainfuck source BFOS gives it.

If your LLVM tools are not on `PATH`, pass `--cc <path>` for `clang` or
`--lli <path>` for `lli`.

Cross-compiling a full executable requires the target linker, C runtime, and
sysroot that your selected `clang --target=<triple>` needs. Emitting LLVM IR,
assembly, or object files works with fewer target runtime assumptions. Complete
target images can require target-specific tools.

## CLI

```text
hypothalamus [OPTIONS] <INPUT>

Options:
  -o, --output <PATH>       Output path. Use '-' with --emit llvm-ir for stdout
      --emit <KIND>         exe, obj, asm, llvm-ir, jit, or image [default: target-specific]
      --jit, --run          Execute the generated LLVM IR with lli
      --target <TARGET>     Target preset or raw LLVM triple [default: native]
      --list-targets        Print built-in target presets
      --tape-size <CELLS>   Tape cell count [default: 30000]
      --bounds-check        Trap on out-of-range tape access
      --freestanding        Emit a callable Brainfuck payload for freestanding runtimes
      --entry <SYMBOL>      Freestanding entry function [default: bf_main]
      --putchar-symbol <S>  Freestanding output hook: void (i8) [default: bf_putchar]
      --getchar-symbol <S>  Freestanding input hook: i32 () [default: bf_getchar]
      --opt-level <LEVEL>   clang optimization level: 0, 1, 2, 3, s, or z [default: 2]
      --cc <PATH>           clang-compatible LLVM driver [default: clang]
      --lli <PATH>          LLVM lli executable for --emit jit [default: lli]
      --keep-ll             Keep generated LLVM IR beside the output
  -h, --help                Print help
      --version             Print version
```