NeuronBox

Run local AI work (training, fine-tuning, inference, benchmarks) without reinventing glue code every week.

You describe the project once in neuron.yaml: where weights live (Hugging Face–style id, a folder on disk, or a single file), which Python stack you need, GPU expectations, and the script to run. NeuronBox builds or reuses a hashed virtualenv under ~/.neuronbox/store/envs/, wires NEURONBOX_* environment variables into your process, and talks to a Unix-socket daemon (neurond) so neuron stats and neuron dashboard show live sessions, host/GPU probes, and (when your code reports them) tokens per second. For hard isolation, use neuron run --oci with runtime.mode: oci in the manifest—Docker is used only on that path.

neuron with no subcommand opens a short getting-started screen; neuron help lists every command.

Scope: NeuronBox is a working local stack: CLI, neurond, newline-delimited JSON on a Unix socket, terminal dashboard, and a global model store. It is not a hosted multi-tenant cloud. Version 0.1.0 is early but usable; APIs and CLI flags may still evolve.

License: GNU Affero General Public License v3 (open source). SPDX identifier in manifests: AGPL-3.0-only. If you cannot meet AGPL obligations (e.g. closed-source SaaS), you need a commercial license — see LICENSING.md and contact neuronbox.contact@proton.me.

Why NeuronBox (at a glance)
Tutorial: end-to-end
How a run works
Daemon, sessions, and throughput
Dashboard and demo mode
Use cases
NeuronBox vs Docker
CLI reference
Environment variables
Prerequisites and build
References
Repository layout
Contributing

Why NeuronBox (at a glance)

Declare the job, not the plumbing: gpu.min_vram, runtime.packages, model.source, and entrypoint live in neuron.yaml instead of ad-hoc CUDA matrices and one-off volume maps.
Model store, not a 50 GB image layer: weights are first-class artifacts in ~/.neuronbox/store, shared across projects, with paths exposed via NEURONBOX_MODEL_DIR (and related vars). neuron pull fetches Hugging Face–style org/model trees into that store (see neuron pull --help for aliases and local paths).
Hot-swap for iteration: neuron swap updates daemon state and ~/.neuronbox/swap_signal.json; neuron serve runs a long-lived Python worker that can react without cold-starting your whole stack for every weight change.
One view of the machine: neuron host inspect and neuron gpu list summarize Metal, ROCm, CUDA, and optional NVML so laptops and Linux servers share one mental model.

Tutorial: end-to-end

1. Build the binaries

From your clone:

cd neuronbox
cargo build -p neuronbox-cli -p neuronbox-runtime --bin neurond

You need target/debug/neuron and target/debug/neurond side by side (or set NEUROND_PATH to the daemon binary). Add target/debug to PATH if you like.

./target/debug/neuron          # welcome screen
./target/debug/neuron help     # full command list

2. Create a project

mkdir ~/my-llm-project && cd ~/my-llm-project
/path/to/neuronbox/target/debug/neuron init

Edit neuron.yaml: model, entrypoint, runtime.packages, gpu.min_vram, runtime.mode (host vs oci), etc. Schema: specs/neuron.yaml.schema.json.

The template sets entrypoint: train.py — create that script (or change entrypoint to your own file) before neuron run.

3. Get weights

Hub-style id (one slash, no colon):
```
/path/to/neuronbox/target/debug/neuron pull org/model
```
Artifacts land under ~/.neuronbox/store by default.
Local tree or file (.gguf, .safetensors, …): set model.source: local and model.name to the path; no pull step.
Container images are not pulled by neuron pull. Use docker pull yourself, or neuron oci prepare when building a runc bundle (docs/OCI_AND_DOCKER.md).

Optional: HF_TOKEN in the environment for private Hub repos.

4. Run the entrypoint

/path/to/neuronbox/target/debug/neuron run

From the directory that contains neuron.yaml, or point at another manifest:

neuron run -f path/to/neuron.yaml

neuron run resolves the model (pull if needed for Hub ids), ensures the hashed venv, sets NEURONBOX_MODEL_DIR, NEURONBOX_SESSION_NAME, NEURONBOX_SESSION_VRAM_MB, and related vars, then spawns your entrypoint script. It registers the child with neurond and unregisters when the process exits.

Shortcut: neuron run org/model with a single HF-style argument only pulls and prints where the model lives—you still need a neuron.yaml and entrypoint to execute code.

neuron run tries to start neurond if the socket is down (best effort). If stats / dashboard cannot connect, run neuron daemon in another terminal.

5. Watch the machine

/path/to/neuronbox/target/debug/neuron dashboard   # TUI: sessions + charts + host/GPU
/path/to/neuronbox/target/debug/neuron stats         # plain-text snapshot

Default socket: ~/.neuronbox/neuron.sock, overridable with NEURONBOX_SOCKET.

How a run works

Piece	Behavior
Virtualenv	Path under `store/envs/` is a hash of `runtime.python`, `runtime.cuda`, and `runtime.packages`. Same manifest shape ⇒ same env. Optional `requirements.lock` in that env dir + `neuron lock` for pinned installs.
Installer	Prefers `uv pip install` when `uv` is on `PATH`; otherwise `pip`. Empty `packages` and no CUDA/ROCm extra index ⇒ no pip invocation.
Model path	`NEURONBOX_MODEL_DIR` points at the resolved tree (store or local). `NEURONBOX_MODEL_PATH` when the manifest points at a single weights file.
Soft VRAM check	If `gpu.min_vram` is set and the host reports GPU memory, `neuron run` can warn when estimates exceed what looks available (non-blocking).
Child environment	Inherited `PYTHONPATH` is removed unless you set `PYTHONPATH` under `env:` in `neuron.yaml` (avoids IDE-injected paths breaking venv numpy/torch).

Daemon, sessions, and throughput

neurond keeps an in-memory registry of sessions (name, PID, estimated VRAM, optional tokens_per_sec). neuron run sends register_session after spawn and unregister_session after exit.

To refresh throughput for the dashboard and neuron stats, send another register_session line with the same PID and an updated tokens_per_sec (see specs/daemon-sessions.md). neuron run exports NEURONBOX_SESSION_NAME and NEURONBOX_SESSION_VRAM_MB so your script can match the initial registration.

Protocol types: runtime/src/protocol.rs.

Dashboard and demo mode

neuron dashboard — real Stats from the daemon, HostProbe for OS/arch/backends/GPUs, ~10 Hz UI refresh for session table and throughput history (history is client-side, not stored in the daemon).
neuron dashboard --demo (Unix) — starts synthetic sessions (helper sleep PIDs), animated tok/s, a mock swap model, and optional synthetic VRAM styling. Quit with q / Esc so the demo task can unregister. For cosmetic gauges on real hardware without fake sessions, you can set NEURONBOX_DEMO_SYNTHETIC_METRICS=1 (see docs/CLI_UX.md).

Use cases

Scenario	Why NeuronBox
Training, LoRA, eval, batch inference	One manifest ties code + weights + Python + GPU; same commands on laptop or server.
Large models and shared disks	Central store; projects reference paths, not duplicate trees.
Reproducible envs	Hashed env dirs + `neuron lock` / `requirements.lock`.
Visibility	Daemon + `dashboard` / `stats` for sessions and reported tok/s.
Optional isolation	`neuron run --oci` when `runtime.mode: oci` and you want Docker mounts + NVIDIA toolkit without hand-written `docker run`.
Mixed hardware	`neuron host inspect` / `neuron gpu list` for support and CI notes.

NeuronBox vs Docker

	Docker	NeuronBox
Primary unit	Image + container	`neuron.yaml` + host paths
Strength	Portability, isolation, orchestration	Fast iteration on metal: hashed venvs, model store, one command to run the manifest
ML weights	You map volumes yourself	Native pull/store, *`NEURONBOX_`** wiring
When to prefer Docker alone	Production parity, K8s	—
When NeuronBox helps	—	Daily local work; Docker only when you opt into OCI

CLI reference

Command	Role
`neuron`	Welcome screen
`neuron help`	Full help
`neuron init`	Create `neuron.yaml` in the current directory
`neuron pull <id>`	ML artifacts: HF-style `org/model`, configured alias, or local path → store
`neuron run`	Run `entrypoint` from `neuron.yaml` (host by default)
`neuron run -f FILE`	Use another manifest path
`neuron run --gpu 0`	Sets `CUDA_VISIBLE_DEVICES` for the child
`neuron run --vram 12gb`	CLI VRAM hint for the session record
`neuron run --oci`	Force Docker OCI path (requires `runtime.mode: oci` alignment; Linux+NVIDIA for GPU containers)
`neuron run org/model`	Pull-only shortcut when a single HF-like arg is given
`neuron serve [-f FILE]`	Long-lived worker + swap signal (same venv resolution as `run`)
`neuron swap MODEL`	Daemon active model + `swap_signal.json`
`neuron stats`	Text: sessions + GPU lines + swap
`neuron dashboard`	Full-screen TUI
`neuron dashboard --demo`	TUI + built-in mock load (Unix)
`neuron host inspect`	JSON HostSnapshot
`neuron gpu list`	Detected GPUs
`neuron model list`	Store index
`neuron lock [-f FILE]`	Write `requirements.lock` into the hashed env (`uv pip compile`)
`neuron daemon`	Run `neurond` in the foreground
`neuron oci prepare`	Runc bundle (Docker on host for rootfs export)
`neuron oci runc`	Run `runc` against a prepared bundle

Breaking changes (older workflows)

neuron pull is not for Docker image tags. neuron ps / stop / rm and neuron run as a generic docker run proxy are removed—use docker directly. NeuronBox keeps Docker under neuron oci … and neuron run --oci only.

Environment variables

Variable	Purpose
`NEURONBOX_SOCKET`	Unix socket path for `neurond` (default `~/.neuronbox/neuron.sock`)
`NEUROND_PATH`	Path to `neurond` if not beside `neuron`
`HF_TOKEN`	Authenticated Hub downloads for `neuron pull`
`NEURONBOX_DEMO_SYNTHETIC_METRICS`	`1` / `true` / `yes` — extra synthetic styling in dashboard (optional)
`NEURONBOX_DISABLE_VRAM_WATCH`	Disables daemon VRAM watch path (e.g. demo spawn)

Set per-project secrets and flags in neuron.yaml → env: (applied to run / serve children).

Prerequisites and build

Rust (workspace; see rust-toolchain.toml if present)
Python 3 on PATH (version should match runtime.python in your manifest when possible)
uv (optional, recommended for faster pip installs)
GPU tooling (optional): NVIDIA, AMD, or Apple Silicon; see neuron host inspect

cargo build --workspace

Linux + NVIDIA (richer reporting when linked):

cargo build -p neuronbox-cli --features nvml
cargo build -p neuronbox-runtime --features nvml

Outputs: target/debug/neuron, target/debug/neurond (or release/).

cargo install --path cli

installs neuron; install or copy neurond accordingly, or rely on NEUROND_PATH.

References

Doc	Topic
docs/CLI_UX.md	Welcome screen, theme, dashboard behavior
docs/OCI_AND_DOCKER.md	When Docker runs
specs/neuron.yaml.schema.json	Manifest schema
specs/swap-signal.schema.json	Swap signal file
specs/daemon-sessions.md	Socket protocol, sessions, tok/s updates
docs/MULTI_GPU.md	Multi-GPU / DDP
docs/GPU_VRAM.md	VRAM, NVML, `NEURONBOX_DISABLE_VRAM_WATCH`
docs/SECURITY.md	Socket trust, limits, model trust
specs/examples/	Example YAML snippets

Repository layout

cli/ — neuron binary; cli/scripts/serve_worker.py (used by neuron serve)
runtime/ — shared library + neurond
specs/ — JSON Schema, protocol docs, YAML examples
sdk/ — optional Python client for the daemon socket (sdk/neuronbox/client.py); pip install -e sdk/ from the repo root if you want it on your PYTHONPATH

Contributing

Small changes welcome. Before opening a PR:

cargo fmt --all
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace

By contributing, you agree your contributions are licensed under the same terms as the project (AGPL v3 for the open-source distribution; see LICENSING.md). For security-sensitive issues, see docs/SECURITY.md.

neuronbox-runtime 0.1.0