atomr-infer-cli

The atomr-infer serve binary. Boots an actor system, applies every [[deployment]] in your project file, mounts the gateway.

Quick start

cargo run -p atomr-infer-cli --features all-remote -- \
    serve --config examples/remote_only_demo/demo.toml

…and curl http://127.0.0.1:8080/v1/chat/completions against it.

Subcommands

Subcommand	What it does
`atomr-infer serve --config <path>`	Parse the project file, build the actor system, register every deployment, mount the gateway, wait for `Ctrl+C`.
`atomr-infer status --config <path>`	Print the deployments in the project file (validate without running).
`atomr-infer cost-report`	Per-deployment cost — talks to a running `MetricsActor`. (Phase 6 stub.)
`atomr-infer rotate-credentials <name>`	Triggers `RemoteSessionActor::rebuild` on the named deployment. (Phase 6 stub.)

Project file (TOML)

[cluster]
name = "production"
bind = "0.0.0.0:8080"

[[deployment]]
name     = "gpt-4o-mini"
model    = "gpt-4o-mini"
runtime  = "open_ai"
replicas = 2

[deployment.serving]
max_concurrent        = 50
on_capacity_exhausted = "queue"     # queue | reject | fallback

[[deployment]]
name     = "tinyllama-local"
model    = "TinyLlama-1.1B-Chat-Q4_0"
runtime  = "candle"
gpus     = 1
replicas = 1

Build profiles

Build	Use case
`cargo build -p atomr-infer-cli --no-default-features --features remote-only`	Pure-remote router; no GPU deps in the binary.
`cargo build -p atomr-infer-cli --features all-remote`	All four remote providers + pipeline.
`cargo build -p atomr-infer-cli --features default-prod`	The doc's recommended production preset.

atomr-infer-cli 0.8.0

atomr-infer-cli

Quick start

Subcommands

Project file (TOML)

Build profiles