atomr-infer-cli 0.8.0

atomr-infer serve CLI binary. Boots the actor system, applies project-file [[deployment]] entries, and mounts the gateway.
Documentation

atomr-infer-cli

The atomr-infer serve binary. Boots an actor system, applies every [[deployment]] in your project file, mounts the gateway.

Quick start

cargo run -p atomr-infer-cli --features all-remote -- \
    serve --config examples/remote_only_demo/demo.toml

…and curl http://127.0.0.1:8080/v1/chat/completions against it.

Subcommands

Subcommand What it does
atomr-infer serve --config <path> Parse the project file, build the actor system, register every deployment, mount the gateway, wait for Ctrl+C.
atomr-infer status --config <path> Print the deployments in the project file (validate without running).
atomr-infer cost-report Per-deployment cost — talks to a running MetricsActor. (Phase 6 stub.)
atomr-infer rotate-credentials <name> Triggers RemoteSessionActor::rebuild on the named deployment. (Phase 6 stub.)

Project file (TOML)

[cluster]
name = "production"
bind = "0.0.0.0:8080"

[[deployment]]
name     = "gpt-4o-mini"
model    = "gpt-4o-mini"
runtime  = "open_ai"
replicas = 2

[deployment.serving]
max_concurrent        = 50
on_capacity_exhausted = "queue"     # queue | reject | fallback

[[deployment]]
name     = "tinyllama-local"
model    = "TinyLlama-1.1B-Chat-Q4_0"
runtime  = "candle"
gpus     = 1
replicas = 1

Build profiles

Build Use case
cargo build -p atomr-infer-cli --no-default-features --features remote-only Pure-remote router; no GPU deps in the binary.
cargo build -p atomr-infer-cli --features all-remote All four remote providers + pipeline.
cargo build -p atomr-infer-cli --features default-prod The doc's recommended production preset.