# atomr-infer-cli
> The `atomr-infer serve` binary. Boots an actor system, applies every
> `[[deployment]]` in your project file, mounts the gateway.
## Quick start
```sh
cargo run -p atomr-infer-cli --features all-remote -- \
serve --config examples/remote_only_demo/demo.toml
```
…and `curl http://127.0.0.1:8080/v1/chat/completions` against it.
## Subcommands
| `atomr-infer serve --config <path>` | Parse the project file, build the actor system, register every deployment, mount the gateway, wait for `Ctrl+C`. |
| `atomr-infer status --config <path>` | Print the deployments in the project file (validate without running). |
| `atomr-infer cost-report` | Per-deployment cost — talks to a running `MetricsActor`. *(Phase 6 stub.)* |
| `atomr-infer rotate-credentials <name>` | Triggers `RemoteSessionActor::rebuild` on the named deployment. *(Phase 6 stub.)* |
## Project file (TOML)
```toml
[cluster]
name = "production"
bind = "0.0.0.0:8080"
[[deployment]]
name = "gpt-4o-mini"
model = "gpt-4o-mini"
runtime = "open_ai"
replicas = 2
[deployment.serving]
max_concurrent = 50
[[deployment]]
name = "tinyllama-local"
model = "TinyLlama-1.1B-Chat-Q4_0"
runtime = "candle"
gpus = 1
replicas = 1
```
## Build profiles
| `cargo build -p atomr-infer-cli --no-default-features --features remote-only` | Pure-remote router; no GPU deps in the binary. |
| `cargo build -p atomr-infer-cli --features all-remote` | All four remote providers + pipeline. |
| `cargo build -p atomr-infer-cli --features default-prod` | The doc's recommended production preset. |