oxibonsai-serve
Status: Stable — Version: 0.1.4 — Tests: 161 passing
Standalone OpenAI-compatible inference server for OxiBonsai.
Binary crate providing an HTTP server with /v1/chat/completions endpoint,
configurable host/port, model path, sampling parameters, and structured logging.
Uses pure std::env argument parsing — no clap dependency. Delegates the
engine and HTTP stack to oxibonsai-runtime.
Part of the OxiBonsai project.
Usage
# Install
# Start server
# With options
Options
| Flag | Default | Description |
|---|---|---|
--model <PATH> |
required | Path to GGUF model file |
--host <HOST> |
0.0.0.0 |
Bind address |
--port <PORT> |
8080 |
Bind port |
--tokenizer <PATH> |
auto | Optional tokenizer path |
--max-tokens <N> |
256 |
Default max tokens |
--temperature <F> |
0.7 |
Sampling temperature |
--seed <N> |
42 |
RNG seed |
--log-level <LEVEL> |
info |
error/warn/info/debug/trace |
--auth-token <TOKEN> |
optional | Bearer auth token for request authentication |
License
Apache-2.0 — COOLJAPAN OU