Model Service CLI
A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.
Features
- Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
- Unregister: Remove a previously registered model service from the orchestrator using its index number or address.
- List: Display all currently registered model services in a clean table format with index numbers for easy reference.
Prerequisites
- Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
- Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at
http://127.0.0.1:11450.
Building
- Clone the repository (if you have it in one):
- Build the CLI:
- For a development build:
The executable will be in./target/debug/llmproxy. - For a release (optimized) build:
The executable will be in./target/release/llmproxy.
- For a development build:
Usage
The general command structure is:
You can get help for the main command or any subcommand:
Commands
1. register
Registers a new model service with the orchestrator.
Options:
--model-name <MODEL_NAME>: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)--addr <ADDR>: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)
Example:
Expected Output (Success):
✔ Registered meta-llama/Llama-2-7b-chat-hf at 127.0.0.1:8001
2. unregister
Unregisters an existing model service from the orchestrator using its index number or address.
Arguments:
<TARGET>: Service index (e.g., 1, 2, 3) or address (e.g., localhost:8001). (Required)
Examples:
# Unregister by index number (most convenient)
# Unregister by address (backward compatible)
Expected Output (Success):
✔ Unregistered service #1 (127.0.0.1:8001)
or:
✔ Unregistered service at 127.0.0.1:8001
Error Examples:
✖ Index 5 not found. Only 2 services are registered.
3. list
Lists all currently registered model services in a clean table format with index numbers.
Example:
Expected Output:
✔ 2 registered services
Label Model Address
#1 meta-llama/Llama-2-7b-chat-hf 10.150.10.75:18012
#2 Qwen/Qwen2-7B-Instruct 127.0.0.1:8001
💡 You can unregister services by index or address:
→ llmproxy unregister 1
→ llmproxy unregister "localhost:8001"
When no services are registered:
ℹ No model services are currently registered
→ Use llmproxy register --model-name <MODEL> --addr <ADDRESS> to register a new service
Backend Server
This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450). The server is responsible for:
- Maintaining the list of active model services.
- Proxying incoming requests to the appropriate registered model service based on the
modelfield in the request body.
Troubleshooting
- Connection Refused: Ensure the backend server is running and accessible at
http://127.0.0.1:11450(or the configured address if you modify theBASE_URLin the CLI source).✖ Cannot connect to llmproxyd server → Make sure the server is running on http://127.0.0.1:11450 → Start it with: llmproxyd - Invalid Index: When using numeric indices, ensure they are within the valid range:
✖ Index 5 not found. Only 2 services are registered. - Server Errors: The CLI provides clear error messages with context and suggestions for resolution.