Model Service CLI
A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.
Features
- Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
- Unregister: Remove a previously registered model service from the orchestrator using its address.
- List: Display all currently registered model services, showing their model names and addresses.
Prerequisites
- Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
- Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at
http://127.0.0.1:11450
.
Building
- Clone the repository (if you have it in one):
- Build the CLI:
- For a development build:
The executable will be in./target/debug/llmproxy
. - For a release (optimized) build:
The executable will be in./target/release/llmproxy
.
- For a development build:
Usage
The general command structure is:
You can get help for the main command or any subcommand:
Commands
1. register
Registers a new model service with the orchestrator.
Options:
--model-name <MODEL_NAME>
: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)--addr <ADDR>
: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)
Example:
Expected Output (Success):
Success (201 Created): Server registered successfully
or if already registered:
Success (200 OK): Server already registered
2. unregister
Unregisters an existing model service from the orchestrator using its address.
Options:
--addr <ADDR>
: The address (host:port) of the model service to unregister (e.g., "127.0.0.1:8001"). (Required)
Example:
Expected Output (Success):
Success (200 OK): Server unregistered successfully
or if not found:
Failed (404 Not Found): Server not found
3. list
Lists all currently registered model services.
Example:
Expected Output:
Registered model services (2):
- Model: Qwen/Qwen2-7B-Instruct, Addr: 127.0.0.1:8001
- Model: Llama3-8B, Addr: 127.0.0.1:8002
or if none are registered:
No model services registered.
Backend Server
This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450
). The server is responsible for:
- Maintaining the list of active model services.
- Proxying incoming requests to the appropriate registered model service based on the
model
field in the request body.
Troubleshooting
- Connection Refused: Ensure the backend server is running and accessible at
http://127.0.0.1:11450
(or the configured address if you modify theBASE_URL
in the CLI source). - Unexpected JSON Errors: Verify that the backend server's API responses match what the CLI expects.
Failed to parse server response
: This could indicate an issue with the server's response format or a network problem. The CLI will attempt to print the raw body which might give clues.