llmproxy-0.1.2 is not a library.

Visit the last successful build: llmproxy-0.1.5

Model Service CLI

Crates.io Downloads (recent) Crates.io License Crates.io Size

A command-line interface (CLI) to interact with the Model Service Orchestrator/Proxy. This tool allows you to register, unregister, and list model services managed by the backend server.

Features

Register: Register a new model service (e.g., a vLLM instance) with the orchestrator, specifying its model name and address.
Unregister: Remove a previously registered model service from the orchestrator using its address.
List: Display all currently registered model services, showing their model names and addresses.

Prerequisites

Rust Toolchain: You need Rust and Cargo installed to build the CLI. Visit rust-lang.org for installation instructions.
Running Backend Server: The Axum-based Model Service Orchestrator/Proxy must be running and accessible. By default, this CLI expects the server to be at http://127.0.0.1:11450.

Building

Clone the repository (if you have it in one):

git clone <your-repo-url>
cd <repository-name>

Build the CLI:
- For a development build:
```
cargo build
```
  The executable will be in ./target/debug/llmproxy.
- For a release (optimized) build:
```
cargo build --release
```
  The executable will be in ./target/release/llmproxy.

Usage

The general command structure is:

./path/to/llmproxy <COMMAND> [OPTIONS]

You can get help for the main command or any subcommand:

./path/to/llmproxy --help
./path/to/llmproxy register --help

Commands

1. `register`

Registers a new model service with the orchestrator.

Options:

--model-name <MODEL_NAME>: The name of the model being served (e.g., "Qwen/Qwen2-7B-Instruct"). (Required)
--addr <ADDR>: The address (host:port) of the model service (e.g., "localhost:8001"). (Required)

Example:

./target/debug/llmproxy register --model-name "Qwen/Qwen2-7B-Instruct" --addr "127.0.0.1:8001"

Expected Output (Success):

Success (201 Created): Server registered successfully

or if already registered:

Success (200 OK): Server already registered

2. `unregister`

Unregisters an existing model service from the orchestrator using its address.

Options:

--addr <ADDR>: The address (host:port) of the model service to unregister (e.g., "127.0.0.1:8001"). (Required)

Example:

./target/debug/llmproxy unregister --addr "127.0.0.1:8001"

Expected Output (Success):

Success (200 OK): Server unregistered successfully

or if not found:

Failed (404 Not Found): Server not found

3. `list`

Lists all currently registered model services.

Example:

./target/debug/llmproxy list

Expected Output:

Registered model services (2):
  - Model: Qwen/Qwen2-7B-Instruct, Addr: 127.0.0.1:8001
  - Model: Llama3-8B, Addr: 127.0.0.1:8002

or if none are registered:

No model services registered.

Backend Server

This CLI tool is a client for the Axum-based backend server. Ensure the server is running and configured correctly (defaulting to http://127.0.0.1:11450). The server is responsible for:

Maintaining the list of active model services.
Proxying incoming requests to the appropriate registered model service based on the model field in the request body.

cargo run --release --bin llmproxyd

Troubleshooting

Connection Refused: Ensure the backend server is running and accessible at http://127.0.0.1:11450 (or the configured address if you modify the BASE_URL in the CLI source).
Unexpected JSON Errors: Verify that the backend server's API responses match what the CLI expects.
Failed to parse server response: This could indicate an issue with the server's response format or a network problem. The CLI will attempt to print the raw body which might give clues.

llmproxy 0.1.2

Model Service CLI

Features

Prerequisites

Building

Usage

Commands

1. register

2. unregister

3. list

Backend Server

Troubleshooting

1. `register`

2. `unregister`

3. `list`