lmcpp 0.1.1

Rust bindings for llama.cpp's server with managed toolchain, typed endpoints, and UDS/HTTP support
Documentation
<!-- cargo-rdme start -->

lmcpp – `llama.cpp`'s [`llama-server`](https://github.com/ggml-org/llama.cpp/tree/master/tools/server) for Rust
=============================================================================================================

## Fully Managed
- **Automated Toolchain** – Downloads, builds, and manages the `llama.cpp` toolchain with [`LmcppToolChain`].  
- **Supported Platforms** – Linux, macOS, and Windows with CPU, CUDA, and Metal support.  
- **Multiple Versions** – Each release tag and backend is cached separately, allowing you to install multiple versions of `llama.cpp`.

## Blazing Fast UDS
- **UDS IPC** – Integrates with `llama-server`’s Unix-domain-socket client on Linux, macOS, and Windows.  
- **Fast!** – Is it faster than HTTP? Yes. Is it *measurably* faster? Maybe.

## Fully Typed / Fully Documented
- **Server Args** – *All* `llama-server` arguments implemented by [`ServerArgs`].  
- **Endpoints** – Each endpoint has request and response types defined.
- **Good Docs** – Every parameter was researched to improve upon the original `llama-server` documentation.

## CLI Tools & Web UI
- **`lmcpp-toolchain-cli`** – Manage the `llama.cpp` toolchain: download, build, cache.  
- **`lmcpp-server-cli`**    – Start, stop, and list servers.  
- **Easy Web UI** – Use [`LmcppServerLauncher::webui`] to start with HTTP *and* the Web UI enabled.

---

```rust
use lmcpp::*;

fn main() -> LmcppResult<()> {
    let server = LmcppServerLauncher::builder()
        .server_args(
            ServerArgs::builder()
                .hf_repo("bartowski/google_gemma-3-1b-it-qat-GGUF")?
                .build(),
        )
        .load()?;

    let res = server.completion(
        CompletionRequest::builder()
            .prompt("Tell me a joke about Rust.")
            .n_predict(64),
    )?;

    println!("Completion response: {:#?}", res.content);
    Ok(())
}
```

```sh,no_run
// With default model
cargo run --bin lmcpp-server-cli -- --webui
// Or with a specific model from URL:
cargo run --bin lmcpp-server-cli -- --webui -u https://huggingface.co/bartowski/google_gemma-3-1b-it-qat-GGUF/blob/main/google_gemma-3-1b-it-qat-Q4_K_M.gguf
// Or with a specific local model:
cargo run --bin lmcpp-server-cli -- --webui -l /path/to/local/model.gguf
```

---

## How It Works

```text
Your Rust App
      ├─→ LmcppToolChain        (downloads / builds / caches)
      │         ↓
      ├─→ LmcppServerLauncher   (spawns & monitors)
      │         ↓
      └─→ LmcppServer           (typed handle over UDS*)
                ├─→ completion()       → text generation
                └─→ other endpoints    → stuff
```

---

### Endpoints ⇄ Typed Helpers
| HTTP Route          | Helper on `LmcppServer` | Request type            | Response type          |
|---------------------|-------------------------|-------------------------|------------------------|
| `POST /completion`  | `completion()`          | [`CompletionRequest`]   | [`CompletionResponse`] |
| `POST /infill`      | `infill()`              | [`InfillRequest`]       | [`CompletionResponse`] |
| `POST /embeddings`  | `embeddings()`          | [`EmbeddingsRequest`]   | [`EmbeddingsResponse`] |
| `POST /tokenize`    | `tokenize()`            | [`TokenizeRequest`]     | [`TokenizeResponse`]   |
| `POST /detokenize`  | `detokenize()`          | [`DetokenizeRequest`]   | [`DetokenizeResponse`] |
| `GET  /props`       | `props()`               || [`PropsResponse`]      |
| *custom*            | `status()` ¹            || [`ServerStatus`]       |
| *Open AI*           | `open_ai_v1_*()`        | [`serde_json::Value`]   | [`serde_json::Value`]  |

¹ Internal helper for server health.

---
## Supported Platforms
| Platform   | CPU | CUDA | Metal | Binary Sources       |
|------------|-----|------|-------|----------------------|
| Linux x64  ||| –  | Pre-built + Source |
| macOS ARM  || –  || Pre-built + Source |
| macOS x64  || –  || Pre-built + Source |
| Windows x64||| –  | Pre-built + Source |

---

<!-- cargo-rdme end -->