1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
//! lmcpp – `llama.cpp`'s [`llama-server`](https://github.com/ggml-org/llama.cpp/tree/master/tools/server) for Rust
//! =============================================================================================================
//!
//! ## Fully Managed
//! - **Automated Toolchain** – Downloads, builds, and manages the `llama.cpp` toolchain with [`LmcppToolChain`].
//! - **Supported Platforms** – Linux, macOS, and Windows with CPU, CUDA, and Metal support.
//! - **Multiple Versions** – Each release tag and backend is cached separately, allowing you to install multiple versions of `llama.cpp`.
//!
//! ## Blazing Fast UDS
//! - **UDS IPC** – Integrates with `llama-server`’s Unix-domain-socket client on Linux, macOS, and Windows.
//! - **Fast!** – Is it faster than HTTP? Yes. Is it *measurably* faster? Maybe.
//!
//! ## Fully Typed / Fully Documented
//! - **Server Args** – *All* `llama-server` arguments implemented by [`ServerArgs`].
//! - **Endpoints** – Each endpoint has request and response types defined.
//! - **Good Docs** – Every parameter was researched to improve upon the original `llama-server` documentation.
//!
//! ## CLI Tools & Web UI
//! - **`lmcpp-toolchain-cli`** – Manage the `llama.cpp` toolchain: download, build, cache.
//! - **`lmcpp-server-cli`** – Start, stop, and list servers.
//! - **Easy Web UI** – Use [`LmcppServerLauncher::webui`] to start with HTTP *and* the Web UI enabled.
//!
//! ---
//!
//! ```rust,no_run
//! use lmcpp::*;
//!
//! fn main() -> LmcppResult<()> {
//! let server = LmcppServerLauncher::builder()
//! .server_args(
//! ServerArgs::builder()
//! .hf_repo("bartowski/google_gemma-3-1b-it-qat-GGUF")?
//! .build(),
//! )
//! .load()?;
//!
//! let res = server.completion(
//! CompletionRequest::builder()
//! .prompt("Tell me a joke about Rust.")
//! .n_predict(64),
//! )?;
//!
//! println!("Completion response: {:#?}", res.content);
//! Ok(())
//! }
//! ```
//!
//! ```sh,no_run
//! # With default model
//! cargo run --bin lmcpp-server-cli -- --webui
//! # Or with a specific model from URL:
//! cargo run --bin lmcpp-server-cli -- --webui -u https://huggingface.co/bartowski/google_gemma-3-1b-it-qat-GGUF/blob/main/google_gemma-3-1b-it-qat-Q4_K_M.gguf
//! # Or with a specific local model:
//! cargo run --bin lmcpp-server-cli -- --webui -l /path/to/local/model.gguf
//! ```
//!
//! ---
//!
//! ## How It Works
//!
//! ```text
//! Your Rust App
//! │
//! ├─→ LmcppToolChain (downloads / builds / caches)
//! │ ↓
//! ├─→ LmcppServerLauncher (spawns & monitors)
//! │ ↓
//! └─→ LmcppServer (typed handle over UDS*)
//! │
//! ├─→ completion() → text generation
//! └─→ other endpoints → stuff
//! ```
//!
//! ---
//!
//! ### Endpoints ⇄ Typed Helpers
//! | HTTP Route | Helper on `LmcppServer` | Request type | Response type |
//! |---------------------|-------------------------|-------------------------|------------------------|
//! | `POST /completion` | `completion()` | [`CompletionRequest`] | [`CompletionResponse`] |
//! | `POST /infill` | `infill()` | [`InfillRequest`] | [`CompletionResponse`] |
//! | `POST /embeddings` | `embeddings()` | [`EmbeddingsRequest`] | [`EmbeddingsResponse`] |
//! | `POST /tokenize` | `tokenize()` | [`TokenizeRequest`] | [`TokenizeResponse`] |
//! | `POST /detokenize` | `detokenize()` | [`DetokenizeRequest`] | [`DetokenizeResponse`] |
//! | `GET /props` | `props()` | – | [`PropsResponse`] |
//! | *custom* | `status()` ¹ | – | [`ServerStatus`] |
//! | *Open AI* | `open_ai_v1_*()` | [`serde_json::Value`] | [`serde_json::Value`] |
//!
//! ¹ Internal helper for server health.
//!
//! ---
//! ## Supported Platforms
//! | Platform | CPU | CUDA | Metal | Binary Sources |
//! |------------|-----|------|-------|----------------------|
//! | Linux x64 | ✅ | ✅ | – | Pre-built + Source |
//! | macOS ARM | ✅ | – | ✅ | Pre-built + Source |
//! | macOS x64 | ✅ | – | ✅ | Pre-built + Source |
//! | Windows x64| ✅ | ✅ | – | Pre-built + Source |
//!
//! ---
use ;
pub use ;
pub use ;
pub use ;