Expand description
§Server – Handle
Runtime handle for a llama.cpp server process started by this crate.
The handle owns the child process and exposes a typed client so the rest of
the crate can talk to the model without caring how it was launched.
§Core ideas
- Process isolation – the server lives in a separate process, keeping crashes and memory leaks away from the host application.
- Deterministic boot – we distinguish loading, running and unhealthy/offline states and fail fast when the wrong model is seen.
- Time budgets – callers choose how patient to be for downloads, model loading and individual polling retries.
Dropping LmcppServer stops the external process automatically.
Structs§
- Download
Budget - Maximum time the caller is willing to wait for a model download to complete when the server launches with a remote source (HF repo / URL).
- Lmcpp
Server - Handle representing a live llama.cpp server process.
- Load
Budget - Maximum time the caller is willing to wait for the model to load after the server binary has started.
- Retry
Delay - Back‑off applied between repeated health probes during start‑up.
Enums§
- Server
Status - Observable state of a llama.cpp server obtained via
/healthand/props.
Functions§
- model_
ids_ match - Decide whether two user‑supplied model identifiers refer to the same model file.