Module handle

Expand description

§Server – Handle

Runtime handle for a llama.cpp server process started by this crate.
The handle owns the child process and exposes a typed client so the rest of the crate can talk to the model without caring how it was launched.

§Core ideas

Process isolation – the server lives in a separate process, keeping crashes and memory leaks away from the host application.
Deterministic boot – we distinguish loading, running and unhealthy/offline states and fail fast when the wrong model is seen.
Time budgets – callers choose how patient to be for downloads, model loading and individual polling retries.

Dropping LmcppServer stops the external process automatically.

Structs§

DownloadBudget: Maximum time the caller is willing to wait for a model download to complete when the server launches with a remote source (HF repo / URL).
LmcppServer: Handle representing a live llama.cpp server process.
LoadBudget: Maximum time the caller is willing to wait for the model to load after the server binary has started.
RetryDelay: Back‑off applied between repeated health probes during start‑up.

Enums§

ServerStatus: Observable state of a llama.cpp server obtained via /health and /props.

Functions§

model_ids_match: Decide whether two user‑supplied model identifiers refer to the same model file.

Module handle

Module handle Copy item path

§Server – Handle

§Core ideas

Structs§

Enums§

Functions§

Module handle