Module handle

Module handle 

Source
Expand description

§Server – Handle

Runtime handle for a llama.cpp server process started by this crate.
The handle owns the child process and exposes a typed client so the rest of the crate can talk to the model without caring how it was launched.

§Core ideas

  • Process isolation – the server lives in a separate process, keeping crashes and memory leaks away from the host application.
  • Deterministic boot – we distinguish loading, running and unhealthy/offline states and fail fast when the wrong model is seen.
  • Time budgets – callers choose how patient to be for downloads, model loading and individual polling retries.

Dropping LmcppServer stops the external process automatically.

Structs§

DownloadBudget
Maximum time the caller is willing to wait for a model download to complete when the server launches with a remote source (HF repo / URL).
LmcppServer
Handle representing a live llama.cpp server process.
LoadBudget
Maximum time the caller is willing to wait for the model to load after the server binary has started.
RetryDelay
Back‑off applied between repeated health probes during start‑up.

Enums§

ServerStatus
Observable state of a llama.cpp server obtained via /health and /props.

Functions§

model_ids_match
Decide whether two user‑supplied model identifiers refer to the same model file.