Skip to main content

Crate inferd_client

Crate inferd_client 

Source
Expand description

Rust client for the inferd local-inference daemon.

Wire protocol is NDJSON over Unix socket / Windows named pipe / loopback TCP. Spec is frozen as protocol v1; see the inferd repository’s docs/protocol-v1.md.

Two patterns for waiting on the daemon to come up; pick based on whether you need progress UX:

  • Pattern A (passive)dial_and_wait_ready retries connect against the inference transport with exponential backoff. Successful connect is the ready signal because the daemon’s inference socket only exists when the backend is ready (THREAT_MODEL F-13 in the upstream repo). Standard Postgres/Redis/etcd client shape.
  • Pattern B (active)AdminClient subscribes to the admin socket and yields lifecycle events (starting/loading_model/ready/restarting/draining). Use this for installer GUIs, dashboards, or middleware that wants to display download progress during first-boot bootstrap.

§Quickstart

use inferd_client::{Client, Request, Message, Role, Response};
use tokio_stream::StreamExt;

let mut client = inferd_client::dial_and_wait_ready(
    std::time::Duration::from_secs(30),
    || Client::dial_tcp("127.0.0.1:47321"),
)
.await?;

let mut stream = client.generate(Request {
    id: "demo-1".into(),
    messages: vec![Message {
        role: Role::User,
        content: "hello".into(),
    }],
    ..Default::default()
})
.await?;

while let Some(frame) = stream.next().await {
    match frame? {
        Response::Token { content, .. } => print!("{content}"),
        Response::Done { stop_reason, backend, .. } => {
            println!("\n[done; backend={backend}, stop={stop_reason:?}]");
        }
        Response::Error { code, message, .. } => {
            eprintln!("[error {code:?}: {message}]");
        }
        Response::Status { .. } => {}
    }
}

Structs§

AdminClient
Subscriber for the inferd admin socket.
AdminEvent
One frame off the admin socket. Fields not relevant to the current status/phase are absent (or default) per the spec’s flattened wire shape.
Client
Inference-socket client.
ImageTokenBudget
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Image-token budget; one of VALID_IMAGE_TOKEN_BUDGETS. Wraps a u32 so constructors can enforce the enum at the type level.
Message
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. One conversation turn carried in Request::messages.
Request
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. The inference request envelope sent by clients.
Resolved
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Request with all defaults applied and validation completed. Backends receive this; they never see the optional-shaped wire form.
Usage
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Token-count usage report carried on done frames.

Enums§

ClientError
Errors produced by the inference client.
ErrorCode
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Machine-readable error classification carried on error response frames.
ProtoError
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Errors produced by the proto crate while parsing or validating frames.
Response
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. One frame on the response NDJSON stream.
Role
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Conversation role attached to each message.
StopReason
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Why a generation ended. Carried on done frames.
WaitError
Errors produced by dial_and_wait_ready.

Constants§

MAX_FRAME_BYTES
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. Hard cap on a single NDJSON frame in bytes (64 MiB).
VALID_IMAGE_TOKEN_BUDGETS
Re-exports from inferd-proto so consumers don’t need a separate inferd-proto dep for the wire types. The proto crate IS the version-pin contract for protocol compatibility — inferd-client 0.1 always uses inferd-proto 0.1. The set of image_token_budget values accepted by the daemon. Any other value is rejected with ErrorCode::InvalidRequest.

Functions§

default_admin_addr
Default admin endpoint path per platform. Mirrors the daemon’s endpoint::default_admin_addr so clients can reach the spec’d default without hard-coding it.
dial_and_wait_ready
Pattern A passive readiness: retry connect against the inference transport until success or timeout elapses. Successful connect is the ready signal — the daemon’s inference socket only exists when the backend is ready per THREAT_MODEL F-13.
is_transient_dial_error
Returns true if err is the kind of transient connect failure that the daemon’s F-13 ready-gating produces during bring-up (the inference socket doesn’t exist yet). Permanent errors (permission denied, malformed addr) return false and bubble up immediately rather than spamming retries.

Type Aliases§

FrameStream
Stream of Response frames yielded by Client::generate.