Struct Cli

Source

pub struct Cli {Show 31 fields
    pub backend: BackendKind,
    pub lock: PathBuf,
    pub tcp: Option<String>,
    pub uds: Option<PathBuf>,
    pub pipe: Option<String>,
    pub group: Option<String>,
    pub active_permits: usize,
    pub queue_depth: usize,
    pub ready_timeout_secs: u64,
    pub model_path: Option<PathBuf>,
    pub model_sha256: Option<String>,
    pub n_ctx: u32,
    pub n_gpu_layers: i32,
    pub openai_base_url: Option<String>,
    pub openai_api_key: Option<String>,
    pub openai_model: Option<String>,
    pub openai_timeout_secs: u64,
    pub bedrock_region: Option<String>,
    pub bedrock_model_id: Option<String>,
    pub bedrock_bearer_token: Option<String>,
    pub bedrock_endpoint: Option<String>,
    pub bedrock_timeout_secs: u64,
    pub api_key: Option<String>,
    pub config: Option<PathBuf>,
    pub admin_addr: Option<PathBuf>,
    pub v2: bool,
    pub v2_addr: Option<PathBuf>,
    pub v2_tcp: Option<String>,
    pub embed: bool,
    pub embed_addr: Option<PathBuf>,
    pub embed_tcp: Option<String>,
}

Expand description

Top-level CLI for inferd-daemon.

Fields§

§backend: BackendKind

Backend to load at startup.

§lock: PathBuf

Path to the single-instance lock file. The lock is held for the lifetime of the daemon process.

§tcp: Option<String>

Loopback TCP bind address. Mutually exclusive with --uds and --pipe.

§uds: Option<PathBuf>

Unix domain socket path. Mutually exclusive with --tcp and --pipe. Unix only.

§pipe: Option<String>

Windows named pipe path (e.g. \\.\pipe\inferd-infer). Mutually exclusive with --tcp and --uds. Windows only.

§group: Option<String>

Group name for the UDS (Unix only). Ignored on other transports.

§active_permits: usize

Active generations served concurrently. v0.1 invariant is 1; values above 1 are reserved for v0.2 continuous-batching backends.

§queue_depth: usize

Maximum waiting queue depth. Submits beyond this return code: queue_full immediately.

§ready_timeout_secs: u64

Seconds to wait for the backend to report ready before failing startup.

§model_path: Option<PathBuf>

Path to the GGUF model file. Required when --backend llamacpp.

§model_sha256: Option<String>

Optional expected SHA-256 of the model file as a hex string (64 chars). When present, the daemon verifies the file before loading via subtle::ConstantTimeEq (THREAT_MODEL F-5).

§n_ctx: u32

Llama.cpp context window in tokens. Default 8192.

§n_gpu_layers: i32

Llama.cpp GPU layer offload count. 0 = CPU-only. GPU support requires the cuda/metal/vulkan/rocm cargo feature at build time.

§openai_base_url: Option<String>

Base URL of the upstream OpenAI-compat endpoint, no trailing slash and no path (the adapter appends /v1/chat/completions). Required when --backend openai-compat. Examples: https://api.openai.com, http://localhost:11434, https://openrouter.ai.

§openai_api_key: Option<String>

Bearer token for the OpenAI-compat upstream. Sent as Authorization: Bearer <value>. Pass an empty string to skip the header entirely for self-hosted endpoints. Resolves from --openai-api-key, then INFERD_OPENAI_API_KEY, then OPENAI_API_KEY (the de-facto env name most providers’ SDKs already use).

§openai_model: Option<String>

Upstream model identifier echoed in the request model field — provider-specific (e.g. gpt-4o-mini, llama3.1:8b, meta-llama/Meta-Llama-3-70B-Instruct). Required when --backend openai-compat.

§openai_timeout_secs: u64

Total request timeout for OpenAI-compat calls, in seconds. Default 300 (5 minutes) — long enough for a slow first-token from a cold cloud model, short enough to surface stuck requests rather than hang forever.

§bedrock_region: Option<String>

AWS region the Bedrock endpoint lives in, e.g. us-east-1, eu-central-1. Required when --backend bedrock-invoke. Used for both the endpoint host and SigV4 signing scope.

§bedrock_model_id: Option<String>

Bedrock model id (URL-encoded by the adapter), e.g. anthropic.claude-3-5-sonnet-20241022-v2:0. Required when --backend bedrock-invoke.

§bedrock_bearer_token: Option<String>

Pre-issued Bedrock bearer token (AWS_BEARER_TOKEN_BEDROCK shape, AWS rolled this out in 2025-06). When set, the adapter sends Authorization: Bearer <value> and skips SigV4. When unset, the adapter falls back to the standard AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (+ optional AWS_SESSION_TOKEN) chain via SigV4 signing.

§bedrock_endpoint: Option<String>

Override the Bedrock endpoint host. Empty/absent → default bedrock-runtime.<region>.amazonaws.com. Useful for VPC endpoints / integration tests.

§bedrock_timeout_secs: u64

Total request timeout for Bedrock calls, in seconds. Default 300 (5 minutes).

§api_key: Option<String>

Optional pre-shared API key. When set, TCP clients MUST send {"type":"auth","key":"<this value>"} as their first NDJSON frame on the connection or the daemon closes the connection. UDS and named-pipe transports ignore this — kernel-attested peer credentials (F-7) do the work there.

Comparison is constant-time. THREAT_MODEL F-8.

§config: Option<PathBuf>

Path to the operator JSON config file. Default ~/.inferd/config.json. When present, fetch + auto-pull are driven from it; CLI flags (--model-path, --model-sha256, --n-ctx, --n-gpu-layers) override config-file values when both are supplied. When absent, the daemon falls back to CLI-flag-only operation (dev mode).

§admin_addr: Option<PathBuf>

Admin endpoint path. Defaults per-platform to the path documented in docs/protocol-v1.md §“Admin endpoint” — e.g. /run/inferd/admin.sock on Linux, \\.\pipe\inferd-admin on Windows. Override for tests / non-default deployments.

§v2: bool

Enable the v2 inference endpoint per ADR 0015. v2 binds on a separate socket from v1: infer.v2.sock on Unix / \\.\pipe\inferd-infer-v2 on Windows. v1 stays on its own socket and is unaffected.

Phase 1B: the v2 endpoint accepts and validates v2 requests but returns Error{code:internal, message:"v2 generation not implemented"} because the Backend trait does not yet expose generate_v2. Use this to integration-test middleware that will speak v2 once Phase 2A lands.

§v2_addr: Option<PathBuf>

Override the default v2 inference endpoint path. Mirrors --uds / --pipe for v2; on Linux/macOS this is a UDS path, on Windows a named-pipe path. Has no effect unless --v2 is also set.

§v2_tcp: Option<String>

Loopback TCP bind address for the v2 endpoint. Mutually exclusive with --v2-addr. Useful for tests that don’t want the platform default (UDS / named pipe). Has no effect unless --v2 is also set.

§embed: bool

Enable the embed inference endpoint per ADR 0017. The embed endpoint binds on a separate socket from v1/v2: infer.embed.sock on Unix / \\.\pipe\inferd-infer-embed on Windows. Has no effect unless the active backend’s capabilities().embed is true (capability-driven binding).

§embed_addr: Option<PathBuf>

Override the default embed inference endpoint path. Mirrors --uds / --pipe for embed; on Linux/macOS this is a UDS path, on Windows a named-pipe path. Has no effect unless --embed is also set.

§embed_tcp: Option<String>

Loopback TCP bind address for the embed endpoint. Mutually exclusive with --embed-addr. Has no effect unless --embed is also set.

Struct Cli Copy item path

Fields§

Implementations§

impl Cli

pub fn require_one_transport(&self) -> Result<(), &'static str>

Trait Implementations§

impl Args for Cli

fn group_id() -> Option<Id>

fn augment_args<'b>(__clap_app: Command) -> Command

fn augment_args_for_update<'b>(__clap_app: Command) -> Command

impl CommandFactory for Cli

fn command<'b>() -> Command

fn command_for_update<'b>() -> Command

impl Debug for Cli

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl FromArgMatches for Cli

fn from_arg_matches(__clap_arg_matches: &ArgMatches) -> Result<Self, Error>

fn from_arg_matches_mut( __clap_arg_matches: &mut ArgMatches, ) -> Result<Self, Error>

fn update_from_arg_matches( &mut self, __clap_arg_matches: &ArgMatches, ) -> Result<(), Error>

fn update_from_arg_matches_mut( &mut self, __clap_arg_matches: &mut ArgMatches, ) -> Result<(), Error>

impl Parser for Cli

fn parse() -> Self

fn try_parse() -> Result<Self, Error>

fn parse_from<I, T>(itr: I) -> Selfwhere I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn try_parse_from<I, T>(itr: I) -> Result<Self, Error>where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn update_from<I, T>(&mut self, itr: I)where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn try_update_from<I, T>(&mut self, itr: I) -> Result<(), Error>where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

Auto Trait Implementations§

impl Freeze for Cli

impl RefUnwindSafe for Cli

impl Send for Cli

impl Sync for Cli

impl Unpin for Cli

impl UnsafeUnpin for Cli

impl UnwindSafe for Cli

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> Same for T

type Output = T

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

Struct Cli

fn parse_from<I, T>(itr: I) -> Self
where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn try_parse_from<I, T>(itr: I) -> Result<Self, Error>
where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn update_from<I, T>(&mut self, itr: I)
where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

fn try_update_from<I, T>(&mut self, itr: I) -> Result<(), Error>
where I: IntoIterator<Item = T>, T: Into<OsString> + Clone,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,