pub enum ServeCommands {
Plan {
model: String,
gpu: bool,
batch_size: usize,
seq_len: usize,
format: String,
quant: Option<String>,
},
Run {},
}Expand description
Inference server subcommands (plan/run).
apr serve plan computes VRAM budget, throughput estimates, and contract
verification before starting a server. apr serve run launches the server.
Variants§
Plan
Pre-flight inference capacity plan (VRAM budget, roofline, contracts)
Inspects model metadata, detects GPU hardware, and produces a capacity plan showing whether the model fits in VRAM with the requested batch size. No weights are loaded — header-only inspection.
Accepts local files (.gguf, .apr, .safetensors) or HuggingFace repo IDs (hf://org/repo or org/repo). For HF repos, only the ~2KB config.json is fetched — no weight download needed.
Fields
Run
Start inference server (REST API, streaming, metrics)
Trait Implementations§
Source§impl Debug for ServeCommands
impl Debug for ServeCommands
Source§impl FromArgMatches for ServeCommands
impl FromArgMatches for ServeCommands
Source§fn from_arg_matches(__clap_arg_matches: &ArgMatches) -> Result<Self, Error>
fn from_arg_matches(__clap_arg_matches: &ArgMatches) -> Result<Self, Error>
Source§fn from_arg_matches_mut(
__clap_arg_matches: &mut ArgMatches,
) -> Result<Self, Error>
fn from_arg_matches_mut( __clap_arg_matches: &mut ArgMatches, ) -> Result<Self, Error>
Source§fn update_from_arg_matches(
&mut self,
__clap_arg_matches: &ArgMatches,
) -> Result<(), Error>
fn update_from_arg_matches( &mut self, __clap_arg_matches: &ArgMatches, ) -> Result<(), Error>
Assign values from
ArgMatches to self.Source§fn update_from_arg_matches_mut<'b>(
&mut self,
__clap_arg_matches: &mut ArgMatches,
) -> Result<(), Error>
fn update_from_arg_matches_mut<'b>( &mut self, __clap_arg_matches: &mut ArgMatches, ) -> Result<(), Error>
Assign values from
ArgMatches to self.Source§impl Subcommand for ServeCommands
impl Subcommand for ServeCommands
Source§fn augment_subcommands<'b>(__clap_app: Command) -> Command
fn augment_subcommands<'b>(__clap_app: Command) -> Command
Source§fn augment_subcommands_for_update<'b>(__clap_app: Command) -> Command
fn augment_subcommands_for_update<'b>(__clap_app: Command) -> Command
Append to
Command so it can instantiate self via
FromArgMatches::update_from_arg_matches_mut Read moreSource§fn has_subcommand(__clap_name: &str) -> bool
fn has_subcommand(__clap_name: &str) -> bool
Test whether
Self can parse a specific subcommandAuto Trait Implementations§
impl Freeze for ServeCommands
impl RefUnwindSafe for ServeCommands
impl Send for ServeCommands
impl Sync for ServeCommands
impl Unpin for ServeCommands
impl UnsafeUnpin for ServeCommands
impl UnwindSafe for ServeCommands
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more