pub struct ModelDeploymentCard {Show 14 fields
pub display_name: String,
pub model_info: Option<ModelInfoType>,
pub tokenizer: Option<TokenizerKind>,
pub prompt_formatter: Option<PromptFormatterArtifact>,
pub chat_template_file: Option<PromptFormatterArtifact>,
pub gen_config: Option<GenerationConfig>,
pub prompt_context: Option<Vec<PromptContextMixin>>,
pub last_published: Option<DateTime<Utc>>,
pub revision: u64,
pub context_length: u32,
pub kv_cache_block_size: u32,
pub migration_limit: u32,
pub user_data: Option<Value>,
pub runtime_config: ModelRuntimeConfig,
/* private fields */
}
Fields§
§display_name: String
Human readable model name, e.g. “Meta Llama 3.1 8B Instruct”
model_info: Option<ModelInfoType>
Model information
tokenizer: Option<TokenizerKind>
Tokenizer configuration
prompt_formatter: Option<PromptFormatterArtifact>
Prompt Formatter configuration
chat_template_file: Option<PromptFormatterArtifact>
chat template may be stored as a separate file instead of in prompt_formatter
.
gen_config: Option<GenerationConfig>
Generation config - default sampling params
prompt_context: Option<Vec<PromptContextMixin>>
Prompt Formatter Config
last_published: Option<DateTime<Utc>>
When this card was last advertised by a worker. None if not yet published.
revision: u64
Incrementing count of how many times we published this card
context_length: u32
Max context (in number of tokens) this model can handle
kv_cache_block_size: u32
Size of a KV cache block - vllm only currently Passed to the engine and the KV router.
migration_limit: u32
How many times a request can be migrated to another worker if the HTTP server lost connection to the current worker.
user_data: Option<Value>
User-defined metadata for custom worker behavior
runtime_config: ModelRuntimeConfig
Implementations§
Source§impl ModelDeploymentCard
impl ModelDeploymentCard
pub fn builder() -> ModelDeploymentCardBuilder
Sourcepub fn with_name_only(name: &str) -> ModelDeploymentCard
pub fn with_name_only(name: &str) -> ModelDeploymentCard
Create a ModelDeploymentCard where only the name is filled in.
Single-process setups don’t need an MDC to communicate model details, but it simplifies the code to assume we always have one. This is how you get one in those cases. A quasi-null object: https://en.wikipedia.org/wiki/Null_object_pattern
Sourcepub fn expiry_check_period() -> Duration
pub fn expiry_check_period() -> Duration
How often we should check if a model deployment card expired because it’s workers are gone
Sourcepub fn load_from_json_file<P: AsRef<Path>>(file: P) -> Result<Self>
pub fn load_from_json_file<P: AsRef<Path>>(file: P) -> Result<Self>
Load a model deployment card from a JSON file
Sourcepub fn load_from_json_str(contents: &str) -> Result<Self, Error>
pub fn load_from_json_str(contents: &str) -> Result<Self, Error>
Load a model deployment card from a JSON string
Sourcepub fn save_to_json_file(&self, file: &str) -> Result<(), Error>
pub fn save_to_json_file(&self, file: &str) -> Result<(), Error>
Save the model deployment card to a JSON file
pub fn slug(&self) -> &Slug
Sourcepub fn to_json(&self) -> Result<String, Error>
pub fn to_json(&self) -> Result<String, Error>
Serialize the model deployment card to a JSON string
pub fn mdcsum(&self) -> String
Sourcepub fn is_expired(&self) -> bool
pub fn is_expired(&self) -> bool
Was this card last published a long time ago, suggesting the worker is gone?
Sourcepub fn has_tokenizer(&self) -> bool
pub fn has_tokenizer(&self) -> bool
Is this a full model card with tokenizer?
There are cases where we have a placeholder card (see with_name_only
).
pub fn tokenizer_hf(&self) -> Result<HfTokenizer>
pub fn is_gguf(&self) -> bool
Sourcepub async fn move_to_nats(&mut self, nats_client: Client) -> Result<()>
pub async fn move_to_nats(&mut self, nats_client: Client) -> Result<()>
Move the files this MDC uses into the NATS object store. Updates the URI’s to point to NATS.
Sourcepub async fn delete_from_nats(&mut self, nats_client: Client) -> Result<()>
pub async fn delete_from_nats(&mut self, nats_client: Client) -> Result<()>
Delete this card from the key-value store and it’s URLs from the object store
Sourcepub fn set_name(&mut self, name: &str)
pub fn set_name(&mut self, name: &str)
Allow user to override the name we register this model under.
Corresponds to vllm’s --served-model-name
.
Sourcepub fn load_from_disk(
config_path: impl AsRef<Path>,
custom_template_path: Option<&Path>,
) -> Result<ModelDeploymentCard>
pub fn load_from_disk( config_path: impl AsRef<Path>, custom_template_path: Option<&Path>, ) -> Result<ModelDeploymentCard>
Build an in-memory ModelDeploymentCard from either:
- a folder containing config.json, tokenizer.json and token_config.json
- a GGUF file With an optional custom template
Sourcepub async fn load_from_store(
model_slug: &Slug,
drt: &DistributedRuntime,
) -> Result<Option<Self>>
pub async fn load_from_store( model_slug: &Slug, drt: &DistributedRuntime, ) -> Result<Option<Self>>
Load a ModelDeploymentCard from storage the DistributedRuntime is configured to use. Card should be fully local and ready to use when the call returns.
Trait Implementations§
Source§impl Clone for ModelDeploymentCard
impl Clone for ModelDeploymentCard
Source§fn clone(&self) -> ModelDeploymentCard
fn clone(&self) -> ModelDeploymentCard
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreSource§impl Debug for ModelDeploymentCard
impl Debug for ModelDeploymentCard
Source§impl Default for ModelDeploymentCard
impl Default for ModelDeploymentCard
Source§fn default() -> ModelDeploymentCard
fn default() -> ModelDeploymentCard
Source§impl<'de> Deserialize<'de> for ModelDeploymentCard
impl<'de> Deserialize<'de> for ModelDeploymentCard
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl Display for ModelDeploymentCard
impl Display for ModelDeploymentCard
Source§impl Serialize for ModelDeploymentCard
impl Serialize for ModelDeploymentCard
Auto Trait Implementations§
impl Freeze for ModelDeploymentCard
impl RefUnwindSafe for ModelDeploymentCard
impl Send for ModelDeploymentCard
impl Sync for ModelDeploymentCard
impl Unpin for ModelDeploymentCard
impl UnwindSafe for ModelDeploymentCard
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
Source§impl<T, U> OverflowingInto<U> for Twhere
U: OverflowingFrom<T>,
impl<T, U> OverflowingInto<U> for Twhere
U: OverflowingFrom<T>,
fn overflowing_into(self) -> (U, bool)
Source§impl<T> Paint for Twhere
T: ?Sized,
impl<T> Paint for Twhere
T: ?Sized,
Source§fn fg(&self, value: Color) -> Painted<&T>
fn fg(&self, value: Color) -> Painted<&T>
Returns a styled value derived from self
with the foreground set to
value
.
This method should be used rarely. Instead, prefer to use color-specific
builder methods like red()
and
green()
, which have the same functionality but are
pithier.
§Example
Set foreground color to white using fg()
:
use yansi::{Paint, Color};
painted.fg(Color::White);
Set foreground color to white using white()
.
use yansi::Paint;
painted.white();
Source§fn bright_black(&self) -> Painted<&T>
fn bright_black(&self) -> Painted<&T>
Source§fn bright_red(&self) -> Painted<&T>
fn bright_red(&self) -> Painted<&T>
Source§fn bright_green(&self) -> Painted<&T>
fn bright_green(&self) -> Painted<&T>
Source§fn bright_yellow(&self) -> Painted<&T>
fn bright_yellow(&self) -> Painted<&T>
Source§fn bright_blue(&self) -> Painted<&T>
fn bright_blue(&self) -> Painted<&T>
Source§fn bright_magenta(&self) -> Painted<&T>
fn bright_magenta(&self) -> Painted<&T>
Source§fn bright_cyan(&self) -> Painted<&T>
fn bright_cyan(&self) -> Painted<&T>
Source§fn bright_white(&self) -> Painted<&T>
fn bright_white(&self) -> Painted<&T>
Source§fn bg(&self, value: Color) -> Painted<&T>
fn bg(&self, value: Color) -> Painted<&T>
Returns a styled value derived from self
with the background set to
value
.
This method should be used rarely. Instead, prefer to use color-specific
builder methods like on_red()
and
on_green()
, which have the same functionality but
are pithier.
§Example
Set background color to red using fg()
:
use yansi::{Paint, Color};
painted.bg(Color::Red);
Set background color to red using on_red()
.
use yansi::Paint;
painted.on_red();
Source§fn on_primary(&self) -> Painted<&T>
fn on_primary(&self) -> Painted<&T>
Source§fn on_magenta(&self) -> Painted<&T>
fn on_magenta(&self) -> Painted<&T>
Source§fn on_bright_black(&self) -> Painted<&T>
fn on_bright_black(&self) -> Painted<&T>
Source§fn on_bright_red(&self) -> Painted<&T>
fn on_bright_red(&self) -> Painted<&T>
Source§fn on_bright_green(&self) -> Painted<&T>
fn on_bright_green(&self) -> Painted<&T>
Source§fn on_bright_yellow(&self) -> Painted<&T>
fn on_bright_yellow(&self) -> Painted<&T>
Source§fn on_bright_blue(&self) -> Painted<&T>
fn on_bright_blue(&self) -> Painted<&T>
Source§fn on_bright_magenta(&self) -> Painted<&T>
fn on_bright_magenta(&self) -> Painted<&T>
Source§fn on_bright_cyan(&self) -> Painted<&T>
fn on_bright_cyan(&self) -> Painted<&T>
Source§fn on_bright_white(&self) -> Painted<&T>
fn on_bright_white(&self) -> Painted<&T>
Source§fn attr(&self, value: Attribute) -> Painted<&T>
fn attr(&self, value: Attribute) -> Painted<&T>
Enables the styling Attribute
value
.
This method should be used rarely. Instead, prefer to use
attribute-specific builder methods like bold()
and
underline()
, which have the same functionality
but are pithier.
§Example
Make text bold using attr()
:
use yansi::{Paint, Attribute};
painted.attr(Attribute::Bold);
Make text bold using using bold()
.
use yansi::Paint;
painted.bold();
Source§fn rapid_blink(&self) -> Painted<&T>
fn rapid_blink(&self) -> Painted<&T>
Source§fn quirk(&self, value: Quirk) -> Painted<&T>
fn quirk(&self, value: Quirk) -> Painted<&T>
Enables the yansi
Quirk
value
.
This method should be used rarely. Instead, prefer to use quirk-specific
builder methods like mask()
and
wrap()
, which have the same functionality but are
pithier.
§Example
Enable wrapping using .quirk()
:
use yansi::{Paint, Quirk};
painted.quirk(Quirk::Wrap);
Enable wrapping using wrap()
.
use yansi::Paint;
painted.wrap();
Source§fn clear(&self) -> Painted<&T>
👎Deprecated since 1.0.1: renamed to resetting()
due to conflicts with Vec::clear()
.
The clear()
method will be removed in a future release.
fn clear(&self) -> Painted<&T>
resetting()
due to conflicts with Vec::clear()
.
The clear()
method will be removed in a future release.Source§fn whenever(&self, value: Condition) -> Painted<&T>
fn whenever(&self, value: Condition) -> Painted<&T>
Conditionally enable styling based on whether the Condition
value
applies. Replaces any previous condition.
See the crate level docs for more details.
§Example
Enable styling painted
only when both stdout
and stderr
are TTYs:
use yansi::{Paint, Condition};
painted.red().on_yellow().whenever(Condition::STDOUTERR_ARE_TTY);
Source§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<T, U> RoundingInto<U> for Twhere
U: RoundingFrom<T>,
impl<T, U> RoundingInto<U> for Twhere
U: RoundingFrom<T>,
fn rounding_into(self, rm: RoundingMode) -> (U, Ordering)
Source§impl<T, U> SaturatingInto<U> for Twhere
U: SaturatingFrom<T>,
impl<T, U> SaturatingInto<U> for Twhere
U: SaturatingFrom<T>,
fn saturating_into(self) -> U
Source§impl<T> Serialize for T
impl<T> Serialize for T
fn erased_serialize(&self, serializer: &mut dyn Serializer) -> Result<(), Error>
fn do_erased_serialize( &self, serializer: &mut dyn Serializer, ) -> Result<(), ErrorImpl>
Source§impl<T> ToCompactString for Twhere
T: Display,
impl<T> ToCompactString for Twhere
T: Display,
Source§fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
ToCompactString::to_compact_string()
Read moreSource§fn to_compact_string(&self) -> CompactString
fn to_compact_string(&self) -> CompactString
CompactString
. Read moreSource§impl<T> ToDebugString for Twhere
T: Debug,
impl<T> ToDebugString for Twhere
T: Debug,
Source§fn to_debug_string(&self) -> String
fn to_debug_string(&self) -> String
Source§impl<T> ToStringFallible for Twhere
T: Display,
impl<T> ToStringFallible for Twhere
T: Display,
Source§fn try_to_string(&self) -> Result<String, TryReserveError>
fn try_to_string(&self) -> Result<String, TryReserveError>
ToString::to_string
, but without panic on OOM.