HardwareConfig

inference_lab::config::hardware

Struct HardwareConfig

pub struct HardwareConfig {
    pub name: String,
    pub compute_flops: f64,
    pub memory_bandwidth: f64,
    pub memory_capacity: u64,
    pub kv_cache_capacity: u64,
    pub gpu_memory_utilization: f64,
    pub bytes_per_param: u32,
    pub compute_bound_threshold: u32,
}

Fields§

§name: String

Accelerator name (e.g., “H100”, “A100”)

§compute_flops: f64

Compute capacity in FLOPS (for specific precision, e.g., bf16)

§memory_bandwidth: f64

Memory bandwidth in bytes/sec

§memory_capacity: u64

Total memory capacity in bytes

§kv_cache_capacity: u64

KV cache capacity in bytes (subset of memory_capacity) If not specified, calculated from gpu_memory_utilization

§gpu_memory_utilization: f64

Fraction of GPU memory to use (vLLM default: 0.9) Used to calculate kv_cache_capacity if not explicitly set

§bytes_per_param: u32

Number of bytes per parameter (1 for fp8, 2 for bf16)

§compute_bound_threshold: u32

Compute-bound threshold (derived from flops/bandwidth ratio) This is calculated: bytes_per_param * compute_flops / memory_bandwidth

Implementations§

impl HardwareConfig

pub fn compute_threshold(&mut self)

Calculate and set the compute-bound threshold

pub fn compute_kv_cache_capacity(&mut self, model_size_bytes: u64)

Calculate KV cache capacity if not explicitly set Formula: (memory_capacity * gpu_memory_utilization) - model_size This matches vLLM’s behavior: requested_memory - non_kv_cache_memory

pub fn with_threshold(self) -> Self

Initialize with threshold pre-computed

Trait Implementations§

impl Clone for HardwareConfig

fn clone(&self) -> HardwareConfig

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Debug for HardwareConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<'de> Deserialize<'de> for HardwareConfig

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

Auto Trait Implementations§

impl Freeze for HardwareConfig

impl RefUnwindSafe for HardwareConfig

impl Send for HardwareConfig

impl Sync for HardwareConfig

impl Unpin for HardwareConfig

impl UnwindSafe for HardwareConfig

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,