Docs.rs
inference-lab-0.6.1
inference-lab 0.6.1
Docs.rs crate page
MIT
Links
Repository
crates.io
Source
Owners
fergusfinn
Dependencies
axum ^0.8
normal
optional
clap ^4.5
normal
optional
colored ^2.1
normal
optional
console_error_panic_hook ^0.1
normal
env_logger ^0.11
normal
optional
getrandom ^0.2
normal
js-sys ^0.3
normal
log ^0.4
normal
minijinja ^2.0
normal
optional
ordered-float ^4.0
normal
rand ^0.8
normal
rand_distr ^0.4
normal
serde ^1.0
normal
serde-wasm-bindgen ^0.6
normal
serde_json ^1.0
normal
tabled ^0.16
normal
optional
tokenizers ^0.22
normal
optional
tokio ^1
normal
optional
tokio-stream ^0.1
normal
optional
toml ^0.8
normal
tower-http ^0.6
normal
optional
uuid ^1
normal
optional
wasm-bindgen ^0.2
normal
tempfile ^3.8
dev
Versions
49.39%
of the crate is documented
Go to latest version
Platform
x86_64-unknown-linux-gnu
Feature flags
docs.rs
About docs.rs
Badges
Builds
Metadata
Shorthand URLs
Download
Rustdoc JSON
Build queue
Privacy policy
Rust
Rust website
The Book
Standard Library API Reference
Rust by Example
The Cargo Guide
Clippy Documentation
Skip to main content
Module arithmetic
inference_
lab
0.6.1
Module arithmetic
Module Items
Functions
In inference_
lab::
compute
inference_lab
::
compute
Module
arithmetic
Copy item path
Source
Functions
ยง
flops_
for_
tokens
Calculate FLOPS for a given number of tokens Formula: FLOPS = 2 * num_tokens * active_parameters + attention_flops For MoE models, uses active_parameters (not total) since only some experts are activated Includes both matmul and attention FLOPs
kv_
cache_
bytes
Calculate memory transfer bytes for KV cache for a given sequence length Formula: kv_bytes = kv_cache_bytes_per_token * seq_len
model_
weight_
bytes
Calculate memory transfer bytes for model weights Formula: weight_bytes = num_parameters * bytes_per_param
total_
memory_
transfer
Calculate total memory transfer bytes for an iteration Formula: total_bytes = model_weights + sum(kv_cache for each request)