Crate boostr

Expand description

§boostr

ML framework built on numr — attention, quantization, model architectures.

boostr extends numr’s foundational numerical computing with ML-specific operations, quantized tensor support, and model building blocks. It uses numr’s runtime, tensors, and ops directly — no reimplementation, no wrappers.

§Relationship to numr

┌─────────────────────────────────────────────────────────┐
│                    boostr ◄── YOU ARE HERE               │
│   (attention, RoPE, MoE, quantization, model loaders)   │
└──────────────────────────┬──────────────────────────────┘
│                      numr                                │
│     (tensors, ops, runtime, autograd, linalg, FFT)       │
└─────────────────────────────────────────────────────────┘

§Design

Extension traits: ML ops (AttentionOps, RoPEOps) implemented on numr’s clients
QuantTensor: Separate type for block-quantized data (GGUF formats)
impl_generic: Composite ops composed from numr primitives, same on all backends
Custom kernels: Dequant, quantized matmul, fused attention (SIMD/PTX/WGSL)

Re-exports§

pub use nn::Init;
pub use nn::VarBuilder;
pub use nn::VarMap;
pub use nn::Weight;
pub use ops::AttentionOps;
pub use ops::DeviceGrammarDfa;
pub use ops::FlashAttentionOps;
pub use ops::FusedFp8TrainingOps;
pub use ops::FusedOptimizerOps;
pub use ops::FusedQkvOps;
pub use ops::GrammarDfaOps;
pub use ops::KvCacheOps;
pub use ops::MlaOps;
pub use ops::PagedAttentionOps;
pub use ops::RoPEOps;
pub use ops::SamplingOps;
pub use ops::var_flash_attention;
pub use quant::DecomposedQuantLinear;
pub use quant::DecomposedQuantMethod;
pub use quant::DecomposedQuantTensor;
pub use quant::DequantOps;
pub use quant::FusedQuantOps;
pub use quant::QuantFormat;
pub use quant::QuantMatmulOps;
pub use quant::QuantTensor;
pub use model::ExpertWeights;
pub use format::GgufTokenizer;
pub use model::encoder::EmbeddingPipeline;
pub use model::encoder::Encoder;
pub use model::encoder::EncoderClient;
pub use model::encoder::EncoderConfig;
pub use model::encoder::Pooling;

Modules§

autograd: Automatic differentiation (autograd)
data
distributed
error: boostr error types
format
inference
model
nn
ops
optimizer
quant
runtime: Runtime backends for tensor computation
tensor: Tensor types and operations
trainer

Structs§

CpuClient: CPU client for operation dispatch
CpuDevice: CPU device (there’s only one: the host CPU)
CpuRuntime: CPU compute runtime
Tensor: N-dimensional array stored on a compute device

Enums§

DType: Data types supported by numr tensors
NumrError: Errors that can occur in numr operations

Traits§

ActivationOps: Activation operations
BinaryOps: Element-wise binary operations on tensors.
ConvOps: Convolution operations.
IndexingOps: Indexing operations
NormalizationOps: Normalization operations
Runtime: Core trait for compute backends
RuntimeClient: Trait for runtime clients that handle operation dispatch
ScalarOps: Scalar operations trait for tensor-scalar operations
TensorOps: Core tensor operations trait
TypeConversionOps: Type conversion operations
UnaryOps: Unary operations

Type Aliases§

NumrResult: Result type alias using numr’s Error

Crate boostr

Crate boostr Copy item path

§boostr

§Relationship to numr

§Design

Re-exports§

Modules§

Structs§

Enums§

Traits§

Type Aliases§

Crate boostr