Skip to main content

Crate boostr

Crate boostr 

Source
Expand description

§boostr

ML framework built on numr — attention, quantization, model architectures.

boostr extends numr’s foundational numerical computing with ML-specific operations, quantized tensor support, and model building blocks. It uses numr’s runtime, tensors, and ops directly — no reimplementation, no wrappers.

§Relationship to numr

┌─────────────────────────────────────────────────────────┐
│                    boostr ◄── YOU ARE HERE               │
│   (attention, RoPE, MoE, quantization, model loaders)   │
└──────────────────────────┬──────────────────────────────┘
│                      numr                                │
│     (tensors, ops, runtime, autograd, linalg, FFT)       │
└─────────────────────────────────────────────────────────┘

§Design

  • Extension traits: ML ops (AttentionOps, RoPEOps) implemented on numr’s clients
  • QuantTensor: Separate type for block-quantized data (GGUF formats)
  • impl_generic: Composite ops composed from numr primitives, same on all backends
  • Custom kernels: Dequant, quantized matmul, fused attention (SIMD/PTX/WGSL)

Re-exports§

pub use nn::Init;
pub use nn::VarBuilder;
pub use nn::VarMap;
pub use nn::Weight;
pub use ops::AttentionOps;
pub use ops::DeviceGrammarDfa;
pub use ops::FlashAttentionOps;
pub use ops::FusedFp8TrainingOps;
pub use ops::FusedOptimizerOps;
pub use ops::FusedQkvOps;
pub use ops::GrammarDfaOps;
pub use ops::KvCacheOps;
pub use ops::MlaOps;
pub use ops::PagedAttentionOps;
pub use ops::RoPEOps;
pub use ops::SamplingOps;
pub use ops::var_flash_attention;
pub use quant::DecomposedQuantLinear;
pub use quant::DecomposedQuantMethod;
pub use quant::DecomposedQuantTensor;
pub use quant::DequantOps;
pub use quant::FusedQuantOps;
pub use quant::QuantFormat;
pub use quant::QuantMatmulOps;
pub use quant::QuantTensor;
pub use model::ExpertWeights;
pub use format::GgufTokenizer;
pub use model::encoder::EmbeddingPipeline;
pub use model::encoder::Encoder;
pub use model::encoder::EncoderClient;
pub use model::encoder::EncoderConfig;
pub use model::encoder::Pooling;

Modules§

autograd
Automatic differentiation (autograd)
data
distributed
error
boostr error types
format
inference
model
nn
ops
optimizer
quant
runtime
Runtime backends for tensor computation
tensor
Tensor types and operations
trainer

Structs§

CpuClient
CPU client for operation dispatch
CpuDevice
CPU device (there’s only one: the host CPU)
CpuRuntime
CPU compute runtime
Tensor
N-dimensional array stored on a compute device

Enums§

DType
Data types supported by numr tensors
NumrError
Errors that can occur in numr operations

Traits§

ActivationOps
Activation operations
BinaryOps
Element-wise binary operations on tensors.
ConvOps
Convolution operations.
IndexingOps
Indexing operations
NormalizationOps
Normalization operations
Runtime
Core trait for compute backends
RuntimeClient
Trait for runtime clients that handle operation dispatch
ScalarOps
Scalar operations trait for tensor-scalar operations
TensorOps
Core tensor operations trait
TypeConversionOps
Type conversion operations
UnaryOps
Unary operations

Type Aliases§

NumrResult
Result type alias using numr’s Error