Expand description
WeightLoader trait — unified interface for loading tensor/linear weights
into a specific backend.
Implementations (landing in Phase B):
SafeTensorsLoader— reads.safetensorsfiles, returnsDenseLinearunlessquantize_config.jsonindicates GPTQ/AWQ, in which case it returnsGptqLinear/AwqLinear.GgufLoader— reads.gguffiles, returnsGgufLinear.
The trait is generic over B: Backend so the loader can materialise
tensors directly into backend-native buffers (zero-copy on Apple Silicon
shared memory, dtoh/htod for CUDA, etc.).
Structs§
- Prefixed
Loader - Adapter that prepends a fixed prefix to every tensor name before delegating to an underlying loader.