rlx-models-core 0.2.1

Shared config, weight loading, and compile helpers for RLX model crates
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# rlx-models-core

Shared config, weight loading, compile profiles, and packed GGUF prefill helpers for RLX model crates (published on crates.io as **`rlx-models-core`**; import as `rlx_core`).

**Version 0.2.1** adds packed GGUF support:

| API | Role |
|-----|------|
| [`packed_gguf_compile_guard`]src/flow_bridge.rs | Metal `RLX_DISABLE_MPSGRAPH`, MLX `RLX_MLX_MODE=eager` during compile |
| [`compile_options_for_packed_gguf_prefill_with_profile`]src/flow_bridge.rs | Fusion off on wgpu/CUDA/ROCm for `FusedResidualRmsNorm` gaps |
| [`packed_gguf_execution_device`]src/flow_bridge.rs | Route MLX/wgpu/CUDA packed prefill to CPU when needed |

Used by `rlx-llama32`, `rlx-qwen3`, `rlx-gemma`, and `rlx-minicpm5`.

## See also

- [README.md]../../README.md
- [AGENTS.md]../../AGENTS.md