rlx-mlx-sys 0.2.4

Low-level MLX C++ build + C ABI shim for RLX (vendored mlx, NVRTC-free).
docs.rs failed to build rlx-mlx-sys-0.2.4
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

rlx-mlx-sys

Vendored MLX C++ (git submodule at vendor/mlx) plus the rlx_mlx_shim C ABI built via CMake + cc in build.rs. Consumed by rlx-mlx; not meant for direct use outside RLX unless you accept the shim API stability policy (none yet).

After clone:

git submodule update --init rlx-mlx-sys/vendor/mlx

Linux compile times

Backend First build (approx.) Opt-in
CPU only (default) ~5–10 min always
CUDA (nvcc kernels) ~45–90 min RLX_MLX_CUDA=1 or --features cuda

CPU-only is the default even when CUDA/cuDNN are installed. Auto-building the CUDA backend added an hour+ to every clean cargo build on WSL rigs.

Faster Linux builds

  1. Install system LAPACK headers (avoids bootstrapping OpenBLAS into OUT_DIR):

    sudo apt-get install libblas-dev liblapack-dev liblapacke-dev
    
  2. Use debug cargo profile for iteration — build.rs maps cargo build → CMake Debug (much faster nvcc when CUDA is enabled).

  3. Install ccache (MLX enables it automatically when found):

    sudo apt-get install ccache
    
  4. Pin parallel nvcc jobs if WSL OOMs or thrashes:

    RLX_MLX_JOBS=4 cargo build -p rlx-mlx-sys --features cuda
    
  5. Pin GPU arch when cross-compiling or CI has no GPU:

    RLX_MLX_CUDA_ARCH=89 RLX_MLX_CUDA=1 cargo build -p rlx-mlx --features cuda
    
  6. Runtime device selection (after build): RLX_MLX_DEVICE=cpu|gpu.

Full guide: docs/benchmarks/mlx-linux.md.

macOS / Windows

Requires macOS + Xcode (xcrun metal) for Metal kernels. CPU MLX builds on Linux and Windows; on Linux, lapacke.h must be present (liblapacke-dev) or build.rs bootstraps OpenBLAS into OUT_DIR.

License

GPL-3.0-only.