Borsalino
Thin GPU compute abstraction for the Industrial Algebra ecosystem.
One trait, two backends, zero ceremony.
Write WGSL compute kernels. Dispatch them synchronously on Metal or Vulkan. Read results back. No bind groups, no pipeline layouts, no descriptor sets, no async runtime.
Quick Start
use GpuBackend;
// WGSL compute kernel
let wgsl = r#"
@group(0) @binding(0) var<storage, read> input: array<f32>;
@group(0) @binding(1) var<storage, read_write> output: array<f32>;
@compute @workgroup_size(256)
fn add_one(@builtin(global_invocation_id) gid: vec3<u32>) {
let i = gid.x;
output[i] = input[i] + 1.0;
}
"#;
let gpu = init?;
let pipeline = gpu.compile?;
let input = gpu.create_buffer?;
let output = gpu.?;
gpu.dispatch?;
let result: = gpu.read_buffer?;
assert_eq!;
Run with:
Backends
| Backend | Platform | Feature | Status |
|---|---|---|---|
| Metal | macOS (Apple Silicon) | metal |
✅ Active — raw objc_msgSend FFI |
| Vulkan | Linux, Windows | vulkan |
✅ Active — raw ash FFI |
| Stub | Any | (none) | Returns NoBackend — safe fallback |
Features
| Feature | What it enables |
|---|---|
metal |
Metal backend (macOS only) |
vulkan |
Vulkan backend via ash (Linux / Windows) |
verify |
karpal-verify 0.5 GPU obligation bundles (SMT, Lean, Kani export) |
Architecture
GpuBackend trait (7 methods)
│
├── MetalBackend (metal.rs)
│ ├── naga WGSL → MSL translation
│ └── objc_msgSend FFI (19 selectors, 0 Metal crate deps)
│
└── VulkanBackend (vulkan.rs)
├── naga WGSL → SPIR-V translation
└── ash FFI (Vulkan 1.3)
Opaque handle types (ComputePipeline, GpuBuffer) carry raw pointers and
backend-specific drop functions — no coupling between lib.rs and backend
modules.
Shader Language
Kernels are authored in WGSL (WebGPU Shading Language). Borsalino translates to each backend's native format via naga:
- Metal: WGSL → MSL → Metal compiler
- Vulkan: WGSL → SPIR-V → vkCreateComputePipelines
Buffer bindings use @group(0) @binding(N) in WGSL, mapped to dispatch
buffer position: buffers[0] → @binding(0), buffers[1] → @binding(1).
Memory Strategy
Borsalino auto-detects your hardware and picks the optimal memory layout:
| GPU type | Detection | Memory | Behaviour |
|---|---|---|---|
| Apple Silicon M-series | Unified | Host-visible, coherent | Zero-copy between CPU and GPU |
| AMD integrated (APU) | Unified | Host-visible, coherent | Zero-copy |
| NVIDIA Grace Blackwell (GB10) | Unified | Host-visible, coherent | Zero-copy |
| NVIDIA RTX / AMD RDNA / Intel Arc | Discrete (auto) | Device-local VRAM + staging | Automatic PCIe transfers |
For explicit control:
use MemoryStrategy;
let gpu = init?; // auto-detect
let gpu = init_device_local?; // force VRAM
let gpu = init_with_strategy?; // force unified
Batched Dispatch
Chain multiple dispatches into a single command buffer with
dispatch_many:
use ;
gpu.dispatch_many?;
Batching amortises command-buffer allocation overhead. On RTX 5080, 256 dispatches per buffer drops per-dispatch latency from 37 us to 0.5 us (75x faster). On GB10: 46 us to 1.0 us (46x faster). Peak throughput: 577 GFLOPS (RTX 5080, 1M elements batched).
See BENCHMARKS.md for full cross-platform performance data.
Verification
GPU safety properties are encoded as karpal-verify 0.5 obligation bundles
(feature verify, fetches from crates.io):
use ;
let bundle = add_one_obligations;
assert!;
Bundles export to SMT-LIB2, Lean 4, and Kani verification backends.
Examples
| Example | Description | Run with |
|---|---|---|
hello_compute |
add_one kernel on 4 elements | cargo run --example hello_compute --features vulkan |
saxpy |
SAXPY (a·x + y) on 1024 elements | cargo run --example saxpy --features vulkan |
bench |
Cross-platform GPU benchmarks | cargo run --example bench --features vulkan --release |
dispatch_profile |
Per-component dispatch cost profiling | cargo run --example dispatch_profile --features vulkan --release |
Testing
# Vulkan backend (Linux / Windows)
# Verification obligations
# Both
# Metal backend (macOS only)
License
AGPL-3.0-only OR Commercial. Copyright (C) 2026 Industrial Algebra.
Dual-licensed: use under AGPL v3 for open-source projects, or obtain a commercial license for proprietary use. See LICENSE and LICENSE-COMMERCIAL.