Expand description
Copyright 2026 0xClandestine, Ekryski, TheTom, Ambisphaeric SPDX-License-Identifier: Apache-2.0 MetalTile facade crate.
metaltile re-exports the DSL macros, compile-time placeholder types, IR/codegen crates,
and runtime entry points used to define and launch #[kernel] functions.
§Quick start
use metaltile::prelude::*;
fn encode_f32s(values: &[f32]) -> Vec<u8> {
let mut bytes = Vec::with_capacity(values.len() * core::mem::size_of::<f32>());
for value in values {
bytes.extend_from_slice(&value.to_ne_bytes());
}
bytes
}
fn decode_f32s(bytes: &[u8]) -> Vec<f32> {
bytes
.chunks_exact(core::mem::size_of::<f32>())
.map(|chunk| f32::from_ne_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]))
.collect()
}
#[kernel]
fn vector_add(a: Tensor<f32>, b: Tensor<f32>, c: Tensor<f32>) {
let idx = program_id::<0>();
store(c[idx], load(a[idx]) + load(b[idx]));
}
let ctx = Context::new()?;
if !ctx.has_gpu() {
return Ok(());
}
let n = 256usize;
let a: Vec<f32> = (0..n).map(|i| i as f32).collect();
let b = vec![1.0f32; n];
let result = vector_add::launch(&ctx)
.input("a", encode_f32s(&a))
.input("b", encode_f32s(&b))
.input("c", vec![0; a.len() * core::mem::size_of::<f32>()])
.dispatch()?;
let c = decode_f32s(result.outputs.get("c").expect("output buffer"));
assert_eq!(c[0], 1.0);
assert_eq!(c[255], 256.0);§Writing kernels
- Import
metaltile::prelude::*anywhere you define#[kernel]functions. - Output tensors are identified by parameter name today. The facade treats
c,out, andoutputas writable outputs. launch(&ctx).input(name, bytes)binds rawVec<u8>buffers by parameter name, andDispatchResultreturns output bytes under the same output name.Tensor<T, Shape>and#[constexpr]annotate IR/codegen metadata, but the current launch builder is still byte-buffer oriented.#[constexpr]parameters become extra constant-buffer bindings in generated MSL. The facade launch builder does not bind them automatically yet, so dispatch examples currently avoid constexpr-dependent kernels.- The current elementwise launch path sizes its grid from the output buffer and uses 256-thread groups without inserting a tail guard, so examples should use element counts that match that dispatch shape.
KernelModestarts ascore::ir::KernelMode::Elementwiseand may be adjusted by later lowering/codegen passes based on the IR shape and ops you emit.- Use
<kernel>::build_kernel_ir()ormetaltile::codegen::msl::MslGeneratorto inspect the generated IR and MSL before dispatching.
Re-exports§
pub use prelude::Tensor;pub use metaltile_codegen as codegen;pub use metaltile_core as core;
Modules§
- prelude
- Copyright 2026 0xClandestine, Ekryski, TheTom, Ambisphaeric
SPDX-License-Identifier: Apache-2.0
Re-exports and placeholder DSL items for
#[kernel]functions.
Macros§
- shape
- Proc macros and helper macros used by kernel definitions.
Construct a [
Shape] from dimension expressions. - tile
- Proc macros and helper macros used by kernel definitions. Construct a 2D tile shape.
Structs§
- Context
- Runtime context, dispatch result, and top-level runtime error.
- Dispatch
Result - Runtime context, dispatch result, and top-level runtime error.
Enums§
- Codegen
Error - Error returned by
metaltile::codegenhelpers. - Metal
Tile Error - Runtime context, dispatch result, and top-level runtime error.
Constants§
- VERSION
- Crate version from
Cargo.toml.
Functions§
- version
- Return the crate version from
Cargo.toml.
Attribute Macros§
- bench_
kernel - Proc macros and helper macros used by kernel definitions.
Registers a
#[kernel]function for automatic benchmarking. - constexpr
- Proc macros and helper macros used by kernel definitions.
- kernel
- Proc macros and helper macros used by kernel definitions.
- scalar
- Proc macros and helper macros used by kernel definitions.
- strided
- Proc macros and helper macros used by kernel definitions.