Skip to main content

Crate baracuda_cutensor

Crate baracuda_cutensor 

Source
Expand description

Safe Rust wrappers for NVIDIA cuTENSOR (v2 API).

cuTENSOR is NVIDIA’s high-performance tensor-primitive library — einsum-style contractions, element-wise ops, reductions, and permutations. This crate wraps the full v2 host API surface.

§Concepts

§Example — D = α · A ⊗ B + β · C (matmul via contraction)

Einstein notation: D[m,n] = A[m,k] · B[k,n]. Mode IDs identify the shared k index — pick any distinct integers per mode.

use baracuda_cutensor::*;

let handle = Handle::new()?;
let m = 64i64; let n = 64i64; let k = 32i64;
let a = TensorDescriptor::new(&handle, &[m, k], None, DataType::F32, 128)?;
let b = TensorDescriptor::new(&handle, &[k, n], None, DataType::F32, 128)?;
let c = TensorDescriptor::new(&handle, &[m, n], None, DataType::F32, 128)?;
let modes_a = &[0i32, 2]; // [m, k]
let modes_b = &[2, 1];     // [k, n]
let modes_c = &[0, 1];     // [m, n]
let op = unsafe {
    Contraction::new(&handle, &a, modes_a, &b, modes_b, &c, modes_c, &c, modes_c,
        core::ptr::null())
}?;
let pref = PlanPreference::default_for(&handle)?;
let ws = op.estimate_workspace(&pref, WorkspaceKind::Default)?;
let plan = Plan::new(&op, &pref, ws)?;

§Example — reduce along an axis (sum over k)

D[m] = Σ_k A[m, k]. Modes present in A but absent from D are reduced with the chosen BinaryOp (Add for sum).

use baracuda_cutensor::*;

let handle = Handle::new()?;
let m = 128i64; let k = 64i64;
let a = TensorDescriptor::new(&handle, &[m, k], None, DataType::F32, 128)?;
let d = TensorDescriptor::new(&handle, &[m],    None, DataType::F32, 128)?;

let modes_a = &[0i32, 1]; // [m, k]
let modes_d = &[0i32];     // [m]
let op = unsafe {
    Reduction::new(&handle, &a, modes_a, &d, modes_d, &d, modes_d,
        BinaryOp::Add, core::ptr::null())
}?;
let pref = PlanPreference::default_for(&handle)?;
let ws = op.estimate_workspace(&pref, WorkspaceKind::Default)?;
let _plan = Plan::new(&op, &pref, ws)?;

§Example — element-wise D = A + C via ElementwiseBinary

Same modes on every operand, no contraction or reduction — just a fused per-element op with optional unary pre-ops on each input.

use baracuda_cutensor::*;

let handle = Handle::new()?;
let n = 1024i64;
let a = TensorDescriptor::new(&handle, &[n], None, DataType::F32, 128)?;
let c = TensorDescriptor::new(&handle, &[n], None, DataType::F32, 128)?;
let d = TensorDescriptor::new(&handle, &[n], None, DataType::F32, 128)?;

let modes = &[0i32];
let op = unsafe {
    ElementwiseBinary::new(
        &handle,
        &a, modes, UnaryOp::Identity,
        &c, modes, UnaryOp::Identity,
        &d, modes,
        BinaryOp::Add,
        core::ptr::null(),
    )
}?;
let pref = PlanPreference::default_for(&handle)?;
let _plan = Plan::new(&op, &pref, /* workspace */ 0)?;

Structs§

BlockSparseContraction
Block-sparse contraction: the A operand is block-sparse, B/C/D dense.
BlockSparseTensorDescriptor
A block-sparse tensor descriptor (cuTENSOR 2.x). Used on the A operand of a BlockSparseContraction.
ComputeDescriptor
A custom [compute descriptor]. Prefer the pre-defined ones (Handle::compute_desc_32f, …) unless you need attribute customization.
Contraction
A contraction op: D[mD] = α * op_a(A[mA]) * op_b(B[mB]) + β * op_c(C[mC]).
ElementwiseBinary
Elementwise binary op: D[mD] = (α * op_a(A[mA])) op_ac (γ * op_c(C[mC])).
ElementwiseTrinary
Elementwise trinary op: D[mD] = ((α * op_a(A) op_ab β * op_b(B)) op_abc γ * op_c(C)).
Handle
cuTENSOR library handle.
OperationDescriptor
An un-compiled operation descriptor. Users typically create these through constructors on Contraction, Reduction, ElementwiseBinary, ElementwiseTrinary, or Permutation.
Permutation
Tensor permutation (axis shuffle + optional unary op): B[mB] = α * op_a(A[mA]).
Plan
A compiled operation plan. Dispatch to the matching execute method based on the op kind that built it.
PlanPreference
Plan preferences — algorithm selection + JIT mode.
Reduction
A reduction op: D[mD] = reduce(A[mA]) with user-chosen reduce op.
TensorDescriptor
A tensor descriptor: modes + extents + dtype + stride layout.
TrinaryContraction
A ternary contraction op: E[mE] = α·op_a(A)·op_b(B)·op_c(C) + β·op_d(D).

Enums§

BinaryOp
Binary combining operator (used between operands in elementwise / reduction ops).
DataType
Element dtype for tensor descriptors.
UnaryOp
Per-operand unary operator (applied to A/B/C before the main op).
WorkspaceKind
Workspace-size preference tier.

Functions§

cudart_version
cuTENSOR’s view of the CUDART version it was built against.
force_disable_logging
Force-disable all cuTENSOR logging (tightest possible quiet).
open_log_file
Open a log file path for cuTENSOR output.
probe
Verify cuTENSOR is loadable on this host.
set_log_level
Set the cuTENSOR logger verbosity (0 = off, 1 = error, 2 = trace).
set_log_mask
Bitmask of log categories (API calls, hints, traces, …). Full value list in cuTENSOR headers.
version
Encoded integer version from cutensorGetVersion. Decode as major = v / 10000, minor = (v / 100) % 100, patch = v % 100.

Type Aliases§

Error
Error type for cuTENSOR operations.
Result
Result alias.