Crate kn_cuda_eval

Expand description

A Cuda CPU executor for neural network graphs from the kn_graph crate. The core type is CudaExecutor.

This crate is part of the Kyanite project, see its readme for more information. See system-requirements for how to set up the cuda libraries.

§Quick demo

// load and optimize the graph
let graph = load_graph_from_onnx_path("test.onnx", false)?;
let graph = optimize_graph(&graph, Default::default());

// select a device
let device = CudaDevice::new(0).unwrap();

// build an executor
let batch_size = 8;
let mut executor = CudaExecutor::new(device, &graph, batch_size);

// evaluate the graph with some inputs, get the outputs back
let inputs = [DTensor::F32(Tensor::zeros(vec![batch_size, 16]))];
let outputs: &[DTensor] = executor.evaluate(&inputs);

Modules§

autokernel: The autokernel infrastructure and specific kernels.
device_tensor: On-device tensor data structure.
executor: The main executor type and the compiler for it.
offset_tensor: Tensor utility.
shape: Shape utilities.
tester: Testing and debugging infrastructure.
util: Miscellaneous utilities.

Structs§

CudaDevice: Export the CudaDevice type for convenience: often an explicit dependency on the kn_cuda_sys crate is not needed. A cuda device index.

Crate kn_cuda_evalCopy item path

§Quick demo

Modules§

Structs§

Crate kn_cuda_eval