Crate candle_core

Expand description

ML framework for Rust

use candle_core::{Tensor, DType, Device};

let a = Tensor::arange(0f32, 6f32, &Device::Cpu)?.reshape((2, 3))?;
let b = Tensor::arange(0f32, 12f32, &Device::Cpu)?.reshape((3, 4))?;
let c = a.matmul(&b)?;

§Features

Simple syntax (looks and feels like PyTorch)
CPU and Cuda backends (and M1 support)
Enable serverless (CPU) small and fast deployments
Model training
Distributed computing (NCCL).
Models out of the box (Llama, Whisper, Falcon, …)

§FAQ

Why Candle?

Candle stems from the need to reduce binary size in order to enable serverless possible by making the whole engine smaller than PyTorch very large library volume

And simply removing Python from production workloads. Python can really add overhead in more complex workflows and the GIL is a notorious source of headaches.

Rust is cool, and a lot of the HF ecosystem already has Rust crates safetensors and tokenizers

§Other Crates

Candle consists of a number of crates. This crate holds core the common data structures but you may wish to look at the docs for the other crates which can be found here:

candle-core. Core Datastructures and DataTypes.
candle-nn. Building blocks for Neural Nets.
candle-datasets. Rust access to commonly used Datasets like MNIST.
candle-examples. Examples of Candle in Use.
candle-onnx. Loading and using ONNX models.
candle-pyo3. Access to Candle from Python.
candle-transformers. Candle implemntation of many published transformer models.

Re-exports§

pub use cpu_backend::CpuStorage;
pub use cpu_backend::CpuStorageRef;
pub use error::Context;
pub use error::Error;
pub use error::Result;
pub use layout::Layout;
pub use shape::Shape;
pub use shape::D;
pub use streaming::StreamTensor;
pub use streaming::StreamingBinOp;
pub use streaming::StreamingModule;
pub use dummy_cuda_backend as cuda;
pub use cuda::CudaDevice;
pub use cuda::CudaStorage;

Modules§

backend: Traits to Define Backend Behavior
backprop: Methods for backpropagation of gradients.
conv: 1D and 2D Convolutions
cpu: Traits and methods for CPU-backed Tensors
cpu_backend: Implementation of Backend Fns for CPU
display: Pretty printing of tensors
dummy_cuda_backend: Implementation of the Cuda backend when Cuda support has not been compiled in.
error: Candle-specific Error and Result
layout: Tensor Layouts including contiguous or sparse strides
npy: Numpy support for tensors.
op: Tensor Opertion Enums and Traits
pickle: Just enough pickle support to be able to read PyTorch checkpoints.
quantized: Code for GGML and GGUF files
safetensors: Module to load safetensor files into CPU/GPU memory.
scalar: TensorScalar Enum and Trait
shape: The shape of a tensor is a tuple with the size of each of its dimensions.
streaming: StreamTensror useful for streaming ops.
test_utils
utils: Useful functions for checking features.

Macros§

bail
map_dtype
test_device

Structs§

DTypeParseError
MetalDevice
MetalStorage
StridedIndex: An iterator over offset position for items of an N-dimensional arrays stored in a flat buffer using some potential strides.
Tensor: The core struct for manipulating tensors.
TensorId: Unique identifier for tensors.
UgIOp1
Var: A variable is a wrapper around a tensor, however variables can have their content modified whereas tensors are immutable.

Enums§

DType: The different types of elements allowed in tensors.
Device: Cpu, Cuda, or Metal
DeviceLocation: A DeviceLocation represents a physical device whereas multiple Device can live on the same location (typically for cuda devices).
MetalError
Storage
StridedBlocks
TensorIndexer: Generic structure used to index a slice of the tensor

Traits§

CustomOp1: Unary ops that can be defined in user-land.
CustomOp2
CustomOp3
FloatDType
IndexOp: Trait used to implement multiple signatures for ease of use of the slicing of a tensor
InplaceOp1: Unary ops that can be defined in user-land. These ops work in place and as such back-prop is unsupported.
InplaceOp2
InplaceOp3
IntDType
Module: Defining a module with forward method using a single argument.
ModuleT: A single forward method using a single single tensor argument and a flag to separate the training and evaluation behaviors.
NdArray
ToUsize2
WithDType

Crate candle_coreCopy item path