Crate tenflowers

Expand description

§TenfloweRS - Pure Rust Deep Learning Framework

TenfloweRS is a comprehensive machine learning framework implemented in pure Rust, providing TensorFlow-compatible APIs with Rust’s safety and performance guarantees. Built on the robust SciRS2 scientific computing ecosystem, TenfloweRS offers:

Production-Ready: Full-featured neural networks, training, and deployment
High Performance: GPU acceleration, SIMD optimization, mixed precision
Type Safety: Rust’s type system prevents common ML bugs at compile time
Cross-Platform: CPU, GPU (CUDA, Metal, Vulkan), and WebGPU support
Ecosystem Integration: Seamless integration with SciRS2, NumRS2, and OptiRS

§Quick Start

§Basic Tensor Operations

use tenflowers::prelude::*;

// Create tensors
let a = Tensor::<f32>::zeros(&[2, 3]);
let b = Tensor::<f32>::ones(&[2, 3]);

// Arithmetic operations
let c = ops::add(&a, &b)?;
let d = ops::mul(&a, &b)?;

// Matrix multiplication
let x = Tensor::<f32>::ones(&[2, 3]);
let y = Tensor::<f32>::ones(&[3, 4]);
let z = ops::matmul(&x, &y)?;

§Building Neural Networks

use tenflowers::prelude::*;

// Create a simple feedforward network
let mut model = Sequential::new();
model.add(Dense::new(784, 128)?);
model.add_activation(ActivationFunction::ReLU);
model.add(Dense::new(128, 10)?);
model.add_activation(ActivationFunction::Softmax);

// Forward pass
let input = Tensor::zeros(&[32, 784]);
let output = model.forward(&input)?;

§Training Models

use tenflowers::prelude::*;

// Quick training
let results = quick_train(
    model,
    &x_train,
    &y_train,
    Box::new(SGD::new(0.01)),
    categorical_cross_entropy,
    10,  // epochs
    32,  // batch_size
)?;

§GPU Acceleration

use tenflowers::prelude::*;

// Move computation to GPU
let device = Device::gpu(0)?;
let gpu_tensor = Tensor::<f32>::zeros(&[1000, 1000]).to_device(&device)?;
let result = ops::matmul(&gpu_tensor, &gpu_tensor)?;

§Automatic Differentiation

use tenflowers::prelude::*;

let mut tape = GradientTape::new();

// Create tracked tensors
let x = tape.watch(Tensor::<f32>::ones(&[2, 2]));
let y = tape.watch(Tensor::<f32>::ones(&[2, 2]));

// Compute gradients
let z = tape.watch(Tensor::<f32>::ones(&[2, 2]));
let gradients = tape.gradient(&[z], &[x, y])?;

§Data Loading

use tenflowers::prelude::*;

// Load dataset
let dataset = CsvDatasetBuilder::new("data.csv")
    .has_header(true)
    .build()?;

// Create data loader with batching
let loader = DataLoaderBuilder::new(dataset)
    .batch_size(32)
    .shuffle(true)
    .num_workers(4)
    .build()?;

// Iterate through batches
for batch in loader.iter() {
    let (features, labels) = batch?;
    // Training step...
}

§Architecture

TenfloweRS is organized into several focused crates:

core: Tensor operations and device management
autograd: Automatic differentiation engine
neural: Neural network layers and models
dataset: Data loading and preprocessing

§Feature Flags

§Default Features

std: Standard library support
parallel: Parallel execution via Rayon

§GPU Acceleration

gpu: GPU acceleration via WGPU (Metal, Vulkan, DirectX, WebGPU)
cuda: CUDA support (Linux/Windows only)
cudnn: cuDNN support (requires CUDA)
opencl: OpenCL support
metal: Metal support (macOS only)
rocm: ROCm support (AMD GPUs)
nccl: NCCL for distributed GPU training

§BLAS Acceleration

blas: Generic BLAS support
blas-openblas: OpenBLAS acceleration
blas-mkl: Intel MKL acceleration
blas-accelerate: Apple Accelerate framework (macOS only)

§Performance & Optimization

simd: SIMD vectorization optimizations

§Serialization & I/O

serialize: Serialization support (JSON, MessagePack)
compression: Compression support for checkpoints
onnx: ONNX model import/export

§Platform Support

wasm: WebAssembly support

§Development

autograd: Automatic differentiation support
benchmark: Benchmarking utilities

§Language Bindings

python: Python bindings via PyO3

§Convenience

full: Enable most features (gpu, blas-openblas, simd, serialize, compression, onnx, autograd, python)

§SciRS2 Integration

TenfloweRS is built on top of the SciRS2 ecosystem:

TenfloweRS (Deep Learning Framework)
    ↓ builds upon
OptiRS (ML Optimization)
    ↓ builds upon
SciRS2 (Scientific Computing Foundation)

This integration provides:

Advanced numerical operations via scirs2-core
Automatic differentiation via scirs2-autograd
Neural network abstractions via scirs2-neural
Optimized algorithms via optirs

Re-exports§

pub use tenflowers_autograd as autograd;
pub use tenflowers_core as core;
pub use tenflowers_dataset as dataset;
pub use tenflowers_neural as neural;

Modules§

common: Common types and utilities
prelude: Prelude module for convenient imports

Constants§

VERSION: The version of the TenfloweRS framework

Functions§

version: Returns the version string of TenfloweRS

Crate tenflowers

Crate tenflowers Copy item path

§TenfloweRS - Pure Rust Deep Learning Framework

§Quick Start

§Basic Tensor Operations

§Building Neural Networks

§Training Models

§GPU Acceleration

§Automatic Differentiation

§Data Loading

§Architecture

§Feature Flags

§Default Features

§GPU Acceleration

§BLAS Acceleration

§Performance & Optimization

§Serialization & I/O

§Platform Support

§Development

§Language Bindings

§Convenience

§SciRS2 Integration

Re-exports§

Modules§

Constants§

Functions§

Crate tenflowers