Crate ghostflow_core

Crate ghostflow_core 

Source
Expand description

GhostFlow Core - High-performance tensor operations

This crate provides the foundational tensor type and operations for the GhostFlow ML framework.

§Phase 4 Optimizations (Beat JAX!)

  • Operation fusion engine
  • JIT compilation
  • Memory layout optimization
  • Custom optimized kernels

Re-exports§

pub use dtype::DType;
pub use shape::Shape;
pub use shape::Strides;
pub use storage::Storage;
pub use tensor::Tensor;
pub use device::Device;
pub use device::Cpu;
pub use error::GhostError;
pub use error::Result;
pub use serialize::StateDict;
pub use serialize::save_state_dict;
pub use serialize::load_state_dict;
pub use serialize::Serializable;
pub use sparse::SparseTensorCOO;
pub use sparse::SparseTensorCSR;
pub use sparse::SparseTensorCSC;
pub use hardware::HardwareBackend;
pub use hardware::HardwareDevice;
pub use hardware::HardwareOps;
pub use hardware::ElementwiseOp;
pub use hardware::list_devices;
pub use fusion::FusionEngine;
pub use fusion::ComputeGraph;
pub use fusion::FusionPattern;
pub use layout::LayoutOptimizer;
pub use layout::MemoryLayout;
pub use layout::DeviceInfo;
pub use simd_ops::simd_add_f32;
pub use simd_ops::simd_mul_f32;
pub use simd_ops::simd_dot_f32;
pub use simd_ops::simd_relu_f32;
pub use memory::MemoryPool;
pub use memory::MemoryStats;
pub use memory::MemoryLayoutOptimizer;
pub use memory::TrackedAllocator;
pub use profiler::Profiler;
pub use profiler::ProfileScope;
pub use profiler::Benchmark;
pub use profiler::BenchmarkResult;
pub use profiler::global_profiler;

Modules§

device
Device abstraction for CPU/GPU execution
dtype
Data types supported by GhostFlow tensors
error
Error types for GhostFlow
fusion
Kernel fusion engine for optimizing computation graphs
hardware
Hardware abstraction layer
layout
Memory Layout Optimizer
memory
Memory optimization utilities
metal
Metal backend for Apple Silicon
neon
ARM NEON SIMD optimizations
ops
Tensor operations
prelude
Prelude for convenient imports
profiler
Profiling tools for performance analysis
rocm
ROCm (AMD GPU) backend
serialize
Model serialization and deserialization
shape
Shape and stride handling for tensors
simd_ops
Advanced SIMD optimizations for tensor operations
sparse
Sparse tensor operations
storage
Storage backend for tensor data
tensor
Core Tensor type - the foundation of GhostFlow
tensor_ops
Tensor operations trait extensions
tpu
TPU (Tensor Processing Unit) backend

Macros§

profile
Profile a code block