Skip to main content

Crate tensorlogic_scirs_backend

Crate tensorlogic_scirs_backend 

Source
Expand description

SciRS2-backed executor (CPU/SIMD/GPU via features).

Version: 0.1.0 | Status: Production Ready

This crate provides a production-ready implementation of the TensorLogic execution traits using the SciRS2 scientific computing library.

§Core Features

§Execution Engine

  • Forward pass: Tensor operations (einsum, element-wise, reductions)
  • Backward pass: Automatic differentiation with stored intermediate values
  • Gradient checking: Numeric verification for correctness
  • Batch execution: Parallel processing support for multiple inputs

§Performance

  • Memory pooling: Efficient tensor allocation with shape-based reuse
  • Operation fusion: Analysis and optimization opportunities
  • SIMD support: Vectorized operations via feature flags
  • Profiling: Detailed performance monitoring and tracing

§Reliability

  • Error handling: Comprehensive error types with detailed context
  • Execution tracing: Multi-level debugging and operation tracking
  • Numerical stability: Fallback mechanisms for NaN/Inf handling
  • Shape validation: Runtime shape inference and verification

§Testing

  • 104 tests: Including unit, integration, and property-based tests
  • Property tests: Mathematical properties verified with proptest
  • Gradient tests: Numeric gradient checking for autodiff correctness

§Module Organization

  • executor: Core Scirs2Exec implementation
  • autodiff: Backward pass and gradient computation
  • gradient_ops: Advanced gradient operations (STE, Gumbel-Softmax, soft quantifiers)
  • error: Comprehensive error types and validation
  • fallback: Numerical stability and NaN/Inf handling
  • tracing: Execution debugging and performance tracking
  • memory_pool: Efficient tensor allocation
  • fusion: Operation fusion analysis
  • gradient_check: Numeric gradient verification
  • shape_inference: Runtime shape validation
  • batch_executor: Parallel batch processing
  • profiled_executor: Performance profiling wrapper
  • capabilities: Runtime capability detection
  • dependency_analyzer: Graph dependency analysis for parallel execution
  • parallel_executor: Multi-threaded parallel execution using Rayon
  • device: Device management (CPU/GPU selection)
  • execution_mode: Execution mode abstractions (Eager/Graph/JIT)
  • precision: Precision control (f32/f64/mixed)

Re-exports§

pub use activations::elu;
pub use activations::gelu;
pub use activations::gelu_approx;
pub use activations::gelu_scalar;
pub use activations::hardsigmoid;
pub use activations::hardswish;
pub use activations::leaky_relu;
pub use activations::log_softmax;
pub use activations::mish;
pub use activations::prelu;
pub use activations::relu;
pub use activations::relu6;
pub use activations::relu_grad;
pub use activations::relu_scalar;
pub use activations::selu;
pub use activations::sigmoid;
pub use activations::sigmoid_grad;
pub use activations::sigmoid_scalar;
pub use activations::silu;
pub use activations::softmax;
pub use activations::softplus;
pub use activations::softsign;
pub use activations::swish;
pub use activations::swish_scalar;
pub use activations::tanh_activation;
pub use activations::tanh_grad;
pub use activations::ActivationBenchmark;
pub use activations::ActivationError;
pub use activations::ActivationType;
pub use attention::attention_entropy;
pub use attention::chunked_attention;
pub use attention::scaled_dot_product_attention;
pub use attention::stable_softmax;
pub use attention::AttentionConfig;
pub use attention::AttentionError;
pub use attention::AttentionOutput;
pub use attention::MultiHeadAttention;
pub use attention_grad::attention_backward;
pub use attention_grad::multihead_attention_backward;
pub use attention_grad::softmax_backward;
pub use attention_grad::AttentionGradients;
pub use attention_grad::MultiHeadAttentionGrad;
pub use batch_executor::ParallelBatchExecutor;
pub use blocked_sparse::blocked_sparse_add;
pub use blocked_sparse::blocked_sparse_dense_mm;
pub use blocked_sparse::blocked_sparse_mm;
pub use blocked_sparse::blocked_sparse_scale;
pub use blocked_sparse::BlockSparsityPattern;
pub use blocked_sparse::BlockSparsityStats;
pub use blocked_sparse::BlockedSparseDynTensor;
pub use blocked_sparse::BlockedSparseError;
pub use blocked_sparse::BlockedSparseTensor;
pub use checkpoint::Checkpoint;
pub use checkpoint::CheckpointConfig;
pub use checkpoint::CheckpointManager;
pub use checkpoint::CheckpointMetadata;
pub use comparison::abs_diff;
pub use comparison::assert_tensors_close;
pub use comparison::compare_tensors;
pub use comparison::count_non_finite;
pub use comparison::is_finite;
pub use comparison::ComparisonError;
pub use comparison::ComparisonResult;
pub use comparison::Tolerance;
pub use convolution::col2im;
pub use convolution::conv1d;
pub use convolution::conv2d;
pub use convolution::conv_transpose2d;
pub use convolution::depthwise_conv2d;
pub use convolution::im2col;
pub use convolution::ConvConfig;
pub use convolution::ConvError;
pub use convolution::ConvStats;
pub use cuda_detect::cuda_device_count;
pub use cuda_detect::cuda_devices_to_device_list;
pub use cuda_detect::detect_cuda_devices;
pub use cuda_detect::is_cuda_available;
pub use cuda_detect::CudaDeviceInfo;
pub use custom_ops::BinaryCustomOp;
pub use custom_ops::CustomOp;
pub use custom_ops::CustomOpContext;
pub use custom_ops::EluOp;
pub use custom_ops::GeluOp;
pub use custom_ops::HardSigmoidOp;
pub use custom_ops::HardSwishOp;
pub use custom_ops::LeakyReluOp;
pub use custom_ops::MishOp;
pub use custom_ops::OpRegistry;
pub use custom_ops::SoftplusOp;
pub use custom_ops::SwishOp;
pub use decomposition::cp_als;
pub use decomposition::fold;
pub use decomposition::hosvd;
pub use decomposition::truncated_svd;
pub use decomposition::tucker1;
pub use decomposition::unfold;
pub use decomposition::CpDecomposition;
pub use decomposition::DecompositionError;
pub use decomposition::HosvdResult;
pub use decomposition::TruncatedSvd;
pub use decomposition::Tucker1Result;
pub use dependency_analyzer::DependencyAnalysis;
pub use dependency_analyzer::DependencyStats;
pub use dependency_analyzer::OperationDependency;
pub use device::Device;
pub use device::DeviceError;
pub use device::DeviceType;
pub use device::SystemDeviceManager;
pub use device_manager::DeviceConfig;
pub use device_manager::DeviceManager;
pub use device_manager::DeviceSelector;
pub use device_manager::HeuristicSelector;
pub use device_manager::OpDescriptor;
pub use device_manager::OpKind;
pub use error::NumericalError;
pub use error::NumericalErrorKind;
pub use error::ShapeMismatchError;
pub use error::TlBackendError;
pub use error::TlBackendResult;
pub use execution_mode::CompilationStats;
pub use execution_mode::CompiledGraph;
pub use execution_mode::ExecutionConfig;
pub use execution_mode::ExecutionMode;
pub use execution_mode::MemoryPlan;
pub use execution_mode::OptimizationConfig;
pub use executor_f32::Scirs2Exec32;
pub use executor_f32::Scirs2Tensor32;
pub use fallback::is_valid;
pub use fallback::sanitize_tensor;
pub use fallback::FallbackConfig;
pub use gather_scatter::gather;
pub use gather_scatter::gather_nd;
pub use gather_scatter::masked_fill;
pub use gather_scatter::masked_select;
pub use gather_scatter::scatter_add;
pub use gather_scatter::scatter_max;
pub use gather_scatter::scatter_min;
pub use gather_scatter::top_k;
pub use gather_scatter::GatherScatterError;
pub use gather_scatter::IndexStats;
pub use geometric_ops::gcn_layer;
pub use geometric_ops::graph_laplacian;
pub use geometric_ops::mat_mul;
pub use geometric_ops::sph_harm;
pub use geometric_ops::spherical_harmonics;
pub use geometric_ops::AdjacencyMatrix;
pub use geometric_ops::GcnActivation;
pub use geometric_ops::GeoError;
pub use geometric_ops::LaplacianMatrix;
pub use geometric_ops::LaplacianType;
pub use geometric_ops::Rotation3;
pub use gpu_readiness::assess_gpu_readiness;
pub use gpu_readiness::generate_recommendations;
pub use gpu_readiness::recommend_batch_size;
pub use gpu_readiness::GpuCapability;
pub use gpu_readiness::GpuReadinessReport;
pub use gpu_readiness::WorkloadProfile;
pub use gradient_ops::gumbel_softmax;
pub use gradient_ops::gumbel_softmax_backward;
pub use gradient_ops::soft_exists;
pub use gradient_ops::soft_exists_backward;
pub use gradient_ops::soft_forall;
pub use gradient_ops::soft_forall_backward;
pub use gradient_ops::ste_threshold;
pub use gradient_ops::ste_threshold_backward;
pub use gradient_ops::GumbelSoftmaxConfig;
pub use gradient_ops::QuantifierMode;
pub use gradient_ops::SteConfig;
pub use graph_optimizer::GraphOptimizer;
pub use graph_optimizer::GraphOptimizerBuilder;
pub use graph_optimizer::OptimizationPass;
pub use graph_optimizer::OptimizationStats;
pub use inplace_ops::can_execute_inplace;
pub use inplace_ops::is_shape_preserving;
pub use inplace_ops::InplaceExecutor;
pub use inplace_ops::InplaceStats;
pub use lazy::EvaluationPlan;
pub use lazy::LazyEinsumGraph;
pub use lazy::LazyExecutor;
pub use lazy::LazyStats;
pub use lazy::LazyTensor;
pub use lazy::NodeMemoryEstimate;
pub use memory_profiler::AllocationRecord;
pub use memory_profiler::AtomicMemoryCounter;
pub use memory_profiler::MemoryProfiler;
pub use memory_profiler::MemoryStats as ProfilerMemoryStats;
pub use metrics::format_bytes;
pub use metrics::shared_metrics;
pub use metrics::AtomicMetrics;
pub use metrics::MemoryStats;
pub use metrics::MetricsCollector;
pub use metrics::MetricsConfig;
pub use metrics::MetricsSummary;
pub use metrics::OperationRecord;
pub use metrics::OperationStats;
pub use metrics::SharedMetrics;
pub use metrics::ThroughputStats;
pub use parallel_executor::ParallelConfig;
pub use parallel_executor::ParallelScirs2Exec;
pub use parallel_executor::ParallelStats;
pub use pooling::adaptive_avg_pool;
pub use pooling::avg_pool;
pub use pooling::global_avg_pool;
pub use pooling::global_max_pool;
pub use pooling::lp_pool;
pub use pooling::max_pool;
pub use pooling::max_pool_with_indices;
pub use pooling::max_unpool;
pub use pooling::PoolConfig;
pub use pooling::PoolingError;
pub use pooling::PoolingStats;
pub use precision::ComputePrecision;
pub use precision::Precision;
pub use precision::PrecisionConfig;
pub use precision::Scalar;
pub use precision_cast::cast_f32_to_f64;
pub use precision_cast::cast_f64_to_f32;
pub use precision_cast::DualPrecisionBridge;
pub use profiled_executor::ProfiledScirs2Exec;
pub use quantization::calibrate_quantization;
pub use quantization::QatConfig;
pub use quantization::QuantizationGranularity;
pub use quantization::QuantizationParams;
pub use quantization::QuantizationScheme;
pub use quantization::QuantizationStats;
pub use quantization::QuantizationType;
pub use quantization::QuantizedTensor;
pub use recurrent::gru_sequence;
pub use recurrent::lstm_sequence;
pub use recurrent::rnn_sequence;
pub use recurrent::GruCell;
pub use recurrent::LstmCell;
pub use recurrent::LstmState;
pub use recurrent::RecurrentError;
pub use recurrent::RecurrentStats;
pub use recurrent::RnnCell;
pub use scoring::log_sum_exp;
pub use scoring::weighted_soft_exists;
pub use scoring::weighted_soft_forall;
pub use scoring::LogSpaceAggregator;
pub use scoring::ScoringConfig;
pub use scoring::ScoringError;
pub use scoring::ScoringMode;
pub use scoring::WeightedQuantifier;
pub use shape_inference::validate_tensor_shapes;
pub use shape_inference::Scirs2ShapeInference;
pub use signal_ops::apply_window;
pub use signal_ops::dct;
pub use signal_ops::dft;
pub use signal_ops::fir_filter;
pub use signal_ops::hz_to_mel;
pub use signal_ops::idct;
pub use signal_ops::idft;
pub use signal_ops::istft;
pub use signal_ops::mel_filterbank;
pub use signal_ops::mel_to_hz;
pub use signal_ops::stft;
pub use signal_ops::window;
pub use signal_ops::Complex;
pub use signal_ops::FirFilter;
pub use signal_ops::SignalError;
pub use signal_ops::StftResult;
pub use signal_ops::WindowType;
pub use tensor_io::load_tensor;
pub use tensor_io::load_tensors;
pub use tensor_io::read_header;
pub use tensor_io::read_tensor;
pub use tensor_io::save_tensor;
pub use tensor_io::save_tensors;
pub use tensor_io::write_tensor;
pub use tensor_io::TensorHeader;
pub use tensor_io::TensorIoError;
pub use tensor_loss::LossReduction;
pub use tensor_loss::TensorBCELoss;
pub use tensor_loss::TensorCosineEmbeddingLoss;
pub use tensor_loss::TensorCrossEntropyLoss;
pub use tensor_loss::TensorFocalLoss;
pub use tensor_loss::TensorHuberLoss;
pub use tensor_loss::TensorKLDivLoss;
pub use tensor_loss::TensorLoss;
pub use tensor_loss::TensorLossConfig;
pub use tensor_loss::TensorLossError;
pub use tensor_loss::TensorLossOutput;
pub use tensor_loss::TensorLossRegistry;
pub use tensor_loss::TensorMseLoss;
pub use tracing::ExecutionTracer;
pub use tracing::TraceEvent;
pub use tracing::TraceLevel;

Modules§

activations
Activation functions for neural network layers.
attention
Numerically stable attention operations for TensorLogic.
attention_grad
Backward pass for attention operations.
batch_executor
Batch execution support for parallel processing.
blocked_sparse
Blocked Sparse Row (BSR) format tensor operations.
capabilities
Backend capability detection and reporting.
checkpoint
Checkpoint and resume functionality for training workflows.
comparison
Tensor comparison utilities for testing and validation.
convolution
Convolution operations for neural network tensor processing.
cuda_detect
CUDA device detection utilities.
custom_ops
Custom operations infrastructure with dynamic registration.
decomposition
Tensor decomposition algorithms for the SciRS2 backend.
dependency_analyzer
Dependency analysis for parallel execution of EinsumGraph operations.
device
Device management for tensor computations.
device_manager
Operation-level device selection and management.
error
Comprehensive error types for tensorlogic-scirs-backend.
execution_mode
Execution mode abstractions for different execution strategies.
executor_f32
SciRS2 f32 executor implementation.
fallback
Fallback mechanisms for numerical stability.
fusion
Operation fusion for improved performance.
gather_scatter
Gather/Scatter operations for tensor indexing and selection.
geometric_ops
Geometric deep learning operations.
gpu_readiness
GPU Readiness Framework
gradient_check
Numeric gradient checking utilities for verifying analytical gradients.
gradient_ops
Advanced gradient operations for non-differentiable logical operations.
graph_optimizer
Graph optimization passes for improved execution performance.
inplace_ops
In-place operations for memory optimization.
lazy
Lazy evaluation for large EinsumGraphs.
memory_pool
Memory pooling for efficient tensor allocation.
memory_profiler
Memory Profiling Utilities for TensorLogic
metrics
Comprehensive performance monitoring and metrics collection.
parallel_executor
Parallel executor implementation using Rayon for multi-threaded execution.
pooling
Pooling operations for neural network tensor processing.
precision
Precision control for tensor computations.
precision_cast
Utilities for casting between f32 and f64 tensors, and a dual-precision bridge.
profiled_executor
Performance profiling support for execution monitoring.
quantization
Quantization Infrastructure for TensorLogic
recurrent
Recurrent neural network cells: RNN, LSTM, GRU.
scoring
Log-space scoring aggregation and weighted quantifiers.
shape_inference
Shape inference and validation support.
signal_ops
Signal processing operations for audio and time-series data.
tensor_io
Tensor binary serialization and deserialization.
tensor_loss
Tensor-level loss functions operating on ArrayD<f64> with optional gradient output.
tracing
Execution tracing and debugging support.

Structs§

ForwardTape
Stores intermediate values from forward pass for gradient computation
Scirs2Exec

Type Aliases§

Scirs2Tensor