Skip to main content

Crate tensorlogic_scirs_backend

Crate tensorlogic_scirs_backend 

Source
Expand description

SciRS2-backed executor (CPU/SIMD/GPU via features).

Version: 0.1.0-beta.1 | Status: Production Ready

This crate provides a production-ready implementation of the TensorLogic execution traits using the SciRS2 scientific computing library.

§Core Features

§Execution Engine

  • Forward pass: Tensor operations (einsum, element-wise, reductions)
  • Backward pass: Automatic differentiation with stored intermediate values
  • Gradient checking: Numeric verification for correctness
  • Batch execution: Parallel processing support for multiple inputs

§Performance

  • Memory pooling: Efficient tensor allocation with shape-based reuse
  • Operation fusion: Analysis and optimization opportunities
  • SIMD support: Vectorized operations via feature flags
  • Profiling: Detailed performance monitoring and tracing

§Reliability

  • Error handling: Comprehensive error types with detailed context
  • Execution tracing: Multi-level debugging and operation tracking
  • Numerical stability: Fallback mechanisms for NaN/Inf handling
  • Shape validation: Runtime shape inference and verification

§Testing

  • 104 tests: Including unit, integration, and property-based tests
  • Property tests: Mathematical properties verified with proptest
  • Gradient tests: Numeric gradient checking for autodiff correctness

§Module Organization

  • executor: Core Scirs2Exec implementation
  • autodiff: Backward pass and gradient computation
  • gradient_ops: Advanced gradient operations (STE, Gumbel-Softmax, soft quantifiers)
  • error: Comprehensive error types and validation
  • fallback: Numerical stability and NaN/Inf handling
  • tracing: Execution debugging and performance tracking
  • memory_pool: Efficient tensor allocation
  • fusion: Operation fusion analysis
  • gradient_check: Numeric gradient verification
  • shape_inference: Runtime shape validation
  • batch_executor: Parallel batch processing
  • profiled_executor: Performance profiling wrapper
  • capabilities: Runtime capability detection
  • dependency_analyzer: Graph dependency analysis for parallel execution
  • parallel_executor: Multi-threaded parallel execution using Rayon
  • device: Device management (CPU/GPU selection)
  • execution_mode: Execution mode abstractions (Eager/Graph/JIT)
  • precision: Precision control (f32/f64/mixed)

Re-exports§

pub use batch_executor::ParallelBatchExecutor;
pub use checkpoint::Checkpoint;
pub use checkpoint::CheckpointConfig;
pub use checkpoint::CheckpointManager;
pub use checkpoint::CheckpointMetadata;
pub use cuda_detect::cuda_device_count;
pub use cuda_detect::cuda_devices_to_device_list;
pub use cuda_detect::detect_cuda_devices;
pub use cuda_detect::is_cuda_available;
pub use cuda_detect::CudaDeviceInfo;
pub use custom_ops::BinaryCustomOp;
pub use custom_ops::CustomOp;
pub use custom_ops::CustomOpContext;
pub use custom_ops::EluOp;
pub use custom_ops::GeluOp;
pub use custom_ops::HardSigmoidOp;
pub use custom_ops::HardSwishOp;
pub use custom_ops::LeakyReluOp;
pub use custom_ops::MishOp;
pub use custom_ops::OpRegistry;
pub use custom_ops::SoftplusOp;
pub use custom_ops::SwishOp;
pub use dependency_analyzer::DependencyAnalysis;
pub use dependency_analyzer::DependencyStats;
pub use dependency_analyzer::OperationDependency;
pub use device::Device;
pub use device::DeviceError;
pub use device::DeviceManager;
pub use device::DeviceType;
pub use error::NumericalError;
pub use error::NumericalErrorKind;
pub use error::ShapeMismatchError;
pub use error::TlBackendError;
pub use error::TlBackendResult;
pub use execution_mode::CompilationStats;
pub use execution_mode::CompiledGraph;
pub use execution_mode::ExecutionConfig;
pub use execution_mode::ExecutionMode;
pub use execution_mode::MemoryPlan;
pub use execution_mode::OptimizationConfig;
pub use fallback::is_valid;
pub use fallback::sanitize_tensor;
pub use fallback::FallbackConfig;
pub use gpu_readiness::assess_gpu_readiness;
pub use gpu_readiness::generate_recommendations;
pub use gpu_readiness::recommend_batch_size;
pub use gpu_readiness::GpuCapability;
pub use gpu_readiness::GpuReadinessReport;
pub use gpu_readiness::WorkloadProfile;
pub use gradient_ops::gumbel_softmax;
pub use gradient_ops::gumbel_softmax_backward;
pub use gradient_ops::soft_exists;
pub use gradient_ops::soft_exists_backward;
pub use gradient_ops::soft_forall;
pub use gradient_ops::soft_forall_backward;
pub use gradient_ops::ste_threshold;
pub use gradient_ops::ste_threshold_backward;
pub use gradient_ops::GumbelSoftmaxConfig;
pub use gradient_ops::QuantifierMode;
pub use gradient_ops::SteConfig;
pub use graph_optimizer::GraphOptimizer;
pub use graph_optimizer::GraphOptimizerBuilder;
pub use graph_optimizer::OptimizationPass;
pub use graph_optimizer::OptimizationStats;
pub use inplace_ops::can_execute_inplace;
pub use inplace_ops::is_shape_preserving;
pub use inplace_ops::InplaceExecutor;
pub use inplace_ops::InplaceStats;
pub use memory_profiler::AllocationRecord;
pub use memory_profiler::AtomicMemoryCounter;
pub use memory_profiler::MemoryProfiler;
pub use memory_profiler::MemoryStats as ProfilerMemoryStats;
pub use metrics::format_bytes;
pub use metrics::shared_metrics;
pub use metrics::AtomicMetrics;
pub use metrics::MemoryStats;
pub use metrics::MetricsCollector;
pub use metrics::MetricsConfig;
pub use metrics::MetricsSummary;
pub use metrics::OperationRecord;
pub use metrics::OperationStats;
pub use metrics::SharedMetrics;
pub use metrics::ThroughputStats;
pub use parallel_executor::ParallelConfig;
pub use parallel_executor::ParallelScirs2Exec;
pub use parallel_executor::ParallelStats;
pub use precision::ComputePrecision;
pub use precision::Precision;
pub use precision::PrecisionConfig;
pub use precision::Scalar;
pub use profiled_executor::ProfiledScirs2Exec;
pub use quantization::calibrate_quantization;
pub use quantization::QatConfig;
pub use quantization::QuantizationGranularity;
pub use quantization::QuantizationParams;
pub use quantization::QuantizationScheme;
pub use quantization::QuantizationStats;
pub use quantization::QuantizationType;
pub use quantization::QuantizedTensor;
pub use shape_inference::validate_tensor_shapes;
pub use shape_inference::Scirs2ShapeInference;
pub use tracing::ExecutionTracer;
pub use tracing::TraceEvent;
pub use tracing::TraceLevel;

Modules§

batch_executor
Batch execution support for parallel processing.
capabilities
Backend capability detection and reporting.
checkpoint
Checkpoint and resume functionality for training workflows.
cuda_detect
CUDA device detection utilities.
custom_ops
Custom operations infrastructure with dynamic registration.
dependency_analyzer
Dependency analysis for parallel execution of EinsumGraph operations.
device
Device management for tensor computations.
error
Comprehensive error types for tensorlogic-scirs-backend.
execution_mode
Execution mode abstractions for different execution strategies.
fallback
Fallback mechanisms for numerical stability.
fusion
Operation fusion for improved performance.
gpu_readiness
GPU Readiness Framework
gradient_check
Numeric gradient checking utilities for verifying analytical gradients.
gradient_ops
Advanced gradient operations for non-differentiable logical operations.
graph_optimizer
Graph optimization passes for improved execution performance.
inplace_ops
In-place operations for memory optimization.
memory_pool
Memory pooling for efficient tensor allocation.
memory_profiler
Memory Profiling Utilities for TensorLogic
metrics
Comprehensive performance monitoring and metrics collection.
parallel_executor
Parallel executor implementation using Rayon for multi-threaded execution.
precision
Precision control for tensor computations.
profiled_executor
Performance profiling support for execution monitoring.
quantization
Quantization Infrastructure for TensorLogic
shape_inference
Shape inference and validation support.
tracing
Execution tracing and debugging support.

Structs§

ForwardTape
Stores intermediate values from forward pass for gradient computation
Scirs2Exec

Type Aliases§

Scirs2Tensor