Expand description
SciRS2-backed executor (CPU/SIMD/GPU via features).
Version: 0.1.0-beta.1 | Status: Production Ready
This crate provides a production-ready implementation of the TensorLogic execution traits using the SciRS2 scientific computing library.
§Core Features
§Execution Engine
- Forward pass: Tensor operations (einsum, element-wise, reductions)
- Backward pass: Automatic differentiation with stored intermediate values
- Gradient checking: Numeric verification for correctness
- Batch execution: Parallel processing support for multiple inputs
§Performance
- Memory pooling: Efficient tensor allocation with shape-based reuse
- Operation fusion: Analysis and optimization opportunities
- SIMD support: Vectorized operations via feature flags
- Profiling: Detailed performance monitoring and tracing
§Reliability
- Error handling: Comprehensive error types with detailed context
- Execution tracing: Multi-level debugging and operation tracking
- Numerical stability: Fallback mechanisms for NaN/Inf handling
- Shape validation: Runtime shape inference and verification
§Testing
- 104 tests: Including unit, integration, and property-based tests
- Property tests: Mathematical properties verified with proptest
- Gradient tests: Numeric gradient checking for autodiff correctness
§Module Organization
executor: Core Scirs2Exec implementationautodiff: Backward pass and gradient computationgradient_ops: Advanced gradient operations (STE, Gumbel-Softmax, soft quantifiers)error: Comprehensive error types and validationfallback: Numerical stability and NaN/Inf handlingtracing: Execution debugging and performance trackingmemory_pool: Efficient tensor allocationfusion: Operation fusion analysisgradient_check: Numeric gradient verificationshape_inference: Runtime shape validationbatch_executor: Parallel batch processingprofiled_executor: Performance profiling wrappercapabilities: Runtime capability detectiondependency_analyzer: Graph dependency analysis for parallel executionparallel_executor: Multi-threaded parallel execution using Rayondevice: Device management (CPU/GPU selection)execution_mode: Execution mode abstractions (Eager/Graph/JIT)precision: Precision control (f32/f64/mixed)
Re-exports§
pub use batch_executor::ParallelBatchExecutor;pub use checkpoint::Checkpoint;pub use checkpoint::CheckpointConfig;pub use checkpoint::CheckpointManager;pub use checkpoint::CheckpointMetadata;pub use cuda_detect::cuda_device_count;pub use cuda_detect::cuda_devices_to_device_list;pub use cuda_detect::detect_cuda_devices;pub use cuda_detect::is_cuda_available;pub use cuda_detect::CudaDeviceInfo;pub use custom_ops::BinaryCustomOp;pub use custom_ops::CustomOp;pub use custom_ops::CustomOpContext;pub use custom_ops::EluOp;pub use custom_ops::GeluOp;pub use custom_ops::HardSigmoidOp;pub use custom_ops::HardSwishOp;pub use custom_ops::LeakyReluOp;pub use custom_ops::MishOp;pub use custom_ops::OpRegistry;pub use custom_ops::SoftplusOp;pub use custom_ops::SwishOp;pub use dependency_analyzer::DependencyAnalysis;pub use dependency_analyzer::DependencyStats;pub use dependency_analyzer::OperationDependency;pub use device::Device;pub use device::DeviceError;pub use device::DeviceManager;pub use device::DeviceType;pub use error::NumericalError;pub use error::NumericalErrorKind;pub use error::ShapeMismatchError;pub use error::TlBackendError;pub use error::TlBackendResult;pub use execution_mode::CompilationStats;pub use execution_mode::CompiledGraph;pub use execution_mode::ExecutionConfig;pub use execution_mode::ExecutionMode;pub use execution_mode::MemoryPlan;pub use execution_mode::OptimizationConfig;pub use fallback::is_valid;pub use fallback::sanitize_tensor;pub use fallback::FallbackConfig;pub use gpu_readiness::assess_gpu_readiness;pub use gpu_readiness::generate_recommendations;pub use gpu_readiness::recommend_batch_size;pub use gpu_readiness::GpuCapability;pub use gpu_readiness::GpuReadinessReport;pub use gpu_readiness::WorkloadProfile;pub use gradient_ops::gumbel_softmax;pub use gradient_ops::gumbel_softmax_backward;pub use gradient_ops::soft_exists;pub use gradient_ops::soft_exists_backward;pub use gradient_ops::soft_forall;pub use gradient_ops::soft_forall_backward;pub use gradient_ops::ste_threshold;pub use gradient_ops::ste_threshold_backward;pub use gradient_ops::GumbelSoftmaxConfig;pub use gradient_ops::QuantifierMode;pub use gradient_ops::SteConfig;pub use graph_optimizer::GraphOptimizer;pub use graph_optimizer::GraphOptimizerBuilder;pub use graph_optimizer::OptimizationPass;pub use graph_optimizer::OptimizationStats;pub use inplace_ops::can_execute_inplace;pub use inplace_ops::is_shape_preserving;pub use inplace_ops::InplaceExecutor;pub use inplace_ops::InplaceStats;pub use memory_profiler::AllocationRecord;pub use memory_profiler::AtomicMemoryCounter;pub use memory_profiler::MemoryProfiler;pub use memory_profiler::MemoryStats as ProfilerMemoryStats;pub use metrics::format_bytes;pub use metrics::AtomicMetrics;pub use metrics::MemoryStats;pub use metrics::MetricsCollector;pub use metrics::MetricsConfig;pub use metrics::MetricsSummary;pub use metrics::OperationRecord;pub use metrics::OperationStats;pub use metrics::ThroughputStats;pub use parallel_executor::ParallelConfig;pub use parallel_executor::ParallelScirs2Exec;pub use parallel_executor::ParallelStats;pub use precision::ComputePrecision;pub use precision::Precision;pub use precision::PrecisionConfig;pub use precision::Scalar;pub use profiled_executor::ProfiledScirs2Exec;pub use quantization::calibrate_quantization;pub use quantization::QatConfig;pub use quantization::QuantizationGranularity;pub use quantization::QuantizationParams;pub use quantization::QuantizationScheme;pub use quantization::QuantizationStats;pub use quantization::QuantizationType;pub use quantization::QuantizedTensor;pub use shape_inference::validate_tensor_shapes;pub use shape_inference::Scirs2ShapeInference;pub use tracing::ExecutionTracer;pub use tracing::TraceEvent;pub use tracing::TraceLevel;
Modules§
- batch_
executor - Batch execution support for parallel processing.
- capabilities
- Backend capability detection and reporting.
- checkpoint
- Checkpoint and resume functionality for training workflows.
- cuda_
detect - CUDA device detection utilities.
- custom_
ops - Custom operations infrastructure with dynamic registration.
- dependency_
analyzer - Dependency analysis for parallel execution of EinsumGraph operations.
- device
- Device management for tensor computations.
- error
- Comprehensive error types for tensorlogic-scirs-backend.
- execution_
mode - Execution mode abstractions for different execution strategies.
- fallback
- Fallback mechanisms for numerical stability.
- fusion
- Operation fusion for improved performance.
- gpu_
readiness - GPU Readiness Framework
- gradient_
check - Numeric gradient checking utilities for verifying analytical gradients.
- gradient_
ops - Advanced gradient operations for non-differentiable logical operations.
- graph_
optimizer - Graph optimization passes for improved execution performance.
- inplace_
ops - In-place operations for memory optimization.
- memory_
pool - Memory pooling for efficient tensor allocation.
- memory_
profiler - Memory Profiling Utilities for TensorLogic
- metrics
- Comprehensive performance monitoring and metrics collection.
- parallel_
executor - Parallel executor implementation using Rayon for multi-threaded execution.
- precision
- Precision control for tensor computations.
- profiled_
executor - Performance profiling support for execution monitoring.
- quantization
- Quantization Infrastructure for TensorLogic
- shape_
inference - Shape inference and validation support.
- tracing
- Execution tracing and debugging support.
Structs§
- Forward
Tape - Stores intermediate values from forward pass for gradient computation
- Scirs2
Exec