Crate torsh_backend

Expand description

Unified backend implementation for ToRSh

This crate provides a unified backend system that integrates with SciRS2’s compute backends. All backend implementations are included in this single crate and selected via feature flags.

§Features

cpu (default): CPU backend with SIMD optimizations via scirs2-core
cuda: NVIDIA GPU backend via scirs2-core’s CUDA support
metal: Apple GPU backend via scirs2-core’s Metal/MPS support
rocm: AMD GPU backend (when available in scirs2-core)
webgpu: WebGPU backend (when available in scirs2-core)

Re-exports§

pub use adaptive_kernel_selection::AdaptiveKernelSelector;
pub use adaptive_kernel_selection::AdaptiveSelectionConfig;
pub use adaptive_kernel_selection::BenchmarkResult;
pub use adaptive_kernel_selection::BenchmarkResults;
pub use adaptive_kernel_selection::CustomKernel;
pub use adaptive_kernel_selection::HybridConfig;
pub use adaptive_kernel_selection::KernelCharacteristics;
pub use adaptive_kernel_selection::KernelConstraints;
pub use adaptive_kernel_selection::KernelExecutor;
pub use adaptive_kernel_selection::KernelImplementation;
pub use adaptive_kernel_selection::KernelInputs;
pub use adaptive_kernel_selection::KernelOutputs;
pub use adaptive_kernel_selection::KernelParameter;
pub use adaptive_kernel_selection::KernelPerformanceRecord;
pub use adaptive_kernel_selection::KernelRegistry;
pub use adaptive_kernel_selection::KernelSelection;
pub use adaptive_kernel_selection::KernelUsageStats;
pub use adaptive_kernel_selection::KernelVariant;
pub use adaptive_kernel_selection::MLBasedConfig;
pub use adaptive_kernel_selection::MLModelType;
pub use adaptive_kernel_selection::MLTrainingParams;
pub use adaptive_kernel_selection::PerformanceTracker;
pub use adaptive_kernel_selection::ResourceRequirements;
pub use adaptive_kernel_selection::ScalabilityCharacteristics;
pub use adaptive_kernel_selection::ScalingBehavior;
pub use adaptive_kernel_selection::ScoreBasedConfig;
pub use adaptive_kernel_selection::SelectionAccuracyTracker;
pub use adaptive_kernel_selection::SelectionAlgorithm;
pub use adaptive_kernel_selection::SelectionReason;
pub use adaptive_kernel_selection::SelectionStatistics;
pub use backend::Backend;
pub use backend::BackendCapabilities;
pub use backend::BackendCore;
pub use backend::BackendDeviceManager;
pub use backend::BackendExecutor;
pub use backend::BackendExtension;
pub use backend::BackendExtensionRegistry;
pub use backend::BackendFactory;
pub use backend::BackendLifecycle;
pub use backend::BackendOperations;
pub use backend::BackendOps;
pub use backend::BackendPlugin;
pub use backend::BackendRegistry;
pub use backend::BackendResourceManager;
pub use backend::BackendType;
pub use backend::CapabilityValue;
pub use backend::DeviceEnumerator;
pub use backend::ExecutionModel;
pub use backend::ExtendedCapabilities;
pub use backend::HardwareFeature;
pub use backend::MemoryHierarchy;
pub use backend::OperationsBundle;
pub use backend::PerformanceHints;
pub use backend::PluginMetadata;
pub use backend::PrecisionMode;
pub use backend::ResourceLimits;
pub use backend::ResourceStatistics;
pub use backend::ResourceUsage;
pub use backend::ScopedResource;
pub use buffer::Buffer;
pub use buffer::BufferDescriptor;
pub use buffer::BufferHandle;
pub use buffer::BufferUsage;
pub use buffer::BufferView;
pub use buffer::MemoryLocation;
pub use convolution::algorithms as conv_algorithms;
pub use convolution::ConvolutionAlgorithm;
pub use convolution::ConvolutionConfig;
pub use convolution::ConvolutionOps;
pub use convolution::ConvolutionPerformanceHints;
pub use convolution::ConvolutionType;
pub use convolution::DefaultConvolutionOps;
pub use convolution::PaddingMode;
pub use cross_backend_transfer::CrossBackendTransferManager;
pub use cross_backend_validation::compare_f32_values;
pub use cross_backend_validation::compare_f64_values;
pub use cross_backend_validation::run_cross_backend_validation;
pub use cross_backend_validation::CrossBackendValidator;
pub use device::Device;
pub use device::DeviceConfiguration;
pub use device::DeviceDiscovery;
pub use device::DeviceFeature;
pub use device::DeviceInfo;
pub use device::DeviceManager;
pub use device::DevicePerformanceInfo;
pub use device::DeviceRequirements;
pub use device::DeviceType;
pub use device::DeviceUtils;
pub use error::ErrorCategory;
pub use error::ErrorContext;
pub use error::ErrorSeverity;
pub use fft::convenience as fft_convenience;
pub use fft::DefaultFftExecutor;
pub use fft::DefaultFftOps;
pub use fft::FftDirection;
pub use fft::FftExecutor;
pub use fft::FftNormalization;
pub use fft::FftOps;
pub use fft::FftPlan;
pub use fft::FftType;
pub use hardware_optimization_tests::run_hardware_optimization_tests;
pub use hardware_optimization_tests::run_lightweight_hardware_tests;
pub use hardware_optimization_tests::HardwareOptimizationTester;
pub use kernel::Kernel;
pub use kernel::KernelDescriptor;
pub use kernel::KernelHandle;
pub use kernel::KernelLaunchConfig;
pub use kernel::KernelMetadata;
pub use memory::AccessPattern;
pub use memory::AllocationHint;
pub use memory::AllocationLifetime;
pub use memory::AllocationStrategy;
pub use memory::CompactionResult;
pub use memory::DefragmentationPolicy;
pub use memory::DefragmentationPriority;
pub use memory::DefragmentationResult;
pub use memory::DefragmentationStrategy;
pub use memory::FragmentationInfo;
pub use memory::FragmentationSeverity;
pub use memory::FreeListPool;
pub use memory::LeakReport;
pub use memory::LeakSeverity;
pub use memory::LeakType;
pub use memory::MemoryAdvice;
pub use memory::MemoryManager;
pub use memory::MemoryManagerFactory;
pub use memory::MemoryPool;
pub use memory::MemoryPoolConfig;
pub use memory::MemoryStats;
pub use memory::PoolStats;
pub use memory_defrag::CompactionPlan;
pub use memory_defrag::DefragmentationManager;
pub use memory_defrag::DefragmentationRequest;
pub use memory_defrag::DefragmentationStats;
pub use memory_defrag::DefragmentationTask;
pub use memory_defrag::MemoryBlock;
pub use memory_defrag::MemoryLayout;
pub use memory_defrag::TaskStatus;
pub use memory_profiler::AccessType;
pub use memory_profiler::AllocationContext;
pub use memory_profiler::AllocationUsageStats;
pub use memory_profiler::HintSeverity;
pub use memory_profiler::MemoryAllocation;
pub use memory_profiler::MemoryPressureEvent;
pub use memory_profiler::MemoryProfiler;
pub use memory_profiler::MemoryProfilerConfig;
pub use memory_profiler::MemorySnapshot;
pub use memory_profiler::MemoryType;
pub use memory_profiler::PerformanceHint;
pub use memory_profiler::PerformanceHintType;
pub use memory_profiler::PressureLevel;
pub use performance_modeling::AnomalyDetector;
pub use performance_modeling::AnomalySeverity;
pub use performance_modeling::AnomalyType;
pub use performance_modeling::ComplexityClass;
pub use performance_modeling::CorrelationAnalyzer;
pub use performance_modeling::CorrelationResult;
pub use performance_modeling::EnvironmentalFactors;
pub use performance_modeling::ModelAccuracy;
pub use performance_modeling::ModelComplexity;
pub use performance_modeling::ModelTrainingResult;
pub use performance_modeling::PatternType;
pub use performance_modeling::PerformanceAnomaly;
pub use performance_modeling::PerformanceCharacteristics;
pub use performance_modeling::PerformanceMeasurement;
pub use performance_modeling::PerformanceModel;
pub use performance_modeling::PerformanceReport;
pub use performance_modeling::PerformanceSample;
pub use performance_modeling::PerformanceTrend;
pub use performance_modeling::RealtimeStatistics;
pub use performance_modeling::RuntimeMonitor;
pub use performance_modeling::RuntimePerformanceModeler;
pub use performance_modeling::TrendDirection;
pub use performance_modeling::WorkloadPattern;
pub use performance_tuning::analyze_workload_optimization_opportunities;
pub use performance_tuning::create_default_constraints;
pub use performance_tuning::create_default_system_state;
pub use performance_tuning::create_energy_budget_constraints;
pub use performance_tuning::create_image_processing_workload;
pub use performance_tuning::create_ml_inference_workload;
pub use performance_tuning::create_ml_training_workload;
pub use performance_tuning::create_performance_optimized_system_state;
pub use performance_tuning::create_power_efficient_system_state;
pub use performance_tuning::create_realtime_constraints;
pub use performance_tuning::create_sample_workload;
pub use performance_tuning::create_throughput_constraints;
pub use performance_tuning::new_coordinator;
pub use performance_tuning::recommend_backend;
pub use performance_tuning::AccessPattern as PerfAccessPattern;
pub use performance_tuning::ActualPerformance;
pub use performance_tuning::BackendTuningStrategy;
pub use performance_tuning::DataType;
pub use performance_tuning::GlobalPerformanceStats;
pub use performance_tuning::MemoryAllocationStrategy;
pub use performance_tuning::NumaTopologyState;
pub use performance_tuning::OperationType;
pub use performance_tuning::OptimizationLevel;
pub use performance_tuning::PerformanceFeedback;
pub use performance_tuning::PerformancePrediction;
pub use performance_tuning::PerformanceTuningCoordinator;
pub use performance_tuning::PowerEfficiencyMode;
pub use performance_tuning::PowerState;
pub use performance_tuning::SchedulingStrategy;
pub use performance_tuning::StrategyMetrics;
pub use performance_tuning::SystemState;
pub use performance_tuning::ThermalState;
pub use performance_tuning::TuningConstraints;
pub use performance_tuning::TuningParameters;
pub use performance_tuning::TuningRecommendation;
pub use performance_tuning::TuningValue;
pub use performance_tuning::WorkloadCharacteristics;
pub use profiler::Profiler;
pub use profiler::ProfilerEvent;
pub use profiler::ProfilerStats;
pub use profiler::SimpleProfiler;
pub use quantization::CalibrationMethod;
pub use quantization::QuantizationCalibrator;
pub use quantization::QuantizationHardwareFeatures;
pub use quantization::QuantizationOps;
pub use quantization::QuantizationParams;
pub use quantization::QuantizationScheme;
pub use quantization::QuantizedDType;
pub use quantization::QuantizedTensor;
pub use quantization::SimdQuantizationOps;
pub use rnn::activations as rnn_activations;
pub use rnn::cells as rnn_cells;
pub use rnn::DefaultRnnOps;
pub use rnn::RnnActivation;
pub use rnn::RnnCellType;
pub use rnn::RnnConfig;
pub use rnn::RnnDirection;
pub use rnn::RnnOps;
pub use rnn::RnnOutput;
pub use rnn::RnnPerformanceHints;
pub use sparse_ops::DefaultSparseOps;
pub use sparse_ops::SparseFormat;
pub use sparse_ops::SparseFormatConverter;
pub use sparse_ops::SparseMatrix;
pub use sparse_ops::SparseOperation;
pub use sparse_ops::SparseOps;
pub use sparse_ops::SparseOptimizationHints;
pub use unified_memory_pool::CpuMemoryPool;
pub use unified_memory_pool::CudaMemoryPool;
pub use unified_memory_pool::MetalMemoryPool;
pub use unified_memory_pool::RocmMemoryPool;
pub use unified_memory_pool::UnifiedMemoryPool;
pub use unified_memory_pool::WebGpuMemoryPool;
pub use version_compat::BackendDependency;
pub use version_compat::CompatibilityReport;
pub use version_compat::DependencyStatus;
pub use version_compat::Version;
pub use version_compat::VersionCompatibilityChecker;
pub use version_compat::VersionError;
pub use version_compat::VersionErrorContextExt;
pub use version_compat::VersionRange;
pub use zero_copy::TransferDirection;
pub use zero_copy::TransferMode;
pub use zero_copy::ZeroCopyCapabilities;
pub use zero_copy::ZeroCopyManager;
pub use zero_copy::ZeroCopyStats;
pub use zero_copy::ZeroCopyTransfer;
pub use cpu::prepare_tensor_data;
pub use cpu::prepare_tensor_data_mut;
pub use cpu::SciRS2CpuBackend;

Modules§

adaptive_kernel_selection: Adaptive kernel selection based on input characteristics
backend: Core backend trait and implementations
buffer: Buffer management and memory operations
convolution: Convolution operations for all backends
cpu: CPU backend implementation for ToRSh
cross_backend_transfer: Cross-backend memory transfer optimization
cross_backend_validation: Cross-backend correctness validation tests
deadlock_prevention: Deadlock prevention utilities for backend operations
device: Device abstraction and management
error: Unified backend error handling using TorshError
fft: Fast Fourier Transform operations for all backends
hardware_optimization_tests: Hardware-specific optimization testing
introspection: Backend Capability Introspection System
jit_compiler: Just-In-Time (JIT) Kernel Compilation System
kernel: Compute kernel abstraction and management
kernel_generation: Kernel generation module structure.
memory: Memory management abstractions
memory_defrag: Advanced memory defragmentation and compaction strategies for ToRSh backends
memory_profiler: Comprehensive memory profiling system with SciRS2 integration
performance_modeling: Runtime performance modeling and prediction system
performance_tuning: Backend-specific performance tuning strategies
prelude: Prelude module for convenient imports
profiler: Performance profiling and monitoring
property_tests: Property-Based Testing for Backend Mathematical Correctness
quantization: Comprehensive quantization module for ToRSh backend
rnn: Recurrent Neural Network operations for all backends
sparse_ops: Comprehensive sparse operations support for ToRSh backends
unified_memory_pool: Unified memory pool system across all backends
version_compat: Comprehensive error checking and version compatibility for ToRSh backends
zero_copy: Zero-copy memory transfer implementations for ToRSh backends

Macros§

backend_error: Macros for simplified error creation
compute_error
error_with_location: Macro for adding location context to errors
memory_error
memory_profiler: Convenience macros for common operations
performance_hint: Convenience macro for creating performance hints
profile_scope: Macro for creating scoped profiling events

Structs§

BackendBuilder: Unified backend builder

Enums§

BackendError: Backend-specific error types

Constants§

VERSION
VERSION_MAJOR
VERSION_MINOR
VERSION_PATCH

Functions§

auto: Create a backend with automatic selection
available_backends: List available backend types
cpu: Create a CPU backend
cuda: Create a CUDA backend
device_count: Get device count for a specific backend type
enumerate_all_devices: Comprehensive device enumeration across all available backends
find_best_device: Find the best available device based on selection criteria
is_available: Check if any GPU backend is available (always false without CUDA feature)
metal: Create a Metal backend

Type Aliases§

BackendResult: Result type alias for ToRSh operations
BufferError: Buffer error type (alias to BackendError)

Crate torsh_backend

Crate torsh_backend Copy item path

§Features

Re-exports§

Modules§

Macros§

Structs§

Enums§

Constants§

Functions§

Type Aliases§

Crate torsh_backend