Expand description
Unified backend implementation for ToRSh
This crate provides a unified backend system that integrates with SciRS2’s compute backends. All backend implementations are included in this single crate and selected via feature flags.
§Features
cpu(default): CPU backend with SIMD optimizations via scirs2-corecuda: NVIDIA GPU backend via scirs2-core’s CUDA supportmetal: Apple GPU backend via scirs2-core’s Metal/MPS supportrocm: AMD GPU backend (when available in scirs2-core)webgpu: WebGPU backend (when available in scirs2-core)
Re-exports§
pub use adaptive_kernel_selection::AdaptiveKernelSelector;pub use adaptive_kernel_selection::AdaptiveSelectionConfig;pub use adaptive_kernel_selection::BenchmarkResult;pub use adaptive_kernel_selection::BenchmarkResults;pub use adaptive_kernel_selection::CustomKernel;pub use adaptive_kernel_selection::HybridConfig;pub use adaptive_kernel_selection::KernelCharacteristics;pub use adaptive_kernel_selection::KernelConstraints;pub use adaptive_kernel_selection::KernelExecutor;pub use adaptive_kernel_selection::KernelImplementation;pub use adaptive_kernel_selection::KernelInputs;pub use adaptive_kernel_selection::KernelOutputs;pub use adaptive_kernel_selection::KernelParameter;pub use adaptive_kernel_selection::KernelPerformanceRecord;pub use adaptive_kernel_selection::KernelRegistry;pub use adaptive_kernel_selection::KernelSelection;pub use adaptive_kernel_selection::KernelUsageStats;pub use adaptive_kernel_selection::KernelVariant;pub use adaptive_kernel_selection::MLBasedConfig;pub use adaptive_kernel_selection::MLModelType;pub use adaptive_kernel_selection::MLTrainingParams;pub use adaptive_kernel_selection::PerformanceTracker;pub use adaptive_kernel_selection::ResourceRequirements;pub use adaptive_kernel_selection::ScalabilityCharacteristics;pub use adaptive_kernel_selection::ScalingBehavior;pub use adaptive_kernel_selection::ScoreBasedConfig;pub use adaptive_kernel_selection::SelectionAccuracyTracker;pub use adaptive_kernel_selection::SelectionAlgorithm;pub use adaptive_kernel_selection::SelectionReason;pub use adaptive_kernel_selection::SelectionStatistics;pub use backend::Backend;pub use backend::BackendCapabilities;pub use backend::BackendCore;pub use backend::BackendDeviceManager;pub use backend::BackendExecutor;pub use backend::BackendExtension;pub use backend::BackendExtensionRegistry;pub use backend::BackendFactory;pub use backend::BackendLifecycle;pub use backend::BackendOperations;pub use backend::BackendOps;pub use backend::BackendPlugin;pub use backend::BackendRegistry;pub use backend::BackendResourceManager;pub use backend::BackendType;pub use backend::CapabilityValue;pub use backend::DeviceEnumerator;pub use backend::ExecutionModel;pub use backend::ExtendedCapabilities;pub use backend::HardwareFeature;pub use backend::MemoryHierarchy;pub use backend::OperationsBundle;pub use backend::PerformanceHints;pub use backend::PluginMetadata;pub use backend::PrecisionMode;pub use backend::ResourceLimits;pub use backend::ResourceStatistics;pub use backend::ResourceUsage;pub use backend::ScopedResource;pub use buffer::Buffer;pub use buffer::BufferDescriptor;pub use buffer::BufferHandle;pub use buffer::BufferUsage;pub use buffer::BufferView;pub use buffer::MemoryLocation;pub use convolution::algorithms as conv_algorithms;pub use convolution::ConvolutionAlgorithm;pub use convolution::ConvolutionConfig;pub use convolution::ConvolutionOps;pub use convolution::ConvolutionPerformanceHints;pub use convolution::ConvolutionType;pub use convolution::DefaultConvolutionOps;pub use convolution::PaddingMode;pub use cross_backend_transfer::CrossBackendTransferManager;pub use cross_backend_validation::compare_f32_values;pub use cross_backend_validation::compare_f64_values;pub use cross_backend_validation::run_cross_backend_validation;pub use cross_backend_validation::CrossBackendValidator;pub use device::Device;pub use device::DeviceConfiguration;pub use device::DeviceDiscovery;pub use device::DeviceFeature;pub use device::DeviceInfo;pub use device::DeviceManager;pub use device::DevicePerformanceInfo;pub use device::DeviceRequirements;pub use device::DeviceType;pub use device::DeviceUtils;pub use error::ErrorCategory;pub use error::ErrorContext;pub use error::ErrorSeverity;pub use fft::convenience as fft_convenience;pub use fft::DefaultFftExecutor;pub use fft::DefaultFftOps;pub use fft::FftDirection;pub use fft::FftExecutor;pub use fft::FftNormalization;pub use fft::FftOps;pub use fft::FftPlan;pub use fft::FftType;pub use hardware_optimization_tests::run_hardware_optimization_tests;pub use hardware_optimization_tests::run_lightweight_hardware_tests;pub use hardware_optimization_tests::HardwareOptimizationTester;pub use kernel::Kernel;pub use kernel::KernelDescriptor;pub use kernel::KernelHandle;pub use kernel::KernelLaunchConfig;pub use kernel::KernelMetadata;pub use memory::AccessPattern;pub use memory::AllocationHint;pub use memory::AllocationLifetime;pub use memory::AllocationStrategy;pub use memory::CompactionResult;pub use memory::DefragmentationPolicy;pub use memory::DefragmentationPriority;pub use memory::DefragmentationResult;pub use memory::DefragmentationStrategy;pub use memory::FragmentationInfo;pub use memory::FragmentationSeverity;pub use memory::FreeListPool;pub use memory::LeakReport;pub use memory::LeakSeverity;pub use memory::LeakType;pub use memory::MemoryAdvice;pub use memory::MemoryManager;pub use memory::MemoryManagerFactory;pub use memory::MemoryPool;pub use memory::MemoryPoolConfig;pub use memory::MemoryStats;pub use memory::PoolStats;pub use memory_defrag::CompactionPlan;pub use memory_defrag::DefragmentationManager;pub use memory_defrag::DefragmentationRequest;pub use memory_defrag::DefragmentationStats;pub use memory_defrag::DefragmentationTask;pub use memory_defrag::MemoryBlock;pub use memory_defrag::MemoryLayout;pub use memory_defrag::TaskStatus;pub use memory_profiler::AccessType;pub use memory_profiler::AllocationContext;pub use memory_profiler::AllocationUsageStats;pub use memory_profiler::HintSeverity;pub use memory_profiler::MemoryAllocation;pub use memory_profiler::MemoryPressureEvent;pub use memory_profiler::MemoryProfiler;pub use memory_profiler::MemoryProfilerConfig;pub use memory_profiler::MemorySnapshot;pub use memory_profiler::MemoryType;pub use memory_profiler::PerformanceHint;pub use memory_profiler::PerformanceHintType;pub use memory_profiler::PressureLevel;pub use performance_modeling::AnomalyDetector;pub use performance_modeling::AnomalySeverity;pub use performance_modeling::AnomalyType;pub use performance_modeling::ComplexityClass;pub use performance_modeling::CorrelationAnalyzer;pub use performance_modeling::CorrelationResult;pub use performance_modeling::EnvironmentalFactors;pub use performance_modeling::ModelAccuracy;pub use performance_modeling::ModelComplexity;pub use performance_modeling::ModelTrainingResult;pub use performance_modeling::PatternType;pub use performance_modeling::PerformanceAnomaly;pub use performance_modeling::PerformanceCharacteristics;pub use performance_modeling::PerformanceMeasurement;pub use performance_modeling::PerformanceModel;pub use performance_modeling::PerformanceReport;pub use performance_modeling::PerformanceSample;pub use performance_modeling::PerformanceTrend;pub use performance_modeling::RealtimeStatistics;pub use performance_modeling::RuntimeMonitor;pub use performance_modeling::RuntimePerformanceModeler;pub use performance_modeling::TrendDirection;pub use performance_modeling::WorkloadPattern;pub use performance_tuning::analyze_workload_optimization_opportunities;pub use performance_tuning::create_default_constraints;pub use performance_tuning::create_default_system_state;pub use performance_tuning::create_energy_budget_constraints;pub use performance_tuning::create_image_processing_workload;pub use performance_tuning::create_ml_inference_workload;pub use performance_tuning::create_ml_training_workload;pub use performance_tuning::create_performance_optimized_system_state;pub use performance_tuning::create_power_efficient_system_state;pub use performance_tuning::create_realtime_constraints;pub use performance_tuning::create_sample_workload;pub use performance_tuning::create_throughput_constraints;pub use performance_tuning::new_coordinator;pub use performance_tuning::recommend_backend;pub use performance_tuning::AccessPattern as PerfAccessPattern;pub use performance_tuning::ActualPerformance;pub use performance_tuning::BackendTuningStrategy;pub use performance_tuning::DataType;pub use performance_tuning::GlobalPerformanceStats;pub use performance_tuning::MemoryAllocationStrategy;pub use performance_tuning::NumaTopologyState;pub use performance_tuning::OperationType;pub use performance_tuning::OptimizationLevel;pub use performance_tuning::PerformanceFeedback;pub use performance_tuning::PerformancePrediction;pub use performance_tuning::PerformanceTuningCoordinator;pub use performance_tuning::PowerEfficiencyMode;pub use performance_tuning::PowerState;pub use performance_tuning::SchedulingStrategy;pub use performance_tuning::StrategyMetrics;pub use performance_tuning::SystemState;pub use performance_tuning::ThermalState;pub use performance_tuning::TuningConstraints;pub use performance_tuning::TuningParameters;pub use performance_tuning::TuningRecommendation;pub use performance_tuning::TuningValue;pub use performance_tuning::WorkloadCharacteristics;pub use profiler::Profiler;pub use profiler::ProfilerEvent;pub use profiler::ProfilerStats;pub use profiler::SimpleProfiler;pub use quantization::CalibrationMethod;pub use quantization::QuantizationCalibrator;pub use quantization::QuantizationHardwareFeatures;pub use quantization::QuantizationOps;pub use quantization::QuantizationParams;pub use quantization::QuantizationScheme;pub use quantization::QuantizedDType;pub use quantization::QuantizedTensor;pub use quantization::SimdQuantizationOps;pub use rnn::activations as rnn_activations;pub use rnn::cells as rnn_cells;pub use rnn::DefaultRnnOps;pub use rnn::RnnActivation;pub use rnn::RnnCellType;pub use rnn::RnnConfig;pub use rnn::RnnDirection;pub use rnn::RnnOps;pub use rnn::RnnOutput;pub use rnn::RnnPerformanceHints;pub use sparse_ops::DefaultSparseOps;pub use sparse_ops::SparseFormat;pub use sparse_ops::SparseFormatConverter;pub use sparse_ops::SparseMatrix;pub use sparse_ops::SparseOperation;pub use sparse_ops::SparseOps;pub use sparse_ops::SparseOptimizationHints;pub use unified_memory_pool::CpuMemoryPool;pub use unified_memory_pool::CudaMemoryPool;pub use unified_memory_pool::MetalMemoryPool;pub use unified_memory_pool::RocmMemoryPool;pub use unified_memory_pool::UnifiedMemoryPool;pub use unified_memory_pool::WebGpuMemoryPool;pub use version_compat::BackendDependency;pub use version_compat::CompatibilityReport;pub use version_compat::DependencyStatus;pub use version_compat::Version;pub use version_compat::VersionCompatibilityChecker;pub use version_compat::VersionError;pub use version_compat::VersionErrorContextExt;pub use version_compat::VersionRange;pub use zero_copy::TransferDirection;pub use zero_copy::TransferMode;pub use zero_copy::ZeroCopyCapabilities;pub use zero_copy::ZeroCopyManager;pub use zero_copy::ZeroCopyStats;pub use zero_copy::ZeroCopyTransfer;pub use cpu::prepare_tensor_data;pub use cpu::prepare_tensor_data_mut;pub use cpu::SciRS2CpuBackend;
Modules§
- adaptive_
kernel_ selection - Adaptive kernel selection based on input characteristics
- backend
- Core backend trait and implementations
- buffer
- Buffer management and memory operations
- convolution
- Convolution operations for all backends
- cpu
- CPU backend implementation for ToRSh
- cross_
backend_ transfer - Cross-backend memory transfer optimization
- cross_
backend_ validation - Cross-backend correctness validation tests
- deadlock_
prevention - Deadlock prevention utilities for backend operations
- device
- Device abstraction and management
- error
- Unified backend error handling using TorshError
- fft
- Fast Fourier Transform operations for all backends
- hardware_
optimization_ tests - Hardware-specific optimization testing
- introspection
- Backend Capability Introspection System
- jit_
compiler - Just-In-Time (JIT) Kernel Compilation System
- kernel
- Compute kernel abstraction and management
- kernel_
generation - Kernel generation module structure.
- memory
- Memory management abstractions
- memory_
defrag - Advanced memory defragmentation and compaction strategies for ToRSh backends
- memory_
profiler - Comprehensive memory profiling system with SciRS2 integration
- performance_
modeling - Runtime performance modeling and prediction system
- performance_
tuning - Backend-specific performance tuning strategies
- prelude
- Prelude module for convenient imports
- profiler
- Performance profiling and monitoring
- property_
tests - Property-Based Testing for Backend Mathematical Correctness
- quantization
- Comprehensive quantization module for ToRSh backend
- rnn
- Recurrent Neural Network operations for all backends
- sparse_
ops - Comprehensive sparse operations support for ToRSh backends
- unified_
memory_ pool - Unified memory pool system across all backends
- version_
compat - Comprehensive error checking and version compatibility for ToRSh backends
- zero_
copy - Zero-copy memory transfer implementations for ToRSh backends
Macros§
- backend_
error - Macros for simplified error creation
- compute_
error - error_
with_ location - Macro for adding location context to errors
- memory_
error - memory_
profiler - Convenience macros for common operations
- performance_
hint - Convenience macro for creating performance hints
- profile_
scope - Macro for creating scoped profiling events
Structs§
- Backend
Builder - Unified backend builder
Enums§
- Backend
Error - Backend-specific error types
Constants§
Functions§
- auto
- Create a backend with automatic selection
- available_
backends - List available backend types
- cpu
- Create a CPU backend
- cuda
- Create a CUDA backend
- device_
count - Get device count for a specific backend type
- enumerate_
all_ devices - Comprehensive device enumeration across all available backends
- find_
best_ device - Find the best available device based on selection criteria
- is_
available - Check if any GPU backend is available (always false without CUDA feature)
- metal
- Create a Metal backend
Type Aliases§
- Backend
Result - Result type alias for ToRSh operations
- Buffer
Error - Buffer error type (alias to BackendError)