Skip to main content

Crate torsh_backend

Crate torsh_backend 

Source
Expand description

Unified backend implementation for ToRSh

This crate provides a unified backend system that integrates with SciRS2’s compute backends. All backend implementations are included in this single crate and selected via feature flags.

§Features

  • cpu (default): CPU backend with SIMD optimizations via scirs2-core
  • cuda: NVIDIA GPU backend via scirs2-core’s CUDA support
  • metal: Apple GPU backend via scirs2-core’s Metal/MPS support
  • rocm: AMD GPU backend (when available in scirs2-core)
  • webgpu: WebGPU backend (when available in scirs2-core)

Re-exports§

pub use adaptive_kernel_selection::AdaptiveKernelSelector;
pub use adaptive_kernel_selection::AdaptiveSelectionConfig;
pub use adaptive_kernel_selection::BenchmarkResult;
pub use adaptive_kernel_selection::BenchmarkResults;
pub use adaptive_kernel_selection::CustomKernel;
pub use adaptive_kernel_selection::HybridConfig;
pub use adaptive_kernel_selection::KernelCharacteristics;
pub use adaptive_kernel_selection::KernelConstraints;
pub use adaptive_kernel_selection::KernelExecutor;
pub use adaptive_kernel_selection::KernelImplementation;
pub use adaptive_kernel_selection::KernelInputs;
pub use adaptive_kernel_selection::KernelOutputs;
pub use adaptive_kernel_selection::KernelParameter;
pub use adaptive_kernel_selection::KernelPerformanceRecord;
pub use adaptive_kernel_selection::KernelRegistry;
pub use adaptive_kernel_selection::KernelSelection;
pub use adaptive_kernel_selection::KernelUsageStats;
pub use adaptive_kernel_selection::KernelVariant;
pub use adaptive_kernel_selection::MLBasedConfig;
pub use adaptive_kernel_selection::MLModelType;
pub use adaptive_kernel_selection::MLTrainingParams;
pub use adaptive_kernel_selection::PerformanceTracker;
pub use adaptive_kernel_selection::ResourceRequirements;
pub use adaptive_kernel_selection::ScalabilityCharacteristics;
pub use adaptive_kernel_selection::ScalingBehavior;
pub use adaptive_kernel_selection::ScoreBasedConfig;
pub use adaptive_kernel_selection::SelectionAccuracyTracker;
pub use adaptive_kernel_selection::SelectionAlgorithm;
pub use adaptive_kernel_selection::SelectionReason;
pub use adaptive_kernel_selection::SelectionStatistics;
pub use backend::Backend;
pub use backend::BackendCapabilities;
pub use backend::BackendCore;
pub use backend::BackendDeviceManager;
pub use backend::BackendExecutor;
pub use backend::BackendExtension;
pub use backend::BackendExtensionRegistry;
pub use backend::BackendFactory;
pub use backend::BackendLifecycle;
pub use backend::BackendOperations;
pub use backend::BackendOps;
pub use backend::BackendPlugin;
pub use backend::BackendRegistry;
pub use backend::BackendResourceManager;
pub use backend::BackendType;
pub use backend::CapabilityValue;
pub use backend::DeviceEnumerator;
pub use backend::ExecutionModel;
pub use backend::ExtendedCapabilities;
pub use backend::HardwareFeature;
pub use backend::MemoryHierarchy;
pub use backend::OperationsBundle;
pub use backend::PerformanceHints;
pub use backend::PluginMetadata;
pub use backend::PrecisionMode;
pub use backend::ResourceLimits;
pub use backend::ResourceStatistics;
pub use backend::ResourceUsage;
pub use backend::ScopedResource;
pub use buffer::Buffer;
pub use buffer::BufferDescriptor;
pub use buffer::BufferHandle;
pub use buffer::BufferUsage;
pub use buffer::BufferView;
pub use buffer::MemoryLocation;
pub use convolution::algorithms as conv_algorithms;
pub use convolution::ConvolutionAlgorithm;
pub use convolution::ConvolutionConfig;
pub use convolution::ConvolutionOps;
pub use convolution::ConvolutionPerformanceHints;
pub use convolution::ConvolutionType;
pub use convolution::DefaultConvolutionOps;
pub use convolution::PaddingMode;
pub use cross_backend_transfer::CrossBackendTransferManager;
pub use cross_backend_validation::compare_f32_values;
pub use cross_backend_validation::compare_f64_values;
pub use cross_backend_validation::run_cross_backend_validation;
pub use cross_backend_validation::CrossBackendValidator;
pub use device::Device;
pub use device::DeviceConfiguration;
pub use device::DeviceDiscovery;
pub use device::DeviceFeature;
pub use device::DeviceInfo;
pub use device::DeviceManager;
pub use device::DevicePerformanceInfo;
pub use device::DeviceRequirements;
pub use device::DeviceType;
pub use device::DeviceUtils;
pub use error::ErrorCategory;
pub use error::ErrorContext;
pub use error::ErrorSeverity;
pub use fft::convenience as fft_convenience;
pub use fft::DefaultFftExecutor;
pub use fft::DefaultFftOps;
pub use fft::FftDirection;
pub use fft::FftExecutor;
pub use fft::FftNormalization;
pub use fft::FftOps;
pub use fft::FftPlan;
pub use fft::FftType;
pub use hardware_optimization_tests::run_hardware_optimization_tests;
pub use hardware_optimization_tests::run_lightweight_hardware_tests;
pub use hardware_optimization_tests::HardwareOptimizationTester;
pub use kernel::Kernel;
pub use kernel::KernelDescriptor;
pub use kernel::KernelHandle;
pub use kernel::KernelLaunchConfig;
pub use kernel::KernelMetadata;
pub use memory::AccessPattern;
pub use memory::AllocationHint;
pub use memory::AllocationLifetime;
pub use memory::AllocationStrategy;
pub use memory::CompactionResult;
pub use memory::DefragmentationPolicy;
pub use memory::DefragmentationPriority;
pub use memory::DefragmentationResult;
pub use memory::DefragmentationStrategy;
pub use memory::FragmentationInfo;
pub use memory::FragmentationSeverity;
pub use memory::FreeListPool;
pub use memory::LeakReport;
pub use memory::LeakSeverity;
pub use memory::LeakType;
pub use memory::MemoryAdvice;
pub use memory::MemoryManager;
pub use memory::MemoryManagerFactory;
pub use memory::MemoryPool;
pub use memory::MemoryPoolConfig;
pub use memory::MemoryStats;
pub use memory::PoolStats;
pub use memory_defrag::CompactionPlan;
pub use memory_defrag::DefragmentationManager;
pub use memory_defrag::DefragmentationRequest;
pub use memory_defrag::DefragmentationStats;
pub use memory_defrag::DefragmentationTask;
pub use memory_defrag::MemoryBlock;
pub use memory_defrag::MemoryLayout;
pub use memory_defrag::TaskStatus;
pub use memory_profiler::AccessType;
pub use memory_profiler::AllocationContext;
pub use memory_profiler::AllocationUsageStats;
pub use memory_profiler::HintSeverity;
pub use memory_profiler::MemoryAllocation;
pub use memory_profiler::MemoryPressureEvent;
pub use memory_profiler::MemoryProfiler;
pub use memory_profiler::MemoryProfilerConfig;
pub use memory_profiler::MemorySnapshot;
pub use memory_profiler::MemoryType;
pub use memory_profiler::PerformanceHint;
pub use memory_profiler::PerformanceHintType;
pub use memory_profiler::PressureLevel;
pub use performance_modeling::AnomalyDetector;
pub use performance_modeling::AnomalySeverity;
pub use performance_modeling::AnomalyType;
pub use performance_modeling::ComplexityClass;
pub use performance_modeling::CorrelationAnalyzer;
pub use performance_modeling::CorrelationResult;
pub use performance_modeling::EnvironmentalFactors;
pub use performance_modeling::ModelAccuracy;
pub use performance_modeling::ModelComplexity;
pub use performance_modeling::ModelTrainingResult;
pub use performance_modeling::PatternType;
pub use performance_modeling::PerformanceAnomaly;
pub use performance_modeling::PerformanceCharacteristics;
pub use performance_modeling::PerformanceMeasurement;
pub use performance_modeling::PerformanceModel;
pub use performance_modeling::PerformanceReport;
pub use performance_modeling::PerformanceSample;
pub use performance_modeling::PerformanceTrend;
pub use performance_modeling::RealtimeStatistics;
pub use performance_modeling::RuntimeMonitor;
pub use performance_modeling::RuntimePerformanceModeler;
pub use performance_modeling::TrendDirection;
pub use performance_modeling::WorkloadPattern;
pub use performance_tuning::analyze_workload_optimization_opportunities;
pub use performance_tuning::create_default_constraints;
pub use performance_tuning::create_default_system_state;
pub use performance_tuning::create_energy_budget_constraints;
pub use performance_tuning::create_image_processing_workload;
pub use performance_tuning::create_ml_inference_workload;
pub use performance_tuning::create_ml_training_workload;
pub use performance_tuning::create_performance_optimized_system_state;
pub use performance_tuning::create_power_efficient_system_state;
pub use performance_tuning::create_realtime_constraints;
pub use performance_tuning::create_sample_workload;
pub use performance_tuning::create_throughput_constraints;
pub use performance_tuning::new_coordinator;
pub use performance_tuning::recommend_backend;
pub use performance_tuning::AccessPattern as PerfAccessPattern;
pub use performance_tuning::ActualPerformance;
pub use performance_tuning::BackendTuningStrategy;
pub use performance_tuning::DataType;
pub use performance_tuning::GlobalPerformanceStats;
pub use performance_tuning::MemoryAllocationStrategy;
pub use performance_tuning::NumaTopologyState;
pub use performance_tuning::OperationType;
pub use performance_tuning::OptimizationLevel;
pub use performance_tuning::PerformanceFeedback;
pub use performance_tuning::PerformancePrediction;
pub use performance_tuning::PerformanceTuningCoordinator;
pub use performance_tuning::PowerEfficiencyMode;
pub use performance_tuning::PowerState;
pub use performance_tuning::SchedulingStrategy;
pub use performance_tuning::StrategyMetrics;
pub use performance_tuning::SystemState;
pub use performance_tuning::ThermalState;
pub use performance_tuning::TuningConstraints;
pub use performance_tuning::TuningParameters;
pub use performance_tuning::TuningRecommendation;
pub use performance_tuning::TuningValue;
pub use performance_tuning::WorkloadCharacteristics;
pub use profiler::Profiler;
pub use profiler::ProfilerEvent;
pub use profiler::ProfilerStats;
pub use profiler::SimpleProfiler;
pub use quantization::CalibrationMethod;
pub use quantization::QuantizationCalibrator;
pub use quantization::QuantizationHardwareFeatures;
pub use quantization::QuantizationOps;
pub use quantization::QuantizationParams;
pub use quantization::QuantizationScheme;
pub use quantization::QuantizedDType;
pub use quantization::QuantizedTensor;
pub use quantization::SimdQuantizationOps;
pub use rnn::activations as rnn_activations;
pub use rnn::cells as rnn_cells;
pub use rnn::DefaultRnnOps;
pub use rnn::RnnActivation;
pub use rnn::RnnCellType;
pub use rnn::RnnConfig;
pub use rnn::RnnDirection;
pub use rnn::RnnOps;
pub use rnn::RnnOutput;
pub use rnn::RnnPerformanceHints;
pub use sparse_ops::DefaultSparseOps;
pub use sparse_ops::SparseFormat;
pub use sparse_ops::SparseFormatConverter;
pub use sparse_ops::SparseMatrix;
pub use sparse_ops::SparseOperation;
pub use sparse_ops::SparseOps;
pub use sparse_ops::SparseOptimizationHints;
pub use unified_memory_pool::CpuMemoryPool;
pub use unified_memory_pool::CudaMemoryPool;
pub use unified_memory_pool::MetalMemoryPool;
pub use unified_memory_pool::RocmMemoryPool;
pub use unified_memory_pool::UnifiedMemoryPool;
pub use unified_memory_pool::WebGpuMemoryPool;
pub use version_compat::BackendDependency;
pub use version_compat::CompatibilityReport;
pub use version_compat::DependencyStatus;
pub use version_compat::Version;
pub use version_compat::VersionCompatibilityChecker;
pub use version_compat::VersionError;
pub use version_compat::VersionErrorContextExt;
pub use version_compat::VersionRange;
pub use zero_copy::TransferDirection;
pub use zero_copy::TransferMode;
pub use zero_copy::ZeroCopyCapabilities;
pub use zero_copy::ZeroCopyManager;
pub use zero_copy::ZeroCopyStats;
pub use zero_copy::ZeroCopyTransfer;
pub use cpu::prepare_tensor_data;
pub use cpu::prepare_tensor_data_mut;
pub use cpu::SciRS2CpuBackend;

Modules§

adaptive_kernel_selection
Adaptive kernel selection based on input characteristics
backend
Core backend trait and implementations
buffer
Buffer management and memory operations
convolution
Convolution operations for all backends
cpu
CPU backend implementation for ToRSh
cross_backend_transfer
Cross-backend memory transfer optimization
cross_backend_validation
Cross-backend correctness validation tests
deadlock_prevention
Deadlock prevention utilities for backend operations
device
Device abstraction and management
error
Unified backend error handling using TorshError
fft
Fast Fourier Transform operations for all backends
hardware_optimization_tests
Hardware-specific optimization testing
introspection
Backend Capability Introspection System
jit_compiler
Just-In-Time (JIT) Kernel Compilation System
kernel
Compute kernel abstraction and management
kernel_generation
Kernel generation module structure.
memory
Memory management abstractions
memory_defrag
Advanced memory defragmentation and compaction strategies for ToRSh backends
memory_profiler
Comprehensive memory profiling system with SciRS2 integration
performance_modeling
Runtime performance modeling and prediction system
performance_tuning
Backend-specific performance tuning strategies
prelude
Prelude module for convenient imports
profiler
Performance profiling and monitoring
property_tests
Property-Based Testing for Backend Mathematical Correctness
quantization
Comprehensive quantization module for ToRSh backend
rnn
Recurrent Neural Network operations for all backends
sparse_ops
Comprehensive sparse operations support for ToRSh backends
unified_memory_pool
Unified memory pool system across all backends
version_compat
Comprehensive error checking and version compatibility for ToRSh backends
zero_copy
Zero-copy memory transfer implementations for ToRSh backends

Macros§

backend_error
Macros for simplified error creation
compute_error
error_with_location
Macro for adding location context to errors
memory_error
memory_profiler
Convenience macros for common operations
performance_hint
Convenience macro for creating performance hints
profile_scope
Macro for creating scoped profiling events

Structs§

BackendBuilder
Unified backend builder

Enums§

BackendError
Backend-specific error types

Constants§

VERSION
VERSION_MAJOR
VERSION_MINOR
VERSION_PATCH

Functions§

auto
Create a backend with automatic selection
available_backends
List available backend types
cpu
Create a CPU backend
cuda
Create a CUDA backend
device_count
Get device count for a specific backend type
enumerate_all_devices
Comprehensive device enumeration across all available backends
find_best_device
Find the best available device based on selection criteria
is_available
Check if any GPU backend is available (always false without CUDA feature)
metal
Create a Metal backend

Type Aliases§

BackendResult
Result type alias for ToRSh operations
BufferError
Buffer error type (alias to BackendError)