Expand description
§kizzasi-core
Core SSM (State Space Model) engine for Kizzasi AGSP.
Implements linear-time State Space Models (Mamba/S4/RWKV) for efficient processing of continuous signal streams with O(1) inference step complexity.
§COOLJAPAN Ecosystem
This crate follows the KIZZASI_POLICY.md and uses scirs2-core for all
array and numerical operations.
Re-exports§
pub use attention::GatedLinearAttention;pub use attention::MultiHeadSSMAttention;pub use attention::MultiHeadSSMConfig;pub use conv::CausalConv1d;pub use conv::DepthwiseCausalConv1d;pub use conv::DilatedCausalConv1d;pub use conv::DilatedStack;pub use conv::ShortConv;pub use dataloader::BatchIterator;pub use dataloader::DataLoaderConfig;pub use dataloader::TimeSeriesAugmentation;pub use dataloader::TimeSeriesDataLoader;pub use device::get_best_device;pub use device::is_cuda_available;pub use device::is_metal_available;pub use device::list_devices;pub use device::DeviceConfig;pub use device::DeviceInfo;pub use device::DeviceType;pub use efficient_attention::EfficientAttentionConfig;pub use efficient_attention::EfficientMultiHeadAttention;pub use efficient_attention::FusedAttentionKernel;pub use embedded_alloc::BumpAllocator;pub use embedded_alloc::EmbeddedAllocator;pub use embedded_alloc::FixedPool;pub use embedded_alloc::StackAllocator;pub use embedded_alloc::StackGuard;pub use flash_attention::flash_attention_fused;pub use flash_attention::FlashAttention;pub use flash_attention::FlashAttentionConfig;pub use gpu_utils::GPUMemoryPool;pub use gpu_utils::MemoryStats;pub use gpu_utils::TensorPrefetch;pub use gpu_utils::TensorTransfer;pub use gpu_utils::TransferBatch;pub use h3::DiagonalSSM;pub use h3::H3Config;pub use h3::H3Layer;pub use h3::H3Model;pub use h3::ShiftSSM;pub use kernel_fusion::fused_ffn_gelu;pub use kernel_fusion::fused_layernorm_gelu;pub use kernel_fusion::fused_layernorm_silu;pub use kernel_fusion::fused_linear_activation;pub use kernel_fusion::fused_multihead_output;pub use kernel_fusion::fused_qkv_projection;pub use kernel_fusion::fused_quantize_dequantize;pub use kernel_fusion::fused_softmax_attend;pub use kernel_fusion::fused_ssm_step;pub use lora::LoRAAdapter;pub use lora::LoRAConfig;pub use lora::LoRALayer;pub use mamba2::Mamba2Config;pub use mamba2::Mamba2Layer;pub use mamba2::Mamba2Model;pub use metrics::MetricsLogger;pub use metrics::MetricsSummary;pub use metrics::TrainingMetrics;pub use nn::gelu;pub use nn::gelu_fast;pub use nn::layer_norm;pub use nn::leaky_relu;pub use nn::log_softmax;pub use nn::relu;pub use nn::rms_norm;pub use nn::sigmoid;pub use nn::silu;pub use nn::softmax;pub use nn::tanh;pub use nn::Activation;pub use nn::ActivationType;pub use nn::GatedLinearUnit;pub use nn::LayerNorm;pub use nn::NormType;pub use optimizations::acquire_workspace;pub use optimizations::ilp;pub use optimizations::prefetch;pub use optimizations::release_workspace;pub use optimizations::CacheAligned;pub use optimizations::DiscretizationCache;pub use optimizations::SSMWorkspace;pub use optimizations::WorkspaceGuard;pub use parallel::BatchProcessor;pub use parallel::ParallelConfig;pub use pool::ArrayPool;pub use pool::MultiArrayPool;pub use pool::PoolStats;pub use pool::PooledArray;pub use profiling::CounterStats;pub use profiling::MemoryProfiler;pub use profiling::PerfCounter;pub use profiling::ProfilerMemoryStats;pub use profiling::ProfilingSession;pub use profiling::ScopeTimer;pub use profiling::Timer;pub use pruning::GradientPruner;pub use pruning::PruningConfig;pub use pruning::PruningGranularity;pub use pruning::PruningMask;pub use pruning::PruningStrategy;pub use pruning::StructuredPruner;pub use pytorch_compat::detect_checkpoint_architecture;pub use pytorch_compat::load_pytorch_checkpoint;pub use pytorch_compat::PyTorchCheckpoint;pub use pytorch_compat::PyTorchConverter;pub use pytorch_compat::WeightMapping;pub use quantization::DynamicQuantizer;pub use quantization::QuantizationParams;pub use quantization::QuantizationScheme;pub use quantization::QuantizationType;pub use quantization::QuantizedTensor;pub use retnet::MultiScaleRetention;pub use retnet::RetNetConfig;pub use retnet::RetNetLayer;pub use retnet::RetNetModel;pub use rwkv7::ChannelMixing;pub use rwkv7::RWKV7Config;pub use rwkv7::RWKV7Layer;pub use rwkv7::RWKV7Model;pub use rwkv7::TimeMixing;pub use s4d::S4DConfig;pub use s4d::S4DLayer;pub use s4d::S4DModel;pub use s5::S5Config;pub use s5::S5Layer;pub use s5::S5Model;pub use scan::parallel_scan;pub use scan::parallel_ssm_batch;pub use scan::parallel_ssm_scan;pub use scan::segmented_scan;pub use scan::AssociativeOp;pub use scan::SSMElement;pub use scan::SSMScanOp;pub use scheduler::ConstantScheduler;pub use scheduler::CosineScheduler;pub use scheduler::ExponentialScheduler;pub use scheduler::LRScheduler;pub use scheduler::LinearScheduler;pub use scheduler::OneCycleScheduler;pub use scheduler::PolynomialScheduler;pub use scheduler::StepScheduler;pub use sequences::apply_mask;pub use sequences::masked_mean;pub use sequences::masked_sum;pub use sequences::pad_sequences;pub use sequences::PackedSequence;pub use sequences::PaddingStrategy;pub use sequences::SequenceMask;pub use training::CheckpointMetadata;pub use training::ConstraintLoss;pub use training::Loss;pub use training::MixedPrecision;pub use training::SchedulerType;pub use training::TrainableSSM;pub use training::Trainer;pub use training::TrainingConfig;pub use weights::WeightFormat;pub use weights::WeightLoadConfig;pub use weights::WeightLoader;pub use weights::WeightPruner;
Modules§
- attention
- Multi-head SSM Attention mechanisms
- conv
- Causal convolution implementations for SSM architectures
- dataloader
- DataLoader for time-series training
- device
- Device selection and GPU acceleration utilities
- efficient_
attention - Memory-efficient attention implementations
- embedded_
alloc - Embedded-friendly allocator for no_std environments
- fixed_
point - Fixed-Point Arithmetic for Embedded Systems
- flash_
attention - Flash-Attention-2 Implementation
- gpu_
utils - GPU memory management and tensor transfer utilities
- h3
- H3 (Hungry Hungry Hippos) Architecture
- kernel_
fusion - Fused Kernel Optimizations
- lora
- LoRA (Low-Rank Adaptation) Support
- mamba2
- Mamba-2 SSD (State Space Duality)
- metrics
- Training metrics and logging utilities
- nn
- Neural network building blocks: normalization and activation functions
- numerics
- Numerical stability utilities for SSM computations
- optimizations
- Performance optimizations for kizzasi-core
- parallel
- Parallel computation utilities for multi-layer SSM processing
- pool
- Memory pooling for allocation reuse
- profiling
- Performance profiling utilities for kizzasi-core
- pruning
- Structured Pruning
- pytorch_
compat - PyTorch Compatibility Layer
- quantization
- Dynamic Quantization
- retnet
- RetNet: Retention Networks for Multi-Scale Sequence Modeling
- rwkv7
- RWKV-7 Architecture
- s5
- S5 (Simplified State Space Layers)
- s4d
- S4D: Diagonal Structured State Space Model
- scan
- Parallel Scan Algorithms for SSMs
- scheduler
- Learning rate schedulers for training
- sequences
- Variable-length Sequence Handling
- simd
- SIMD-optimized operations for high-performance matrix computations
- simd_
avx512 - AVX-512 SIMD Optimizations
- simd_
neon - ARM NEON SIMD Optimizations
- training
- Training infrastructure for SSM models
- weights
- Weight management for SSM models
Macros§
- profile_
memory - Macro to profile memory usage of a block
- time_
block - Macro to time a block of code
Structs§
- Continuous
Embedding - Continuous embedding layer for signal values
- Hidden
State - Represents the hidden state of the SSM
- Kizzasi
Config - Configuration for the Kizzasi AGSP engine
- SelectiveSSM
- Selective State Space Model (Mamba-style)
Enums§
Traits§
- Signal
Predictor - Core trait for autoregressive signal prediction
- State
Space Model - Trait for state space model implementations
Type Aliases§
- Array1
- one-dimensional array
- Core
Result - Result type alias for core operations