Skip to main content

Crate yscv_model

Crate yscv_model 

Source
Expand description

Model definitions, losses, checkpoints, and training helpers for yscv.

Modules§

tcp_transport
TCP-based transport for multi-node gradient exchange.

Structs§

AdaptiveAvgPool2dLayer
Adaptive average pooling: output a fixed spatial size regardless of input size.
AdaptiveMaxPool2dLayer
Adaptive max pooling: output a fixed spatial size.
AllReduceAggregator
All-reduce aggregator: averages gradients across all workers via ring reduce.
AnchorFreeHead
Anchor-free detection head (FCOS-style, inference-mode, NHWC).
ArchitectureConfig
Describes the shape of a model architecture.
AvgPool2dLayer
2D average-pooling layer (NHWC layout).
Batch
One deterministic batch from dataset iterator.
BatchCollector
Collects individual samples into batches for efficient processing.
BatchIterOptions
Controls mini-batch order, truncation behavior, and optional per-batch regularization.
BatchNorm2dLayer
2D batch normalization layer (NHWC layout).
BestModelCheckpoint
Saves model weights when monitored metric improves.
CenterCrop
Crop the center region of an image tensor. Input shape: [H, W, C]. Output shape: [size, size, C].
CnnTrainConfig
Configuration for high-level CNN training.
Compose
Chains multiple transforms sequentially.
CompressedGradient
Compressed gradient: stores only the top-k elements by magnitude.
Conv1dLayer
1D convolution layer (NLC layout: [batch, length, channels]).
Conv2dLayer
2D convolution layer (NHWC layout).
Conv3dLayer
3D convolution layer (BDHWC layout).
ConvTranspose2dLayer
Transposed 2D convolution layer (NHWC layout).
CrossAttention
Cross-attention: query from decoder, key/value from encoder output.
CutMixConfig
Controls per-batch region replacement interpolation for image tensors.
DataLoader
Parallel data loader that prefetches batches using worker threads.
DataLoaderBatch
A batch of samples produced by the data loader.
DataLoaderConfig
Configuration for the parallel data loader.
DataLoaderIter
Iterator over batches produced by worker threads.
DataParallelConfig
Configuration for data-parallel distributed training.
DatasetSplit
Deterministic dataset split produced by split_by_counts / split_by_ratio.
DeformableConv2dLayer
Deformable 2D convolution layer (NHWC layout).
DepthwiseConv2dLayer
Depthwise 2D convolution layer (NHWC layout).
DistributedConfig
Identifies a worker inside a distributed training group.
DropoutLayer
Dropout layer (training vs eval mode).
DynamicBatchConfig
Dynamic batching configuration for inference.
DynamicLossScaler
State for dynamic loss scaling during mixed-precision training.
EarlyStopping
Early stopping to halt training when a metric stops improving.
EmbeddingLayer
Embedding lookup table: maps integer indices to dense vectors.
EpochMetrics
Metrics for one training epoch.
EpochTrainOptions
Epoch-level training controls for batch order and preprocessing.
ExponentialMovingAverage
Exponential Moving Average of model parameters.
FeedForward
Feed-forward network: Linear(d_model, d_ff) -> ReLU -> Linear(d_ff, d_model).
FeedForwardLayer
Feed-forward layer wrapping FeedForward.
FlattenLayer
Flatten layer: reshapes NHWC [N, H, W, C] to [N, H*W*C] for dense layer input.
FpnNeck
Feature Pyramid Network lateral + top-down pathway (inference-mode, NHWC).
GELULayer
GELU activation layer.
GaussianBlur
Apply Gaussian blur to an image tensor. Input shape: [H, W, C].
GlobalAvgPool2dLayer
Global average pooling: NHWC [N,H,W,C] -> [N,1,1,C].
GroupNormLayer
Group normalization: divides channels into groups and normalizes within each group.
GruCell
GRU cell: update and reset gates.
GruLayer
GRU layer wrapping gru_forward_sequence.
HubEntry
Registry entry for a pretrained model.
ImageAugmentationPipeline
Ordered per-sample augmentation pipeline for NHWC mini-batch data.
InProcessTransport
In-process transport backed by mpsc channels (for testing).
InferencePipeline
Builder-style inference pipeline that wraps a SequentialModel with optional pre- and post-processing closures.
InstanceNormLayer
Instance normalization (normalizes per-sample per-channel).
LayerNormLayer
Layer normalization over the last dimension.
LeakyReLULayer
Stateless LeakyReLU layer with configurable negative slope.
LinearLayer
Dense linear layer: y = x @ weight + bias.
LocalAggregator
No-op aggregator for single-machine training (API uniformity).
LoraConfig
LoRA configuration.
LoraLinear
A LoRA adapter for a linear layer.
LrFinderConfig
Configuration for LR range test.
LrFinderResult
Result of an LR range test.
LstmCell
LSTM cell: standard gates (input, forget, cell, output).
LstmLayer
LSTM layer wrapping lstm_forward_sequence.
MaskHead
Mask prediction head for instance segmentation (Mask R-CNN style).
MaxPool2dLayer
2D max-pooling layer (NHWC layout).
MbConvBlock
MBConv block (EfficientNet / MobileNetV2 inverted residual, inference-mode).
MetricsLogger
Logs training metrics to a CSV file and prints a summary line to stdout.
MiniBatchIter
Deterministic sequential mini-batch iterator.
MishLayer
Mish activation layer.
MixUpConfig
Controls per-batch sample/label interpolation for regularized training.
MixedPrecisionConfig
Mixed-precision training configuration.
ModelHub
Model hub for downloading and caching pretrained weights.
ModelZoo
File-based pretrained model registry.
MultiHeadAttention
Multi-head attention weights.
MultiHeadAttentionConfig
Multi-head attention configuration.
MultiHeadAttentionLayer
Multi-head attention layer wrapping MultiHeadAttention.
Normalize
Normalize channels: (x - mean) / std
PReLULayer
PReLU activation layer. Uses per-channel or single alpha for the negative slope.
ParameterServer
Centralized parameter server: rank 0 collects, averages, and broadcasts gradients (or parameters).
PatchEmbedding
Patch embedding layer for Vision Transformer.
PerChannelQuantResult
Per-channel symmetric quantization for conv weights [KH, KW, C_in, C_out].
PermuteDims
Permute dimensions.
PipelineParallelConfig
Configuration for pipeline-parallel training.
PipelineStage
Pipeline parallelism: split a sequential model across multiple stages.
PixelShuffleLayer
Pixel shuffle / sub-pixel convolution: rearranges [N, H, W, C*r^2] -> [N, H*r, W*r, C].
PrunedTensor
Result of magnitude-based weight pruning.
QuantizedTensor
Quantized tensor representation: INT8 values + per-tensor scale + zero-point.
RandomHorizontalFlip
Randomly flip horizontally with probability p. Uses xorshift64 PRNG seeded at construction. Input shape: [H, W, C].
RandomSampler
A sampler that yields indices in a random (deterministic) order.
ReLULayer
Stateless ReLU layer.
ResidualBlock
Residual block: runs input through a sequence of layers, then adds the original input as a skip connection (output = layers(input) + input).
Resize
Resize image tensor to target height and width using bilinear interpolation. Input shape: [H, W, C].
RnnCell
Vanilla RNN cell: h_t = tanh(x_t @ W_ih + h_{t-1} @ W_hh + b).
RnnLayer
RNN layer wrapping rnn_forward_sequence.
SafeTensorFile
A parsed SafeTensors file backed by an in-memory byte buffer.
ScaleValues
Scale f32 values by a constant factor.
ScheduledEpochMetrics
Metrics for one scheduler-driven epoch.
SchedulerTrainOptions
Scheduler-driven epoch training controls.
SeparableConv2dLayer
Separable 2D convolution layer (NHWC layout).
SequentialCheckpoint
Serializable sequential model checkpoint.
SequentialModel
Ordered stack of layers executed one-by-one.
SequentialSampler
A sampler that yields indices in sequential order.
SiLULayer
SiLU (Swish) activation layer.
SigmoidLayer
Stateless sigmoid activation layer.
SoftmaxLayer
Softmax layer over the last dimension.
SqueezeExciteBlock
Squeeze-and-Excitation block (inference-mode).
StreamingDataLoader
A data loader that lazily reads batches from disk, using a background thread to prefetch the next batch while the current batch is being processed.
SupervisedCsvConfig
Configuration for parsing/loading supervised CSV datasets.
SupervisedDataset
Supervised dataset with aligned input/target sample axis at position 0.
SupervisedImageFolderConfig
SupervisedImageFolderLoadResult
Result payload for image-folder dataset loading with explicit class mapping.
SupervisedImageManifestConfig
Configuration for parsing/loading supervised image-manifest CSV datasets.
SupervisedJsonlConfig
Configuration for parsing/loading supervised JSONL datasets.
TanhLayer
Stateless tanh activation layer.
TcpAllReduceAggregator
Wrapper that uses a TcpTransport for gradient aggregation.
TcpTransport
TCP-based transport for multi-node gradient exchange.
TensorBoardCallback
Training callback that logs scalar metrics to TensorBoard event files.
TensorBoardWriter
Writes TensorBoard-compatible event files in TFRecord format.
TensorInfo
Per-tensor metadata extracted from the SafeTensors JSON header.
TensorSnapshot
Serializable tensor snapshot used in model checkpoints.
TopKCompressor
Top-K gradient compressor: keeps only the top ratio fraction of gradients.
TrainResult
Training result returned after fitting.
Trainer
High-level trainer that wraps optimizer + loss + callbacks configuration.
TrainerConfig
High-level training configuration.
TrainingLog
Records per-epoch training metrics.
TransformerDecoder
Stack of TransformerDecoderBlock layers.
TransformerDecoderBlock
Single transformer decoder block: masked self-attention → cross-attention → FFN, each sub-layer wrapped with residual connection and layer normalization.
TransformerEncoderBlock
Transformer encoder block: MHA -> Add&Norm -> FFN -> Add&Norm.
TransformerEncoderLayer
Transformer encoder layer wrapping TransformerEncoderBlock.
UNetDecoderStage
UNet decoder stage (inference-mode, NHWC).
UNetEncoderStage
UNet encoder stage (inference-mode, NHWC).
UpsampleLayer
Upsample layer: nearest or bilinear upsampling.
VisionTransformer
Vision Transformer (ViT) for image classification (inference-mode).
WeightedRandomSampler
Weighted random sampler: draws num_samples indices with probability proportional to weights.

Enums§

ImageAugmentationOp
Per-sample image augmentations for rank-4 NHWC training tensors.
ImageFolderTargetMode
Configuration for loading supervised image-folder datasets.
LayerCheckpoint
Serializable layer checkpoint payload.
LossKind
Which loss function to use.
ModelArchitecture
Known model architectures in the zoo.
ModelError
Errors returned by model-layer assembly, checkpoints, and training helpers.
ModelLayer
MonitorMode
Mode for metric monitoring.
NodeRole
Describes whether this node is the coordinator (rank 0) or a worker.
OptimizerKind
Which optimizer to use.
OptimizerType
Multi-epoch CNN training with configurable optimizer type.
QuantMode
Quantization mode.
SafeTensorDType
Supported element types in a SafeTensors file.
SamplingPolicy
Sample-order policy used by BatchIterOptions.
SupervisedLoss
Configures supervised-loss function used by train-step and train-epoch helpers.

Constants§

CRATE_ID

Traits§

GradientAggregator
Strategy for combining gradients across distributed workers.
TrainingCallback
Trait for training callbacks invoked after each epoch.
Transform
Trait for deterministic tensor transforms (preprocessing).
Transport
Byte-level communication primitive used by aggregation strategies.

Functions§

accumulate_gradients
Adds source gradients into the existing gradients of the given nodes.
adam_state_from_map
Restore Adam/AdamW state from a string-keyed map.
adam_state_to_map
Flatten Adam/AdamW state into a string-keyed map for serialization.
add_bottleneck_block
Adds a MobileNetV2-style inverted bottleneck block to a SequentialModel.
add_residual_block
Adds a ResNet-style residual block to a SequentialModel (inference-mode).
apply_pruning_mask
Apply a binary mask to weights (element-wise multiply).
batched_inference
Splits a large input into batches, runs inference, and reassembles.
bce_loss
Binary cross-entropy loss for predictions already passed through sigmoid. bce = -mean(target * log(pred) + (1 - target) * log(1 - pred)).
bilstm_forward_sequence
Bidirectional LSTM: runs forward and backward LSTMs, concatenates outputs.
build_alexnet
Builds a simple AlexNet-style conv stack.
build_classifier
Builds a full classifier with a custom number of output classes.
build_feature_extractor
Builds a backbone (feature extractor) without the final classifier head.
build_mobilenet_v2
Builds a MobileNetV2-style model using inverted bottleneck blocks.
build_resnet
Builds a ResNet-family model: stem + residual stages + global-avg-pool + linear head.
build_resnet_custom
Builds a ResNet with per-stage block counts (bypasses the single-count helper).
build_resnet_feature_extractor
Builds a ResNet-like feature extractor (no final classifier).
build_simple_cnn_classifier
Builds a simple CNN classifier architecture for NHWC input.
build_vgg
Builds a VGG-style sequential conv network.
cast_params_for_forward
Convert model parameters from master precision to forward precision.
cast_to_master
Cast a list of tensors back to master dtype for gradient accumulation.
checkpoint_from_json
checkpoint_to_json
collect_gradients
Collects the current gradients for a set of nodes as owned tensors.
compress_gradients
Compress gradients by keeping only top-k% elements by magnitude.
constant
Fill a tensor with a constant value.
contrastive_loss
Contrastive loss for siamese networks.
cosine_embedding_loss
Cosine embedding loss.
cross_entropy_loss
Cross-entropy loss from raw logits. Computes nll_loss(log_softmax(logits), targets).
ctc_loss
CTC (Connectionist Temporal Classification) loss.
decompress_gradients
Decompress gradients back to full tensors.
default_cache_dir
Returns the default cache directory for downloaded model weights.
dequantize_weights
Dequantize a set of quantized weights back to f32 tensors.
dice_loss
Dice loss for segmentation.
distillation_loss
Knowledge distillation loss (Hinton et al., 2015).
distributed_train_step
Performs a single distributed training step: forward, backward, aggregate, update.
export_sequential_to_onnx
Exports a SequentialModel to an ONNX protobuf byte vector.
export_sequential_to_onnx_file
Exports a SequentialModel to an ONNX file.
focal_loss
Focal loss for imbalanced classification.
fuse_conv_bn
Fuse Conv2d + BatchNorm2d into a single Conv2d with adjusted weights and bias.
gather_shards
Reassemble shards (produced by shard_tensor) back into a single tensor.
generate_causal_mask
Generates a causal (lower-triangular) attention mask. Returns [seq_len, seq_len] tensor where:
generate_padding_mask
Generates a padding mask for batched sequences with different lengths. lengths: actual length of each sequence in the batch max_len: maximum sequence length (pad length) Returns [batch, max_len] tensor where:
gru_forward_sequence
Runs a GRU cell over a sequence [batch, seq_len, input_size].
hinge_loss
Mean hinge loss: mean(max(0, margin - prediction * target)).
huber_loss
Mean Huber loss: mean(0.5 * min(|e|, delta)^2 + delta * max(|e| - delta, 0)), where e = prediction - target.
infer_batch
Batch inference on a SequentialModel (tensor mode, no autograd graph).
infer_batch_graph
Runs inference through the autograd graph and returns the output tensor value.
inspect_weights
Lists tensor names and shapes from a weight file without loading data.
kaiming_normal
Kaiming (He) normal initialization.
kaiming_uniform
Kaiming (He) uniform initialization.
kl_div_loss
KL divergence loss:
label_smoothing_cross_entropy
Cross-entropy with label smoothing.
load_state_dict
Load all tensors from a SafeTensors file into a name-to-tensor map.
load_supervised_dataset_csv_file
Loads supervised training samples from a CSV file.
load_supervised_dataset_jsonl_file
Loads supervised training samples from a JSONL file.
load_supervised_image_folder_dataset
Loads supervised training samples from an image-folder classification tree.
load_supervised_image_folder_dataset_with_classes
Loads supervised training samples from an image-folder classification tree and returns class mapping.
load_supervised_image_manifest_csv_file
Loads supervised training image-manifest CSV from file.
load_training_checkpoint
Load a full training checkpoint, splitting model weights from optimizer state.
load_weights
Loads named tensors from a binary weight file.
loopback_pair
Create a loopback TCP transport pair for testing.
lr_range_test
Run an LR range test.
lstm_forward_sequence
Runs an LSTM cell over a sequence [batch, seq_len, input_size].
mae_loss
Mean absolute error loss: mean(abs(prediction - target)).
mixed_precision_train_step
Runs a mixed-precision forward+backward step.
mse_loss
Mean squared error loss: mean((prediction - target)^2).
nll_loss
Negative log-likelihood loss from log-probabilities. Expects log_probs shape [batch, classes] and targets shape [batch, 1] where targets contain class indices as f32.
optimize_sequential
Scan a SequentialModel and fuse Conv2d + BatchNorm2d patterns.
orthogonal
Orthogonal initialization via QR decomposition (simplified Gram-Schmidt).
parse_supervised_dataset_csv
Parses supervised training samples from CSV text into a SupervisedDataset.
parse_supervised_dataset_jsonl
Parses supervised training samples from JSONL text into a SupervisedDataset.
parse_supervised_image_manifest_csv
Parses supervised training image-manifest CSV into a SupervisedDataset.
prune_magnitude
Prune weights by magnitude: zero out the smallest sparsity fraction.
quantize_per_channel
quantize_weights
Quantize all weight tensors in a model checkpoint for storage/inference.
quantized_matmul
Quantized matmul: dequantize -> f32 matmul -> re-quantize.
remap_state_dict
Remap an entire state dict from timm names to yscv names.
rnn_forward_sequence
Runs an RNN cell over a sequence [batch, seq_len, input_size].
save_training_checkpoint
Save a full training checkpoint: model weights + optimizer state.
save_weights
Saves a named set of tensors to a binary file (safetensors-like format).
scale_gradients
Scales gradients of the given nodes by a scalar factor.
scaled_dot_product_attention
Scaled dot-product attention: softmax(Q @ K^T / sqrt(d_k)) @ V.
sgd_state_from_map
Restore SGD velocity buffers from a string-keyed map.
sgd_state_to_map
Flatten SGD velocity buffers into a string-keyed map for serialization.
shard_tensor
Shard a tensor along its first dimension into num_shards roughly equal parts.
smooth_l1_loss
Smooth L1 loss (detection-style parameterization of Huber loss):
split_into_stages
Split a model with num_layers layers into num_stages roughly equal stages.
timm_to_yscv_name
Translate a timm/PyTorch weight name to the corresponding yscv name.
train_cnn_epoch_adam
One-call CNN training epoch with Adam optimizer.
train_cnn_epoch_adamw
One-call CNN training epoch with AdamW optimizer.
train_cnn_epoch_sgd
One-call CNN training epoch: register params, forward, loss, backward, update, sync.
train_cnn_epochs
Runs multiple CNN training epochs, returning per-epoch metrics.
train_epoch_adam
Deterministic one-epoch Adam train loop over sequential mini-batches.
train_epoch_adam_with_loss
Deterministic one-epoch Adam train loop with configurable supervised loss.
train_epoch_adam_with_options
Deterministic one-epoch Adam train loop with configurable batch iterator options.
train_epoch_adam_with_options_and_loss
Deterministic one-epoch Adam train loop with configurable batch iterator options and loss.
train_epoch_adamw
Deterministic one-epoch AdamW train loop over sequential mini-batches.
train_epoch_adamw_with_loss
Deterministic one-epoch AdamW train loop with configurable supervised loss.
train_epoch_adamw_with_options
Deterministic one-epoch AdamW train loop with configurable batch iterator options.
train_epoch_adamw_with_options_and_loss
Deterministic one-epoch AdamW train loop with configurable batch iterator options and loss.
train_epoch_distributed
Train one epoch with distributed gradient synchronization.
train_epoch_distributed_sgd
Convenience wrapper: train one distributed epoch over a SequentialModel and SupervisedDataset with SGD.
train_epoch_rmsprop
Deterministic one-epoch RMSProp train loop over sequential mini-batches.
train_epoch_rmsprop_with_loss
Deterministic one-epoch RMSProp train loop with configurable supervised loss.
train_epoch_rmsprop_with_options
Deterministic one-epoch RMSProp train loop with configurable batch iterator options.
train_epoch_rmsprop_with_options_and_loss
Deterministic one-epoch RMSProp train loop with configurable batch iterator options and loss.
train_epoch_sgd
Deterministic one-epoch train loop over sequential mini-batches.
train_epoch_sgd_with_loss
Deterministic one-epoch train loop with configurable supervised loss.
train_epoch_sgd_with_options
Deterministic one-epoch train loop with configurable batch iterator options.
train_epoch_sgd_with_options_and_loss
Deterministic one-epoch train loop with configurable batch iterator options and loss.
train_epochs_adam_with_scheduler
Runs multiple Adam epochs and advances scheduler after each epoch.
train_epochs_adam_with_scheduler_and_loss
Runs multiple Adam epochs with configurable supervised loss and advances scheduler after each epoch.
train_epochs_adamw_with_scheduler
Runs multiple AdamW epochs and advances scheduler after each epoch.
train_epochs_adamw_with_scheduler_and_loss
Runs multiple AdamW epochs with configurable supervised loss and advances scheduler after each epoch.
train_epochs_rmsprop_with_scheduler
Runs multiple RMSProp epochs and advances scheduler after each epoch.
train_epochs_rmsprop_with_scheduler_and_loss
Runs multiple RMSProp epochs with configurable supervised loss and advances scheduler after each epoch.
train_epochs_sgd_with_scheduler
Runs multiple SGD epochs and advances scheduler after each epoch.
train_epochs_sgd_with_scheduler_and_loss
Runs multiple SGD epochs with configurable supervised loss and advances scheduler after each epoch.
train_epochs_with_callbacks
Train for multiple epochs with callbacks.
train_step_adam
Runs one full train step: loss forward, backward, and Adam updates.
train_step_adam_with_accumulation
Runs one training step with gradient accumulation across multiple micro-batches using the Adam optimizer.
train_step_adam_with_loss
Runs one full train step: configured loss forward, backward, and Adam updates.
train_step_adamw
Runs one full train step: loss forward, backward, and AdamW updates.
train_step_adamw_with_accumulation
Runs one training step with gradient accumulation across multiple micro-batches using the AdamW optimizer.
train_step_adamw_with_loss
Runs one full train step: configured loss forward, backward, and AdamW updates.
train_step_rmsprop
Runs one full train step: loss forward, backward, and RMSProp updates.
train_step_rmsprop_with_accumulation
Runs one training step with gradient accumulation across multiple micro-batches using the RMSProp optimizer.
train_step_rmsprop_with_loss
Runs one full train step: configured loss forward, backward, and RMSProp updates.
train_step_sgd
Runs one full train step: loss forward, backward, and SGD updates.
train_step_sgd_with_accumulation
Runs one training step with gradient accumulation across multiple micro-batches.
train_step_sgd_with_loss
Runs one full train step: configured loss forward, backward, and SGD updates.
triplet_loss
Triplet loss for metric learning.
xavier_normal
Xavier (Glorot) normal initialization.
xavier_uniform
Xavier (Glorot) uniform initialization.