Module utils

Module utils 

Source
Expand description

Utility functions for attention mechanisms.

This module provides common utilities like softmax, masking, and numerical stability helpers used across attention implementations.

Functionsยง

add_vectors
Adds two vectors element-wise.
apply_causal_mask
Applies causal masking to attention scores.
apply_dropout
Applies dropout to a vector during training.
dot_product
Computes dot product between two vectors.
l2_norm
Computes L2 norm of a vector.
masked_softmax
Computes softmax with masking support.
normalize_vector
Normalizes a vector to unit length.
scale_vector
Scales a vector by a scalar value.
softmax
Computes softmax over a slice of values.
stable_softmax
Stable softmax that returns Vec directly (no Result) Used by sparse, moe, and graph modules