Expand description
GGUF-quantized EmbeddingGemma model internals.
Implements the full forward pass: token embedding → transformer layers (bidirectional attention) → mean pooling → dense projections → L2 normalization.
Structs§
- Embedding
Gemma Model - The full
EmbeddingGemmamodel loaded from GGUF + Dense layer safetensors.
Functions§
- l2_
normalize - L2-normalizes a tensor along the last dimension.
- mean_
pool - Mean-pools token embeddings over the sequence dimension, respecting an attention mask.