Expand description
ClassicBert architecture (BGE-small-en-v1.5).
12-layer BERT with learned position embeddings, GELU activation, fused QKV projections, and CLS pooling. This is the original BERT architecture used by BGE-small.
Weight structures are generic over the tensor type T, which is
Driver::Tensor when wired to a
backend. The ModelArch implementation composes
Driver primitives into the full forward
pass.
Structsยง
- Classic
Bert Arch ClassicBertarchitecture: BGE-small-en-v1.5.- Classic
Bert Layer Weights - Weights for one
ClassicBertencoder layer. - Classic
Bert Weights - Full
ClassicBertmodel weights, generic over tensor type.