Expand description
RLX-backed text and image embedding models.
Migrated from burnembed — compiles BERT / NomicBERT / NomicVision graphs
via rlx-runtime and exposes tier-0 inference helpers.
ⓘ
use rlx_models::embed::{Pooling, RlxBertModel, BertTokenizer, embed_with_rlx};
let tok = BertTokenizer::from_dir(model_dir, 512)?;
let mut model = RlxBertModel::load(&config, &weights)?;
let vecs = embed_with_rlx(&mut model, &tok, &["hello", "world"], Pooling::Mean)?;Structs§
- Bert
Tokenizer - Wrapper around HuggingFace tokenizer configured for BERT-style encoding.
- Image
Model Info - Metadata for an image embedding model.
- Model
Info - Metadata for an embedding model.
- RlxBert
Model - RLX-compiled BERT model ready for inference.
- RlxEmbed
- High-level embedding model — auto-detects BERT / NomicBERT / NomicVision.
- RlxNomic
Model - RLX-compiled NomicBERT with shape-bucketed compile cache.
- RlxVision
Model - RLX-compiled NomicVision encoder (patch preprocess host-side, trunk on RLX).
- Tokenized
Batch - Output of batch tokenization: token IDs, attention masks, and token type IDs.
Enums§
- Arch
- Detected embedding architecture from
config.json. - Embedding
Model - Supported text embedding models.
- Image
Embedding Model - Supported image embedding models.
- Model
Arch - Model architecture type.
- Pooling
- Pooling strategy for reducing token hidden states to one vector per sequence.
Functions§
- assemble_
vision_ hidden - Assemble encoder input
[batch, seq, hidden]from NCHW pixels + preprocess weights. - compile_
model - Compile an embedding graph for the given batch/seq on
device. - compile_
model_ cpu - Compile on CPU (convenience for tests and default
RlxEmbed::from_dir). - default_
pooling - Default pooling heuristic from HuggingFace repo id.
- detect_
arch - Detect architecture from config.json fields.
- embed_
with_ rlx - Embed texts with a compiled BERT model: tokenize, forward, pool, L2-normalize.
- l2_
normalize_ in_ place - L2-normalize a vector in place (matches fastembed: divide by norm + 1e-12).
- models_
map - Get the global model registry.
- pool_
embeddings - Pool
[batch, seq, hidden]hidden states into[batch, hidden]and L2-normalize.