Skip to main content

Crate rlx_embed

Crate rlx_embed 

Source
Expand description

RLX-backed text and image embedding models.

Migrated from burnembed — compiles BERT / NomicBERT / NomicVision graphs via rlx-runtime and exposes tier-0 inference helpers.

use rlx_models::embed::{Pooling, RlxBertModel, BertTokenizer, embed_with_rlx};

let tok = BertTokenizer::from_dir(model_dir, 512)?;
let mut model = RlxBertModel::load(&config, &weights)?;
let vecs = embed_with_rlx(&mut model, &tok, &["hello", "world"], Pooling::Mean)?;

Structs§

BertTokenizer
Wrapper around HuggingFace tokenizer configured for BERT-style encoding.
ImageModelInfo
Metadata for an image embedding model.
ModelInfo
Metadata for an embedding model.
RlxBertModel
RLX-compiled BERT model ready for inference.
RlxEmbed
High-level embedding model — auto-detects BERT / NomicBERT / NomicVision.
RlxNomicModel
RLX-compiled NomicBERT with shape-bucketed compile cache.
RlxVisionModel
RLX-compiled NomicVision encoder (patch preprocess host-side, trunk on RLX).
TokenizedBatch
Output of batch tokenization: token IDs, attention masks, and token type IDs.

Enums§

Arch
Detected embedding architecture from config.json.
EmbeddingModel
Supported text embedding models.
ImageEmbeddingModel
Supported image embedding models.
ModelArch
Model architecture type.
Pooling
Pooling strategy for reducing token hidden states to one vector per sequence.

Functions§

assemble_vision_hidden
Assemble encoder input [batch, seq, hidden] from NCHW pixels + preprocess weights.
compile_model
Compile an embedding graph for the given batch/seq on device.
compile_model_cpu
Compile on CPU (convenience for tests and default RlxEmbed::from_dir).
default_pooling
Default pooling heuristic from HuggingFace repo id.
detect_arch
Detect architecture from config.json fields.
embed_with_rlx
Embed texts with a compiled BERT model: tokenize, forward, pool, L2-normalize.
l2_normalize_in_place
L2-normalize a vector in place (matches fastembed: divide by norm + 1e-12).
models_map
Get the global model registry.
pool_embeddings
Pool [batch, seq, hidden] hidden states into [batch, hidden] and L2-normalize.