docbert-pylate

Rust library for late-interaction (ColBERT) model inference, used by docbert for query and document encoding.

This crate is a vendored, Rust-only fork of pylate-rs. The upstream Python, WebAssembly, and npm packaging layers have been removed — docbert-pylate is consumed exclusively as a library from inside this workspace and is not intended to be published as a standalone crate.

What it provides

A ColBERT model loaded from a Hugging Face repo or a local directory.
BERT and ModernBERT backbones via Candle.
Query and document encoding with batched, rayon-parallel CPU execution and optional CUDA / Metal / MKL / Accelerate backends.
Hierarchical token pooling for document embeddings.

Acceleration features

Feature	Backend
(default)	Standard CPU
`accelerate`	Apple CPU (macOS)
`mkl`	Intel CPU (MKL)
`metal`	Apple GPU (M-series)
`cuda`	NVIDIA GPU (CUDA)

Features are propagated from docbert / docbert-core — see the top-level docbert crate for the user-facing build options.

License

MIT — same as upstream pylate-rs.

docbert-pylate 0.9.0

docbert-pylate

What it provides

Acceleration features

License