docbert-pylate
Rust library for late-interaction (ColBERT) model inference, used by
docbert for query and document encoding.
This crate is a vendored, Rust-only fork of pylate-rs.
The upstream Python, WebAssembly, and npm packaging layers have been removed —
docbert-pylate is consumed exclusively as a library from inside this workspace
and is not intended to be published as a standalone crate.
What it provides
- A
ColBERTmodel loaded from a Hugging Face repo or a local directory. - BERT and ModernBERT backbones via Candle.
- Query and document encoding with batched, rayon-parallel CPU execution and optional CUDA / Metal / MKL / Accelerate backends.
- Hierarchical token pooling for document embeddings.
Acceleration features
| Feature | Backend |
|---|---|
| (default) | Standard CPU |
accelerate |
Apple CPU (macOS) |
mkl |
Intel CPU (MKL) |
metal |
Apple GPU (M-series) |
cuda |
NVIDIA GPU (CUDA) |
Features are propagated from docbert / docbert-core — see the top-level
docbert crate for the user-facing build options.
License
MIT — same as upstream pylate-rs.