Skip to main content

Crate flash_rerank

Crate flash_rerank 

Source
Expand description

Flash-Rerank – Blazing-fast neural reranking engine.

Provides cross-encoder and ColBERT inference via ONNX Runtime with TensorRT, CUDA, and CPU execution providers.

Re-exports§

pub use types::CacheMetadata;
pub use types::Device;
pub use types::ModelConfig;
pub use types::ModelFile;
pub use types::ModelManifest;
pub use types::Precision;
pub use types::RerankConfig;
pub use types::RerankRequest;
pub use types::RerankResult;
pub use types::ScorerType;

Modules§

batch
Dynamic request batching for GPU inference.
calibrate
cascade
engine
Inference backends for cross-encoder and ColBERT models.
fusion
models
Model download, caching, and lifecycle management.
multi_gpu
tokenize
types

Enums§

Error
Top-level error type for flash_rerank.

Functions§

rerank
Convenience function: create an OrtScorer from a model directory and score documents.

Type Aliases§

Result
Result alias for flash_rerank operations.