Skip to main content

LabelEncoder

Trait LabelEncoder 

Source
pub trait LabelEncoder: Send + Sync {
    // Required methods
    fn encode_label(&self, label: &str) -> Result<Vec<f32>>;
    fn encode_labels(&self, labels: &[&str]) -> Result<Vec<f32>>;
    fn hidden_dim(&self) -> usize;
}
Expand description

Label encoder trait for encoding entity type descriptions.

§Motivation

Zero-shot NER works by encoding entity type descriptions into the same vector space as text spans. Instead of training separate classifiers for each entity type, we compute similarity between spans and label embeddings.

This enables:

  • Unlimited entity types at inference (no retraining needed)
  • Faster inference when labels are pre-computed
  • Better generalization to unseen entity types via semantic similarity

§Research Alignment

From GLiNER bi-encoder (knowledgator/modern-gliner-bi-base-v1.0):

“textual encoder is ModernBERT-base and entity label encoder is sentence transformer - BGE-small-en.”

§Example

use anno::LabelEncoder;

fn setup_custom_types(encoder: &dyn LabelEncoder) {
    // Encode rich descriptions for better matching
    let labels = &[
        "a named individual human being",
        "a company, institution, or organized group",
        "a geographical location, city, country, or region",
    ];

    let embeddings = encoder.encode_labels(labels).unwrap();
    // Store embeddings in SemanticRegistry for fast lookup
}

Required Methods§

Source

fn encode_label(&self, label: &str) -> Result<Vec<f32>>

Encode a single label description.

§Arguments
  • label - Label description (e.g., “a named individual human being”)
Source

fn encode_labels(&self, labels: &[&str]) -> Result<Vec<f32>>

Encode multiple labels.

§Arguments
  • labels - Label descriptions
§Returns

Flattened embeddings: [num_labels, hidden_dim]

Source

fn hidden_dim(&self) -> usize

Get the hidden dimension.

Implementors§