Skip to main content

NERExtractor

Struct NERExtractor 

Source
pub struct NERExtractor { /* private fields */ }
Expand description

NER extractor with fallback support.

This is the recommended way to use NER in anno. It handles:

  • Backend selection based on available features
  • Graceful fallback when ML models fail
  • Hybrid mode combining ML and patterns

§Example

use anno::backends::extractor::NERExtractor;

// Automatic selection (best available)
let extractor = NERExtractor::best_available();

// Explicit backend
let extractor = NERExtractor::with_bert_onnx("protectai/bert-base-NER-onnx")?;

// Extract entities
let entities = extractor.extract("Apple announced new iPhone in Cupertino.", None)?;

Implementations§

Source§

impl NERExtractor

Source

pub fn new(primary: Option<Arc<dyn Model>>, backend_type: BackendType) -> Self

Create with explicit primary and fallback.

Source

pub fn pattern_only() -> Self

Create with regex-based backend only.

Limited to structured entities: DATE, TIME, MONEY, PERCENT, EMAIL, URL, PHONE

Source

pub fn best_available() -> Self

Create the best available NER extractor.

Tries backends in priority order:

  1. GLiNER (if onnx feature enabled) - zero-shot
  2. BERT ONNX (if onnx feature enabled) - reliable fixed-type NER
  3. Candle (if candle feature enabled) - Rust-native inference
  4. RegexNER (always) - structured entities only
Source

pub fn fast() -> Self

Create the fastest available NER extractor.

Prioritizes speed over accuracy:

  1. GLiNER small (if onnx feature) - fast zero-shot
  2. RegexNER (always)
Source

pub fn best_quality() -> Self

Create the highest quality NER extractor.

Prioritizes accuracy over speed:

  1. GLiNER large (if onnx feature) - highest accuracy
  2. GLiNER medium (if onnx feature) - fallback
  3. BERT ONNX (if onnx feature) - reliable
  4. RegexNER (always)
Source

pub fn with_bert_onnx(model_name: &str) -> Result<Self>

Create with BERT ONNX backend.

Uses standard BERT models fine-tuned for NER with BIO tagging. Reliable and widely tested, but limited to fixed entity types.

§Arguments
  • model_name - HuggingFace model identifier (e.g., “protectai/bert-base-NER-onnx”)
Source

pub fn with_gliner(model_name: &str) -> Result<Self>

Create with GLiNER backend (zero-shot NER).

GLiNER is the recommended backend for best accuracy on named entities. It supports zero-shot NER (any entity type without retraining).

§Arguments
  • model_name - HuggingFace model identifier (e.g., “onnx-community/gliner_small-v2.1”)
Source

pub fn with_candle(_model_name: &str) -> Result<Self>

Stub for when candle feature is disabled.

Source

pub fn extract(&self, text: &str, language: Option<&str>) -> Result<Vec<Entity>>

Extract entities with automatic fallback.

Tries primary ML backend first, falls back to patterns if it fails.

Source

pub fn extract_hybrid( &self, text: &str, language: Option<&str>, ) -> Result<Vec<Entity>>

Extract entities using hybrid strategy.

Combines ML model (for semantic entities) with patterns (for structured entities):

  • ML: Person, Organization, Location (context-dependent)
  • Patterns: Date, Money, Percent, Email, URL, Phone (format-based)

This gets best of both worlds:

  • High F1 on ambiguous entities (via ML)
  • 100% precision on pattern entities (via patterns)
Source

pub fn backend_type(&self) -> BackendType

Get the active backend type.

Source

pub fn active_backend_name(&self) -> &'static str

Get the name of the active backend.

Source

pub fn has_ml_backend(&self) -> bool

Check if ML backend is available.

Source

pub fn supports_zero_shot(&self) -> bool

Check if this extractor supports zero-shot NER.

Trait Implementations§

Source§

impl Model for NERExtractor

Source§

fn extract_entities( &self, text: &str, language: Option<&str>, ) -> Result<Vec<Entity>>

Extract entities from text.
Source§

fn supported_types(&self) -> Vec<EntityType>

Get supported entity types.
Source§

fn is_available(&self) -> bool

Check if model is available and ready.
Source§

fn name(&self) -> &'static str

Get the model name/identifier.
Source§

fn description(&self) -> &'static str

Get a description of the model.
Source§

fn capabilities(&self) -> ModelCapabilities

Get capability summary for this model. Read more
Source§

fn version(&self) -> String

Get a version identifier for the model configuration/weights. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more