pub struct NERExtractor { /* private fields */ }Expand description
NER extractor with fallback support.
This is the recommended way to use NER in anno. It handles:
- Backend selection based on available features
- Graceful fallback when ML models fail
- Hybrid mode combining ML and patterns
§Example
use anno::backends::extractor::NERExtractor;
// Automatic selection (best available)
let extractor = NERExtractor::best_available();
// Explicit backend
let extractor = NERExtractor::with_bert_onnx("protectai/bert-base-NER-onnx")?;
// Extract entities
let entities = extractor.extract("Apple announced new iPhone in Cupertino.", None)?;Implementations§
Source§impl NERExtractor
impl NERExtractor
Sourcepub fn new(primary: Option<Arc<dyn Model>>, backend_type: BackendType) -> Self
pub fn new(primary: Option<Arc<dyn Model>>, backend_type: BackendType) -> Self
Create with explicit primary and fallback.
Sourcepub fn pattern_only() -> Self
pub fn pattern_only() -> Self
Create with regex-based backend only.
Limited to structured entities: DATE, TIME, MONEY, PERCENT, EMAIL, URL, PHONE
Sourcepub fn best_available() -> Self
pub fn best_available() -> Self
Create the best available NER extractor.
Tries backends in priority order:
- GLiNER (if
onnxfeature enabled) - zero-shot - BERT ONNX (if
onnxfeature enabled) - reliable fixed-type NER - Candle (if
candlefeature enabled) - Rust-native inference - RegexNER (always) - structured entities only
Sourcepub fn fast() -> Self
pub fn fast() -> Self
Create the fastest available NER extractor.
Prioritizes speed over accuracy:
- GLiNER small (if
onnxfeature) - fast zero-shot - RegexNER (always)
Sourcepub fn best_quality() -> Self
pub fn best_quality() -> Self
Create the highest quality NER extractor.
Prioritizes accuracy over speed:
- GLiNER large (if
onnxfeature) - highest accuracy - GLiNER medium (if
onnxfeature) - fallback - BERT ONNX (if
onnxfeature) - reliable - RegexNER (always)
Sourcepub fn with_bert_onnx(model_name: &str) -> Result<Self>
pub fn with_bert_onnx(model_name: &str) -> Result<Self>
Create with BERT ONNX backend.
Uses standard BERT models fine-tuned for NER with BIO tagging. Reliable and widely tested, but limited to fixed entity types.
§Arguments
model_name- HuggingFace model identifier (e.g., “protectai/bert-base-NER-onnx”)
Sourcepub fn with_gliner(model_name: &str) -> Result<Self>
pub fn with_gliner(model_name: &str) -> Result<Self>
Create with GLiNER backend (zero-shot NER).
GLiNER is the recommended backend for best accuracy on named entities. It supports zero-shot NER (any entity type without retraining).
§Arguments
model_name- HuggingFace model identifier (e.g., “onnx-community/gliner_small-v2.1”)
Sourcepub fn with_candle(_model_name: &str) -> Result<Self>
pub fn with_candle(_model_name: &str) -> Result<Self>
Stub for when candle feature is disabled.
Sourcepub fn extract(&self, text: &str, language: Option<&str>) -> Result<Vec<Entity>>
pub fn extract(&self, text: &str, language: Option<&str>) -> Result<Vec<Entity>>
Extract entities with automatic fallback.
Tries primary ML backend first, falls back to patterns if it fails.
Sourcepub fn extract_hybrid(
&self,
text: &str,
language: Option<&str>,
) -> Result<Vec<Entity>>
pub fn extract_hybrid( &self, text: &str, language: Option<&str>, ) -> Result<Vec<Entity>>
Extract entities using hybrid strategy.
Combines ML model (for semantic entities) with patterns (for structured entities):
- ML: Person, Organization, Location (context-dependent)
- Patterns: Date, Money, Percent, Email, URL, Phone (format-based)
This gets best of both worlds:
- High F1 on ambiguous entities (via ML)
- 100% precision on pattern entities (via patterns)
Sourcepub fn backend_type(&self) -> BackendType
pub fn backend_type(&self) -> BackendType
Get the active backend type.
Sourcepub fn active_backend_name(&self) -> &'static str
pub fn active_backend_name(&self) -> &'static str
Get the name of the active backend.
Sourcepub fn has_ml_backend(&self) -> bool
pub fn has_ml_backend(&self) -> bool
Check if ML backend is available.
Sourcepub fn supports_zero_shot(&self) -> bool
pub fn supports_zero_shot(&self) -> bool
Check if this extractor supports zero-shot NER.
Trait Implementations§
Source§impl Model for NERExtractor
impl Model for NERExtractor
Source§fn extract_entities(
&self,
text: &str,
language: Option<&str>,
) -> Result<Vec<Entity>>
fn extract_entities( &self, text: &str, language: Option<&str>, ) -> Result<Vec<Entity>>
Source§fn supported_types(&self) -> Vec<EntityType>
fn supported_types(&self) -> Vec<EntityType>
Source§fn is_available(&self) -> bool
fn is_available(&self) -> bool
Source§fn description(&self) -> &'static str
fn description(&self) -> &'static str
Source§fn capabilities(&self) -> ModelCapabilities
fn capabilities(&self) -> ModelCapabilities
Auto Trait Implementations§
impl Freeze for NERExtractor
impl !RefUnwindSafe for NERExtractor
impl Send for NERExtractor
impl Sync for NERExtractor
impl Unpin for NERExtractor
impl UnsafeUnpin for NERExtractor
impl !UnwindSafe for NERExtractor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more