#[non_exhaustive]pub enum ExtractionMethod {
Pattern,
Neural,
Lexicon,
SoftLexicon,
GatedEnsemble,
Consensus,
Heuristic,
Unknown,
Rule,
ML,
Ensemble,
}Expand description
Extraction method used to identify an entity.
§Research Context
Different extraction methods have different strengths:
| Method | Precision | Recall | Generalization | Use Case |
|---|---|---|---|---|
| Pattern | Very High | Low | N/A (format-based) | Dates, emails, money |
| Neural | High | High | Good | General NER |
| Lexicon | Very High | Low | None | Closed-domain entities |
| SoftLexicon | Medium | High | Good for rare types | Low-resource NER |
| GatedEnsemble | Highest | Highest | Contextual | Short texts, domain shift |
See docs/ for repo-local notes and entry points.
Variants (Non-exhaustive)§
This enum is marked as non-exhaustive
Pattern
Regex pattern matching (high precision for structured data like dates, money). Does not generalize - only detects format-based entities.
Neural
Neural model inference (BERT, GLiNER, etc.). The recommended default for general NER. Generalizes to unseen entities.
Lexicon
Exact lexicon/gazetteer lookup (deprecated approach). High precision on known entities, zero recall on novel entities. Only use for closed domains (stock tickers, medical codes).
SoftLexicon
Embedding-based soft lexicon matching. Useful for low-resource languages and rare entity types. See: Rijhwani et al. (2020) “Soft Gazetteers for Low-Resource NER”
GatedEnsemble
Gated ensemble: neural + lexicon with learned weighting. Model learns when to trust lexicon vs. context. See: Nie et al. (2021) “GEMNET: Effective Gated Gazetteer Representations”
Consensus
Multiple methods agreed on this entity (high confidence).
Heuristic
Heuristic-based extraction (capitalization, word shape, context). Used by heuristic backends that don’t use neural models.
Unknown
Unknown or unspecified extraction method.
Rule
Legacy rule-based extraction (for backward compatibility).
ML
Legacy alias for Neural (for backward compatibility).
Ensemble
Legacy alias for Consensus (for backward compatibility).
Implementations§
Source§impl ExtractionMethod
impl ExtractionMethod
Sourcepub const fn is_calibrated(&self) -> bool
pub const fn is_calibrated(&self) -> bool
Returns true if this extraction method produces probabilistically calibrated confidence scores suitable for calibration analysis (ECE, Brier score, etc.).
§Calibrated Methods
- Neural: Softmax outputs are intended to be probabilistic (though may need temperature scaling for true calibration)
- GatedEnsemble: Produces learned probability estimates
- SoftLexicon: Embedding similarity is pseudo-probabilistic
§Uncalibrated Methods
- Pattern: Binary (match/no-match); confidence is typically hardcoded
- Heuristic: Arbitrary scores from hand-crafted rules
- Lexicon: Binary exact match
- Consensus: Agreement count, not a probability
§Example
use anno_core::ExtractionMethod;
assert!(ExtractionMethod::Neural.is_calibrated());
assert!(!ExtractionMethod::Pattern.is_calibrated());
assert!(!ExtractionMethod::Heuristic.is_calibrated());Sourcepub const fn confidence_interpretation(&self) -> &'static str
pub const fn confidence_interpretation(&self) -> &'static str
Returns the confidence interpretation for this extraction method.
This helps users understand what the confidence score means:
"probability": Score approximates P(correct)"heuristic_score": Score is a non-probabilistic quality measure"binary": Score is 0 or 1 (or a fixed value for matches)
Trait Implementations§
Source§impl Clone for ExtractionMethod
impl Clone for ExtractionMethod
Source§fn clone(&self) -> ExtractionMethod
fn clone(&self) -> ExtractionMethod
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for ExtractionMethod
impl Debug for ExtractionMethod
Source§impl Default for ExtractionMethod
impl Default for ExtractionMethod
Source§fn default() -> ExtractionMethod
fn default() -> ExtractionMethod
Source§impl<'de> Deserialize<'de> for ExtractionMethod
impl<'de> Deserialize<'de> for ExtractionMethod
Source§fn deserialize<__D>(
__deserializer: __D,
) -> Result<ExtractionMethod, <__D as Deserializer<'de>>::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(
__deserializer: __D,
) -> Result<ExtractionMethod, <__D as Deserializer<'de>>::Error>where
__D: Deserializer<'de>,
Source§impl Display for ExtractionMethod
impl Display for ExtractionMethod
Source§impl Hash for ExtractionMethod
impl Hash for ExtractionMethod
Source§impl PartialEq for ExtractionMethod
impl PartialEq for ExtractionMethod
Source§impl Serialize for ExtractionMethod
impl Serialize for ExtractionMethod
Source§fn serialize<__S>(
&self,
__serializer: __S,
) -> Result<<__S as Serializer>::Ok, <__S as Serializer>::Error>where
__S: Serializer,
fn serialize<__S>(
&self,
__serializer: __S,
) -> Result<<__S as Serializer>::Ok, <__S as Serializer>::Error>where
__S: Serializer,
impl Copy for ExtractionMethod
impl Eq for ExtractionMethod
impl StructuralPartialEq for ExtractionMethod
Auto Trait Implementations§
impl Freeze for ExtractionMethod
impl RefUnwindSafe for ExtractionMethod
impl Send for ExtractionMethod
impl Sync for ExtractionMethod
impl Unpin for ExtractionMethod
impl UnsafeUnpin for ExtractionMethod
impl UnwindSafe for ExtractionMethod
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<T> ToCompactString for Twhere
T: Display,
impl<T> ToCompactString for Twhere
T: Display,
Source§fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
fn try_to_compact_string(&self) -> Result<CompactString, ToCompactStringError>
ToCompactString::to_compact_string() Read moreSource§fn to_compact_string(&self) -> CompactString
fn to_compact_string(&self) -> CompactString
CompactString. Read moreSource§impl<T> ToStringFallible for Twhere
T: Display,
impl<T> ToStringFallible for Twhere
T: Display,
Source§fn try_to_string(&self) -> Result<String, TryReserveError>
fn try_to_string(&self) -> Result<String, TryReserveError>
ToString::to_string, but without panic on OOM.