Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
anno
Extract named entities, relations, coreference chains, and PII from unstructured text. Fixed entity types (PER/ORG/LOC/MISC) or zero-shot custom labels.
Dual-licensed under MIT or Apache-2.0. MSRV: 1.85.
Quickstart
[]
= "0.3.9"
let entities = extract?;
for e in &entities
// Sophie Wilson [PER] (0,13) 0.95
// ARM [ORG] (27,30) 0.90
# Ok::
Filter results with prelude:
use *;
let people: = entities.of_type.collect;
let confident: = entities.above_confidence.collect;
For backend control, construct a model directly:
use ;
let m = default;
let ents = m.extract_entities?;
# Ok::
StackedNER::default() selects the best available backend at runtime: BERT or NuNER (if onnx enabled and models cached), then GLiNER, falling back to heuristic + pattern extraction. Set ANNO_NO_DOWNLOADS=1 or HF_HUB_OFFLINE=1 to force cached-only behavior.
Zero-shot custom types via GLiNER:
use GLiNEROnnx;
let m = new?;
let ents = m.extract?;
for e in &ents
// drug: Aspirin
// symptom: headaches
# Ok::
Custom backends
AnyModel wraps a closure into a Model, bypassing the sealed trait when you need to plug in an external NER system:
use ;
let model = new;
let ents = model.extract_entities?;
# Ok::
What it does
Named entity recognition. Spans (start, end, type, confidence) with character offsets (Unicode scalar values, not bytes). Fixed taxonomies (PER/ORG/LOC/MISC) or caller-defined labels for zero-shot extraction [1, 2].
Coreference resolution. Group mentions into clusters tracking the same referent. Rule-based sieves (SimpleCorefResolver), neural (FCoref, 78.5 F1 on CoNLL-2012 [3]), and mention-ranking (MentionRankingCoref).
Structured patterns. Dates, monetary amounts, emails, URLs, phone numbers via deterministic regex grammars.
Relation extraction. (head, relation, tail) triples via RelationCapable backends (gliner2, tplinker). Other backends produce co-occurrence edges for graph export.
PII detection. Classify NER entities as PII and scan for structured patterns (SSN, credit card, IBAN, email, phone). Redact or pseudonymize in one call:
use ;
let text = "John Smith's SSN is 123-45-6789.";
let m = default;
let ents = m.extract_entities?;
let mut pii_ents: = ents.iter.filter_map.collect;
pii_ents.extend;
let redacted = redact;
// "[REDACTED]'s SSN is [REDACTED]."
# Ok::
Export. Brat standoff, CoNLL BIO tags, JSONL, N-Triples, JSON-LD, and graph CSV via pure functions in anno::export.
Backends
| Backend | Feature | Zero-shot | Status | Reference |
|---|---|---|---|---|
stacked (default) |
-- | -- | stable | -- |
gliner |
onnx |
Yes | stable | Zaratiana et al. [5] |
gliner2 |
onnx |
Yes | beta | [11] |
nuner |
onnx |
Yes | stable | Bogdanov et al. [6] |
bert_onnx |
onnx |
No | beta | Devlin et al. [8] |
w2ner |
onnx |
No | beta | Li et al. [7] |
tplinker |
onnx |
No | beta | Wang et al. [10] |
glirel |
onnx |
Yes | beta | -- |
gliner_poly |
onnx |
Yes | beta | -- |
gliner_candle |
candle |
Yes | beta | -- |
candle_ner |
candle |
No | beta | -- |
pattern |
-- | N/A | stable | -- |
heuristic |
-- | No | stable | -- |
crf |
-- | No | stable | Lafferty et al. [9] |
hmm |
-- | No | stable | Rabiner [12] |
ensemble |
-- | No | beta | -- |
bilstm_crf |
-- | No | beta | -- |
universal_ner |
llm |
Yes | beta | -- |
See BACKENDS.md for details, default models, and WIP backends.
ML backends are feature-gated (onnx or candle). Weights download from HuggingFace on first use.
Feature flags
| Feature | Default | Description |
|---|---|---|
onnx |
Yes | ONNX Runtime backends via ort |
candle |
No | Pure-Rust backends (no C++ runtime) |
metal |
No | Metal GPU acceleration (enables candle) |
cuda |
No | CUDA GPU acceleration (enables candle) |
analysis |
No | Coref metrics, cluster encoders |
schema |
No | JSON Schema for output types |
llm |
No | LLM-based extraction (OpenRouter, Anthropic, Groq, Gemini, Ollama) |
production |
No | parking_lot locks + tracing instrumentation |
bundled-crf-weights |
No | Embed trained CRF weights in binary |
bundled-hmm-params |
No | Embed HMM parameters in binary |
CLI
# PER:1 "Lynn Conway"
# ORG:2 "IBM" "Xerox PARC"
# LOC:1 "California"
# drug:1 "Aspirin" symptom:2 "headaches" "fever"
# Coreference: "Sophie Wilson" -> "She"
JSON output with --format json. Batch processing with anno batch. Graph export (N-Triples, JSON-LD, CSV) with anno export --features graph.
Coreference
| Backend | Type | Quality | Speed |
|---|---|---|---|
SimpleCorefResolver |
Rule-based (9 sieves) | Low | Fast |
FCoref |
Neural (DistilRoBERTa) | 78.5 F1 [3] | Medium |
MentionRankingCoref |
Mention-ranking | Medium | Medium |
FCoref requires a one-time model export: uv run scripts/export_fcoref.py (from a repo clone).
RAG preprocessing (rag::resolve_for_rag(), rag::preprocess()): rewrites pronouns for self-contained chunks after splitting. Always available (no feature flag required).
Scope
Inference-time extraction. Training pipelines are out of scope -- use upstream frameworks and export ONNX weights.
Troubleshooting
- ONNX linking errors: use
default-features = falsefor builds without C++, or checkORT_DYLIB_PATH. - Model downloads: set
HF_HUB_OFFLINE=1for cached-only mode behind firewalls. - Feature errors: most backends are gated behind
onnxorcandle. - Offset mismatches: all spans use character offsets, not byte offsets. See CONTRACT.md.
Documentation
- QUICKSTART -- getting started
- CONTRACT -- offset semantics, scope
- BACKENDS -- backend details, feature flags
- ARCHITECTURE -- crate layout
- REFERENCES -- full bibliography
- API docs
References
[1] Grishman & Sundheim, COLING 1996. [2] Tjong Kim Sang & De Meulder, CoNLL 2003. [3] Otmazgin et al., AACL 2022 (F-COREF). [4] Jurafsky & Martin, SLP3 2024. [5] Zaratiana et al., NAACL 2024 (GLiNER). [6] Bogdanov et al., 2024 (NuNER). [7] Li et al., AAAI 2022 (W2NER). [8] Devlin et al., NAACL 2019 (BERT). [9] Lafferty et al., ICML 2001 (CRF). [10] Wang et al., COLING 2020 (TPLinker). [11] Zaratiana et al., 2025 (GLiNER2). [12] Rabiner, Proc. IEEE 1989 (HMM).
Full list: docs/REFERENCES.md. Citeable via CITATION.cff.
License
Dual-licensed under MIT or Apache-2.0.