pub struct Identity {
pub id: IdentityId,
pub canonical_name: String,
pub entity_type: Option<TypeLabel>,
pub kb_id: Option<String>,
pub kb_name: Option<String>,
pub description: Option<String>,
pub embedding: Option<Vec<f32>>,
pub aliases: Vec<String>,
pub confidence: f32,
pub source: Option<IdentitySource>,
}Expand description
A global identity: a real-world entity linked to a knowledge base.
§The Modal Gap
There’s a fundamental representational gap between:
- Text mentions: Contextual, variable surface forms (“Marie Curie”, “she”, “the scientist”)
- KB entities: Canonical, static representations (Q7186 in Wikidata)
Bridging this gap requires:
- Learning aligned embeddings (text encoder ↔ KB encoder)
- Type consistency constraints
- Cross-encoder re-ranking for hard cases
§Design Philosophy
Identities are the “global truth” that tracks point to. They represent:
- A canonical name and description
- A knowledge base reference (if available)
- An embedding in the entity space (for similarity/clustering)
Identities can exist without KB links (for novel entities not in the KB).
Fields§
§id: IdentityIdUnique identifier
canonical_name: StringCanonical name (the “official” name)
entity_type: Option<TypeLabel>Entity type/category.
Stored as a TypeLabel to support both core and custom (domain) labels.
kb_id: Option<String>Knowledge base reference (e.g., “Q7186” for Wikidata)
kb_name: Option<String>Knowledge base name (e.g., “wikidata”, “umls”)
description: Option<String>Description from knowledge base
embedding: Option<Vec<f32>>Entity embedding in the KB/entity space This is aligned with the text encoder space for similarity computation
aliases: Vec<String>Alias names (other known surface forms)
confidence: f32Confidence that this identity is correctly resolved
source: Option<IdentitySource>Source of identity formation (how it was created)
Implementations§
Source§impl Identity
impl Identity
Sourcepub fn new(id: impl Into<IdentityId>, canonical_name: impl Into<String>) -> Self
pub fn new(id: impl Into<IdentityId>, canonical_name: impl Into<String>) -> Self
Create a new identity.
Sourcepub fn from_kb(
id: impl Into<IdentityId>,
canonical_name: impl Into<String>,
kb_name: impl Into<String>,
kb_id: impl Into<String>,
) -> Self
pub fn from_kb( id: impl Into<IdentityId>, canonical_name: impl Into<String>, kb_name: impl Into<String>, kb_id: impl Into<String>, ) -> Self
Create an identity from a knowledge base entry.
Sourcepub const fn id(&self) -> IdentityId
pub const fn id(&self) -> IdentityId
Get the identity’s unique identifier.
Sourcepub fn canonical_name(&self) -> &str
pub fn canonical_name(&self) -> &str
Get the canonical name.
Sourcepub const fn confidence(&self) -> f32
pub const fn confidence(&self) -> f32
Get the confidence score.
Sourcepub fn set_confidence(&mut self, confidence: f32)
pub fn set_confidence(&mut self, confidence: f32)
Set the confidence score.
Sourcepub fn source(&self) -> Option<&IdentitySource>
pub fn source(&self) -> Option<&IdentitySource>
Get the identity source.
Sourcepub fn with_embedding(self, embedding: Vec<f32>) -> Self
pub fn with_embedding(self, embedding: Vec<f32>) -> Self
Set the embedding.
Sourcepub fn with_type(self, entity_type: impl Into<String>) -> Self
pub fn with_type(self, entity_type: impl Into<String>) -> Self
Set the entity type from a string.
For new code, prefer Self::with_type_label which provides type safety.
Sourcepub fn with_type_label(self, label: TypeLabel) -> Self
pub fn with_type_label(self, label: TypeLabel) -> Self
Set the entity type using a type-safe label.
This is the preferred method for new code as it provides type safety
and integrates with the core EntityType taxonomy.
Sourcepub fn type_label(&self) -> Option<TypeLabel>
pub fn type_label(&self) -> Option<TypeLabel>
Get the entity type as a type-safe label.
This converts the internal string representation to a TypeLabel,
attempting to parse it as a core EntityType first.
Sourcepub fn with_description(self, description: impl Into<String>) -> Self
pub fn with_description(self, description: impl Into<String>) -> Self
Set description.