Annotation represents a particular
instance of annotation and is the central
concept of the model. They can be considered the primary nodes of the graph model. The
instance of annotation is strictly decoupled from the
data or key/value of the
annotation (
AnnotationData). After all, multiple instances can be annotated
with the same label (multiple annotations may share the same annotation data).
Moreover, an
Annotation can have multiple annotation data associated.
The result is that multiple annotations with the exact same content require less storage
space, and searching and indexing is facilitated.
AnnotationData holds the actual content of an annotation; a key/value pair. (the
term feature is regularly seen for this in certain annotation paradigms).
Annotation Data is deliberately decoupled from the actual Annotation
instances so multiple annotation instances can point to the same content
without causing any overhead in storage. Moreover, it facilitates indexing and
searching. The annotation data is part of an AnnotationDataSet, which
effectively defines a certain user-defined vocabulary.
This is the builder for AnnotationData. It contains public IDs or handles that will be resolved.
It is usually not instantiated directly but used via the [AnnotationBuilder.with_data()], [AnnotationBuilder.insert_data()] or [AnnotationDataSet.with_data()] methods.
It also does not have its own build() method but is resolved via the aforementioned methods.
An
AnnotationDataSet stores the keys
DataKey and values
AnnotationData (which in turn encapsulates
DataValue) that are used by annotations.
It effectively defines a certain vocabulary, i.e. key/value pairs.
The
AnnotationDataSet does not store the
crate::annotation::Annotation instances themselves, those are in
the
AnnotationStore. The datasets themselves are also held by the
AnnotationStore.
An Annotation Store is an unordered collection of annotations, resources and
annotation data sets. It can be seen as the root of the graph model and the glue
that holds everything together. It is the entry point for any stam model.
This holds the configuration. It is not limited to configuring a single part of the model, but unifies all in a single configuration.
The DataKey class defines a vocabulary field, it
belongs to a certain
AnnotationDataSet. An
AnnotationData
in turn makes reference to a DataKey and assigns it a value.
This iterator is produced by [TextResource.find_text_regex()] and searches a text based on regular expressions.
This match structure is returned by the
FindRegexIter iterator, which is in turn produced by [
TextResource.find_text_regex()] and searches a text based on regular expressions.
This structure represents a single regular-expression match of the iterator on the text.
A map mapping public IDs to internal ids, implemented as a HashMap.
Used to resolve public IDs to internal ones.
Text selection offset. Specifies begin and end offsets to select a range of a text, via two
Cursor instances.
The end-point is non-inclusive.
A compiled regular expression for matching Unicode strings.
Match multiple (possibly overlapping) regular expressions in a single scan.
This models relations or ‘edges’ in graph terminology, between handles. It acts as a reverse index is used for various purposes.
Iterator that returns the selector itself, plus all selectors under it (recursively)
This is the iterator to iterate over a Store, it is created by the iter() method from the
StoreFor<T> trait
It produces a references to the item wrapped in a fat pointer (
WrappedItem<T>) that also contains reference to the store
and which is immediately implements various methods for working with the type.
Mutable variant of
StoreIter<T>, but unlike that one this does not wrap results in a fat pointer but returns them directly, ready for mutation.
This holds the textual resource to be annotated. It holds the full text in memory.
Corresponds to a slice of the text. This only contains minimal
information; i.e. the begin offset and end offset.
This is similar to Offset, but that one uses cursors which may
be relative. TextSelection specified an offset in more absolute terms.
This iterator is used for iterating over TextSelections in a resource in a sorted fashion
using the so-called position index.
A TextSelectionSet holds one or more
TextSelection items and a reference to the TextResource from which they’re drawn.
All textselections in a set must reference the same resource, which implies they are comparable.
Helper structure that contains a store and a reference to self. Mostly for internal use.
A cursor points to a specific point in a text. I
Used to select offsets. Units are unicode codepoints (not bytes!)
and are 0-indexed.
Item offers various ways of referring to a data structure of type T in the core STAM model
It abstracts over public IDs (both owned an and borrowed), handles, and references.
A Selector identifies the target of an annotation and the part of the
target that the annotation applies to. Selectors can be considered the labelled edges of the graph model, tying all nodes together.
There are multiple types of selectors, all captured in this enum.
A SelectorBuilder is a recipe that, when applied, identifies the target of an annotation and the part of the
target that the annotation applies to. They produce a Selector and you can do so via [Annotationstore.selector].
See
Selector, this is a simplified variant that carries only the type, not the target.
This enum groups the different kind of errors that this STAM library can produce
The TextSelectionOperator, simply put, allows comparison of two [TextSelection'] instances. It allows testing for all kinds of spatial relations (as embodied by this enum) in which two [TextSelection`] instances can be.
This is a smart pointer that encapsulates both the item and the store that owns it.
It allows the item to have some more introspection as it knows who its immediate parent is.
It is used for example in serialization.