Crate stam

source ·

Structs

  • Annotation represents a particular instance of annotation and is the central concept of the model. They can be considered the primary nodes of the graph model. The instance of annotation is strictly decoupled from the data or key/value of the annotation (AnnotationData). After all, multiple instances can be annotated with the same label (multiple annotations may share the same annotation data). Moreover, an Annotation can have multiple annotation data associated. The result is that multiple annotations with the exact same content require less storage space, and searching and indexing is facilitated.
  • This is the build recipe for Annotation. It contains public IDs or handles that will be resolved when the actual Annotation is built. The building is done by passing this to AnnotationStore::annotate().
  • AnnotationData holds the actual content of an annotation; a key/value pair. (the term feature is regularly seen for this in certain annotation paradigms). Annotation Data is deliberately decoupled from the actual Annotation instances so multiple annotation instances can point to the same content without causing any overhead in storage. Moreover, it facilitates indexing and searching. The annotation data is part of an AnnotationDataSet, which effectively defines a certain user-defined vocabulary.
  • This is the build recipe for AnnotationData. It contains public IDs or handles that will be resolved. It is usually not instantiated directly but used via the [AnnotationBuilder.with_data()], [AnnotationBuilder.insert_data()] or [AnnotationDataSet.with_data()] methods.
  • An AnnotationDataSet stores the keys DataKey and values AnnotationData (which in turn encapsulates DataValue) that are used by annotations. It effectively defines a certain vocabulary, i.e. key/value pairs. The AnnotationDataSet does not store the crate::annotation::Annotation instances themselves, those are in the AnnotationStore. The datasets themselves are also held by the AnnotationStore.
  • An Annotation Store is an unordered collection of annotations, resources and annotation data sets. It can be seen as the root of the graph model and the glue that holds everything together. It is the entry point for any stam model.
  • This holds the configuration for the annotationstore
  • The DataKey class defines a vocabulary field, it belongs to a certain AnnotationDataSet. An AnnotationData in turn makes reference to a DataKey and assigns it a value.
  • A map mapping public IDs to internal ids, implemented as a HashMap. Used to resolve public IDs to internal ones.
  • Text selection offset. Specifies begin and end offsets to select a range of a text, via two Cursor instances. The end-point is non-inclusive.
  • A compiled regular expression for matching Unicode strings.
  • Match multiple (possibly overlapping) regular expressions in a single scan.
  • This models relations or ‘edges’ in graph terminology, between handles. It acts as a reverse index is used for various purposes.
  • Iterator that returns the selector itself, plus all selectors under it (recursively)
  • This is the iterator to iterate over a Store, it is created by the iter() method from the StoreFor<T> trait
  • This holds the textual resource to be annotated. It holds the full text in memory.
  • Corresponds to a slice of the text. This only contains minimal information; i.e. the begin offset and end offset. This is similar to Offset, but that one uses cursors which may be relative. TextSelection specified an offset in more absolute terms.
  • This iterator is used for iterating over TextSelections in a resource in a sorted fashion using the so-called position index.
  • This is a smart pointer that encapsulates both the item and the store that owns it. It allows the item to have some more introspection as it knows who its immediate parent is. It is used for example in serialization.
  • Helper structure that contains a store and a reference to self. Mostly for internal use.

Enums

  • This is either an public ID or a Handle
  • A cursor points to a specific point in a text. I Used to select offsets. Units are unicode codepoints (not bytes!) and are 0-indexed.
  • A Selector identifies the target of an annotation and the part of the target that the annotation applies to. Selectors can be considered the labelled edges of the graph model, tying all nodes together. There are multiple types of selectors, all captured in this enum.
  • A SelectorBuilder is a recipe that, when applied, identifies the target of an annotation and the part of the target that the annotation applies to. They produce a Selector and you can do so via [Annotationstore.selector].
  • See Selector, this is a simplified variant that carries only the type, not the target.
  • This enum groups the different kind of errors that this STAM library can produce
  • The TextSelectionOperator, simply put, allows comparison of two [TextSelection'] instances. It allows testing for all kinds of spatial relations (as embodied by this enum) in which two [TextSelection`] instances can be.

Traits

  • The handle trait is implemented on various handle types. They have in common that refer to the internal id a Storable item in a Store by index. Types implementing this are lightweigt and do not borrow anything, they can be passed and copied freely. This is a sealed trait, not implementable outside this crate.
  • This trait is implemented by types that can return a Selector to themselves
  • This trait is implemented on types that provide storage for a certain other generic type (T) It requires the types to also implemnet GetStore and HasIdMap It is a sealed trait, not implementable outside this crate.

Type Definitions

  • Type for Store elements. The struct that owns a field of this type should implement the trait StoreFor.