Struct stam::TextResource

source ·
pub struct TextResource { /* private fields */ }
Expand description

This holds the textual resource to be annotated. It holds the full text in memory.

The text SHOULD be in Unicode Normalization Form C (NFC) but MAY be in another unicode normalization forms.

Implementations§

source§

impl TextResource

source

pub fn new(id: String, config: Config) -> Self

Instantiates a new completely empty TextResource

source

pub fn builder() -> TextResourceBuilder

source

pub fn from_file(filename: &str, config: Config) -> Result<Self, StamError>

Create a new TextResource from file, the text will be loaded into memory entirely

source

pub fn with_file( self, filename: &str, config: Config ) -> Result<Self, StamError>

Loads a text for the TextResource from file (STAM JSON or plain text), the text will be loaded into memory entirely The use of [Self.from_file()] is preferred instead. This method can be dangerous if it modifies any existing text of a resource.

source

pub fn with_filename(self, filename: &str) -> Self

Sets the filename for writing, will force a write to it when the underlying store is serialized. CAUTION: This method does not load a file so it will overwrite any existing file!

source

pub fn with_string(self, text: String) -> Self

Sets the text of the TextResource from string, kept in memory entirely The use of [Self.from_string()] is preferred instead. This method can be dangerous if it modifies any existing text of a resource.

source

pub fn to_txt_file(&self, filename: &str) -> Result<(), StamError>

Writes a plain text file

source

pub fn from_string(id: String, text: String, config: Config) -> Self

Create a new TextResource from string, kept in memory entirely

source

pub fn known_textselection( &self, offset: &Offset ) -> Result<Option<TextSelectionHandle>, StamError>

Finds a known text selection, as specified by the offset. Known textselections are associated with an annotation. Returns a handle. Use the higher-level method [Self.textselection()] instead if you want to return a textselection regardless of whether it’s known or not.

source

pub fn textselections( &self ) -> impl Iterator<Item = WrappedItem<'_, TextSelection>>

Returns an unsorted iterator over all textselections in this resource Use this only if order doesn’t matter for. For a sorted version, use Self::iter() or Self::range() instead.

source

pub fn textselections_len(&self) -> usize

source

pub fn range<'a>(&'a self, begin: usize, end: usize) -> TextSelectionIter<'a>

Returns a sorted double-ended iterator over a range of all textselections and returns all textselections that either start or end in this range (depending on the direction you’re iterating in)

source

pub fn iter<'a>(&'a self) -> TextSelectionIter<'a>

Returns a sorted double-ended iterator over all textselections in this resource For unsorted (slightly more performant), use TextResource::textselections() instead.

source

pub fn positions<'a>( &'a self, mode: PositionMode ) -> Box<dyn Iterator<Item = &'a usize> + 'a>

Returns a sorted iterator over all absolute positions (begin aligned cursors) that are in use By passing a PositionMode parameter you can specify whether you want only positions where a textselection begins, ends or both.

source

pub fn position(&self, index: usize) -> Option<&PositionIndexItem>

Lookup a position (unicode point) in the PositionIndex. Low-level function. Only works for positions at which a TextSelection starts or ends (non-inclusive), returns None otherwise

source

pub fn positionindex_len(&self) -> usize

Returns the number of positions in the positionindex

source§

impl TextResource

source

pub fn textselections_by_operator_ref<'store, 'q>( &'store self, operator: TextSelectionOperator, refset: &'q TextSelectionSet ) -> FindTextSelectionsIter<'store, 'q>

Apply a TextSelectionOperator to find text selections This is a low-level method. Use Self::find_textselections() instead.

source

pub fn textselections_by_operator<'store>( &'store self, operator: TextSelectionOperator, refset: TextSelectionSet ) -> FindTextSelectionsOwnedIter<'store>

source

pub fn find_textselections_ref<'store, 'q>( &'store self, operator: TextSelectionOperator, refset: &'q TextSelectionSet ) -> impl Iterator<Item = WrappedItem<'store, TextSelection>> + 'qwhere 'store: 'q,

Find textselections by applying a text selection operator (TextSelectionOperator) to a one or more querying textselections (in an [TextSelectionSet']). Returns an iterator over all matching text selections in the resource, as [WrappedItem`].

source

pub fn find_textselections<'store>( &'store self, operator: TextSelectionOperator, refset: TextSelectionSet ) -> impl Iterator<Item = WrappedItem<'store, TextSelection>>

Find textselections by applying a text selection operator (TextSelectionOperator) to a one or more querying textselections (in an [TextSelectionSet']). Returns an iterator over all matching text selections in the resource, as [WrappedItem`].

Trait Implementations§

source§

impl AssociatedFile for TextResource

source§

fn filename(&self) -> Option<&str>

Get the filename for stand-off file specified using @include (if any)

source§

fn set_filename(&mut self, filename: &str) -> &mut Self

Get the filename for stand-off file specified using @include (if any)

source§

fn with_filename(self, filename: &str) -> Selfwhere Self: Sized,

source§

fn filename_without_extension(&self) -> Option<&str>

Returns the filename without (known!) extension. The extension must be a known extension used by STAM for this to work.
source§

fn filename_without_workdir(&self) -> Option<&str>

Serializes the filename ready for use with STAM JSON’s @include or STAM CSV. It basically only strips the workdir component, if any.
source§

impl Clone for TextResource

source§

fn clone(&self) -> TextResource

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl Configurable for TextResource

source§

fn config(&self) -> &Config

source§

fn config_mut(&mut self) -> &mut Config

source§

fn set_config(&mut self, config: Config) -> &mut Self

Setter to associate a configuration
source§

fn with_config(self, config: Config) -> Self

Builder pattern to associate a configuration
source§

impl Debug for TextResource

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl<'de> Deserialize<'de> for TextResource

source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
source§

impl PartialEq<TextResource> for TextResource

source§

fn eq(&self, other: &TextResource) -> bool

This method tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

This method tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
source§

impl SelfSelector for TextResource

source§

fn selector(&self) -> Result<Selector, StamError>

Returns a selector to this resource

source§

impl Serialize for TextResource

source§

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>where S: Serializer,

Serialize this value into the given Serde serializer. Read more
source§

impl Storable for TextResource

§

type HandleType = TextResourceHandle

§

type StoreType = AnnotationStore

source§

fn id(&self) -> Option<&str>

Get the public ID
source§

fn handle(&self) -> Option<TextResourceHandle>

Retrieve the internal (numeric) id. For any type T uses in StoreFor<T>, this may be None only in the initial stage when it is still unbounded to a store.
source§

fn with_id(self, id: String) -> Self

Builder pattern to set the public Id
source§

fn set_handle(&mut self, handle: TextResourceHandle)

Set the internal ID. May only be called once (though currently not enforced).
source§

fn carries_id() -> bool

Does this type support an ID?
source§

fn handle_or_err(&self) -> Result<Self::HandleType, StamError>

Like Self::handle() but returns a StamError::Unbound error if there is no internal id.
source§

fn id_or_err(&self) -> Result<&str, StamError>

Like Self::id() but returns a StamError::NoIdError error if there is no internal id.
source§

fn wrap_in<'store>( &'store self, store: &'store Self::StoreType ) -> Result<WrappedItem<'store, Self>, StamError>where Self: Sized,

Returns a wrapped reference to this item and the store that owns it. This allows for some more introspection on the part of the item. reverse of StoreFor<T>::wrap()
source§

fn wrap_owned_in<'store>( self, store: &'store Self::StoreType ) -> Result<WrappedItem<'store, Self>, StamError>where Self: Sized,

Returns a wrapped reference to this item and the store that owns it. This allows for some more introspection on the part of the item. reverse of StoreFor<T>::wrap_owned()
source§

fn bound(&mut self)

Callback function that is called after an item is bound to a store
source§

fn generate_id(self, idmap: Option<&mut IdMap<Self::HandleType>>) -> Selfwhere Self: Sized,

Generate a random ID in a given idmap (adds it to the map and assigns it to the item)
source§

impl StoreFor<TextResource> for AnnotationStore

source§

fn store(&self) -> &Store<TextResource>

Get a reference to the entire store for the associated type

source§

fn store_mut(&mut self) -> &mut Store<TextResource>

Get a mutable reference to the entire store for the associated type

source§

fn idmap(&self) -> Option<&IdMap<TextResourceHandle>>

Get a reference to the id map for the associated type, mapping global ids to internal ids

source§

fn idmap_mut(&mut self) -> Option<&mut IdMap<TextResourceHandle>>

Get a mutable reference to the id map for the associated type, mapping global ids to internal ids

source§

fn preinsert(&self, item: &mut TextResource) -> Result<(), StamError>

Called prior to inserting an item into to the store If it returns an error, the insert will be cancelled. Allows for bookkeeping such as inheriting configuration parameters from parent to the item

source§

fn preremove(&mut self, handle: TextResourceHandle) -> Result<(), StamError>

called before the item is removed from the store updates the relation maps, no need to call manually

source§

fn store_typeinfo() -> &'static str

source§

fn insert(&mut self, item: T) -> Result<T::HandleType, StamError>

Adds an item to the store. Returns a handle to it upon success.
source§

fn inserted(&mut self, handle: T::HandleType) -> Result<(), StamError>

Called after an item was inserted to the store Allows the store to do further bookkeeping like updating relation maps
source§

fn add(self, item: T) -> Result<Self, StamError>where Self: Sized,

source§

fn has<'a, 'b>(&'a self, item: &Item<'b, T>) -> bool

Returns true if the store has the item
source§

unsafe fn get_unchecked(&self, handle: T::HandleType) -> Option<&T>

Get a reference to an item from the store, by handle, without checking validity. Read more
source§

fn get<'a, 'b>(&'a self, item: &Item<'b, T>) -> Result<&'a T, StamError>

Get a reference to an item from the store
source§

fn get_mut<'a, 'b>(&mut self, item: &Item<'b, T>) -> Result<&mut T, StamError>

Get a mutable reference to an item from the store by internal ID
source§

fn remove(&mut self, handle: T::HandleType) -> Result<(), StamError>

Removes an item by handle, returns an error if the item has dependencies and can’t be removed
source§

fn resolve_id(&self, id: &str) -> Result<T::HandleType, StamError>

Resolves an ID to a handle You usually don’t want to call this directly
source§

fn owns(&self, item: &T) -> Option<bool>

Tests if the item is owner by the store, returns None if ownership is unknown
source§

fn iter<'a>(&'a self) -> StoreIter<'a, T> where T: Storable<StoreType = Self>,

Iterate over the store
source§

fn iter_mut<'a>(&'a mut self) -> StoreIterMut<'a, T>

Iterate over the store, mutably
source§

fn next_handle(&self) -> T::HandleType

Return the internal id that will be assigned for the next item to the store
source§

fn last_handle(&self) -> T::HandleType

Return the internal id that was assigned to last inserted item
source§

fn bind(&mut self, item: T) -> Result<T, StamError>

This binds an item to the store PRIOR to it being actually added You should never need to call this directly (it can only be called once per item anyway).
source§

fn wrap<'a>(&'a self, item: &'a T) -> Result<WrappedItem<'_, T>, StamError>where T: Storable<StoreType = Self>,

Wraps the item in a smart pointer that also holds a reference to this store This method performs some extra checks to verify if the item is indeed owned by the store and returns an error if not.
source§

fn wrap_owned<'a>(&'a self, item: T) -> Result<WrappedItem<'_, T>, StamError>where T: Storable<StoreType = Self>,

Wraps the item in a smart pointer that also holds a reference to this store Ownership is retained with this method, i.e. the store does NOT own the item.
source§

fn wrap_store<'a>(&'a self) -> WrappedStore<'_, T, Self>where Self: Sized,

Wraps the entire store along with a reference to self Low-level method that you won’t need
source§

impl StoreFor<TextSelection> for TextResource

source§

fn store(&self) -> &Store<TextSelection>

Get a reference to the entire store for the associated type

source§

fn store_mut(&mut self) -> &mut Store<TextSelection>

Get a mutable reference to the entire store for the associated type

source§

fn idmap(&self) -> Option<&IdMap<TextSelectionHandle>>

Get a reference to the id map for the associated type, mapping global ids to internal ids

source§

fn idmap_mut(&mut self) -> Option<&mut IdMap<TextSelectionHandle>>

Get a mutable reference to the id map for the associated type, mapping global ids to internal ids

source§

fn store_typeinfo() -> &'static str

source§

fn inserted(&mut self, handle: TextSelectionHandle) -> Result<(), StamError>

Called after an item was inserted to the store Allows the store to do further bookkeeping like updating relation maps
source§

fn insert(&mut self, item: T) -> Result<T::HandleType, StamError>

Adds an item to the store. Returns a handle to it upon success.
source§

fn preinsert(&self, item: &mut T) -> Result<(), StamError>

Called prior to inserting an item into to the store If it returns an error, the insert will be cancelled. Allows for bookkeeping such as inheriting configuration parameters from parent to the item
source§

fn add(self, item: T) -> Result<Self, StamError>where Self: Sized,

source§

fn has<'a, 'b>(&'a self, item: &Item<'b, T>) -> bool

Returns true if the store has the item
source§

unsafe fn get_unchecked(&self, handle: T::HandleType) -> Option<&T>

Get a reference to an item from the store, by handle, without checking validity. Read more
source§

fn get<'a, 'b>(&'a self, item: &Item<'b, T>) -> Result<&'a T, StamError>

Get a reference to an item from the store
source§

fn get_mut<'a, 'b>(&mut self, item: &Item<'b, T>) -> Result<&mut T, StamError>

Get a mutable reference to an item from the store by internal ID
source§

fn remove(&mut self, handle: T::HandleType) -> Result<(), StamError>

Removes an item by handle, returns an error if the item has dependencies and can’t be removed
source§

fn preremove(&mut self, handle: T::HandleType) -> Result<(), StamError>

Called before an item is removed from the store Allows the store to do further bookkeeping like updating relation maps
source§

fn resolve_id(&self, id: &str) -> Result<T::HandleType, StamError>

Resolves an ID to a handle You usually don’t want to call this directly
source§

fn owns(&self, item: &T) -> Option<bool>

Tests if the item is owner by the store, returns None if ownership is unknown
source§

fn iter<'a>(&'a self) -> StoreIter<'a, T> where T: Storable<StoreType = Self>,

Iterate over the store
source§

fn iter_mut<'a>(&'a mut self) -> StoreIterMut<'a, T>

Iterate over the store, mutably
source§

fn next_handle(&self) -> T::HandleType

Return the internal id that will be assigned for the next item to the store
source§

fn last_handle(&self) -> T::HandleType

Return the internal id that was assigned to last inserted item
source§

fn bind(&mut self, item: T) -> Result<T, StamError>

This binds an item to the store PRIOR to it being actually added You should never need to call this directly (it can only be called once per item anyway).
source§

fn wrap<'a>(&'a self, item: &'a T) -> Result<WrappedItem<'_, T>, StamError>where T: Storable<StoreType = Self>,

Wraps the item in a smart pointer that also holds a reference to this store This method performs some extra checks to verify if the item is indeed owned by the store and returns an error if not.
source§

fn wrap_owned<'a>(&'a self, item: T) -> Result<WrappedItem<'_, T>, StamError>where T: Storable<StoreType = Self>,

Wraps the item in a smart pointer that also holds a reference to this store Ownership is retained with this method, i.e. the store does NOT own the item.
source§

fn wrap_store<'a>(&'a self) -> WrappedStore<'_, T, Self>where Self: Sized,

Wraps the entire store along with a reference to self Low-level method that you won’t need
source§

impl<'store> Text<'store, 'store> for TextResource

source§

fn textlen(&self) -> usize

Returns the length of the text in unicode points For bytes, use self.text().len() instead.

source§

fn text(&'store self) -> &'store str

Returns a reference to the full text of this resource

source§

fn text_by_offset( &'store self, offset: &Offset ) -> Result<&'store str, StamError>

Returns a string reference to a slice of text as specified by the offset

source§

fn utf8byte(&self, abscursor: usize) -> Result<usize, StamError>

Resolves a begin aligne cursor to UTF-8 byteposition If you have a Cursor instance, pass it through [Self.beginaligned_cursor()] first.

source§

fn utf8byte_to_charpos(&self, bytecursor: usize) -> Result<usize, StamError>

Convert utf8 byte to unicode point. O(n), not as efficient as the reverse operation in [’utf8byte()`]

source§

fn textselection( &'store self, offset: &Offset ) -> Result<WrappedItem<'store, TextSelection>, StamError>

Returns a [`TextSelection’] that corresponds to the offset. If the TextSelection exists, the existing one will be returned. If it doesn’t exist yet, a new one will be returned, and it won’t have a handle, nor will it be added to the store automatically.

The TextSelection is returned as in a far pointer (WrappedItem) that also contains reference to the underlying store.

Use [Self::has_textselection()] instead if you want to limit to existing text selections (i.e. those pertaining to annotations) only.

source§

fn find_text_regex<'regex>( &'store self, expressions: &'regex [Regex], precompiledset: Option<&RegexSet>, allow_overlap: bool ) -> Result<FindRegexIter<'store, 'regex>, StamError>

Searches the text using one or more regular expressions, returns an iterator over TextSelections along with the matching expression, this is held by the [`FindRegexMatch’] struct.

Passing multiple regular expressions at once is more efficient than calling this function anew for each one. If capture groups are used in the regular expression, only those parts will be returned (the rest is context). If none are used, the entire expression is returned.

The allow_overlap parameter determines if the matching expressions are allowed to overlap. It you are doing some form of tokenisation, you also likely want this set to false. All of this only matters if you supply multiple regular expressions.

Results are returned in the exact order they are found in the text

source§

fn find_text<'fragment>( &'store self, fragment: &'fragment str ) -> FindTextIter<'store, 'fragment>

Searches for the specified text fragment. Returns an iterator to iterate over all matches in the text. The iterator returns TextSelection items.

This search is case sensitive, use [Self.find_text_nocase()] to search case insensitive. For more complex and powerful searching use [Self.find_text_regex()] instead

If you want to search only a subpart of the text, extract a [’TextSelection] first with [Self.textselection()] and then run find_text()` on that instead.

source§

fn find_text_nocase(&'store self, fragment: &str) -> FindNoCaseTextIter<'store>

Searches for the specified text fragment. Returns an iterator to iterate over all matches in the text. The iterator returns TextSelection items.

This search is case insensitive, use [Self.find_text()] to search case sensitive. This variant is slightly less performant than the exact variant. For more complex and powerful searching use [Self.find_text_regex()] instead

If you want to search only a subpart of the text, extract a [’TextSelection] first with [Self.textselection()] and then run find_text_nocase()` on that instead.

source§

fn subslice_utf8_offset(&self, subslice: &str) -> Option<usize>

Finds the utf-8 byte position where the specified text subslice begins

source§

fn absolute_cursor(&self, cursor: usize) -> usize

Resolves a begin-aligned cursor to an absolute cursor (i.e. relative to the TextResource).
source§

fn split_text<'b>(&'store self, delimiter: &'b str) -> SplitTextIter<'store, 'b>

Returns an iterator of [’TextSelection`] instances that represent partitions of the text given the specified delimiter. No text is modified. Read more
source§

fn find_text_sequence<'fragment, F>( &'slf self, fragments: &'fragment [&'fragment str], allow_skip_char: F, case_sensitive: bool ) -> Option<Vec<WrappedItem<'store, TextSelection>>>where F: Fn(char) -> bool,

Searches for the multiple text fragment in sequence. Returns a vector with (wrapped) TextSelection instances. Read more
source§

fn trim_text( &'slf self, chars: &[char] ) -> Result<WrappedItem<'store, TextSelection>, StamError>

Trims all occurrences of any character in chars from both the beginning and end of the text, returning a smaller TextSelection. No text is modified.
source§

fn beginaligned_cursor(&self, cursor: &Cursor) -> Result<usize, StamError>

Resolves a cursor to a begin aligned cursor, resolving all relative end-aligned positions
source§

fn absolute_offset(&self, offset: &Offset) -> Result<Offset, StamError>

Resolves a relative offset (relative to another TextSelection) to an absolute one (in terms of to the underlying TextResource)
source§

impl ToJson for TextResource

source§

fn to_json_writer<W>(&self, writer: W, compact: bool) -> Result<(), StamError>where W: Write,

Writes a serialisation (choose a dataformat) to any writer Lower-level function
source§

fn to_json_file(&self, filename: &str, config: &Config) -> Result<(), StamError>

Writes this structure to a file The actual dataformat can be set via config, the default is STAM JSON.
source§

fn to_json_string(&self, config: &Config) -> Result<String, StamError>

Serializes this structure to one string. The actual dataformat can be set via config, the default is STAM JSON. If config not not specified, an attempt to fetch the AnnotationStore’s initial config is made
source§

impl TryFrom<TextResourceBuilder> for TextResource

§

type Error = StamError

The type returned in the event of a conversion error.
source§

fn try_from(builder: TextResourceBuilder) -> Result<Self, StamError>

Performs the conversion.
source§

impl TypeInfo for TextResource

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for Twhere T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for Twhere T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for Twhere T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for Twhere U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for Twhere T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

§

fn vzip(self) -> V

source§

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,