pub struct ReadonlyRefgetStore { /* private fields */ }Expand description
Core refget store with &self read methods, suitable for Arc sharing in servers.
Mutating methods are used during the setup/loading phase; once wrapped in Arc,
only &self reads are accessible, making concurrent access thread-safe.
Holds a global sequence_store with all sequences (across collections) deduplicated. This allows lookup by sequence digest directly (bypassing collection information). Also holds a collections hashmap, to provide lookup by collection+name.
Implementations§
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn store_exists<P: AsRef<Path>>(path: P) -> bool
pub fn store_exists<P: AsRef<Path>>(path: P) -> bool
Check whether a valid RefgetStore exists at the given path.
Sourcepub fn set_encoding_mode(&mut self, new_mode: StorageMode)
pub fn set_encoding_mode(&mut self, new_mode: StorageMode)
Change the storage mode, re-encoding/decoding existing sequences as needed.
Sourcepub fn enable_encoding(&mut self)
pub fn enable_encoding(&mut self)
Enable 2-bit encoding for space efficiency.
Sourcepub fn disable_encoding(&mut self)
pub fn disable_encoding(&mut self)
Disable encoding, use raw byte storage.
Sourcepub fn enable_persistence<P: AsRef<Path>>(&mut self, path: P) -> Result<()>
pub fn enable_persistence<P: AsRef<Path>>(&mut self, path: P) -> Result<()>
Enable disk persistence for this store.
Sourcepub fn disable_persistence(&mut self)
pub fn disable_persistence(&mut self)
Disable disk persistence for this store.
Sourcepub fn is_persisting(&self) -> bool
pub fn is_persisting(&self) -> bool
Check if persistence to disk is enabled.
Sourcepub fn add_sequence<T: Into<Option<[u8; 48]>>>(
&mut self,
sequence_record: SequenceRecord,
collection_digest: T,
force: bool,
) -> Result<()>
pub fn add_sequence<T: Into<Option<[u8; 48]>>>( &mut self, sequence_record: SequenceRecord, collection_digest: T, force: bool, ) -> Result<()>
Adds a sequence to the Store
Sourcepub fn add_sequence_collection(
&mut self,
collection: SequenceCollection,
) -> Result<()>
pub fn add_sequence_collection( &mut self, collection: SequenceCollection, ) -> Result<()>
Adds a collection, and all sequences in it, to the store.
Sourcepub fn add_sequence_collection_force(
&mut self,
collection: SequenceCollection,
) -> Result<()>
pub fn add_sequence_collection_force( &mut self, collection: SequenceCollection, ) -> Result<()>
Adds a collection, overwriting existing data.
Sourcepub fn add_sequence_record(
&mut self,
sr: SequenceRecord,
force: bool,
) -> Result<()>
pub fn add_sequence_record( &mut self, sr: SequenceRecord, force: bool, ) -> Result<()>
Adds a SequenceRecord directly to the store without collection association.
Sourcepub fn sequence_digests(&self) -> impl Iterator<Item = [u8; 48]> + '_
pub fn sequence_digests(&self) -> impl Iterator<Item = [u8; 48]> + '_
Returns an iterator over all sequence digests in the store
Sourcepub fn sequence_metadata(&self) -> impl Iterator<Item = &SequenceMetadata> + '_
pub fn sequence_metadata(&self) -> impl Iterator<Item = &SequenceMetadata> + '_
Returns an iterator over sequence metadata for all sequences in the store.
Sourcepub fn total_disk_size(&self) -> usize
pub fn total_disk_size(&self) -> usize
Calculate the total disk size of all sequences in the store
Sourcepub fn actual_disk_usage(&self) -> usize
pub fn actual_disk_usage(&self) -> usize
Returns the actual disk usage of the store directory.
Sourcepub fn list_collections(
&self,
page: usize,
page_size: usize,
filters: &[(&str, &str)],
) -> Result<PagedResult<SequenceCollectionMetadata>>
pub fn list_collections( &self, page: usize, page_size: usize, filters: &[(&str, &str)], ) -> Result<PagedResult<SequenceCollectionMetadata>>
List collections with pagination and optional attribute filtering.
Sourcepub fn get_collection_metadata<K: AsRef<[u8]>>(
&self,
collection_digest: K,
) -> Option<&SequenceCollectionMetadata>
pub fn get_collection_metadata<K: AsRef<[u8]>>( &self, collection_digest: K, ) -> Option<&SequenceCollectionMetadata>
Get metadata for a single collection by digest (no sequence data).
Sourcepub fn get_collection(
&self,
collection_digest: &str,
) -> Result<SequenceCollection>
pub fn get_collection( &self, collection_digest: &str, ) -> Result<SequenceCollection>
Get a collection with all its sequences loaded.
Sourcepub fn remove_collection(
&mut self,
digest: &str,
remove_orphan_sequences: bool,
) -> Result<bool>
pub fn remove_collection( &mut self, digest: &str, remove_orphan_sequences: bool, ) -> Result<bool>
Remove a collection from the store.
Sourcepub fn import_collection(
&mut self,
source: &ReadonlyRefgetStore,
digest: &str,
) -> Result<()>
pub fn import_collection( &mut self, source: &ReadonlyRefgetStore, digest: &str, ) -> Result<()>
Import a single collection (with all its sequences, aliases, and FHR metadata) from another store into this store.
The source store must have the collection loaded (call
load_collection() or load_all_collections() first).
Sourcepub fn list_sequences(&self) -> Vec<SequenceMetadata>
pub fn list_sequences(&self) -> Vec<SequenceMetadata>
List all sequences in the store (metadata only, no sequence data).
Sourcepub fn get_sequence_metadata<K: AsRef<[u8]>>(
&self,
seq_digest: K,
) -> Option<&SequenceMetadata>
pub fn get_sequence_metadata<K: AsRef<[u8]>>( &self, seq_digest: K, ) -> Option<&SequenceMetadata>
Get metadata for a single sequence by digest (no sequence data).
Sourcepub fn get_sequence<K: AsRef<[u8]>>(
&self,
seq_digest: K,
) -> Result<&SequenceRecord>
pub fn get_sequence<K: AsRef<[u8]>>( &self, seq_digest: K, ) -> Result<&SequenceRecord>
Get a sequence by its SHA512t24u digest.
Sourcepub fn ensure_decoded<K: AsRef<[u8]>>(&mut self, seq_digest: K) -> Result<()>
pub fn ensure_decoded<K: AsRef<[u8]>>(&mut self, seq_digest: K) -> Result<()>
Ensure a sequence is loaded and decoded into the decoded cache.
Sourcepub fn clear_decoded_cache(&mut self)
pub fn clear_decoded_cache(&mut self)
Clear the decoded sequence cache to reclaim memory.
Sourcepub fn sequence_bytes<K: AsRef<[u8]>>(&self, seq_digest: K) -> Option<&[u8]>
pub fn sequence_bytes<K: AsRef<[u8]>>(&self, seq_digest: K) -> Option<&[u8]>
Get decoded sequence bytes from the cache.
Sourcepub fn get_sequence_by_name<K: AsRef<[u8]>>(
&self,
collection_digest: K,
sequence_name: &str,
) -> Result<&SequenceRecord>
pub fn get_sequence_by_name<K: AsRef<[u8]>>( &self, collection_digest: K, sequence_name: &str, ) -> Result<&SequenceRecord>
Get a sequence by collection digest and name.
Sourcepub fn load_all_collections(&mut self) -> Result<()>
pub fn load_all_collections(&mut self) -> Result<()>
Eagerly load all Stub collections to Full.
Sourcepub fn load_all_sequences(&mut self) -> Result<()>
pub fn load_all_sequences(&mut self) -> Result<()>
Eagerly load all Stub sequences to Full.
Sourcepub fn load_collection(&mut self, digest: &str) -> Result<()>
pub fn load_collection(&mut self, digest: &str) -> Result<()>
Load a single collection by digest.
Sourcepub fn load_sequence(&mut self, digest: &str) -> Result<()>
pub fn load_sequence(&mut self, digest: &str) -> Result<()>
Load a single sequence by digest.
Sourcepub fn iter_collections(&self) -> impl Iterator<Item = SequenceCollection> + '_
pub fn iter_collections(&self) -> impl Iterator<Item = SequenceCollection> + '_
Iterate over all collections with their sequences loaded.
Sourcepub fn iter_sequences(&self) -> impl Iterator<Item = SequenceRecord> + '_
pub fn iter_sequences(&self) -> impl Iterator<Item = SequenceRecord> + '_
Iterate over all sequences with their data loaded.
Sourcepub fn is_collection_loaded<K: AsRef<[u8]>>(&self, collection_digest: K) -> bool
pub fn is_collection_loaded<K: AsRef<[u8]>>(&self, collection_digest: K) -> bool
Check if a collection is fully loaded.
Sourcepub fn local_path(&self) -> Option<&PathBuf>
pub fn local_path(&self) -> Option<&PathBuf>
Returns the local path where the store is located (if any)
Sourcepub fn remote_source(&self) -> Option<&str>
pub fn remote_source(&self) -> Option<&str>
Returns the remote source URL (if any)
Sourcepub fn storage_mode(&self) -> StorageMode
pub fn storage_mode(&self) -> StorageMode
Returns the storage mode used by this store
Sourcepub fn get_substring<K: AsRef<[u8]>>(
&self,
sha512_digest: K,
start: usize,
end: usize,
) -> Result<String>
pub fn get_substring<K: AsRef<[u8]>>( &self, sha512_digest: K, start: usize, end: usize, ) -> Result<String>
Retrieves a substring from an encoded sequence by its SHA512t24u digest.
Sourcepub fn write_store_to_dir<P: AsRef<Path>>(
&self,
root_path: P,
seqdata_path_template: Option<&str>,
) -> Result<()>
pub fn write_store_to_dir<P: AsRef<Path>>( &self, root_path: P, seqdata_path_template: Option<&str>, ) -> Result<()>
Write a RefgetStore object to a directory
Sourcepub fn stats(&self) -> StoreStats
pub fn stats(&self) -> StoreStats
Returns statistics about the store
Sourcepub fn available_alias_namespaces(&self) -> AvailableAliases<'_>
pub fn available_alias_namespaces(&self) -> AvailableAliases<'_>
List alias namespaces available on this store (from manifest).
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn add_sequence_alias(
&mut self,
namespace: &str,
alias: &str,
digest: &str,
) -> Result<()>
pub fn add_sequence_alias( &mut self, namespace: &str, alias: &str, digest: &str, ) -> Result<()>
Add a sequence alias and persist to disk if applicable.
Sourcepub fn get_sequence_metadata_by_alias(
&self,
namespace: &str,
alias: &str,
) -> Option<&SequenceMetadata>
pub fn get_sequence_metadata_by_alias( &self, namespace: &str, alias: &str, ) -> Option<&SequenceMetadata>
Resolve a sequence alias to sequence metadata (no data loading).
Sourcepub fn get_sequence_by_alias(
&self,
namespace: &str,
alias: &str,
) -> Result<&SequenceRecord>
pub fn get_sequence_by_alias( &self, namespace: &str, alias: &str, ) -> Result<&SequenceRecord>
Resolve a sequence alias and return the loaded sequence record.
Sourcepub fn get_aliases_for_sequence(&self, digest: &str) -> Vec<(String, String)>
pub fn get_aliases_for_sequence(&self, digest: &str) -> Vec<(String, String)>
Reverse lookup: find all aliases pointing to this sequence digest.
Sourcepub fn list_sequence_alias_namespaces(&self) -> Vec<String>
pub fn list_sequence_alias_namespaces(&self) -> Vec<String>
List all sequence alias namespaces.
Sourcepub fn list_sequence_aliases(&self, namespace: &str) -> Option<Vec<String>>
pub fn list_sequence_aliases(&self, namespace: &str) -> Option<Vec<String>>
List all aliases in a sequence alias namespace.
Sourcepub fn remove_sequence_alias(
&mut self,
namespace: &str,
alias: &str,
) -> Result<bool>
pub fn remove_sequence_alias( &mut self, namespace: &str, alias: &str, ) -> Result<bool>
Remove a single sequence alias.
Sourcepub fn load_sequence_aliases(
&mut self,
namespace: &str,
path: &str,
) -> Result<usize>
pub fn load_sequence_aliases( &mut self, namespace: &str, path: &str, ) -> Result<usize>
Load sequence aliases from a TSV file into a namespace.
Sourcepub fn add_collection_alias(
&mut self,
namespace: &str,
alias: &str,
digest: &str,
) -> Result<()>
pub fn add_collection_alias( &mut self, namespace: &str, alias: &str, digest: &str, ) -> Result<()>
Add a collection alias and persist to disk if applicable.
Sourcepub fn get_collection_metadata_by_alias(
&self,
namespace: &str,
alias: &str,
) -> Option<&SequenceCollectionMetadata>
pub fn get_collection_metadata_by_alias( &self, namespace: &str, alias: &str, ) -> Option<&SequenceCollectionMetadata>
Resolve a collection alias to collection metadata.
Sourcepub fn get_collection_by_alias(
&self,
namespace: &str,
alias: &str,
) -> Result<SequenceCollection>
pub fn get_collection_by_alias( &self, namespace: &str, alias: &str, ) -> Result<SequenceCollection>
Resolve a collection alias and return the loaded collection.
Sourcepub fn get_aliases_for_collection(&self, digest: &str) -> Vec<(String, String)>
pub fn get_aliases_for_collection(&self, digest: &str) -> Vec<(String, String)>
Reverse lookup: find all aliases pointing to this collection digest.
Sourcepub fn list_collection_alias_namespaces(&self) -> Vec<String>
pub fn list_collection_alias_namespaces(&self) -> Vec<String>
List all collection alias namespaces.
Sourcepub fn list_collection_aliases(&self, namespace: &str) -> Option<Vec<String>>
pub fn list_collection_aliases(&self, namespace: &str) -> Option<Vec<String>>
List all aliases in a collection alias namespace.
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn set_fhr_metadata(
&mut self,
collection_digest: &str,
metadata: FhrMetadata,
) -> Result<()>
pub fn set_fhr_metadata( &mut self, collection_digest: &str, metadata: FhrMetadata, ) -> Result<()>
Set FHR metadata for a collection.
Sourcepub fn get_fhr_metadata(&self, collection_digest: &str) -> Option<&FhrMetadata>
pub fn get_fhr_metadata(&self, collection_digest: &str) -> Option<&FhrMetadata>
Get FHR metadata for a collection. Returns None if missing.
Sourcepub fn remove_fhr_metadata(&mut self, collection_digest: &str) -> bool
pub fn remove_fhr_metadata(&mut self, collection_digest: &str) -> bool
Remove FHR metadata for a collection.
Sourcepub fn list_fhr_metadata(&self) -> Vec<String>
pub fn list_fhr_metadata(&self) -> Vec<String>
List all collection digests that have FHR metadata.
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn add_sequence_collection_from_fasta<P: AsRef<Path>>(
&mut self,
file_path: P,
opts: FastaImportOptions<'_>,
) -> Result<(SequenceCollectionMetadata, bool)>
pub fn add_sequence_collection_from_fasta<P: AsRef<Path>>( &mut self, file_path: P, opts: FastaImportOptions<'_>, ) -> Result<(SequenceCollectionMetadata, bool)>
Import a FASTA file into the store using a multithreaded pipeline.
After the pipeline finishes, computes collection metadata, registers the collection, and inserts all sequences.
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn substrings_from_regions<'a, K: AsRef<[u8]>>(
&'a self,
collection_digest: K,
bed_file_path: &str,
) -> Result<SubstringsFromRegions<'a, K>>
pub fn substrings_from_regions<'a, K: AsRef<[u8]>>( &'a self, collection_digest: K, bed_file_path: &str, ) -> Result<SubstringsFromRegions<'a, K>>
Get an iterator over substrings defined by BED file regions.
Sourcepub fn export_fasta_from_regions<K: AsRef<[u8]>>(
&self,
collection_digest: K,
bed_file_path: &str,
output_file_path: &str,
) -> Result<()>
pub fn export_fasta_from_regions<K: AsRef<[u8]>>( &self, collection_digest: K, bed_file_path: &str, output_file_path: &str, ) -> Result<()>
Export sequences from BED file regions to a FASTA file.
Source§impl ReadonlyRefgetStore
impl ReadonlyRefgetStore
Sourcepub fn enable_ancillary_digests(&mut self)
pub fn enable_ancillary_digests(&mut self)
Enable computation and storage of ancillary digests (nlp, snlp, sorted_sequences).
Sourcepub fn disable_ancillary_digests(&mut self)
pub fn disable_ancillary_digests(&mut self)
Disable computation and storage of ancillary digests.
Sourcepub fn has_ancillary_digests(&self) -> bool
pub fn has_ancillary_digests(&self) -> bool
Returns whether ancillary digests are enabled.
Sourcepub fn has_attribute_index(&self) -> bool
pub fn has_attribute_index(&self) -> bool
Returns whether the on-disk attribute index is enabled.
Sourcepub fn get_collection_level1(&self, digest: &str) -> Result<CollectionLevel1>
pub fn get_collection_level1(&self, digest: &str) -> Result<CollectionLevel1>
Get collection at level 1 representation (attribute digests with spec field names). This is a lightweight operation that only reads metadata, no loading needed.
Sourcepub fn get_collection_level2(&self, digest: &str) -> Result<CollectionLevel2>
pub fn get_collection_level2(&self, digest: &str) -> Result<CollectionLevel2>
Get collection at level 2 representation (full arrays, spec format). May need to load the collection from disk/remote.
Sourcepub fn compare(
&self,
digest_a: &str,
digest_b: &str,
) -> Result<SeqColComparison>
pub fn compare( &self, digest_a: &str, digest_b: &str, ) -> Result<SeqColComparison>
Compare two collections by digest. Both must be preloaded.
Sourcepub fn compare_with_level2(
&self,
digest_a: &str,
external: &CollectionLevel2,
) -> Result<SeqColComparison>
pub fn compare_with_level2( &self, digest_a: &str, external: &CollectionLevel2, ) -> Result<SeqColComparison>
Compare a stored collection (by digest) against an externally-provided level-2 body.
Used for the seqcol spec POST /comparison/:digest1 endpoint where the client
submits a local collection as JSON rather than referencing a stored digest.
The returned SeqColComparison has digests.a set to the stored collection’s
digest and digests.b set to None because the external collection has no
server-side digest.
Sourcepub fn find_collections_by_attribute(
&self,
attr_name: &str,
attr_digest: &str,
) -> Result<Vec<String>>
pub fn find_collections_by_attribute( &self, attr_name: &str, attr_digest: &str, ) -> Result<Vec<String>>
Find all collections with a specific attribute digest.
Dispatches to indexed lookup (if attribute_index enabled) or brute-force metadata scan (default).
Supported attr_name values: “names”, “lengths”, “sequences”, “name_length_pairs”, “sorted_name_length_pairs”, “sorted_sequences”
Sourcepub fn get_attribute(
&self,
attr_name: &str,
attr_digest: &str,
) -> Result<Option<Value>>
pub fn get_attribute( &self, attr_name: &str, attr_digest: &str, ) -> Result<Option<Value>>
Get the raw attribute array for a given attribute digest. Finds a collection with this attribute (via search), loads it, and extracts the array.
Supported attr_name values: “names”, “lengths”, “sequences”, “name_length_pairs”, “sorted_name_length_pairs”, “sorted_sequences”
Returns the array as a serde_json::Value (array of strings or numbers). Returns Ok(None) if no collection has this attribute digest.
Sourcepub fn enable_attribute_index(&mut self)
pub fn enable_attribute_index(&mut self)
Enable indexed attribute lookup (not yet implemented).
Note: The indexed lookup feature is planned for a future release.
Enabling this will cause find_collections_by_attribute() to return
a “not implemented” error until the feature is complete.
Sourcepub fn disable_attribute_index(&mut self)
pub fn disable_attribute_index(&mut self)
Disable indexed attribute lookup, using brute-force scan instead.
Sourcepub fn collection_count(&self) -> usize
pub fn collection_count(&self) -> usize
Total number of collections in the store.