pub struct SegmentedText { /* private fields */ }Expand description
Provider for texts partitioned into segments at known cumulative
end positions. lim_at(p) binary-searches the sorted ends list
and returns the distance from p to the next boundary.
Storage cost is 8 × n_segments bytes (the cumulative-ends
Vec<u64>). For a 50 K-junction SA index on a 6 GB genome that
is 400 KB total — vs the 750 MB a packed bitmap would need,
and the 6 GB an extra-byte-per-symbol u16 text would need.
Lookup is O(log n_segments) — a few cycles for typical
segment counts. The merge can cache lim_p/lim_q across LCP
calls so the cost amortises to ~one binary search per output
record.
Two constructors:
from_lengthstakes per-segment lengths and builds the cumulative-ends list internally. Most ergonomic when the caller has[chr_len_0, chr_len_1, …]already.from_endstakes the sorted cumulative ends directly. Useful when the caller already has them — e.g. STAR’schr_start[]table.
Both constructors require the segments to cover the whole text
(sum(lengths) == text_len, or ends.last() == Some(text_len)).
Implementations§
Source§impl SegmentedText
impl SegmentedText
Sourcepub fn from_lengths(text_len: usize, lengths: &[usize]) -> Self
pub fn from_lengths(text_len: usize, lengths: &[usize]) -> Self
Build from per-segment lengths. The sum must equal text_len.
Sourcepub fn from_ends(text_len: usize, ends: Vec<u64>) -> Self
pub fn from_ends(text_len: usize, ends: Vec<u64>) -> Self
Build from sorted, strictly-increasing cumulative end positions.
ends.last() must equal text_len.
Sourcepub fn n_segments(&self) -> usize
pub fn n_segments(&self) -> usize
Number of segments.
Trait Implementations§
Source§impl Clone for SegmentedText
impl Clone for SegmentedText
Source§fn clone(&self) -> SegmentedText
fn clone(&self) -> SegmentedText
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for SegmentedText
impl Debug for SegmentedText
Source§impl LimitProvider for SegmentedText
impl LimitProvider for SegmentedText
Source§fn lim_at(&self, p: usize) -> usize
fn lim_at(&self, p: usize) -> usize
p in
symbols — i.e. the number of comparable symbols before the
next segment boundary or end-of-text. Must be at most
text.len() - p.Source§fn boundary_order(
&self,
p_a: usize,
lim_a: usize,
p_b: usize,
lim_b: usize,
) -> Ordering
fn boundary_order( &self, p_a: usize, lim_a: usize, p_b: usize, lim_b: usize, ) -> Ordering
lim_a.cmp(&lim_b) — “shorter-suffix-is-smaller”, the
standard generalised-SA / multi-string-SA convention, what a
Vec<&str> sort with &str ordering produces. Read moreAuto Trait Implementations§
impl Freeze for SegmentedText
impl RefUnwindSafe for SegmentedText
impl Send for SegmentedText
impl Sync for SegmentedText
impl Unpin for SegmentedText
impl UnsafeUnpin for SegmentedText
impl UnwindSafe for SegmentedText
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more