pub struct ExternalSorter { /* private fields */ }Expand description
External sorter for minimizer tuples
Manages RAM-bounded sorting with temp file spillover and k-way merge.
Implementations§
Source§impl ExternalSorter
impl ExternalSorter
Sourcepub fn new(
tmp_dir: impl AsRef<Path>,
ram_limit_gib: usize,
num_threads: usize,
verbose: bool,
) -> Result<Self>
pub fn new( tmp_dir: impl AsRef<Path>, ram_limit_gib: usize, num_threads: usize, verbose: bool, ) -> Result<Self>
Create a new external sorter
Sourcepub fn buffer_size_per_thread(&self) -> usize
pub fn buffer_size_per_thread(&self) -> usize
Calculate buffer size per thread in number of tuples
Formula: (ram_limit * GiB) / (2 * sizeof(tuple) * num_threads)
The factor of 2 accounts for temporary memory during parallel sort.
Sourcepub fn sort_and_flush(
&self,
buffer: &mut Vec<MinimizerTupleExternal>,
) -> Result<u64>
pub fn sort_and_flush( &self, buffer: &mut Vec<MinimizerTupleExternal>, ) -> Result<u64>
Sort a buffer and flush to a temp file
Returns the file ID. Thread-safe via atomic counter.
Sourcepub fn merge(&self) -> Result<MergeResult>
pub fn merge(&self) -> Result<MergeResult>
Merge all temp files into final sorted output
Returns statistics: (num_minimizers, num_positions, num_super_kmers)
Sourcepub fn read_merged_tuples(&self) -> Result<Vec<MinimizerTuple>>
pub fn read_merged_tuples(&self) -> Result<Vec<MinimizerTuple>>
Read merged tuples into memory as internal MinimizerTuples
Call this after merge() to get the final sorted tuples.
For large datasets, prefer [open_merged_file] to avoid full materialization.
Sourcepub fn open_merged_file(&self) -> Result<FileTuples>
pub fn open_merged_file(&self) -> Result<FileTuples>
Open the merged file for sequential buffered access.
Returns a FileTuples handle that provides sequential access via
buffered readers. Each scan pass opens a fresh BufReader, keeping
RSS proportional to the buffer size (~4 MB) instead of the file size.
Important: The returned FileTuples takes ownership of cleanup
responsibility for the merged file.
Sourcepub fn remove_merged_file(&self) -> Result<()>
pub fn remove_merged_file(&self) -> Result<()>
Remove the merged file (cleanup)
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for ExternalSorter
impl RefUnwindSafe for ExternalSorter
impl Send for ExternalSorter
impl Sync for ExternalSorter
impl Unpin for ExternalSorter
impl UnsafeUnpin for ExternalSorter
impl UnwindSafe for ExternalSorter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T, U> CastableInto<U> for Twhere
U: CastableFrom<T>,
impl<T, U> CastableInto<U> for Twhere
U: CastableFrom<T>,
Source§impl<T> DowncastableFrom<T> for T
impl<T> DowncastableFrom<T> for T
Source§fn downcast_from(value: T) -> T
fn downcast_from(value: T) -> T
Source§impl<T, U> DowncastableInto<U> for Twhere
U: DowncastableFrom<T>,
impl<T, U> DowncastableInto<U> for Twhere
U: DowncastableFrom<T>,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more