1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
//!
//! # BINSEQ
//!
//! The `binseq` library provides efficient APIs for working with the [BINSEQ](https://www.biorxiv.org/content/10.1101/2025.04.08.647863v1) file format family.
//!
//! It offers methods to read and write BINSEQ files, providing:
//!
//! - Compact multi-bit encoding and decoding of nucleotide sequences through [`bitnuc`](https://docs.rs/bitnuc/latest/bitnuc/)
//! - Memory-mapped file access for efficient reading ([`bq::MmapReader`] and [`vbq::MmapReader`])
//! - Parallel processing capabilities for arbitrary tasks through the [`ParallelProcessor`] trait.
//! - Configurable [`Policy`] for handling invalid nucleotides
//! - Support for both single and paired-end sequences
//! - Optional sequence headers/identifiers (VBQ format)
//! - Abstract [`BinseqRecord`] trait for representing records from both `.bq` and `.vbq` files.
//! - Abstract [`BinseqReader`] enum for processing records from both `.bq` and `.vbq` files.
//!
//! ## Recent VBQ Format Changes (v0.7.0+)
//!
//! The VBQ format has undergone significant improvements:
//!
//! - **Embedded Index**: VBQ files now contain their index data embedded at the end of the file,
//! eliminating separate `.vqi` index files and improving portability.
//! - **Headers Support**: Optional sequence identifiers/headers can be stored with each record.
//! - **Extended Capacity**: u64 indexing supports files with more than 4 billion records.
//! - **Multi-bit Encoding**: Support for both 2-bit and 4-bit nucleotide encodings.
//!
//! Legacy VBQ files are automatically migrated to the new format when accessed.
//!
//! ## Crate Organization
//!
//! This library is split into 3 major parts.
//!
//! There are the [`bq`] and [`vbq`] modules, which provide tools for reading and writing `BQ` and `VBQ` files respectively.
//! Then there are traits and utilities that are ubiquitous across the library which are available at the top-level of the crate.
//!
//! # Example: Memory-mapped Access
//!
//! ```
//! use binseq::Result;
//! use binseq::prelude::*;
//!
//! #[derive(Clone, Default)]
//! pub struct Processor {
//! // Define fields here
//! }
//!
//! impl ParallelProcessor for Processor {
//! fn process_record<B: BinseqRecord>(&mut self, record: B) -> Result<()> {
//! // Implement per-record logic here
//! Ok(())
//! }
//!
//! fn on_batch_complete(&mut self) -> Result<()> {
//! // Implement per-batch logic here
//! Ok(())
//! }
//! }
//!
//! fn main() -> Result<()> {
//! // provide an input path (*.bq or *.vbq)
//! let path = "./data/subset.bq";
//!
//! // open a reader
//! let reader = BinseqReader::new(path)?;
//!
//! // initialize a processor
//! let processor = Processor::default();
//!
//! // process the records in parallel with 8 threads
//! reader.process_parallel(processor, 8)?;
//! Ok(())
//! }
//! ```
/// BQ - fixed length records, no quality scores
/// Error definitions
/// Parallel processing
/// Invalid nucleotide policy
/// Record trait shared between BINSEQ variants
/// VBQ - Variable length records, optional quality scores, compressed blocks
/// Prelude - Commonly used types and traits
/// Context - Reusable state for parallel processing
pub use Context;
pub use ;
pub use ;
pub use ;
pub use BinseqRecord;
/// Re-export `bitnuc::BitSize`
pub use BitSize;