1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
//!
//! # BINSEQ
//!
//! The `binseq` library provides efficient APIs for working with the [BINSEQ](https://www.biorxiv.org/content/10.1101/2025.04.08.647863v2) file format family.
//!
//! It offers methods to read and write BINSEQ files, providing:
//!
//! - Compact multi-bit encoding and decoding of nucleotide sequences through [`bitnuc`](https://docs.rs/bitnuc/latest/bitnuc/)
//! - Support for both single and paired-end sequences
//! - Abstract [`BinseqRecord`] trait for representing records from all variants
//! - Abstract [`BinseqReader`] enum for processing records from all variants
//! - Abstract [`BinseqWriter`] enum for writing records to all variants
//! - Parallel processing capabilities for arbitrary tasks through the [`ParallelProcessor`] trait.
//! - Configurable [`Policy`] for handling invalid nucleotides (BQ/VBQ, CBQ natively supports `N` nucleotides)
//!
//! ## Recent additions (v0.9.0):
//!
//! ### New variant: CBQ
//! **[`cbq`]** is a new variant of BINSEQ that solves many of the pain points around VBQ.
//! The CBQ format is a columnar-block-based format that offers improved compression and faster processing speeds compared to VBQ.
//! It natively supports `N` nucleotides and avoids the need for additional 4-bit encoding.
//!
//! ### Improved interface for writing records
//! **[`BinseqWriter`]** provides a unified interface for writing records generically to BINSEQ files.
//! This makes use of the new [`SequencingRecord`] which provides a cleaner builder API for writing records to BINSEQ files.
//!
//! ## Recent VBQ Format Changes (v0.7.0+)
//!
//! The VBQ format has undergone significant improvements:
//!
//! - **Embedded Index**: VBQ files now contain their index data embedded at the end of the file,
//! improving portability.
//! - **Headers Support**: Optional sequence identifiers/headers can be stored with each record.
//! - **Extended Capacity**: u64 indexing supports files with more than 4 billion records.
//! - **Multi-bit Encoding**: Support for both 2-bit and 4-bit nucleotide encodings.
//!
//! Legacy VBQ files are automatically migrated to the new format when accessed.
//!
//! # Example: Memory-mapped Access
//!
//! ```
//! use binseq::Result;
//! use binseq::prelude::*;
//!
//! #[derive(Clone, Default)]
//! pub struct Processor {
//! // Define fields here
//! }
//!
//! impl ParallelProcessor for Processor {
//! fn process_record<B: BinseqRecord>(&mut self, record: B) -> Result<()> {
//! // Implement per-record logic here
//! Ok(())
//! }
//!
//! fn on_batch_complete(&mut self) -> Result<()> {
//! // Implement per-batch logic here
//! Ok(())
//! }
//! }
//!
//! fn main() -> Result<()> {
//! // provide an input path (*.bq or *.vbq)
//! let path = "./data/subset.bq";
//!
//! // open a reader
//! let reader = BinseqReader::new(path)?;
//!
//! // initialize a processor
//! let processor = Processor::default();
//!
//! // process the records in parallel with 8 threads
//! reader.process_parallel(processor, 8)?;
//! Ok(())
//! }
//! ```
/// BQ - fixed length records, no quality scores
/// Error definitions
/// Parallel processing
/// Invalid nucleotide policy
/// Record types and traits shared between BINSEQ variants
/// VBQ - Variable length records, optional quality scores, compressed blocks
/// CBQ - Columnar variable length records, optional quality scores and headers
/// Prelude - Commonly used types and traits
/// Write operations generic over the BINSEQ variant
/// Utilities for working with BINSEQ files
pub use ;
pub use ;
pub use ;
pub use ;
pub use ;
/// Re-export `bitnuc::BitSize`
pub use BitSize;
/// Default quality score for BINSEQ readers without quality scores
pub const DEFAULT_QUALITY_SCORE: u8 = b'?';