Expand description
§chaintools
A high-performance library for parsing chain files, which describe pairwise alignments between sequences commonly used in genomics. The library provides zero-copy parsing to minimize memory allocations and maximize performance when working with large alignment datasets.
§Features
- Zero-copy parsing: All string data is referenced without allocation for maximum performance
- Memory mapping: Optional
mmapsupport for efficient handling of large files - Parallel processing: Multi-threaded parsing with the
rayonfeature - Streaming: Low-memory streaming parser suitable for stdin and pipes
- Indexing: Random access to individual chains with the
indexfeature - Compression: Built-in gzip support with the
gzipfeature - Writing: Chain and metadata writers available without sequence support
- Feature-gated dependencies: Minimal footprint by enabling only needed features
§Quick Start
use chaintools::Reader;
// Load a chain file (automatically uses mmap when available)
let reader = Reader::<chaintools::Chain>::from_path("example.chain")?;
// Iterate over all chains
for chain in reader.chains() {
println!("Chain {}: score={}", chain.id, chain.score);
}§Examples
§Streaming large files
use chaintools::io::stream::StreamingReader;
// Stream from a file (low memory usage)
let mut reader = StreamingReader::from_path("large.chain")?;
while let Some(chain) = reader.next_chain()? {
println!("Processing chain with score: {}", chain.score);
// Process chain without loading entire file into memory
}§Parallel processing (parallel feature)
use chaintools::Reader;
// Parse large files faster using multiple threads
let reader = Reader::<chaintools::Chain>::from_path_parallel("huge.chain")?;
println!("Parsed {} chains in parallel", reader.len());§Random access with indexing (index feature)
use chaintools::ChainIndex;
// Build an index for fast random access
let index = ChainIndex::from_path("example.chain")?;
// Access specific chains without parsing the entire file
if let Some(chain_bytes) = index.chain_bytes(0) {
println!("First chain is {} bytes", chain_bytes.len());
}
println!("Index contains {} chains", index.len());§Feature flags
mmap: Memory mapping support for efficient handling of large filesgzip: Built-in gzip compression supportindex: Random access indexing for chainsparallel: Multi-threaded parsing with rayonwrite: Marker feature for writer-only dependents; writers are exported unconditionallysequence: Sequence loading and scoring supportdefault: Enablesmmap
§Installation
Add this to your Cargo.toml:
[dependencies]
chaintools = { version = "0.0.2", features = ["mmap", "gzip"] }Re-exports§
pub use model::block::AbsoluteBlock;pub use model::block::Block;pub use model::block::BlockSlice;pub use model::block::absolute_to_dense_blocks;pub use model::chain::Chain;pub use model::chain::Strand;pub use model::error::ChainError;pub use io::reader::Reader;pub use io::storage::ByteSlice;pub use io::stream::OwnedChain;pub use io::stream::OwnedChainHeader;pub use io::stream::OwnedChainParts;pub use io::stream::StreamItem;pub use io::stream::StreamingReader;pub use io::writer::write_chain_dense;pub use io::writer::write_chain_dense_with_id;pub use io::writer::write_chain_header;pub use io::writer::write_chain_header_with_id;pub use io::writer::write_dense_blocks;pub use io::writer::write_metadata_line;pub use io::writer::write_metadata_lines;