Skip to main content

Crate chaintools

Crate chaintools 

Source
Expand description

§chaintools

A high-performance library for parsing chain files, which describe pairwise alignments between sequences commonly used in genomics. The library provides zero-copy parsing to minimize memory allocations and maximize performance when working with large alignment datasets.

§Features

  • Zero-copy parsing: All string data is referenced without allocation for maximum performance
  • Memory mapping: Optional mmap support for efficient handling of large files
  • Parallel processing: Multi-threaded parsing with the rayon feature
  • Streaming: Low-memory streaming parser suitable for stdin and pipes
  • Indexing: Random access to individual chains with the index feature
  • Compression: Built-in gzip support with the gzip feature
  • Writing: Chain and metadata writers available without sequence support
  • Feature-gated dependencies: Minimal footprint by enabling only needed features

§Quick Start

use chaintools::Reader;

// Load a chain file (automatically uses mmap when available)
let reader = Reader::<chaintools::Chain>::from_path("example.chain")?;

// Iterate over all chains
for chain in reader.chains() {
    println!("Chain {}: score={}", chain.id, chain.score);
}

§Examples

§Streaming large files

use chaintools::io::stream::StreamingReader;

// Stream from a file (low memory usage)
let mut reader = StreamingReader::from_path("large.chain")?;

while let Some(chain) = reader.next_chain()? {
    println!("Processing chain with score: {}", chain.score);
    // Process chain without loading entire file into memory
}

§Parallel processing (parallel feature)

use chaintools::Reader;

// Parse large files faster using multiple threads
let reader = Reader::<chaintools::Chain>::from_path_parallel("huge.chain")?;

println!("Parsed {} chains in parallel", reader.len());

§Random access with indexing (index feature)

use chaintools::ChainIndex;

// Build an index for fast random access
let index = ChainIndex::from_path("example.chain")?;

// Access specific chains without parsing the entire file
if let Some(chain_bytes) = index.chain_bytes(0) {
    println!("First chain is {} bytes", chain_bytes.len());
}

println!("Index contains {} chains", index.len());

§Feature flags

  • mmap: Memory mapping support for efficient handling of large files
  • gzip: Built-in gzip compression support
  • index: Random access indexing for chains
  • parallel: Multi-threaded parsing with rayon
  • write: Marker feature for writer-only dependents; writers are exported unconditionally
  • sequence: Sequence loading and scoring support
  • default: Enables mmap

§Installation

Add this to your Cargo.toml:

[dependencies]
chaintools = { version = "0.0.2", features = ["mmap", "gzip"] }

Re-exports§

pub use model::block::AbsoluteBlock;
pub use model::block::Block;
pub use model::block::BlockSlice;
pub use model::block::absolute_to_dense_blocks;
pub use model::chain::Chain;
pub use model::chain::Strand;
pub use model::error::ChainError;
pub use io::reader::Reader;
pub use io::storage::ByteSlice;
pub use io::stream::OwnedChain;
pub use io::stream::OwnedChainHeader;
pub use io::stream::OwnedChainParts;
pub use io::stream::StreamItem;
pub use io::stream::StreamingReader;
pub use io::writer::write_chain_dense;
pub use io::writer::write_chain_dense_with_id;
pub use io::writer::write_chain_header;
pub use io::writer::write_chain_header_with_id;
pub use io::writer::write_dense_blocks;
pub use io::writer::write_metadata_line;
pub use io::writer::write_metadata_lines;

Modules§

io
Chain input/output.
model
Core chain data model.
parser