Skip to main content

Crate pod5lib

Crate pod5lib 

Source
Expand description

Pure Rust library for reading and writing POD5 nanopore sequencing files — Oxford Nanopore Technologies’ successor to FAST5.

POD5 files embed three Apache Arrow IPC tables (Reads, Signal, RunInfo) in a binary container with a FlatBuffers footer. Signal data is compressed with VBZ (delta + zigzag + SVB16 + zstd).

§Reading

use pod5lib::Reader;

let reader = Reader::open("sample.pod5").unwrap();
println!("{} reads", reader.len());

for read in reader.reads_iter() {
    let adc = read.signal().unwrap();      // raw i16 ADC samples
    let pa  = read.signal_pa().unwrap();   // calibrated picoamps (f32)
    println!("{} — {} samples", read.read_id_str(), adc.len());
}

// Random access by UUID string — O(1)
if let Some(read) = reader.get_read_by_id("a1b2c3d4-0506-0708-090a-0b0c0d0e0f10") {
    let _ = read.signal_pa().unwrap();
}

§Writing

Use Writer for small files or when all reads are already in memory. Use StreamingWriter for large files — it VBZ-compresses each read’s signal immediately and does not hold decompressed samples beyond each StreamingWriter::write_read call.

use pod5lib::{Reader, StreamingWriter};

let src = Reader::open("input.pod5").unwrap();
let mut writer = StreamingWriter::create("output.pod5").unwrap();
writer.file_identifier = src.file_identifier().to_string();

for read in src.reads_iter() {
    let signal = read.signal().unwrap();
    writer.write_read(read, &signal).unwrap();
}
writer.finish().unwrap();

Structs§

Calibration
ADC → picoamp calibration parameters.
EndReason
How and why a read ended.
ParReadsIter
Parallel reads iterator returned by Reader::par_reads_iter.
Pore
Physical pore location on the flow cell.
ReadRecord
All metadata and signal for one nanopore read.
Reader
Open and read a POD5 file.
RunInfo
Per-experiment run context (one entry per MinKNOW acquisition).
ShiftScalePair
Optional (shift, scale) pair; f32::NAN when unavailable.
StreamingWriter
Memory-efficient, threaded POD5 writer.
Writer
Accumulate reads in memory and write a POD5 file in one shot.

Enums§

EndReasonKind
Why a read was terminated.
Error
All errors that can be produced by this library.

Constants§

DEFAULT_SIGNAL_BATCH_SIZE
Default number of reads accumulated per rayon compression burst and Arrow signal batch. Matches CHANNEL_DEPTH so one burst fills the channel exactly, maximising pipeline overlap. Tune upward for throughput at the cost of RSS; below 16 rayon under-utilises the CPU pool.

Functions§

uuid_to_str
Format 16 raw bytes into the canonical xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx UUID string.

Type Aliases§

Result
Crate-wide Result type — equivalent to std::result::Result<T, Error>.