Expand description
Pure Rust library for reading and writing POD5 nanopore sequencing files — Oxford Nanopore Technologies’ successor to FAST5.
POD5 files embed three Apache Arrow IPC tables (Reads, Signal, RunInfo) in a binary container with a FlatBuffers footer. Signal data is compressed with VBZ (delta + zigzag + SVB16 + zstd).
§Reading
use pod5lib::Reader;
let reader = Reader::open("sample.pod5").unwrap();
println!("{} reads", reader.len());
for read in reader.reads_iter() {
let adc = read.signal().unwrap(); // raw i16 ADC samples
let pa = read.signal_pa().unwrap(); // calibrated picoamps (f32)
println!("{} — {} samples", read.read_id_str(), adc.len());
}
// Random access by UUID string — O(1)
if let Some(read) = reader.get_read_by_id("a1b2c3d4-0506-0708-090a-0b0c0d0e0f10") {
let _ = read.signal_pa().unwrap();
}§Writing
Use Writer for small files or when all reads are already in memory.
Use StreamingWriter for large files — it VBZ-compresses each read’s signal
immediately and does not hold decompressed samples beyond each StreamingWriter::write_read call.
use pod5lib::{Reader, StreamingWriter};
let src = Reader::open("input.pod5").unwrap();
let mut writer = StreamingWriter::create("output.pod5").unwrap();
writer.file_identifier = src.file_identifier().to_string();
for read in src.reads_iter() {
let signal = read.signal().unwrap();
writer.write_read(read, &signal).unwrap();
}
writer.finish().unwrap();Structs§
- Calibration
- ADC → picoamp calibration parameters.
- EndReason
- How and why a read ended.
- ParReads
Iter - Parallel reads iterator returned by
Reader::par_reads_iter. - Pore
- Physical pore location on the flow cell.
- Read
Record - All metadata and signal for one nanopore read.
- Reader
- Open and read a POD5 file.
- RunInfo
- Per-experiment run context (one entry per MinKNOW acquisition).
- Shift
Scale Pair - Optional (shift, scale) pair;
f32::NANwhen unavailable. - Streaming
Writer - Memory-efficient, threaded POD5 writer.
- Writer
- Accumulate reads in memory and write a POD5 file in one shot.
Enums§
- EndReason
Kind - Why a read was terminated.
- Error
- All errors that can be produced by this library.
Constants§
- DEFAULT_
SIGNAL_ BATCH_ SIZE - Default number of reads accumulated per rayon compression burst and Arrow signal batch.
Matches
CHANNEL_DEPTHso one burst fills the channel exactly, maximising pipeline overlap. Tune upward for throughput at the cost of RSS; below 16 rayon under-utilises the CPU pool.
Functions§
- uuid_
to_ str - Format 16 raw bytes into the canonical
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxUUID string.