Expand description
bigWig signal → per-region score matrix, matching deeptools computeMatrix
reference-point and scale-regions output.
§Output format (deeptools heatmapper.save_matrix)
A gzipped file whose first line is @ followed by a JSON dict of the
parameters (no spaces; keys in deeptools’ fixed order; the per-sample
“special” params — upstream, downstream, body, bin size, ref point,
unscaled 5/3 prime — are emitted as one-element lists). Every subsequent
line is one region: chrom, comma-joined exon starts, comma-joined exon
ends, name, score, strand, then the per-bin signal values formatted with
Python %f (six decimals; missing → nan).
§Per-region binning (deeptools coverage_from_big_wig + coverage_from_array)
For each region a reference point is chosen by mode and strand, two flank
spans are laid out around it, the bigWig is read per-base (NaN where the
file carries no data or the span runs off the chromosome), each flank is
partitioned into bins by numpy.linspace(start, end, nbins, endpoint=False)
truncated to int, and each bin’s value is the NaN-masked mean of its bases.
Minus-strand regions read the flanks swapped and reverse the final row.
With missing data as zero, NaN bases become 0 before averaging.
§reference-point spans (b = upstream, a = downstream, refpoint rp)
- plus strand: left flank
[rp-b, rp]→b/binSizebins; right flank[rp, rp+a]→a/binSizebins. - minus strand: left flank
[rp-a, rp]→a/binSizebins; right flank[rp, rp+b]→b/binSizebins; the row is then reversed.
rp is start (TSS), end (TES) or (start+end)/2 (center) for the plus
strand; end (TSS), start (TES) or (start+end)/2 (center) for minus.
§scale-regions spans
upstream flank [start-b, start] (b/binSize bins), the region body
[start, end] scaled to body/binSize bins, downstream flank [end, end+a]
(a/binSize bins). Minus strand swaps the up/down flanks and reverses.
Structs§
- Matrix
Params - Knobs that drive matrix layout and value computation, mirroring the deeptools parameter dict that ends up in the gzipped header.
- Region
- One BED6 region.
scoreis kept as the literal BED field so a.stays.while a numeric value is re-emitted as deeptools’ float (0→0.0).
Enums§
- BinAvg
- The averaging statistic applied within each bin (deeptools
--averageTypeBins). - Mode
- Which subcommand layout to build.
- RefPoint
- Which point of each region anchors the flanks (reference-point mode).
Functions§
- compute_
matrix - Compute the matrix and write the gzipped deeptools-format file.
- read_
bed - Parse a BED file into BED6 regions.
#-delimited multi-group BEDs are not supported (single “genes” group only); a#line is a hard error so we never silently mis-group.