pub struct RegionSet {
pub regions: Vec<Region>,
pub header: Option<String>,
pub path: Option<PathBuf>,
}Expand description
RegionSet struct, the representation of the interval region set file, such as bed file.
Fields§
§regions: Vec<Region>§header: Option<String>§path: Option<PathBuf>Implementations§
Source§impl RegionSet
impl RegionSet
Sourcepub fn identifier(&self) -> String
pub fn identifier(&self) -> String
Calculate identifier for RegionSet
This function doesn’t sort file, and identifer is based on unsorted first 3 columns.
§Returns
String containing RegionSet identifier
pub fn file_digest(&self) -> String
Sourcepub fn iter_chroms(&self) -> impl Iterator<Item = &String>
pub fn iter_chroms(&self) -> impl Iterator<Item = &String>
Iterate unique chromosomes located in RegionSet
Sourcepub fn iter_chr_regions<'a>(
&'a self,
chr: &'a str,
) -> impl Iterator<Item = &'a Region>
pub fn iter_chr_regions<'a>( &'a self, chr: &'a str, ) -> impl Iterator<Item = &'a Region>
Sourcepub fn sort(&mut self)
pub fn sort(&mut self)
Sort bed file based on first 3 columns. Sorting is happening inside the object, where original order will be overwritten
Sourcepub fn region_widths(&self) -> Vec<u32>
pub fn region_widths(&self) -> Vec<u32>
Calculate all regions width
Sourcepub fn mean_region_width(&self) -> f64
pub fn mean_region_width(&self) -> f64
Calculate mean region width for whole RegionSet
Sourcepub fn calc_mid_points(&self) -> HashMap<String, Vec<u32>>
pub fn calc_mid_points(&self) -> HashMap<String, Vec<u32>>
Calculate middle point for each region, and return hashmap with midpoints for each chromosome
Sourcepub fn calc_mid_points_with_mode(
&self,
mode: CoordinateMode,
) -> HashMap<String, Vec<u32>>
pub fn calc_mid_points_with_mode( &self, mode: CoordinateMode, ) -> HashMap<String, Vec<u32>>
Calculate midpoints using the specified coordinate convention.
See Region::mid_point_with_mode for details on how each mode computes the midpoint.
Sourcepub fn get_max_end_per_chr(&self) -> HashMap<String, u32>
pub fn get_max_end_per_chr(&self) -> HashMap<String, u32>
Get the furthest region location for each region
Sourcepub fn nucleotides_length(&self) -> u32
pub fn nucleotides_length(&self) -> u32
Get total nucleotide count
Source§impl RegionSet
impl RegionSet
Sourcepub fn reduce(&self) -> RegionSet
pub fn reduce(&self) -> RegionSet
Merge overlapping and adjacent intervals per chromosome.
Sorts by (chr, start), then sweeps to merge intervals where
next.start <= current.end. Returns a minimal set of non-overlapping regions.
Sourcepub fn concat(&self, other: &RegionSet) -> RegionSet
pub fn concat(&self, other: &RegionSet) -> RegionSet
Combine two region sets without merging overlapping intervals.
Sourcepub fn concat_into(self, other: RegionSet) -> RegionSet
pub fn concat_into(self, other: RegionSet) -> RegionSet
Combine two region sets without merging overlapping intervals, consuming both sets.
Like RegionSet::concat, but takes ownership of self and other
so the backing Vec<Region>s are moved instead of cloned. Prefer this
when neither input is needed afterward. As with concat, the resulting
set has no header or path (it is a pure-regions set).
Sourcepub fn union(&self, other: &RegionSet) -> RegionSet
pub fn union(&self, other: &RegionSet) -> RegionSet
Merge two region sets into a minimal non-overlapping set.
Equivalent to self.concat(other).reduce().
Sourcepub fn union_into(self, other: RegionSet) -> RegionSet
pub fn union_into(self, other: RegionSet) -> RegionSet
Merge two region sets into a minimal non-overlapping set, consuming both.
Equivalent to self.concat_into(other).reduce(). Saves the concat-stage
clones that RegionSet::union incurs; reduce still allocates internally.
Sourcepub fn trim(&self, chrom_sizes: &HashMap<String, u32>) -> RegionSet
pub fn trim(&self, chrom_sizes: &HashMap<String, u32>) -> RegionSet
Clamp regions to chromosome boundaries.
Sourcepub fn gaps(&self, chrom_sizes: &HashMap<String, u32>) -> RegionSet
pub fn gaps(&self, chrom_sizes: &HashMap<String, u32>) -> RegionSet
Return the gaps between regions per chromosome, bounded by chromosome sizes.
Reduces the input first, then emits intervals that tile the peak-free
regions of each chromosome listed in chrom_sizes:
- a leading gap from position 0 to the first region’s start (omitted if the first region starts at 0),
- an inter-region gap between each consecutive pair of reduced regions,
- a trailing gap from the last region’s end to the chromosome size (omitted if the last region already reaches the chromosome end, or extends past it due to assembly mismatch),
- a full-chromosome gap
0..chrom_sizefor any chromosome inchrom_sizesthat has no regions at all.
Regions on chromosomes not present in chrom_sizes are skipped.
Regions that extend past the stated chromosome size are clipped to
chrom_size when computing the trailing gap, matching the
clipping behavior of trim().
Sourcepub fn flank(&self, width: u32, use_start: bool, both: bool) -> RegionSet
pub fn flank(&self, width: u32, use_start: bool, both: bool) -> RegionSet
Generate flanking regions.
Sourcepub fn resize(&self, width: u32, fix: &str) -> RegionSet
pub fn resize(&self, width: u32, fix: &str) -> RegionSet
Resize regions to a fixed width, anchored at start, end, or center.
Sourcepub fn narrow(
&self,
start: Option<u32>,
end: Option<u32>,
width: Option<u32>,
) -> RegionSet
pub fn narrow( &self, start: Option<u32>, end: Option<u32>, width: Option<u32>, ) -> RegionSet
Narrow each region by specifying a relative sub-range within it.
Sourcepub fn promoters(&self, upstream: u32, downstream: u32) -> RegionSet
pub fn promoters(&self, upstream: u32, downstream: u32) -> RegionSet
Generate promoter regions relative to each region’s start position.
Sourcepub fn pintersect(&self, other: &RegionSet) -> RegionSet
pub fn pintersect(&self, other: &RegionSet) -> RegionSet
Pairwise intersection of two region sets by index position.
Sourcepub fn disjoin(&self) -> RegionSet
pub fn disjoin(&self) -> RegionSet
Break all regions into non-overlapping disjoint pieces.
Internal boundaries (starts and ends of overlapping input regions) split
the covered intervals into non-overlapping pieces. Only pieces that are
covered by at least one input region are emitted; gaps between disjoint
regions are never filled. This matches the semantics of R’s
GenomicRanges disjoin.
Trait Implementations§
Source§impl IntervalSetOps for RegionSet
impl IntervalSetOps for RegionSet
Source§fn setdiff(&self, other: &RegionSet) -> RegionSet
fn setdiff(&self, other: &RegionSet) -> RegionSet
self that overlap with other.Source§fn intersect(&self, other: &RegionSet) -> RegionSet
fn intersect(&self, other: &RegionSet) -> RegionSet
Source§fn jaccard(&self, other: &RegionSet) -> f64
fn jaccard(&self, other: &RegionSet) -> f64
|intersection| / |union|.Source§fn overlap_coefficient(&self, other: &RegionSet) -> f64
fn overlap_coefficient(&self, other: &RegionSet) -> f64
|intersection| / min(|self|, |other|).