Expand description
Core infrastructure for high-performance genomic interval overlap operations in Rust.
This crate provides efficient data structures and algorithms for finding overlapping intervals in genomic data. It is part of the gtars project, which provides tools for working with genomic interval data in Rust, Python, and R.
§Features
- Fast overlap queries: Efficiently find all intervals that overlap with a query interval
- Iterator-based API: Memory-efficient iteration over overlapping intervals
- Thread-safe: All data structures implement
SendandSyncfor concurrent access
All overlap computation logic should live here. Higher-level modules (scoring, tokenizers) wrap this functionality for their specific use cases but should not reimplement overlap algorithms.
§Quick Start
use gtars_overlaprs::{AIList, Overlapper, Interval};
// create some genomic intervals (e.g., ChIP-seq peaks)
let intervals = vec![
Interval { start: 100u32, end: 200, val: "gene1" },
Interval { start: 150, end: 300, val: "gene2" },
Interval { start: 400, end: 500, val: "gene3" },
];
// build the AIList data structure
let ailist = AIList::build(intervals);
// query for overlapping intervals
let overlaps = ailist.find(180, 250);
assert_eq!(overlaps.len(), 2); // gene1 and gene2 overlap
// or use an iterator for memory-efficient processing
for interval in ailist.find_iter(180, 250) {
println!("Found overlap: {:?}", interval);
}§Performance
The AIList data structure is optimized for queries on genomic-scale datasets and provides
excellent performance for typical genomic interval overlap operations. It uses a decomposition
strategy to handle intervals efficiently, particularly when dealing with high-coverage regions
common in genomic data.
§Examples
§Finding all genes that overlap a query region
use gtars_overlaprs::{AIList, Overlapper, Interval};
let genes = vec![
Interval { start: 1000u32, end: 2000, val: "BRCA1" },
Interval { start: 3000, end: 4000, val: "TP53" },
Interval { start: 5000, end: 6000, val: "EGFR" },
];
let gene_index = AIList::build(genes);
// query a specific region (e.g., chr17:1500-3500)
let overlapping_genes: Vec<&str> = gene_index
.find_iter(1500, 3500)
.map(|interval| interval.val)
.collect();
println!("Genes in region: {:?}", overlapping_genes);Re-exports§
pub use self::ailist::AIList;pub use self::bits::Bits;pub use self::traits::Overlapper;
Modules§
- ailist
- Augmented Interval List implementation.
- bits
- Binary Interval Search implementation.
- consts
- Constants used throughout the crate.
- multi_
chrom_ overlapper - Genome-wide interval indexing.
- traits
- Core traits for overlap operations.
Structs§
- Interval
- Represent a range from [start, end) Inclusive start, exclusive of end
Enums§
- Overlapper
Type - The type of overlap data structure to use.