[−][src]Crate scailist
This module provides an implementation of an AIList, but with a dynamic scaling for the number of sublists.
Features
- Consistantly fast. The way the input intervals are decomposed diminishes the effects of super containment.
- Parallel friendly. Queries are on an immutable structure, even for seek
- Consumer / Adapter paradigm, an iterator is returned.
Details:
Please see the paper.
Most interaction with this crate will be through the ScAIList
struct The main methods is [
find`](struct.ScAIList.html#method.find).
The overlap function for this assumes a zero based genomic coordinate system. So [start, stop) is not inclusive of the stop position for neither the queries, nor the Intervals.
ScAIList is composed of four primary parts. A main interval list, which holds all the intervals after they have been decomposed. A component index's list, which holds the start index of each sublist post-decomposition, A component lengths list, which holds the length of each component, and finally a max_ends list, which holds the max end releative to a sublist up to a given point for each interval.
The decomposition step is achieved by walking the list of intervals and recursively (with a cap) extracting intervals that overlap a given number of other intervals within a certain distance from it. The unique development in this implementation is to make the cap dynamic.
Examples
use scailist::{Interval, ScAIList}; use std::cmp; type Iv = Interval<u32>; // create some fake data let data: Vec<Iv> = (0..20).step_by(5).map(|x| Iv{start: x, end: x + 2, val: 0}).collect(); println!("{:#?}", data); // make lapper structure let laps = ScAIList::new(data, None); assert_eq!(laps.find(6, 11).next(), Some(&Iv{start: 5, end: 7, val: 0})); let mut sim: u32= 0; // Calculate the overlap between the query and the found intervals, sum total overlap for i in (0..10).step_by(3) { sim += laps .find(i, i + 2) .map(|iv| cmp::min(i + 2, iv.end) - cmp::max(i, iv.start)) .sum::<u32>(); } assert_eq!(sim, 4);
Structs
Interval | Hold the start and stop of each sublist |
IterFind | Find Iterator |
IterScAIList | ScAIList Iterator |
ScAIList | This is the main object of this repo, see associated methods |