rsomics-tabix
Build a coordinate index (.tbi / .csi) for a bgzip-compressed,
position-sorted tab-delimited file, and query it by region — a Rust port of
htslib tabix. Enables fast region-restricted access to large GFF/BED/SAM/VCF
and arbitrary tab files.
Install
Usage
| flag | meaning | default |
|---|---|---|
-p, --preset gff|bed|sam|vcf |
input format preset | gff |
-s, --seq-col INT |
1-based sequence-name column | from preset |
-b, --begin-col INT |
1-based region-begin column | from preset |
-e, --end-col INT |
1-based region-end column (0 = same as begin) | from preset |
-S, --skip-lines INT |
header lines to skip | 0 |
-c, --comment CHAR |
comment line marker | # |
-0, --zero-based |
coordinates are 0-based (BED-style) | off |
-C, --csi |
emit a .csi index instead of .tbi |
tbi |
-l, --list-chroms |
list sequence names in the index | off |
-f, --force |
overwrite an existing index | off |
The preset constants match htslib's tbx_conf_*: gff {sc=1,bc=4,ec=5,#},
bed {sc=1,bc=2,ec=3,#, 0-based}, sam {sc=3,bc=4,@}, vcf {sc=1,bc=2,#}.
Origin
Independent Rust reimplementation of htslib tabix based on the public BED/GFF/
SAM/VCF formats, the CSI/TBI index format specifications, htslib's MIT-licensed
tbx.c / tbx.h (preset column layouts and the tbx_parse1 coordinate logic),
and black-box testing against the tabix binary.
Index construction uses noodles
(csi / tabix / bgzf, pure Rust, Quadrant ①). BGZF inflate uses the bundled
libdeflate that htslib also uses.
License: MIT OR Apache-2.0. Upstream credit: htslib / tabix (MIT/Expat), Li, Bioinformatics 2011, doi:10.1093/bioinformatics/btq671.