ref_solver/parsing/mod.rs
1//! Parsers for extracting sequence dictionaries from various file formats.
2//!
3//! This module provides parsers for:
4//!
5//! - **SAM/BAM/CRAM files**: Extract `@SQ` lines from alignment file headers
6//! - **Picard .dict files**: Parse sequence dictionary files
7//! - **FASTA index (.fai) files**: Parse FASTA index files
8//! - **NCBI assembly reports**: Parse NCBI assembly reports with multiple naming conventions
9//! - **VCF headers**: Extract `##contig` lines from VCF files
10//! - **TSV/CSV files**: Parse tabular contig definitions
11//!
12//! ## Example
13//!
14//! ```rust,no_run
15//! use ref_solver::parsing::sam::{parse_file, parse_header_text};
16//! use std::path::Path;
17//!
18//! // Parse from a BAM file
19//! let query = parse_file(Path::new("sample.bam")).unwrap();
20//!
21//! // Or parse from raw header text
22//! let header = "@SQ\tSN:chr1\tLN:248_956_422\tM5:6aef897c3d6ff0c78aff06ac189178dd\n";
23//! let query = parse_header_text(header).unwrap();
24//! ```
25//!
26//! ## Supported Tags
27//!
28//! From SAM `@SQ` lines, the following tags are extracted:
29//!
30//! | Tag | Description | Required |
31//! |-----|-------------|----------|
32//! | SN | Sequence name | Yes |
33//! | LN | Sequence length | Yes |
34//! | M5 | MD5 checksum | No |
35//! | AS | Assembly identifier | No |
36//! | UR | URI for sequence | No |
37//! | SP | Species | No |
38//! | AN | Alternate names (aliases) | No |
39
40pub mod dict;
41pub mod fai;
42pub mod fasta;
43pub mod ncbi_report;
44pub mod sam;
45pub mod tsv;
46pub mod vcf;