Skip to main content

Crate seqwish

Crate seqwish 

Source
Expand description

§seqwish - A variation graph inducer

Seqwish builds variation graphs from pairwise sequence alignments. It transforms a collection of sequences and their all-to-all alignments into a graph representation that captures the variation between the sequences.

§Overview

The algorithm proceeds in several stages:

  1. Sequence Indexing - Load and index input sequences
  2. Alignment Processing - Parse and index PAF alignments
  3. Transitive Closure - Compute equivalence classes of aligned positions
  4. Node Compaction - Merge non-bifurcating regions into single nodes
  5. Link Derivation - Extract edges between nodes
  6. GFA Emission - Output the variation graph in GFA format

§Example

use seqwish::seqindex::SeqIndex;
use std::sync::{Arc, Mutex};

// Build a sequence index
let mut seqidx = SeqIndex::new();
seqidx.build_index("sequences.fa").unwrap();

§Command-line Usage

seqwish -s sequences.fa -p alignments.paf -g output.gfa

§Features

  • Memory-safe parallel processing
  • Disk-backed data structures for scalability
  • Produces GFA v1.0 format output
  • Compatible with standard pangenome tools

Modules§

alignments
cigar
compact
dna
dset64
dset64_asm
dset64_unsafe
gfa
intervaltree
Generic interval tree abstraction
links
mmap
paf
pos
seqindex
sxs
tempfile
time
transclosure
utils
version

Structs§

AlnIITreeHandle
Opaque handle to Alignment IITree (uses Mutex for writing)
CigarHandle
Opaque handle to CIGAR vector
IITreeHandle
Opaque handle to IITree (for node/path iitrees that use RwLock)
PafRowHandle
Opaque handle to a parsed PAF row
SeqIndexHandle
Opaque handle to SeqIndex
SxsHandle
Opaque handle to a parsed SXS alignment

Functions§

cigar_free
Free CIGAR handle
cigar_from_string
Parse CIGAR string and return handle to CIGAR vector Returns NULL on error. Must be freed with cigar_free.
cigar_get_op
Get operation at index Returns false if index out of bounds
cigar_length
Get number of operations in CIGAR
cigar_to_string
Convert CIGAR vector to string Returns C string that must be freed with temp_file_free_string
compact_compact_nodes
Compact nodes by marking boundaries in the graph
dna_complement
Get complement of a single DNA base
dna_reverse_complement
Reverse complement a DNA sequence (allocates new string that must be freed)
dna_reverse_complement_in_place
Reverse complement a DNA sequence in place
file_exists
Check if a file exists
handy_parameter
Parse a number with optional suffix (k, m, g)
keep_sparse
Determine if a match should be kept based on sparsification factor
match_hash
Hash function for match parameters
mmap_close_rust
Close a memory-mapped file
mmap_open_rust
Open a file and memory-map it Returns the file size on success, 0 on error The buffer pointer and file descriptor are written to the provided pointers
paf_row_alignment_block_length
paf_row_cigar
paf_row_free
Free a PAF row handle
paf_row_mapping_quality
paf_row_num_matches
paf_row_parse
Parse a PAF row from a C string line Returns NULL if parsing fails
paf_row_query_end
paf_row_query_sequence_length
paf_row_query_sequence_name
paf_row_query_start
paf_row_query_target_same_strand
paf_row_target_end
paf_row_target_sequence_length
paf_row_target_sequence_name
paf_row_target_start
parse_paf_spec
Parse PAF spec string, calling callback for each (filename, weight) pair Callback signature: void callback(void* user_data, const char* filename, uint64_t weight)
pos_decr_pos
Decrement position
pos_decr_pos_by
Decrement position by N
pos_incr_pos
Increment position
pos_incr_pos_by
Increment position by N
pos_is_rev
Check if position is reverse
pos_make_pos_t
Create a position from offset and orientation
pos_offset
Extract offset from position
pos_rev_pos_t
Reverse position orientation
pos_to_string_c
Convert position to string (returns C string that must be freed)
seqwish_rust_add
Simple test function to verify FFI is working
seqwish_rust_version
Returns the version string of the Rust component
sxs_cigar
sxs_free
Free an SXS handle
sxs_is_good
sxs_is_reverse
sxs_mapping_quality
sxs_new
Create a new empty SXS alignment
sxs_num_matches
sxs_parse_lines
Parse SXS alignment from array of C strings (lines) Returns NULL if parsing fails
sxs_query_end
sxs_query_sequence_name
sxs_query_start
sxs_target_end
sxs_target_sequence_name
sxs_target_start
temp_file_create
Create a temporary file. Returns a C string that must be freed with temp_file_free_string. Returns NULL on error.
temp_file_free_string
Free a string returned by temp_file functions
temp_file_get_dir
Get temp directory. Returns a C string that must be freed with temp_file_free_string.
temp_file_remove
Remove a temporary file
temp_file_set_dir
Set temp directory
temp_file_set_keep_temp
Set whether to keep temp files
time_since_epoch_ms
Get milliseconds since Unix epoch
transclosure_compute
Compute transitive closures for variation graph construction