Expand description
§cgdist - High-performance SNP/indel-level distance calculator for core genome MLST analysis
This library provides a high-performance implementation for calculating genetic distances between bacterial samples using core genome MLST (cgMLST) data. It supports multiple hashing algorithms and is compatible with chewBACCA allele calling.
§Features
- High performance: Optimized parallel processing and caching
- Plugin system: Support for CRC32, SHA256, MD5, and custom hashers
- Multiple formats: TSV, CSV, PHYLIP, NEXUS output formats
- Flexible filtering: Sample and loci filtering with regex and file lists
- Quality control: Configurable thresholds for data completeness
- chewBACCA compatible: Full backward compatibility with existing workflows
§Basic Usage
use cgdist::prelude::*;
// Load allelic profiles with CRC32 hasher (chewBACCA compatible)
let matrix = AllelicMatrix::from_file_with_hasher(
std::path::Path::new("profiles.tsv"),
"-", // missing character
"crc32", // hasher type
0.0, // sample threshold
0.0, // locus threshold
None, None, None, None, // filters
None, None, None, None,
)?;
// Calculate distances
let engine = DistanceEngine::new(AlignmentConfig::default(), "crc32".to_string());
let distances = calculate_distance_matrix(
&matrix.samples,
&matrix.loci_names,
&engine,
DistanceMode::SnpsOnly,
0, // min loci
false, // hamming fallback
);Re-exports§
pub use cli::Args;pub use cli::ValidationResult;pub use core::AlignmentConfig;pub use core::DistanceEngine;pub use core::DistanceMode;pub use data::AllelicMatrix;pub use data::AllelicProfile;pub use data::SequenceDatabase;pub use data::SequenceInfo;pub use hashers::AlleleHash;pub use hashers::AlleleHashPair;pub use hashers::AlleleHasher;pub use hashers::HasherRegistry;
Modules§
Constants§
- VERSION
- Library version
Functions§
- get_
info - Get library information