Expand description
A high-performance, zero-dependency Rust library for inferring genetic sex from summarized variant data.
The algorithm consumes an iterator of VariantInfo structs in a single
pass, counting valid and heterozygous observations across autosomes and sex
chromosomes. Metrics are normalized by platform-level “attempted” locus
counts that the caller must provide via PlatformDefinition, making the
library resilient to platform density and sample quality differences.
§Example
use infer_sex::{
Chromosome, DecisionThresholds, GenomeBuild, InferenceConfig, InferenceResult,
InferredSex, PlatformDefinition, SexInferenceAccumulator, VariantInfo,
};
let config = InferenceConfig {
build: GenomeBuild::Build38,
platform: PlatformDefinition {
n_attempted_autosomes: 2_000,
n_attempted_y_nonpar: 1_000,
},
thresholds: Some(DecisionThresholds::default()),
};
let mut acc = SexInferenceAccumulator::new(config);
let variants = vec![
// Autosomal signal for normalization.
VariantInfo { chrom: Chromosome::Autosome, pos: 1_000_000, is_heterozygous: true },
VariantInfo { chrom: Chromosome::Autosome, pos: 2_000_000, is_heterozygous: false },
// X non-PAR heterozygosity (diploid X implies female).
VariantInfo { chrom: Chromosome::X, pos: 10_000_000, is_heterozygous: true },
VariantInfo { chrom: Chromosome::X, pos: 20_000_000, is_heterozygous: true },
];
for v in &variants {
acc.process_variant(v);
}
let result: InferenceResult = acc.finish().expect("valid platform counts");
assert_eq!(result.final_call, InferredSex::Female);
println!("Report: {:?}", result.report);The library returns InferredSex::Male, InferredSex::Female, or
InferredSex::Indeterminate when no sex-chromosome evidence is observed. If
you do not supply DecisionThresholds, a built-in default heuristic is
used to derive the call while still exposing the underlying metrics for
custom downstream logic.
§Platform definitions (n_attempted_*)
The attempted locus counts must match the exact loci that will be streamed into
[process_variant]. A common pattern is to pre-scan a BIM (or similar) file:
use infer_sex::PlatformDefinition;
struct BimRow { chrom: String, pos: u64 }
fn derive_platform_from_bim(rows: impl Iterator<Item = BimRow>) -> PlatformDefinition {
let mut auto = 0u64;
let mut y_nonpar = 0u64;
fn is_in_y_par(_pos: u64) -> bool { unimplemented!("project-specific PAR check") }
for row in rows {
match row.chrom.as_str() {
"1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12"
| "13" | "14" | "15" | "16" | "17" | "18" | "19" | "20" | "21" | "22" => {
auto += 1;
}
"Y" => {
if !is_in_y_par(row.pos) {
y_nonpar += 1;
}
}
_ => {}
}
}
PlatformDefinition {
n_attempted_autosomes: auto,
n_attempted_y_nonpar: y_nonpar,
}
}The variant stream passed to SexInferenceAccumulator must be derived from the
same locus set; down-sampling autosomes for speed requires that
n_attempted_autosomes reflect the down-sampled set.
Structs§
- Algorithm
Constants - Decision
Thresholds - Evidence
Report - Inference
Config - Inference
Result - Platform
Definition - SexInference
Accumulator - Variant
Info