Skip to main content

Crate varforge

Crate varforge 

Source
Expand description

VarForge: synthetic cancer sequencing data generator.

Generates realistic FASTQ and BAM files with controlled mutations, tumour parameters, UMI tags, and cfDNA fragment profiles for benchmarking bioinformatics tools.

Modulesยง

artifacts
Sequencing artifact simulation: FFPE deamination, oxidative damage, and PCR duplicates.
cli
Command-line interface definitions for VarForge.
core
Core simulation primitives: types, coverage, fragment sampling, quality models, and the read engine.
editor
BAM editing engine for spiking variants into existing sequencing data.
io
Input and output: YAML config parsing, FASTQ and BAM writers, reference genome access, VCF reading and writing, and the simulation manifest.
seq_utils
Sequence utility functions shared across modules.
tumour
Tumour model: clonal tree construction and cancer cell fraction assignment.
umi
UMI (unique molecular identifier) support: barcode generation and PCR family simulation.
variants
Variant generation and spike-in: SNVs, indels, MNVs, SVs, CNVs, and mutational signatures.