datasynth-config
Configuration schema, validation, and industry presets for synthetic data generation.
Overview
datasynth-config provides the configuration layer for SyntheticData:
- Schema Definition: Complete YAML configuration schema
- Validation: Bounds checking, constraint validation, distribution sum verification
- Industry Presets: Pre-configured settings for common industries
- Complexity Levels: Small, medium, and large organization profiles
Configuration Sections
| Section | Description |
|---|---|
global |
Industry, dates, seed, performance settings |
companies |
Company codes, currencies, volume weights |
chart_of_accounts |
COA complexity and structure |
transactions |
Line items, amounts, sources, temporal patterns |
master_data |
Vendors, customers, materials, assets, employees |
document_flows |
P2P, O2C configuration |
intercompany |
IC transaction types and transfer pricing |
balance |
Opening balances, trial balance generation |
subledger |
AR, AP, FA, inventory settings |
fx |
Currency and exchange rate settings |
period_close |
Close tasks and schedules |
fraud |
Fraud injection rates and types |
internal_controls |
SOX controls and SoD rules |
anomaly_injection |
Anomaly rates and labeling |
data_quality |
Missing values, typos, duplicates |
graph_export |
ML graph export formats |
output |
Output format and compression |
Industry Presets
| Industry | Description |
|---|---|
manufacturing |
Heavy P2P, inventory, fixed assets |
retail |
High O2C volume, seasonal patterns |
financial_services |
Complex intercompany, high controls |
healthcare |
Regulatory focus, seasonal insurance |
technology |
SaaS revenue patterns, R&D capitalization |
Usage
use ;
// Load and validate configuration
let config = from_yaml_file?;
let validator = new;
validator.validate?;
// Use industry preset
let config = preset_manufacturing;
Validation Rules
period_months: 1-120 (max 10 years)compression_level: 1-9 when enabled- All rate/percentage fields: 0.0-1.0
- Approval thresholds: strictly ascending order
- Distribution sums: must equal 1.0 (±0.01 tolerance)
License
Apache-2.0 - See LICENSE for details.