datasynth-core 0.2.3

Core domain models, traits, and distributions for synthetic accounting data generation
Documentation

datasynth-core

Core domain models, traits, and distributions for synthetic accounting data generation.

Overview

datasynth-core provides the foundational building blocks for the SyntheticData workspace:

  • Domain Models: Journal entries, chart of accounts, master data, documents, anomalies
  • Statistical Distributions: Line item sampling, amount generation, temporal patterns
  • Core Traits: Generator and Sink interfaces for extensibility
  • Template System: File-based templates for regional/sector customization
  • Infrastructure: UUID factory, memory guard, GL account constants

Key Components

Domain Models (models/)

Module Description
journal_entry.rs Journal entry header and balanced line items
chart_of_accounts.rs Hierarchical GL accounts with account types
master_data.rs Enhanced vendors, customers with payment behavior
documents.rs Purchase orders, invoices, goods receipts, payments
temporal.rs Bi-temporal data model for audit trails
anomaly.rs Anomaly types and labels for ML training
internal_control.rs SOX 404 control definitions

Statistical Distributions (distributions/)

Distribution Description
LineItemSampler Empirical distribution (60.68% two-line, 88% even counts)
AmountSampler Log-normal with round-number bias, Benford compliance
TemporalSampler Seasonality patterns with industry integration
BenfordSampler First-digit distribution following P(d) = log10(1 + 1/d)

Infrastructure

Component Description
uuid_factory.rs Deterministic FNV-1a hash-based UUID generation
memory_guard.rs Cross-platform memory tracking with soft/hard limits
accounts.rs Centralized GL control account numbers
templates/ YAML/JSON template loading and merging

Usage

use datasynth_core::models::{JournalEntry, JournalEntryLine};
use datasynth_core::distributions::AmountSampler;

// Create a balanced journal entry
let mut entry = JournalEntry::new(header);
entry.add_line(JournalEntryLine::debit("1100", amount, "AR Invoice"));
entry.add_line(JournalEntryLine::credit("4000", amount, "Revenue"));

// Sample realistic amounts
let sampler = AmountSampler::new(seed);
let amount = sampler.sample_benford_compliant(1000.0, 100000.0);

License

Apache-2.0 - See LICENSE for details.