datasynth-graph
Graph/network export for synthetic accounting data with ML-ready formats.
Overview
datasynth-graph provides graph construction and export capabilities:
- Graph Builders: Transaction, approval, and entity relationship graphs
- ML Export: PyTorch Geometric, Neo4j, and DGL formats
- Feature Engineering: Temporal, amount, structural, and categorical features
- Data Splits: Train/validation/test split generation
Graph Types
| Graph | Nodes | Edges | Use Case |
|---|---|---|---|
| Transaction Network | Accounts, Entities | Transactions | Anomaly detection |
| Approval Network | Users | Approvals | SoD analysis |
| Entity Relationship | Legal Entities | Ownership | Consolidation analysis |
Export Formats
PyTorch Geometric
graphs/transaction_network/pytorch_geometric/
├── node_features.pt # [num_nodes, num_features]
├── edge_index.pt # [2, num_edges]
├── edge_attr.pt # [num_edges, num_edge_features]
├── labels.pt # [num_nodes] or [num_edges]
├── train_mask.pt # Boolean mask
├── val_mask.pt
└── test_mask.pt
Neo4j
graphs/entity_relationship/neo4j/
├── nodes_account.csv
├── nodes_entity.csv
├── edges_transaction.csv
└── import.cypher
Features
| Category | Features |
|---|---|
| Temporal | weekday, period, is_month_end, is_quarter_end, is_year_end |
| Amount | log(amount), benford_probability, is_round_number |
| Structural | line_count, unique_accounts, has_intercompany |
| Categorical | business_process (one-hot), source_type (one-hot) |
Usage
use ;
let builder = new;
let graph = builder.build?;
let exporter = new;
exporter.export?;
License
Apache-2.0 - See LICENSE for details.