seedfaker-core 0.1.0-alpha.1

Core library for seedfaker — deterministic synthetic generator for realistic, correlated, and noisy test records
Documentation

seedfaker

Deterministic synthetic generator for realistic, correlated, and noisy test records across 65+ locales.
Rust CLI + Python + Node.js + Go + PHP + Ruby + MCP.

$ seedfaker name email phone --ctx strict --locale en -n 5 --seed hero --until 2025
Janet Marsh     janet.marsh@inbox.com           +1 (957) 226-4272
Emma Hines      hinesy2@caltech.edu             (779) 640-3402
Amy Schwartz    aschwartzl@yahoo.com            +1-566-391-4136
Ronald Elliott  elliottronaldr@wellsfargo.com   557-470-1277
Cruz Hoffman    hot.hoffman65@slack.com         (432) 491-2668

Same seed — identical output, every run, every machine. Adding or removing fields does not change existing ones. 214 fields. 68 locales. Native scripts.

Install

cargo install seedfaker                        # Rust CLI
pip install seedfaker                          # Python (PyO3 native)
npm install @opendsr/seedfaker                 # Node.js (NAPI native)
go get github.com/opendsr-std/seedfaker-go     # Go (CGO)
composer require opendsr/seedfaker             # PHP (FFI)
gem install seedfaker                          # Ruby (FFI)

What it does

Generation

214 fields across 17 groups with modifiers, ranges, and transforms. 68 locales with native scripts:

$ seedfaker phone:e164 amount:usd credit-card:space -n 3 --seed readme --until 2025
+47412578114     $793.66   3715 236662 87984
+3118148237758   $123.30   4174 0785 8323 6433
+4901707888425   $473.87   3736 553912 88602

$ seedfaker name -l be --abc native -n 3 --seed readme --until 2025
Лявон Леўшкін
Валянціна Асіпенка
Камілія Анікееў

$ seedfaker name -l ja --abc native -n 3 --seed readme --until 2025
石本 和彦
楠木 康夫
川田 朋恵

Data quality

--ctx strict locks all fields in a record to one identity — email follows name, phone matches locale:

$ seedfaker name email phone --ctx strict --locale en -n 3 --seed docs --until 2025
Eric Martin       eric.martin.xxy@okta.com         498-944-8646
Rayyan Shelton    scroll.rayyan7721@aol.com         584-542-1839
Kimberly Harvey   kimberlyloot03@aol.com            439-347-4269

--corrupt degrades values — OCR errors, mojibake, truncation, field swaps:

$ seedfaker name email --corrupt mid -n 5 --seed demo --until 2025
Paulina Laca                                    im.ivana@eunet.rso6Wzw
Irene MichaelidesFmfL Irene MichaelidesFm fL    sigitas.staniulis@protonmail.com
Elv!ra C@stro Gonz@13z                          imhannes@omv.com

Workflows

Expressions — arithmetic between columns:

$ seedfaker price=amount:1..500:plain qty=integer:1..20 "total=price*qty" --seed shop --format csv -n 5 --until 2025
price,qty,total
424.49,14,5942.86
459.67,3,1379.01
309.44,12,3713.28

Templates — free-form output with conditionals and loops:

$ seedfaker name email -t '{{name}} <{{email}}>' --seed demo -n 3 --until 2025
Paulina Laca <im.ivana@eunet.rs>
Irene Michaelides <sigitas.staniulis@protonmail.com>
Elvira Castro Gonzalez <imhannes@omv.com>

13 presets for common formats. Replace PII in existing data. Stream to pipes:

seedfaker run nginx -n 0 --rate 5000 --seed demo | kafka-console-producer --topic logs
seedfaker name email --format sql=users -n 10000 --seed ci --until 2025 | psql mydb

Determinism

Each field is independently derived from (seed, record_number, field_name). This means:

  • Same seed — identical output, byte-for-byte
  • Adding field B does not change field A
  • Output format (CSV, JSONL, SQL) does not affect values
  • Record N is always the same regardless of -n

Important: --until defaults to the current system time. Without pinning it, date and timestamp fields will shift across runs. Always use --until with --seed for full reproducibility:

seedfaker name email --seed demo --until 2025 -n 3
# Paulina Laca    im.ivana@eunet.rs
# Run again — same output. Change seed — different output.

A warning is printed when --seed is set without --until (suppress with -q). See determinism.

How it compares

Traditional faker seedfaker
Determinism Random by default Byte-identical with --seed
Correlation Fields independent Email follows name, phone follows locale
Distribution Uniform Realistic (names from dictionaries, tiered amounts)
Data quality Always clean 15 corruption types, 4 levels
Scale Library calls CLI streaming, configs, SQL/CSV/JSONL/template
Cross-language Language-specific Same output: CLI = Python = Node.js = Go

Performance

Rust core with native bindings — no subprocess, no serialization overhead.

Runtime 3 fields 10 fields 20 fields
CLI (100K) 0.049s 0.142s 0.304s
Python (10K) 0.010s 0.028s 0.058s
Node.js (10K) 0.015s 0.064s 0.194s

Per-field benchmarks · CLI tiers · vs competitors · uniqueness

Documentation

Start here Quick start
CLI Commands, flags, and formats
Fields 214 fields — syntax, ranges, modifiers · Full reference
Configs YAML configs, templates, presets · Expressions and aggregators
Features Context · Corruption · Replace · Streaming
Integrations Library API (Python, Node.js, Go, PHP, Ruby) · MCP server

Safety

Generated data is synthetic. Not for authentication, identity verification, or compliance. Passwords are deterministic — never use them for real authentication. See password modifiers.

License

MIT