seedfaker
Deterministic synthetic generator for realistic, correlated, and noisy test records across 65+ locales.
Rust CLI + Python + Node.js + Go + PHP + Ruby + MCP.
)
)
)
Same seed — identical output, every run, every machine. Adding or removing fields does not change existing ones. 214 fields. 68 locales. Native scripts.
Install
What it does
Generation
214 fields across 17 groups with modifiers, ranges, and transforms. 68 locales with native scripts:
Data quality
--ctx strict locks all fields in a record to one identity — email follows name, phone matches locale:
--corrupt degrades values — OCR errors, mojibake, truncation, field swaps:
Workflows
Expressions — arithmetic between columns:
Templates — free-form output with conditionals and loops:
13 presets for common formats. Replace PII in existing data. Stream to pipes:
|
|
Determinism
Each field is independently derived from (seed, record_number, field_name). This means:
- Same seed — identical output, byte-for-byte
- Adding field B does not change field A
- Output format (CSV, JSONL, SQL) does not affect values
- Record N is always the same regardless of
-n
Important: --until defaults to the current system time. Without pinning it, date and timestamp fields will shift across runs. Always use --until with --seed for full reproducibility:
# Paulina Laca im.ivana@eunet.rs
# Run again — same output. Change seed — different output.
A warning is printed when --seed is set without --until (suppress with -q). See determinism.
How it compares
| Traditional faker | seedfaker | |
|---|---|---|
| Determinism | Random by default | Byte-identical with --seed |
| Correlation | Fields independent | Email follows name, phone follows locale |
| Distribution | Uniform | Realistic (names from dictionaries, tiered amounts) |
| Data quality | Always clean | 15 corruption types, 4 levels |
| Scale | Library calls | CLI streaming, configs, SQL/CSV/JSONL/template |
| Cross-language | Language-specific | Same output: CLI = Python = Node.js = Go |
Performance
Rust core with native bindings — no subprocess, no serialization overhead.
| Runtime | 3 fields | 10 fields | 20 fields |
|---|---|---|---|
| CLI (100K) | 0.049s | 0.142s | 0.304s |
| Python (10K) | 0.010s | 0.028s | 0.058s |
| Node.js (10K) | 0.015s | 0.064s | 0.194s |
Per-field benchmarks · CLI tiers · vs competitors · uniqueness
Documentation
| Start here | Quick start |
| CLI | Commands, flags, and formats |
| Fields | 214 fields — syntax, ranges, modifiers · Full reference |
| Configs | YAML configs, templates, presets · Expressions and aggregators |
| Features | Context · Corruption · Replace · Streaming |
| Integrations | Library API (Python, Node.js, Go, PHP, Ruby) · MCP server |
Safety
Generated data is synthetic. Not for authentication, identity verification, or compliance. Passwords are deterministic — never use them for real authentication. See password modifiers.
License
MIT