# DataForge
[](https://crates.io/crates/dataforge)
[//]: # ([](https://docs.rs/dataforge))
[](build_status)
[](https://opensource.org/licenses/MIT)
[](https://baidu.com)
**High-performance Data Forge Workshop** - Random data generation and database population solution for Rust developers
## 📋 Prerequisites
```
Nightly Rust compiler
$ rustc --version
rustc 1.85.1 (4eb161250 2025-03-15)
```
## ✨ Features
- **High-performance Data Generation**
- Rust-based high-performance random number generation engine
- Multi-threaded parallel generation (powered by rayon)
- Memory pool optimization technology
- **Database Support**
- Support for MySQL, PostgreSQL, SQLite databases
- Automatic Schema inference and matching
- Bulk insert optimization
- **Rich Data Generators**
- Name generators (Chinese, English, Japanese)
- Address generators (supports Chinese regional data)
- Network data generators (email, URL, IP, etc.)
- Date and time generators
- Number generators (phone numbers, ID cards, etc.)
- **Flexible Generation Methods**
- Support for regular expression pattern generation
- Convenient macro interface
- Support for custom generator extensions
- Multi-language data support
## 🚀 Quick Start
### Installation
```toml
[dependencies]
dataforge = "0.1.0"
# Optional features
dataforge = { version = "0.1.0", features = ["database"] }
```
### Basic Usage
```rust
use dataforge::generators::*;
use dataforge::forge;
use serde_json::json;
// Generate test user data
let user = forge!({
"id" => uuid_v4(),
"name" => name::zh_cn_fullname(),
"age" => number::adult_age(),
"email" => internet::email(),
"phone" => number::phone_number_cn(),
"address" => serde_json::json!({
"province": address::zh_province(),
"city": "北京市",
"street": address::zh_address()
}),
"created_at" => datetime::iso8601()
});
println!("{}", serde_json::to_string_pretty(&user).unwrap());
```
### Using Macros to Generate Data
```rust
use dataforge::{pattern, rand_num, datetime};
// Generate using patterns
let phone = pattern!("1[3-9]\\d{9}");
// Generate random numbers
let age = rand_num!(18, 65);
// Generate date and time
let timestamp = datetime!("timestamp");
let iso_date = datetime!("iso");
```
### Core Engine Usage
```rust
use dataforge::core::{CoreEngine, GenConfig, GenerationStrategy};
let config = GenConfig {
batch_size: 1000,
strategy: GenerationStrategy::Random,
null_probability: 0.05,
..Default::default()
};
let engine = CoreEngine::new(config);
let data = engine.generate_batch(100)?;
// Get performance metrics
let metrics = engine.metrics();
println!("Generated: {}, Errors: {}",
metrics.generated_count(),
metrics.error_count()
);
```
### Database Population
```rust
use dataforge::db::DatabaseForge;
// Create database filler
let forge = DatabaseForge::new("mysql://user:pass@localhost/db");
// Configure table and fill data
let result = forge
.table("users", 1000, |t| {
t.field("id", || uuid_v4())
.field("name", || name::zh_cn_fullname())
.field("email", || internet::email())
})
.fill_sync()?;
println!("Filled {} records", result);
```
### Custom Generators
```rust
use dataforge::{DataForge, Language};
use serde_json::Value;
// Create data generator
let mut forge = DataForge::new(Language::ZhCN);
// Register custom generator
forge.register("product_id", || {
serde_json::json!(format!("PROD-{:06}", rand::random::<u32>() % 1000000))
});
// Use custom generator
let product_id = forge.generate("product_id");
```
## Generator Types
### Name Generators
- `name::zh_cn_fullname()` - Chinese full name
- `name::en_us_fullname()` - English full name
- `name::ja_jp_fullname()` - Japanese full name
### Address Generators
- `address::zh_province()` - Chinese province
- `address::zh_address()` - Chinese address
- `address::us_state()` - US state name
- `address::us_city()` - US city
### Network Data Generators
- `internet::email()` - Email address
- `internet::url()` - Website URL
- `internet::ip_address()` - IP address
- `internet::mac_address()` - MAC address
- `internet::user_agent()` - User agent string
### Number Generators
- `number::phone_number_cn()` - Chinese mobile number
- `number::id_card_cn()` - Chinese ID card number
- `number::credit_card_number()` - Bank card number
- `number::adult_age()` - Adult age
- `number::currency(min, max)` - Currency amount
### Date and Time Generators
- `datetime::iso8601()` - ISO8601 format date
- `datetime::timestamp()` - Timestamp
- `datetime::birthday()` - Birthday date
- `datetime::work_time()` - Work time
## Advanced Features
### Parallel Generation
```rust
use dataforge::core::{CoreEngine, GenConfig, GenerationStrategy};
let config = GenConfig {
batch_size: 1000,
strategy: GenerationStrategy::Random,
parallelism: 4,
..Default::default()
};
let engine = CoreEngine::new(config);
let results = engine.generate_batch(10000)?;
```
### Memory Optimization
```rust
use dataforge::memory::{MemoryPool, MemoryPoolConfig};
let config = MemoryPoolConfig::default();
let mut pool = MemoryPool::new(config);
let buffer = pool.allocate(1024)?;
```
### Rule Engine
```rust
use dataforge::rules::{RuleEngine, Rule, RuleType};
let mut engine = RuleEngine::new();
engine.add_rule(Rule {
name: "adult_user".to_string(),
rule_type: RuleType::Condition,
condition: "age >= 18".to_string(),
action: "generate_adult_data".to_string(),
});
```
## Configuration File Support
Supports TOML and YAML configuration files:
```toml
# dataforge.toml
[generation]
batch_size = 1000
strategy = "Random"
null_probability = 0.05
[database]
url = "mysql://user:pass@localhost/db"
batch_size = 5000
```
## Performance Features
- **Multi-threaded Parallelism**: Efficient parallel processing based on rayon
- **Memory Pool**: Reduce memory allocation overhead
- **Batch Operations**: Optimize database insert performance
- **Lazy Loading**: Load data files on demand
- **Zero Copy**: Reduce unnecessary memory copying
## Project Structure
```
dataforge/
├── src/
│ ├── core.rs # Core engine
│ ├── generators/ # Data generators
│ ├── regions/ # Regional data
│ ├── filling/ # Database filling
│ ├── multithreading/ # Multi-threaded processing
│ ├── memory/ # Memory management
│ ├── customization/ # User customization
│ ├── generation/ # Data generation
│ ├── db/ # Database related
│ │ └── schema.rs # Schema parsing
│ ├── config.rs # Configuration management
│ ├── rules/ # Rule engine
│ └── macros.rs # Macro definitions
├── data/ # External data files
├── tests/ # Test files
└── doc/ # Documentation
```
## 📚 Ecosystem
dataforge-faker: Ruby Faker-compatible syntax
dataforge-sqlx: Async database support via sqlx
dataforge-cli: Command-line data generation tool
## License
This project is licensed under either MIT or Apache-2.0 dual license.
## Contributing
Welcome to submit Issues and Pull Requests!