dataprof 0.4.83

High-performance data profiler with ISO 8000/25012 quality metrics for CSV, JSON/JSONL, and Parquet files
Documentation
# Development Workflow ๐Ÿ› ๏ธ

This document describes the development workflow for DataProfiler using the **staging** branch for development and **master** for production releases.

## ๐ŸŒณ Branch Strategy

```
master (production)
  โ†‘
  โ””โ”€โ”€ staging (development/integration)
        โ†‘
        โ”œโ”€โ”€ feature/database-connectors
        โ”œโ”€โ”€ feature/arrow-integration
        โ”œโ”€โ”€ bugfix/unwrap-cleanup
        โ””โ”€โ”€ ... (other feature branches)
```

### Branch Purposes

- **`master`**: Production-ready code, stable releases only
- **`staging`**: Integration branch for development, pre-production testing
- **`feature/*`**: Individual feature development branches
- **`bugfix/*`**: Bug fix branches

## ๐Ÿ”„ Development Workflow

### 1. Starting New Work

```bash
# Always start from latest staging
git checkout staging
git pull origin staging

# Create feature branch
git checkout -b feature/your-feature-name

# Work on your feature...
git add .
git commit -m "feat: implement your feature"
git push -u origin feature/your-feature-name
```

### 2. Pull Request to Staging

1. **Create PR**: `feature/your-feature` โ†’ `staging`
2. **Automated checks run**: Comprehensive staging workflow
3. **Code review**: Team reviews the changes
4. **Merge**: Once approved and checks pass

### 3. Staging Integration

- All features are integrated and tested in `staging`
- Staging has comprehensive CI/CD with:
  - โœ… Code quality checks (format, lint, unwrap detection)
  - โœ… Multi-platform testing (Linux, Windows, macOS)
  - โœ… Performance monitoring
  - โœ… Memory leak detection
  - โœ… Security audits
  - โœ… Integration testing

### 4. Production Release

When staging is ready for release:

```bash
# Create production release PR
git checkout staging
git pull origin staging
gh pr create --base master --title "Release v0.3.2" --body "Production release from staging"
```

**Production PR includes**:
- ๐Ÿ”’ **Security audit**: Vulnerability scanning, secret detection
- ๐Ÿš€ **Performance validation**: Benchmark tests, memory usage
- ๐Ÿ“š **Documentation checks**: Version consistency, changelog
- ๐Ÿงช **Comprehensive testing**: All platforms, all features
- โœ… **Production readiness gate**: Final approval process

## ๐Ÿค– GitHub Actions Workflows

### `staging-dev.yml` - Development Workflow
**Triggers**: Push/PR to `staging`

**Jobs**:
- **quick-check**: Fast feedback (format, lint, basic checks)
- **test-suite**: Comprehensive testing across platforms/Rust versions
- **performance-check**: Memory analysis, basic performance validation
- **security-audit**: Dependency vulnerabilities, unsafe code detection
- **dev-environment**: Development tools validation
- **integration-check**: CLI integration testing

### `staging-to-master.yml` - Production Release
**Triggers**: PR from `staging` to `master`

**Jobs**:
- **validate-pr**: Ensures PR comes from staging branch
- **production-tests**: Strict cross-platform testing
- **security-production-audit**: Enhanced security scanning
- **performance-validation**: Production performance benchmarks
- **documentation-check**: Release documentation validation
- **production-ready**: Final approval gate

### `ci.yml` - Master Protection
**Triggers**: Push/PR to `master` (direct PRs blocked)

Lightweight checks for master branch protection.

## ๐Ÿ“‹ Development Guidelines

### Code Quality Standards

```bash
# Before committing, run:
cargo fmt                           # Format code
cargo clippy -- -D warnings       # Lint with no warnings
cargo test                         # All tests pass

# Avoid in production code:
.unwrap()                          # Use proper error handling
.expect()                          # Return Result<T, Error>
panic!()                           # Graceful error handling

# Check before PR:
grep -r "unwrap()" src/           # Should return nothing
grep -r "TODO\|FIXME" src/        # Address before merge
```

### Commit Message Format

```
type(scope): description

Types: feat, fix, docs, style, refactor, test, chore
Scope: module/component affected
Description: imperative mood, lowercase

Examples:
feat(connectors): add PostgreSQL database connector
fix(parsing): handle malformed CSV headers gracefully
docs(api): update streaming API documentation
refactor(lib): extract column analysis to separate module
```

### Testing Requirements

- **Unit tests**: New code must include unit tests
- **Integration tests**: Major features need integration tests
- **Error path testing**: Test failure scenarios
- **Performance tests**: Benchmark critical paths
- **Documentation tests**: Ensure doc examples work

### Performance Considerations

- Profile memory usage for large file processing
- Benchmark performance-critical changes
- Use SIMD optimizations where applicable
- Avoid unnecessary cloning in hot paths

## ๐Ÿš€ Release Process

### Preparing a Release

1. **Update version** in `Cargo.toml`
2. **Update CHANGELOG.md** with changes
3. **Test thoroughly** in staging environment
4. **Create release PR**: `staging` โ†’ `master`
5. **Production checks pass**
6. **Merge to master**
7. **Tag release**: `git tag v0.3.2 && git push origin v0.3.2`

### Version Numbering

Follow [Semantic Versioning](https://semver.org/):
- `MAJOR.MINOR.PATCH`
- **MAJOR**: Breaking changes
- **MINOR**: New features (backward compatible)
- **PATCH**: Bug fixes (backward compatible)

## ๐Ÿ›ก๏ธ Branch Protection Rules

### Master Branch Protection
- โœ… Require PR reviews
- โœ… Require status checks to pass
- โœ… Require branches to be up to date
- โœ… Restrict pushes to admins only
- โœ… Require PRs from staging branch only

### Staging Branch Protection
- โœ… Require status checks to pass
- โœ… Allow development team push access
- โœ… Require PR reviews for external contributors

## ๐Ÿ› Troubleshooting

### Common Issues

**Q: My PR to staging failed the unwrap() check**
```bash
# Find and fix unwrap calls
grep -r "\.unwrap()" src/
# Replace with proper error handling
```

**Q: Performance check failed**
```bash
# Profile your changes
cargo build --release
time ./target/release/dataprof-cli large_file.csv --quality
# Compare with baseline performance
```

**Q: Memory leak detected**
```bash
# Test with AddressSanitizer
RUSTFLAGS="-Zsanitizer=address" cargo test
# Check unsafe code blocks for proper cleanup
```

### Getting Help

- ๐Ÿ’ฌ **Discussions**: GitHub Discussions for questions
- ๐Ÿ› **Issues**: GitHub Issues for bugs/feature requests
- ๐Ÿ“ง **Maintainers**: Tag @AndreaBozzo for urgent issues
- ๐Ÿ“– **Documentation**: Check docs/ directory

## ๐Ÿ“Š Monitoring & Metrics

### CI/CD Metrics
- โœ… **Build success rate**: Target >95%
- ๐Ÿš€ **Build time**: Target <10 minutes for full suite
- ๐Ÿงช **Test coverage**: Target >80% on core modules
- ๐Ÿ”’ **Security scan**: Zero high-severity vulnerabilities

### Performance Metrics
- โšก **Processing speed**: Monitor throughput trends
- ๐Ÿ’พ **Memory usage**: Track memory efficiency
- ๐Ÿ“ˆ **Regression detection**: Alert on >10% performance drops

---

## ๐ŸŽฏ Quick Reference

```bash
# Start new feature
git checkout staging && git pull origin staging
git checkout -b feature/my-feature

# Daily development
cargo fmt && cargo clippy && cargo test
git add . && git commit -m "feat: my changes"
git push origin feature/my-feature

# Create PR to staging
gh pr create --base staging --title "Add my feature"

# Prepare production release
git checkout staging && git pull origin staging
gh pr create --base master --title "Release v0.3.x"
```

Happy coding! ๐Ÿฆ€โœจ