**Developer:** s4gor
**Github:** https://github.com/s4gor
---
# Schema Sync - Architecture Summary
## Project Structure
```
schema-sync/
├── Cargo.toml # Project configuration
├── README.md # User-facing documentation
├── DESIGN.md # Detailed design rationale
├── ARCHITECTURE.md # This file - architecture overview
├── .gitignore # Git ignore rules
├── src/
│ ├── lib.rs # Main library entry point with architecture diagram
│ ├── adapters.rs # Database adapter traits (DatabaseAdapter, SchemaInspector, MigrationRunner)
│ ├── diff.rs # Schema diff calculation and representation
│ ├── engine.rs # Main engine orchestration
│ ├── errors.rs # Error types
│ ├── planner.rs # Migration planning
│ ├── executor.rs # Migration execution
│ ├── snapshot.rs # Schema snapshot system
│ └── cli.rs # CLI types and context
└── examples/
├── basic_usage.rs # Basic sync example
├── dry_run_mode.rs # Dry-run mode example
└── ci_validation.rs # CI validation example
```
## Core Abstractions
### 1. DatabaseAdapter Trait
**Purpose**: Main entry point for database operations.
**Key Methods**:
- `inspector()` → `Box<dyn SchemaInspector>`
- `migration_runner()` → `Box<dyn MigrationRunner>`
- `database_type()` → `&str`
- `test_connection()` → `Result<()>`
**Why it exists**: Factory pattern for creating inspectors and runners. Enables multi-database support.
### 2. SchemaInspector Trait
**Purpose**: Read-only schema introspection.
**Key Methods**:
- `inspect_schema(tenant)` → `Result<SchemaSnapshot>`
- `schema_exists(tenant)` → `Result<bool>`
- `list_tenants()` → `Result<Vec<TenantContext>>`
**Why it exists**:
- Enables audit mode without write permissions
- Allows dry-run mode to calculate diffs without locks
- Supports testing with mock inspectors
### 3. MigrationRunner Trait
**Purpose**: Execute schema changes.
**Key Methods**:
- `execute_migration(tenant, plan)` → `Result<MigrationResult>`
- `acquire_lock(tenant, timeout)` → `Result<Box<dyn LockGuard>>`
- `validate_migration(tenant, plan)` → `Result<()>`
**Why it exists**:
- Pluggable migration engines (SQL files, Rust code, external tools)
- Different strategies per database type
- Testing with mock runners
### 4. Planner Trait
**Purpose**: Create executable migration plans from schema diffs.
**Key Methods**:
- `create_plan(current, target, diff)` → `Result<MigrationPlan>`
- `validate_plan(plan)` → `Result<()>`
**Why it exists**:
- Dry-run mode can show what would happen
- Validation of plans before execution
- Different planning strategies (safe ordering, dependency resolution)
### 5. Executor Trait
**Purpose**: Orchestrate the execution of migration plans.
**Key Methods**:
- `execute(tenant, plan, runner)` → `Result<ExecutionResult>`
- `dry_run(tenant, plan, runner)` → `Result<ExecutionResult>`
**Why it exists**:
- Different execution strategies (transactional, non-transactional)
- Progress reporting
- Retry logic
### 6. DiffCalculator Trait
**Purpose**: Calculate differences between schema snapshots.
**Key Methods**:
- `calculate_diff(from, to)` → `SchemaDiff`
**Why it exists**:
- Different diff algorithms
- Three-way merge support
- Conflict detection
### 7. SnapshotStore Trait
**Purpose**: Store and retrieve schema snapshots.
**Key Methods**:
- `store(tenant, snapshot)` → `Result<()>`
- `get_latest(tenant)` → `Result<Option<SchemaSnapshot>>`
- `get_by_hash(tenant, hash)` → `Result<Option<SchemaSnapshot>>`
- `list(tenant)` → `Result<Vec<SchemaSnapshot>>`
- `compare(tenant, hash_a, hash_b)` → `Result<SchemaDiff>`
**Why it exists**:
- Multiple storage backends (filesystem, database, version control)
- Version history
- Deterministic versioning
## Data Structures
### SchemaSnapshot
Normalized, database-agnostic representation of a schema.
**Properties**:
- Deterministic: Same schema always produces same snapshot
- Order-independent (uses HashMaps)
- Database-agnostic
**Contains**:
- Tables (with columns, constraints, indexes)
- Views
- Functions
- Types
### SchemaDiff
Represents differences between two snapshots.
**Structure**: Hierarchical (schema → table → column → constraint)
**Contains**:
- Tables: added, removed, modified
- Views: added, removed, modified
- Functions: added, removed, modified
- Types: added, removed, modified
### MigrationPlan
Executable sequence of operations to transform schema.
**Structure**: Ordered steps with dependencies
**Contains**:
- Steps (ordered operations)
- Estimated duration
- Downtime requirements
- Warnings
### TenantContext
Explicit tenant scoping for all operations.
**Properties**:
- Single field: `tenant_id: String`
- Required for all operations
- Prevents cross-tenant leakage
## Extension Points
### Adding a New Database Type
1. Implement `DatabaseAdapter`
2. Implement `SchemaInspector` (convert DB schema → `SchemaSnapshot`)
3. Implement `MigrationRunner` (convert `MigrationPlan` → DB SQL)
4. Use with engine
**No changes needed to**: Engine, planner, executor, diff calculator.
### Adding a New Migration Strategy
1. Implement `MigrationRunner`
2. Convert `MigrationPlan` to your format
3. Execute using your tool
**Example**: SQL file migrations, diesel migrations, sqlx migrations.
### Adding a New Snapshot Storage Backend
1. Implement `SnapshotStore`
2. Store/retrieve `SchemaSnapshot` in your backend
3. Use with engine
**Example**: Filesystem, database, S3, version control.
### Adding a New Planning Strategy
1. Implement `Planner`
2. Create `MigrationPlan` from `SchemaDiff`
3. Use with engine
**Example**: Safe planner for zero-downtime migrations.
### Adding a New Diff Algorithm
1. Implement `DiffCalculator`
2. Calculate `SchemaDiff` from two `SchemaSnapshot`s
3. Use with engine
**Example**: Three-way merge calculator.
## Operation Modes
### Sync Mode
**Implementation**: `Engine::sync_tenant(tenant, target, execute=true)`
**Behavior**: Calculate diff, create plan, execute plan.
### Dry-Run Mode
**Implementation**: `Engine::sync_tenant(tenant, target, execute=false)`
**Behavior**: Calculate diff, create plan, validate plan, return diff without executing.
### Validation Mode (CI)
**Implementation**:
- `Engine::sync_tenant(tenant, target, execute=false)` for all tenants
- Check `already_in_sync` flag
- Exit non-zero if any tenant has `already_in_sync=false`
**Behavior**: Verify all tenants match expected schema.
### Audit Mode
**Implementation**: Use `SchemaInspector` directly, no `MigrationRunner`.
**Behavior**: Read-only inspection, no changes allowed.
## Design Decisions
### Why Traits Over Enums?
Traits allow:
- Multiple implementations to coexist
- Pluggable components
- Testing with mocks
- Extension without modification
### Why Separate Inspector and Runner?
- Inspector can be used without runner (audit mode)
- Different migration strategies can be implemented
- Testing is easier with mock implementations
### Why Separate Planner and Executor?
- Dry-run mode can plan without executing
- Validation of plans before execution
- Different execution strategies
### Why Snapshot System?
- Enables diffing schema version A vs B
- Supports version control integration
- Allows deterministic versioning
### Why TenantContext Everywhere?
- Type safety: Can't accidentally operate on wrong tenant
- Makes tenant isolation explicit
- Supports batch operations
- Enables per-tenant locking
## Future Implementation Tasks
### Phase 1: Core Implementations
- [ ] Default `Planner` implementation
- [ ] Default `Executor` implementation
- [ ] Default `DiffCalculator` implementation
- [ ] File-based `SnapshotStore` implementation
### Phase 2: PostgreSQL Support
- [ ] `PostgresAdapter` implementation
- [ ] `PostgresInspector` implementation
- [ ] `PostgresMigrationRunner` implementation
### Phase 3: CLI
- [ ] CLI argument parsing
- [ ] Mode handling (sync, diff, validate, audit)
- [ ] Output formatting (text, JSON)
- [ ] Exit codes for CI
### Phase 4: Additional Databases
- [ ] MySQL support
- [ ] SQLite support
### Phase 5: Advanced Features
- [ ] Three-way merge support
- [ ] Conflict detection
- [ ] Zero-downtime migration strategies
- [ ] Progress reporting
- [ ] Audit trail
## Testing Strategy
### Unit Tests
- Mock implementations of all traits
- Test each component in isolation
### Integration Tests
- Test database (testcontainers or in-memory)
- Test full sync flow
- Test error handling and rollback
### Property Tests
- Deterministic snapshots
- Reversibility (plan + execution = target)
## Conclusion
This architecture provides a solid, extensible foundation for schema synchronization. The trait-based design enables growth without breaking changes, and the separation of concerns makes the codebase maintainable.
The key insight: **design for extension first**. Every abstraction exists to enable a future feature, not just to solve the current problem.