**Developer:** s4gor
**Github:** https://github.com/s4gor
---
# Schema Sync - Design Document
## Overview
This document explains the architectural decisions and design rationale for the schema-sync crate. It focuses on the **extension layer first** approach, explaining why each abstraction exists and how it enables future features.
## Core Design Principles
### 1. Trait-Based Extensibility
**Principle**: All database operations go through traits, never direct implementations.
**Rationale**:
- Enables multi-database support without changing core logic
- Allows testing with mock implementations
- Supports different migration strategies per database
- Makes the codebase maintainable as it grows
**Example**: `DatabaseAdapter` trait allows PostgreSQL, MySQL, and SQLite implementations to coexist.
### 2. Separation of Concerns
**Principle**: Clear boundaries between reading, planning, and executing.
**Components**:
- **SchemaInspector**: Read-only schema introspection
- **MigrationRunner**: Write operations for applying migrations
- **Planner**: Determines what changes need to be made
- **Executor**: Orchestrates the execution of planned changes
**Rationale**:
- Audit mode can use inspector without runner (no write permissions needed)
- Dry-run mode can plan without executing
- Different strategies can be plugged in for each concern
- Easier to test each component independently
### 3. Tenant Isolation
**Principle**: Every operation is explicitly scoped to a tenant context.
**Implementation**: `TenantContext` is required for all operations.
**Rationale**:
- Prevents accidental cross-tenant operations
- Makes tenant scoping explicit in the type system
- Enables per-tenant locking strategies
- Supports batch operations across tenants
### 4. Mode-Agnostic Core
**Principle**: The engine doesn't know about CLI modes (dry-run, validation, etc.).
**Rationale**:
- Same engine can be used for sync, dry-run, validation, and audit
- Mode-specific behavior is handled at the CLI layer
- Keeps core logic simple and focused
- Enables programmatic usage without CLI
**Example**: `Engine::sync_tenant()` takes an `execute` parameter. The CLI layer sets this based on the mode.
## Architecture Layers
### Layer 1: CLI Layer
**Responsibility**: User interface, mode handling, output formatting.
**Not in core crate**: The CLI will be implemented separately (either in this crate's `bin/` or a separate crate).
**Modes**:
- `sync`: Apply changes (`execute=true`)
- `diff`: Preview changes (`execute=false`)
- `validate`: Check if schemas match (`execute=false`, exit non-zero if mismatch)
- `audit`: Read-only inspection (uses inspector only)
### Layer 2: Engine Layer
**Responsibility**: Orchestrates all components to provide high-level API.
**Components**:
- `Engine`: Main entry point
- Coordinates inspector, planner, executor, diff calculator
- Handles snapshot management
- Provides unified interface
**Why it exists**: Hides complexity of coordinating multiple components. Provides a simple API for common operations.
### Layer 3: Adapter Layer
**Responsibility**: Database-specific operations through traits.
**Traits**:
- `DatabaseAdapter`: Connection management, factory methods
- `SchemaInspector`: Read schema from database
- `MigrationRunner`: Execute schema changes
**Why separate traits**:
- Inspector can be used without runner (audit mode)
- Different migration strategies can be implemented
- Testing is easier with mock implementations
### Layer 4: Database Implementations
**Responsibility**: Concrete implementations for each database type.
**Future implementations**:
- `PostgresAdapter`, `PostgresInspector`, `PostgresMigrationRunner`
- `MysqlAdapter`, `MysqlInspector`, `MysqlMigrationRunner`
- `SqliteAdapter`, `SqliteInspector`, `SqliteMigrationRunner`
**Why separate**: Each database has different SQL dialects, connection methods, and capabilities.
## Data Structures
### SchemaSnapshot
**Purpose**: Normalized, database-agnostic representation of a schema.
**Key properties**:
- Deterministic: Same schema always produces same snapshot
- Order-independent where possible (uses HashMaps)
- Database-agnostic: Not tied to PostgreSQL, MySQL, etc.
**Why it exists**:
- Enables diffing between different database types (theoretical)
- Supports snapshot storage and versioning
- Allows planning without database connection
### SchemaDiff
**Purpose**: Represents differences between two snapshots.
**Structure**: Hierarchical (schema → table → column → constraint).
**Why it exists**:
- Enables human-readable diff output
- Supports machine-readable JSON output
- Allows planning to understand what needs to change
### MigrationPlan
**Purpose**: Executable sequence of operations to transform schema.
**Structure**: Ordered steps with dependencies.
**Why it exists**:
- Separates "what needs to change" (diff) from "how to change it" (plan)
- Enables validation before execution
- Supports dry-run mode
- Allows different planning strategies
### TenantContext
**Purpose**: Explicit tenant scoping for all operations.
**Why it exists**:
- Type safety: Can't accidentally operate on wrong tenant
- Makes tenant isolation explicit
- Supports batch operations
- Enables per-tenant locking
## Extension Points
### 1. Adding a New Database Type
**Steps**:
1. Implement `DatabaseAdapter` trait
2. Implement `SchemaInspector` trait (convert database schema to `SchemaSnapshot`)
3. Implement `MigrationRunner` trait (convert `MigrationPlan` to database SQL)
4. Register in factory or use directly
**No changes needed to**: Engine, planner, executor, diff calculator.
### 2. Adding a New Migration Strategy
**Steps**:
1. Implement `MigrationRunner` trait
2. Convert `MigrationPlan` to your migration format
3. Execute migrations using your tool (diesel, sqlx, etc.)
**Example**: SQL file-based migrations would read SQL files and execute them in order.
### 3. Adding a New Snapshot Storage Backend
**Steps**:
1. Implement `SnapshotStore` trait
2. Store/retrieve `SchemaSnapshot` in your backend (database, S3, etc.)
3. Use with engine
**No changes needed to**: Engine, adapters, planner, executor.
### 4. Adding a New Planning Strategy
**Steps**:
1. Implement `Planner` trait
2. Create `MigrationPlan` from `SchemaDiff` using your strategy
3. Use with engine
**Example**: A "safe" planner might add extra steps to ensure zero-downtime migrations.
### 5. Adding a New Diff Algorithm
**Steps**:
1. Implement `DiffCalculator` trait
2. Calculate `SchemaDiff` from two `SchemaSnapshot`s
3. Use with engine
**Example**: A three-way merge calculator for handling conflicts.
## Future Extensions (Detailed)
### 1. Dry-Run / Diff Mode
**Implementation**: Already supported via `execute=false` parameter.
**Output formats**:
- Text: Format `SchemaDiff` as human-readable text
- JSON: Serialize `SchemaDiff` to JSON
**CLI command**: `schema-sync diff --tenant tenant_123 --format json`
### 2. CI / Validation Mode
**Implementation**:
- Use `Engine::sync_tenant()` with `execute=false` for all tenants
- Check `already_in_sync` flag
- Exit non-zero if any tenant has `already_in_sync=false`
**CLI command**: `schema-sync validate --all-tenants`
### 3. Multi-Database Support
**Implementation**: Implement adapter traits for each database type.
**Challenges**:
- Different SQL dialects
- Different schema introspection queries
- Different migration execution methods
- Different locking mechanisms
**Solution**: Traits abstract these differences. Each implementation handles database-specific details.
### 4. Read-Only Audit Mode
**Implementation**: Use `SchemaInspector` directly, no `MigrationRunner` needed.
**CLI command**: `schema-sync audit --tenant tenant_123`
**Safety**: No write operations, no locks, safe for production.
### 5. Pluggable Migration Engine
**Implementation**: Different `MigrationRunner` implementations.
**Examples**:
- `SqlFileMigrationRunner`: Executes SQL files in order
- `DieselMigrationRunner`: Uses diesel migration system
- `SqlxMigrationRunner`: Uses sqlx migration system
- `RustCodeMigrationRunner`: Executes Rust code migrations
### 6. Schema Snapshot System
**Implementation**: `SnapshotStore` trait implementations.
**Storage backends**:
- `FileSnapshotStore`: Store in filesystem
- `DatabaseSnapshotStore`: Store in database
- `GitSnapshotStore`: Store in version control
- `S3SnapshotStore`: Store in object storage
**Versioning**: Deterministic hash of snapshot content.
### 7. Tenant Isolation Guarantees
**Implementation**:
- All operations require `TenantContext`
- `MigrationRunner::acquire_lock()` ensures per-tenant locking
- Database adapters enforce tenant scoping in SQL
**Safety**: Type system prevents cross-tenant operations.
## Testing Strategy
### Unit Tests
**Mock implementations**: Create mock implementations of all traits for testing.
**Example**: `MockSchemaInspector` that returns predefined snapshots.
### Integration Tests
**Test database**: Use testcontainers or in-memory databases.
**Test scenarios**:
- Sync empty schema to target
- Sync existing schema to target
- Handle conflicts
- Rollback on failure
### Property Tests
**Deterministic snapshots**: Same schema always produces same snapshot.
**Reversibility**: Plan + execution should result in target schema.
## Performance Considerations
### Snapshot Calculation
**Optimization**: Cache snapshots, only recalculate when schema changes.
**Implementation**: Store snapshot hash, compare before full calculation.
### Batch Operations
**Multiple tenants**: Process tenants in parallel where possible.
**Implementation**: Use `Engine::list_tenants()` and process concurrently.
### Locking
**Per-tenant locks**: Don't block other tenants.
**Implementation**: `MigrationRunner::acquire_lock()` is per-tenant.
## Security Considerations
### Tenant Isolation
**Enforcement**: Database adapters must enforce tenant scoping in SQL.
**Example**: PostgreSQL adapter uses schema-qualified table names.
### Connection Security
**Credentials**: Adapters handle connection security (TLS, authentication).
**Not in core**: Core crate doesn't handle credentials, adapters do.
### Audit Trail
**Future**: Log all schema changes with tenant, timestamp, and user.
**Implementation**: `MigrationRunner` implementations can add logging.
## Error Handling
### Error Types
**Structured errors**: `Error` enum with context.
**Why**: Provides actionable error messages with tenant context.
### Recovery
**Rollback**: `MigrationRunner` implementations handle rollback on failure.
**Partial failures**: `ExecutionResult` indicates how many steps succeeded.
## Conclusion
This architecture provides a solid foundation for schema synchronization that can grow over years. The trait-based design enables extensibility without breaking changes, and the separation of concerns makes the codebase maintainable.
The key insight is designing for extension first: every abstraction exists to enable a future feature, not just to solve the current problem.