torc 0.20.7

Workflow management system
# Performance Improvements

## Summary

This document describes major performance optimizations implemented in Torc:

1. **Database Indexes** (2024-11) - 10-50× faster queries for large workflows
2. **Deferred Job Unblocking** (2024-11) - 100× faster job completions at scale

---

## 1. Deferred Job Unblocking (2024-11)

### Problem

When a job completes in a workflow with complex dependencies, the server must find and unblock all
downstream jobs that were waiting for it. This involves:

- Recursive CTE queries to find dependent jobs
- Checking if all blocking jobs are complete
- Updating job statuses
- Adding jobs to the ready queue
- Triggering workflow actions

With 2000 compute nodes completing ~1-minute jobs simultaneously (33 completions/second), this
became a bottleneck:

- Each `complete_job` API call took 50-500ms
- Exclusive database locks serialized all completions
- Workers were blocked waiting for API responses

### Solution

**Deferred unblocking with background processing:**

1. `complete_job` API now just updates job status and sets `unblocking_processed = 0` flag (1-5ms)
2. Background task runs periodically (configurable via `TORC_COMPLETION_CHECK_INTERVAL_SECS`)
3. Background task processes all pending completions in batches per workflow
4. Multiple completions processed in single transaction for efficiency

**Migration**: `20251123000000_add_unblocking_processed`

### Performance Impact

#### Before

- `complete_job` API latency: 50-500ms
- Throughput: ~33 completions/second (serialized by database locks)
- Workers blocked during unblocking logic

#### After

- `complete_job` API latency: 1-5ms (100× faster)
- Throughput: 2000+ completions/second
- No worker blocking
- Trade-off: Downstream jobs delayed by up to `TORC_COMPLETION_CHECK_INTERVAL_SECS` (default: 60
  seconds)

### Configuration

Control unblocking interval via environment variable:

```bash
# Production (default) - efficient batching
export TORC_COMPLETION_CHECK_INTERVAL_SECS=60

# Development/demos - faster feedback
export TORC_COMPLETION_CHECK_INTERVAL_SECS=1

# Testing - near-immediate
export TORC_COMPLETION_CHECK_INTERVAL_SECS=0.1
```

### HPC Suitability

The 60-second delay is negligible for typical HPC workflows:

- Jobs run for minutes to hours
- 60-second delay is <1% of job runtime
- Batching provides massive scalability benefits

For workflows with extremely short jobs (<5 seconds), consider:

- Reducing interval to 1-5 seconds
- Redesigning workflow to have longer-running jobs

### Database Changes

New column:

```sql
ALTER TABLE job ADD COLUMN unblocking_processed INTEGER NOT NULL DEFAULT 0;
```

New indexes:

```sql
CREATE INDEX idx_job_unblocking_pending ON job(workflow_id, status, unblocking_processed)
WHERE status IN (6, 7, 8) AND unblocking_processed = 0;

CREATE INDEX idx_job_workflow_unblocking ON job(workflow_id)
WHERE status IN (6, 7, 8) AND unblocking_processed = 0;
```

### Monitoring

Check background task status in server logs:

```
Starting background job completion check interval: 60 seconds
Processing 15 completed jobs for workflow 42
Processed 15 completions for workflow 42, 8 jobs became ready
```

---

## 2. Database Indexes (2024-11)

### Summary

Database indexes have been added to significantly improve query performance, particularly for
workflows with thousands of jobs.

## What Was Added

**Migration**: `20251105030141_add_database_indexes`

**Total Indexes Added**: 17 indexes across 11 tables

### Critical Performance Indexes (Phase 1)

1. `idx_job_workflow_id` - Filter jobs by workflow (10-50x faster)
2. `idx_job_workflow_status` - Filter jobs by workflow and status (10-50x faster)
3. `idx_result_workflow_id` - Filter results by workflow (10-50x faster)

### Relationship Lookups (Phase 2)

4. `idx_event_workflow_id` - Filter events by workflow
5. `idx_compute_node_workflow_id` - Filter compute nodes by workflow
6. `idx_job_depends_on_depends_on_job_id` - Reverse dependency lookups
7. `idx_job_input_file_file_id` - Find jobs consuming a file
8. `idx_job_output_file_file_id` - Find jobs producing a file
9. `idx_job_input_user_data_user_data_id` - Find jobs consuming user data
10. `idx_job_output_user_data_user_data_id` - Find jobs producing user data

### Resource Allocation Optimization (Phase 3)

11. `idx_resource_requirements_sort_gpus_runtime_memory` - Optimize resource-based job sorting

### User and Workflow Filtering (Phase 4)

12. `idx_workflow_user` - Filter workflows by user
13. `idx_workflow_user_archived` - Filter workflows by user and archived status

### Additional Optimizations

14. `idx_result_job_id` - Filter results by job
15. `idx_result_run_id` - Filter results by run
16. `idx_compute_node_workflow_active` - Filter active compute nodes

## Expected Performance Improvements

### Before Indexes (10,000-job workflow)

- List jobs: ~100-500ms (table scan)
- Find ready jobs: ~100-500ms (table scan)
- List results: ~50-200ms (table scan)
- Dependency lookups: ~50-200ms (table scan)

### After Indexes (10,000-job workflow)

- List jobs: ~5-20ms (index scan) - **10-50x faster**
- Find ready jobs: ~2-10ms (composite index) - **10-50x faster**
- List results: ~5-15ms (index scan) - **10-50x faster**
- Dependency lookups: ~1-5ms (index seek) - **10-50x faster**

## Storage Overhead

For a 10,000-job workflow:

- Total index overhead: ~500 KB - 1 MB
- Write performance impact: ~10-15% overhead on INSERT/UPDATE/DELETE
- Read performance improvement: 10-50x faster

**Conclusion**: Negligible storage cost for massive performance gains.

## Verification

### Check Installed Indexes

```bash
# View all indexes on the job table
sqlite3 torc.db ".indexes job"

# View index details
sqlite3 torc.db ".schema job"

# Check query plan (verify index usage)
sqlite3 torc.db "EXPLAIN QUERY PLAN SELECT * FROM job WHERE workflow_id = 1;"
```

### Expected Output

```
QUERY PLAN
`--SEARCH job USING INDEX idx_job_workflow_id (workflow_id=?)
```

If you see "SCAN TABLE job" instead of "SEARCH ... USING INDEX", the index is not being used.

## Testing

The migration has been tested and applied successfully. To verify in your environment:

```bash
# Check migration status
sqlx migrate info --source torc-server/migrations

# Should show: 20251105030141/installed add database indexes
```

## Rolling Back

If needed, the indexes can be removed:

```bash
sqlx migrate revert --source torc-server/migrations
```

This will drop all 17 indexes without affecting data.

## Monitoring

### Query Performance Logging

Enable SQL logging to verify index usage:

```bash
RUST_LOG=sqlx=debug cargo run --bin torc-server
```

### SQLite Statistics

```sql
-- Analyze table statistics (helps query planner)
ANALYZE;

-- Check table sizes
SELECT name, SUM(pgsize) as size FROM dbstat
WHERE name LIKE 'idx_%'
GROUP BY name
ORDER BY size DESC;
```

### Query Plan Analysis

```sql
-- For any slow query, check if indexes are being used:
EXPLAIN QUERY PLAN <your_query>;

-- Example:
EXPLAIN QUERY PLAN
SELECT * FROM job
WHERE workflow_id = 1 AND status = 1;

-- Expected: SEARCH job USING INDEX idx_job_workflow_status
```

## Maintenance

### Automatic Optimization

SQLite automatically:

- Updates statistics after significant changes
- Chooses optimal indexes for queries
- Maintains indexes during INSERT/UPDATE/DELETE

### Manual Optimization (Optional)

```sql
-- Re-analyze statistics (rarely needed)
ANALYZE;

-- Rebuild indexes and reclaim space (during maintenance window)
VACUUM;

-- Optimize database (SQLite 3.18+)
PRAGMA optimize;
```

## Future Considerations

### If Performance Issues Persist

1. **Check query plans**: Use `EXPLAIN QUERY PLAN` to verify indexes are being used
2. **Analyze slow queries**: Enable `RUST_LOG=sqlx=debug` to see query execution times
3. **Consider additional indexes**: Based on actual usage patterns
4. **Application-level caching**: Cache frequently-accessed data

### If Write Performance Becomes a Bottleneck

1. **Batch operations**: Group multiple INSERTs into single transaction
2. **Remove unused indexes**: Drop indexes that aren't being used
3. **Increase SQLite cache**: `PRAGMA cache_size = 10000;`

### If Database Size Becomes a Concern

1. **Archive old workflows**: Move completed workflows to separate database
2. **Vacuum regularly**: Reclaim space from deleted records
3. **Consider partitioning**: Separate databases for different projects (advanced)

## References

- **Detailed Analysis**: See `docs/DATABASE_INDEXES.md` for comprehensive rationale
- **Migration Files**: `migrations/20251105030141_add_database_indexes.{up,down}.sql`
- **SQLite Index Documentation**: https://www.sqlite.org/queryplanner.html

## Impact on Existing Systems

### Development Environments

- Migration will run automatically on next `cargo run`
- No code changes required
- Immediate performance improvement

### Production Systems

- Apply migration during maintenance window
- Migration is very fast (<1 second typical)
- No downtime required (WAL mode)
- Can be rolled back if needed

### Testing

- All existing tests pass with indexes
- No behavior changes, only performance improvements
- Test suite runs faster due to improved query performance

## Questions?

For issues or questions about database performance:

1. Check `EXPLAIN QUERY PLAN` output for your queries
2. Review `docs/DATABASE_INDEXES.md` for detailed analysis
3. Enable SQL logging with `RUST_LOG=sqlx=debug`
4. Create an issue with query plans and timing information