torc 0.20.7

Workflow management system
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
# Slurm Overview

This document explains how Torc simplifies running workflows on Slurm-based HPC systems. The key
insight is that **you don't need to understand Slurm schedulers or workflow actions** to run
workflows on HPC systems—Torc handles this automatically.

## The Simple Approach

Running a workflow on Slurm requires just two things:

1. **Define your jobs with resource requirements**
2. **Submit with `submit-slurm`**

That's it. Torc will analyze your workflow, generate appropriate Slurm configurations, and submit
everything for execution.

> **⚠️ Important:** The `submit-slurm` command uses heuristics to auto-generate Slurm schedulers and
> workflow actions. For complex workflows with unusual dependency patterns, the generated
> configuration may not be optimal and could result in suboptimal allocation timing. **Always
> preview the configuration first** using `torc slurm generate` (see
> [Previewing Generated Configuration]#previewing-generated-configuration) before submitting
> production workflows.

### Example Workflow

Here's a complete workflow specification that runs on Slurm:

```yaml
name: data_analysis_pipeline
description: Analyze experimental data with preprocessing, training, and evaluation

resource_requirements:
  - name: light
    num_cpus: 4
    memory: 8g
    runtime: PT30M

  - name: compute
    num_cpus: 32
    memory: 64g
    runtime: PT2H

  - name: gpu
    num_cpus: 16
    num_gpus: 2
    memory: 128g
    runtime: PT4H

jobs:
  - name: preprocess
    command: python preprocess.py --input data/ --output processed/
    resource_requirements: light

  - name: train_model
    command: python train.py --data processed/ --output model/
    resource_requirements: gpu
    depends_on: [preprocess]

  - name: evaluate
    command: python evaluate.py --model model/ --output results/
    resource_requirements: compute
    depends_on: [train_model]

  - name: generate_report
    command: python report.py --results results/
    resource_requirements: light
    depends_on: [evaluate]
```

### Submitting the Workflow

```bash
torc submit-slurm --account myproject workflow.yaml
```

Torc will:

1. Detect which HPC system you're on (e.g., NLR Kestrel)
2. Match each job's requirements to appropriate partitions
3. Generate Slurm scheduler configurations
4. Create workflow actions that stage resource allocation based on dependencies
5. Submit the workflow for execution

## How It Works

When you use `submit-slurm`, Torc performs intelligent analysis of your workflow:

### 1. Per-Job Scheduler Generation

Each job gets its own Slurm scheduler configuration based on its resource requirements. This means:

- Jobs are matched to the most appropriate partition
- Memory, CPU, and GPU requirements are correctly specified
- Walltime is set to the partition's maximum (explained below)

### 2. Staged Resource Allocation

Torc analyzes job dependencies and creates **staged workflow actions**:

- **Jobs without dependencies** trigger `on_workflow_start` — resources are allocated immediately
- **Jobs with dependencies** trigger `on_jobs_ready` — resources are allocated only when the job
  becomes ready to run

This prevents wasting allocation time on resources that aren't needed yet. For example, in the
workflow above:

- `preprocess` resources are allocated at workflow start
- `train_model` resources are allocated when `preprocess` completes
- `evaluate` resources are allocated when `train_model` completes
- `generate_report` resources are allocated when `evaluate` completes

### 3. Walltime Calculation

By default, Torc sets the walltime to **1.5× your longest job's runtime** (capped at the partition's
maximum). This provides headroom for jobs that run slightly longer than expected.

You can customize this behavior:

- `--walltime-strategy max-job-runtime` (default): Uses longest job runtime × multiplier
- `--walltime-strategy max-partition-time`: Uses the partition's maximum walltime
- `--walltime-multiplier 2.0`: Change the safety multiplier (default: 1.5)

See [Walltime Strategy Options](#walltime-strategy-options) for details.

### 4. HPC Profile Knowledge

Torc includes built-in knowledge of HPC systems like NLR Kestrel, including:

- Available partitions and their resource limits
- GPU configurations
- Memory and CPU specifications
- Special requirements (e.g., minimum node counts for high-bandwidth partitions)

> **Using an unsupported HPC?** Please
> [request built-in support]https://github.com/NatLabRockies/torc/issues so everyone benefits. You
> can also [create a custom profile]./custom-hpc-profile.md for immediate use.

## Resource Requirements Specification

Resource requirements are the key to the simplified workflow. Define them once and reference them
from jobs:

```yaml
resource_requirements:
  - name: small
    num_cpus: 4
    num_gpus: 0
    num_nodes: 1
    memory: 8g
    runtime: PT1H

  - name: gpu_training
    num_cpus: 32
    num_gpus: 4
    num_nodes: 1
    memory: 256g
    runtime: PT8H
```

### Fields

| Field       | Description               | Example             |
| ----------- | ------------------------- | ------------------- |
| `name`      | Reference name for jobs   | `"compute"`         |
| `num_cpus`  | CPU cores required        | `32`                |
| `num_gpus`  | GPUs required (0 if none) | `2`                 |
| `num_nodes` | Nodes required            | `1`                 |
| `memory`    | Memory with unit suffix   | `"64g"`, `"512m"`   |
| `runtime`   | ISO8601 duration          | `"PT2H"`, `"PT30M"` |

### Runtime Format

Use ISO8601 duration format:

- `PT30M` — 30 minutes
- `PT2H` — 2 hours
- `PT1H30M` — 1 hour 30 minutes
- `P1D` — 1 day
- `P2DT4H` — 2 days 4 hours

## Job Dependencies

Define dependencies explicitly or implicitly through file/data relationships:

### Explicit Dependencies

```yaml
jobs:
  - name: step1
    command: ./step1.sh
    resource_requirements: small

  - name: step2
    command: ./step2.sh
    resource_requirements: small
    depends_on: [step1]

  - name: step3
    command: ./step3.sh
    resource_requirements: small
    depends_on: [step1, step2]  # Waits for both
```

### Implicit Dependencies (via Files)

```yaml
files:
  - name: raw_data
    path: /data/raw.csv
  - name: processed_data
    path: /data/processed.csv

jobs:
  - name: process
    command: python process.py
    input_files: [raw_data]
    output_files: [processed_data]
    resource_requirements: compute

  - name: analyze
    command: python analyze.py
    input_files: [processed_data]  # Creates implicit dependency on 'process'
    resource_requirements: compute
```

## Previewing Generated Configuration

> **Recommended Practice:** Always preview the generated configuration before submitting to Slurm,
> especially for complex workflows. This allows you to verify that schedulers and actions are
> appropriate for your workflow structure.

### Viewing the Execution Plan

Before generating schedulers, visualize how your workflow will execute in stages:

```bash
torc workflows execution-plan workflow.yaml
```

This shows the execution stages, which jobs run at each stage, and (if schedulers are defined) when
Slurm allocations are requested. See
[Visualizing Workflow Structure](../../core/workflows/visualizing-workflows.md) for detailed
examples.

### Generating Slurm Configuration

Preview what Torc will generate:

```bash
torc slurm generate --account myproject --profile kestrel workflow.yaml
```

This outputs the complete workflow with generated schedulers and actions:

#### Scheduler Grouping Options

By default, Torc creates **one scheduler per unique `resource_requirements` name**. This means if
you have three jobs with three different resource requirement definitions (e.g., `cpu`, `memory`,
`mixed`), you get three schedulers—even if all three would fit on the same partition.

The `--group-by` option controls how jobs are grouped into schedulers:

```bash
# Default: one scheduler per resource_requirements name
torc slurm generate --account myproject workflow.yaml
torc slurm generate --account myproject --group-by resource-requirements workflow.yaml
# Result: 3 schedulers (cpu_scheduler, memory_scheduler, mixed_scheduler)

# Group by partition: one scheduler per partition
torc slurm generate --account myproject --group-by partition workflow.yaml
# Result: 1 scheduler (short_scheduler) if all jobs fit on the "short" partition
```

**When to use `--group-by partition`:**

- Your workflow has many small resource requirement definitions that all fit on the same partition
- You want to minimize Slurm queue overhead by reducing the number of allocations
- Jobs have similar characteristics and can share nodes efficiently

**When to use `--group-by resource-requirements` (default):**

- Jobs have significantly different resource profiles that benefit from separate allocations
- You want fine-grained control over which jobs share resources
- You're debugging and want clear separation between job types

When grouping by partition, the scheduler uses the **maximum** resource values from all grouped
requirements (max memory, max CPUs, max runtime, etc.) to ensure all jobs can run.

#### Walltime Strategy Options

The `--walltime-strategy` option controls how Torc calculates the walltime for generated schedulers:

```bash
# Default: use max job runtime with a safety multiplier (1.5x)
torc slurm generate --account myproject workflow.yaml
torc slurm generate --account myproject --walltime-strategy max-job-runtime workflow.yaml

# Use the partition's maximum allowed walltime
torc slurm generate --account myproject --walltime-strategy max-partition-time workflow.yaml
```

**Walltime strategies:**

| Strategy             | Description                                                                               |
| -------------------- | ----------------------------------------------------------------------------------------- |
| `max-job-runtime`    | Uses the longest job's runtime × multiplier (default: 1.5x). Capped at partition max.     |
| `max-partition-time` | Uses the partition's maximum walltime. More conservative but may impact queue scheduling. |

**Customizing the multiplier:**

The `--walltime-multiplier` option (default: 1.5) provides a safety margin when using
`max-job-runtime`:

```bash
# Use 2x the max job runtime for extra buffer
torc slurm generate --account myproject --walltime-multiplier 2.0 workflow.yaml

# Use exact job runtime (no buffer - use with caution)
torc slurm generate --account myproject --walltime-multiplier 1.0 workflow.yaml
```

**When to use `max-job-runtime` (default):**

- You want better queue scheduling (shorter walltime requests often get prioritized)
- Your job runtime estimates are reasonably accurate
- You prefer the Torc runner to exit early rather than holding idle allocations

**When to use `max-partition-time`:**

- Your job runtimes are highly variable or unpredictable
- You consistently underestimate job runtimes
- Queue priority is not a concern

```yaml
name: data_analysis_pipeline
# ... original content ...

jobs:
  - name: preprocess
    command: python preprocess.py --input data/ --output processed/
    resource_requirements: light
    scheduler: preprocess_scheduler

  # ... more jobs ...

slurm_schedulers:
  - name: preprocess_scheduler
    account: myproject
    mem: 8g
    nodes: 1
    walltime: "04:00:00"

  - name: train_model_scheduler
    account: myproject
    mem: 128g
    nodes: 1
    gres: "gpu:2"
    walltime: "04:00:00"

  # ... more schedulers ...

actions:
  - trigger_type: on_workflow_start
    action_type: schedule_nodes
    scheduler: preprocess_scheduler
    scheduler_type: slurm
    num_allocations: 1

  - trigger_type: on_jobs_ready
    action_type: schedule_nodes
    jobs: [train_model]
    scheduler: train_model_scheduler
    scheduler_type: slurm
    num_allocations: 1

  # ... more actions ...
```

Save the output to inspect or modify before submission:

```bash
torc slurm generate --account myproject workflow.yaml -o workflow_with_schedulers.yaml
```

## Torc Server Considerations

The Torc server must be accessible to compute nodes. Options include:

1. **Shared server** (Recommended): A team member allocates a dedicated server in the HPC
   environment
2. **Login node**: Suitable for small workflows with few, long-running jobs

For large workflows with many short jobs, a dedicated server prevents overloading login nodes.

## Best Practices

### 1. Focus on Resource Requirements

Spend time accurately defining resource requirements. Torc handles the rest:

```yaml
resource_requirements:
  # Be specific about what each job type needs
  - name: io_heavy
    num_cpus: 4
    memory: 32g      # High memory for data loading
    runtime: PT1H

  - name: compute_heavy
    num_cpus: 64
    memory: 16g      # Less memory, more CPU
    runtime: PT4H
```

### 2. Use Meaningful Names

Name resource requirements by their purpose, not by partition:

```yaml
# Good - describes the workload
resource_requirements:
  - name: data_preprocessing
  - name: model_training
  - name: inference

# Avoid - ties you to specific infrastructure
resource_requirements:
  - name: short_partition
  - name: gpu_h100
```

### 3. Group Similar Jobs

Jobs with similar requirements can share resource requirement definitions:

```yaml
resource_requirements:
  - name: quick_task
    num_cpus: 2
    memory: 4g
    runtime: PT15M

jobs:
  - name: validate_input
    command: ./validate.sh
    resource_requirements: quick_task

  - name: check_output
    command: ./check.sh
    resource_requirements: quick_task
    depends_on: [main_process]
```

### 4. Test Locally First

Validate your workflow logic locally before submitting to HPC:

```bash
# Run locally (without Slurm)
torc run workflow.yaml

# Then submit to HPC
torc submit-slurm --account myproject workflow.yaml
```

## Limitations and Caveats

The auto-generation in `submit-slurm` uses heuristics that work well for common workflow patterns
but may not be optimal for all cases:

### When Auto-Generation Works Well

- **Linear pipelines**: A → B → C → D
- **Fan-out patterns**: One job unblocks many (e.g., preprocess → 100 work jobs)
- **Fan-in patterns**: Many jobs unblock one (e.g., 100 work jobs → postprocess)
- **Simple DAGs**: Clear dependency structures with distinct resource tiers

### When to Use Manual Configuration

Consider using `torc slurm generate` to preview and manually adjust, or define schedulers manually,
when:

- **Complex dependency graphs**: Multiple interleaved dependency patterns
- **Shared schedulers**: You want multiple jobs to share the same Slurm allocation
- **Custom timing**: Specific requirements for when allocations should be requested
- **Resource optimization**: Fine-tuning to minimize allocation waste
- **Multi-node jobs**: Jobs requiring coordination across multiple nodes (see
  [Multi-Node Jobs]./multi-node-jobs.md)

### What Could Go Wrong

Without previewing, auto-generation might:

1. **Request allocations too early**: Wasting queue time waiting for dependencies
2. **Request allocations too late**: Adding latency to job startup
3. **Create suboptimal scheduler groupings**: Not sharing allocations when beneficial
4. **Miss optimization opportunities**: Not recognizing patterns that could share resources

**Best Practice**: For production workflows, always run `torc slurm generate` first, review the
output, and submit the reviewed configuration with `torc submit`.

## Advanced: Manual Scheduler Configuration

For advanced users who need fine-grained control, you can define schedulers and actions manually.
See [Advanced Slurm Configuration](./slurm.md) for details.

Common reasons for manual configuration:

- Non-standard partition requirements
- Custom Slurm directives (e.g., `--constraint`)
- Multi-node jobs with specific topology requirements
- Reusing allocations across multiple jobs for efficiency

## Troubleshooting

### "No partition found for job"

Your resource requirements exceed what's available. Check:

- Memory doesn't exceed partition limits
- Runtime doesn't exceed partition walltime
- GPU count is available on GPU partitions

Use `torc hpc partitions <profile>` to see available resources.

### Jobs Not Starting

Ensure the Torc server is accessible from compute nodes:

```bash
# From a compute node
curl $TORC_API_URL/health
```

### Wrong Partition Selected

Use `torc hpc match` to see which partitions match your requirements:

```bash
torc hpc match kestrel --cpus 32 --memory 64g --walltime 02:00:00 --gpus 2
```

## See Also

- [Visualizing Workflow Structure]../../core/workflows/visualizing-workflows.md — Execution plans
  and DAG visualization
- [HPC Profiles]./hpc-profiles.md — Detailed HPC profile usage
- [Advanced Slurm Configuration]./slurm.md — Manual Slurm scheduler setup
- [Resource Requirements Reference]../../core/reference/resources.md — Complete specification
- [Workflow Actions]../design/workflow-actions.md — Understanding actions