# How to Debug a Failed Job
Systematically diagnose why a job failed.
## Step 1: Identify the Failed Job
```bash
torc jobs list <workflow_id> --status failed
```
Note the job ID and name.
## Step 2: Check the Exit Code
```bash
torc results get <workflow_id> --job-id <job_id>
```
Common exit codes:
| 1 | General error |
| 2 | Misuse of shell command |
| 126 | Permission denied |
| 127 | Command not found |
| 137 | Killed (SIGKILL) — often OOM |
| 139 | Segmentation fault |
| 143 | Terminated (SIGTERM) |
## Step 3: Read the Logs
```bash
# Get log paths
torc reports results <workflow_id> --job-id <job_id>
# View stderr (usually contains error messages)
cat output/job_stdio/job_wf43_j15_r1_a1.e
# View stdout
cat output/job_stdio/job_wf43_j15_r1_a1.o
# In combined stdio mode, both streams are in a single .log file
cat output/job_stdio/job_wf43_j15_r1_a1.log
```
> **Note:** If `stdio` is configured with `mode: none` or `mode: no_stderr`, log files may not
> exist. See [`StdioConfig`](../reference/workflow-spec.md#stdioconfig) for details.
## Step 4: Check Resource Usage
Did the job exceed its resource limits?
```bash
torc reports check-resource-utilization <workflow_id>
```
Look for:
- **Memory exceeded** — Job was likely OOM-killed (exit code 137)
- **Runtime exceeded** — Job was terminated for running too long
## Step 5: Reproduce Locally
Get the exact command that was run:
```bash
torc jobs get <job_id>
```
Try running it manually to see the error:
```bash
# Copy the command from the output and run it
python process.py --input data.csv
```
## Common Fixes
| OOM killed | Increase `memory` in resource requirements |
| File not found | Verify input files exist, check dependencies |
| Permission denied | Check file permissions, execution bits |
| Timeout | Increase `runtime` in resource requirements |
## Step 6: Fix and Retry
After fixing the issue:
```bash
# Reinitialize to reset failed jobs
torc workflows reset-status --failed --reinitialize <workflow_id>
# Run again
torc workflows run <workflow_id>
torc submit-slurm <workflow_id>
```
## See Also
- [View Job Logs](./view-job-logs.md) — Finding log files
- [Check Resource Utilization](./check-resource-utilization.md) — Resource analysis
- [Debugging Workflows](../monitoring/debugging.md) — Comprehensive debugging guide