Panlabel
The universal annotation converter
If you've ever written a one-off Python script to wrangle COCO annotations into YOLO format (or vice versa), panlabel is here to save you the trouble. It's a fast, single-binary CLI that converts between common object detection annotation formats — with built-in validation, clear lossiness warnings, and no Python dependencies to manage.
Panlabel is also available as a Rust library if you want to integrate format conversion into your own tools.
Note: Panlabel is in active development (v0.5.x). The CLI and library APIs may change between versions, so pin to a specific version if you're using it in production.
Installation
pip / uv (any platform)
# or
This installs a pre-built binary — no Rust toolchain needed.
Homebrew (macOS / Linux)
Shell script (macOS / Linux)
|
PowerShell (Windows)
powershell -ExecutionPolicy Bypass -c "irm https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.ps1 | iex"
Cargo (from source)
# Enable full HF support (remote Hub import + metadata.parquet)
Pre-built binaries
Download from the latest GitHub Release. Builds are available for macOS (Intel + Apple Silicon), Linux (x86_64 + ARM64), and Windows.
Docker
# Convert a COCO file in your current directory to YOLO
Multi-arch images (amd64 + arm64) are published for each release.
As a Rust library
Quick start
# Convert COCO annotations to YOLO (auto-detects the input format)
# Convert a YOLO dataset to COCO JSON
# Convert a Pascal VOC dataset to COCO JSON
# Convert Label Studio JSON to COCO JSON
# Convert CVAT XML to COCO JSON
# Convert a LabelMe directory to COCO JSON
# Convert Apple CreateML JSON to COCO JSON
# Convert a KITTI dataset to COCO JSON
# Convert VIA JSON to COCO JSON
# Convert RetinaNet CSV to COCO JSON
# Convert local HF ImageFolder metadata to COCO JSON
# Convert remote HF dataset repo to COCO JSON (requires --features hf when building from source)
# Convert a zip-style HF dataset repo split to IR JSON (auto-detects extracted payload format)
# Check a dataset for problems before training
# Get machine-readable validation output
# Get a quick overview of what's in a dataset
# Compare two datasets
# Sample a smaller subset for quick experiments
# Preview a conversion without writing output files
# Preview a deterministic sample without writing output files
# Ask for a machine-readable conversion/sample report
# Discover supported formats as JSON
What can panlabel do?
| Command | What it does |
|---|---|
convert |
Convert between annotation formats, with clear warnings about what (if anything) gets lost |
validate |
Check your dataset for common problems — duplicate IDs, missing references, invalid bounding boxes |
stats |
Show rich dataset statistics in text, JSON, or HTML |
diff |
Compare two datasets semantically (summary or detailed output) |
sample |
Create subset datasets (random or stratified), with optional category filtering and JSON reports |
list-formats |
Show which formats are supported and their read/write/lossiness capabilities, including JSON discovery output |
Supported formats
| Format | Extension / Layout | Description | Lossiness |
|---|---|---|---|
ir-json |
.json |
Panlabel's own intermediate representation | Lossless |
coco |
.json |
COCO object detection format | Conditional |
cvat |
.xml / annotations.xml export |
CVAT for images XML annotation export | Lossy |
label-studio |
.json |
Label Studio task export JSON (rectanglelabels) |
Lossy |
tfod |
.csv |
TensorFlow Object Detection | Lossy |
yolo |
images/ + labels/ directory |
YOLO .txt labels (flat or split-aware, optional confidence) |
Lossy |
voc |
Annotations/ + JPEGImages/ directory |
Pascal VOC XML | Lossy |
hf |
metadata.jsonl / metadata.parquet directory |
Hugging Face ImageFolder metadata | Lossy |
labelme |
.json file or annotations/ directory |
LabelMe per-image JSON annotations | Lossy |
create-ml |
.json |
Apple CreateML annotation format | Lossy |
kitti |
label_2/ + image_2/ directory |
KITTI object detection labels | Lossy |
via |
.json |
VGG Image Annotator (VIA) JSON | Lossy |
retinanet |
.csv |
keras-retinanet CSV format | Lossy |
Run panlabel list-formats for the full details, or panlabel list-formats --output json for machine-readable format discovery.
list-formats shows canonical names (for example label-studio), while commands also accept aliases (for example ls, label-studio-json). Across commands, --output-format is the consistent way to request JSON reports; convert and sample also keep --report as an alias. JSON is pretty-printed on a terminal and compact when piped or captured, which makes it friendlier for scripts and agents. stats also adapts its text renderer: rich/Unicode on a terminal, plain text layout when piped.
More convert examples
# COCO to IR JSON (lossless — no data lost)
# IR JSON to TFOD (lossy — requires explicit opt-in)
# Auto-detect input format from file extension/content or directory layout
# Request a machine-readable conversion report
# Preview a conversion without touching the output path
Dry runs still do the real thinking work — format detection, validation, sampling/conversion analysis, and lossiness checks — but they skip the final filesystem write. That means they are good for “what would happen?” checks, but they do not prove that the output path is writable.
Getting help
Documentation
Want to go deeper? The full docs are readable right here on GitHub:
- Documentation home — start here
- CLI reference — every flag and option
- Format reference — how each format works
- Tasks and use cases — what's supported today
- Conversion and lossiness — understanding what gets lost
- Contributing — we'd love your help
- Roadmap — what's coming next
Contributing
Contributions are welcome! Whether it's a bug report, a new format adapter, or a documentation fix — we appreciate the help. For major changes, please open an issue first so we can discuss the approach.
See the contributing guide for details on the codebase structure and how to make changes.
License
MIT — see LICENSE for details.