Panlabel
The universal annotation converter
If you've ever written a one-off Python script to wrangle COCO annotations into YOLO format (or vice versa), panlabel is here to save you the trouble. It's a fast, single-binary CLI that converts between common object detection annotation formats — with built-in validation, clear lossiness warnings, and no Python dependencies to manage.
Panlabel’s current core scope is mainstream/static-image 2D axis-aligned object-detection bbox conversion. It does not provide first-class segmentation, keypoints/pose, oriented boxes, video tracking IDs, or 3D/multisensor labels. When broad schemas include richer structures, panlabel either skips/reports those structures or treats the conversion as lossy.
Panlabel is also available as a Rust library if you want to integrate format conversion into your own tools.
Note: Panlabel is in active development (v0.5.x). The CLI and library APIs may change between versions, so pin to a specific version if you're using it in production.
Installation
pip / uv (any platform)
# or
This installs a pre-built binary — no Rust toolchain needed.
Homebrew (macOS / Linux)
Shell script (macOS / Linux)
|
PowerShell (Windows)
powershell -ExecutionPolicy Bypass -c "irm https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.ps1 | iex"
Cargo (from source)
# Enable full HF support (remote Hub import + metadata.parquet)
Pre-built binaries
Download from the latest GitHub Release. Builds are available for macOS (Intel + Apple Silicon), Linux (x86_64 + ARM64), and Windows.
Docker
# Convert a COCO file in your current directory to YOLO
Multi-arch images (amd64 + arm64) are published for each release.
As a Rust library
Quick start
# Convert between formats (auto-detects the input)
# Check a dataset for problems before training
# Get a quick overview of what's in a dataset
# Compare two datasets semantically
# Sample a smaller subset for quick experiments
# See every supported format and its capabilities
The convert shape is always -f <source> -t <dest> -i <input> -o <output> — pick any source/destination from the Supported formats table. See More convert examples below for lossless vs. lossy conversions, machine-readable JSON reports, dry runs, and remote Hugging Face datasets.
What can panlabel do?
| Command | What it does |
|---|---|
convert |
Convert between annotation formats, with clear warnings about what (if anything) gets lost |
validate |
Check your dataset for common problems — duplicate IDs, missing references, invalid bounding boxes |
stats |
Show rich dataset statistics in text, JSON, or HTML |
diff |
Compare two datasets semantically (summary or detailed output) |
sample |
Create subset datasets (random or stratified), with optional category filtering and JSON reports |
list-formats |
Show which formats are supported and their read/write/lossiness capabilities, including JSON discovery output |
Supported formats
| Format | Extension / Layout | Description | Lossiness |
|---|---|---|---|
ir-json |
.json |
Panlabel's own intermediate representation | Lossless |
coco |
.json |
COCO object detection format | Conditional |
ibm-cloud-annotations |
_annotations.json file or directory |
IBM Cloud Annotations localization JSON | Lossy |
cvat |
.xml / annotations.xml export |
CVAT for images XML annotation export | Lossy |
label-studio |
.json |
Label Studio task export JSON (rectanglelabels) |
Lossy |
labelbox |
.json / .jsonl / .ndjson |
Labelbox current export rows (data_row / projects.*.labels) |
Lossy |
scale-ai |
.json file or directory (annotations/) |
Scale AI image annotation task/response JSON | Lossy |
unity-perception |
.json file or SOLO-like directory |
Unity Perception / SOLO synthetic-data bbox JSON | Lossy |
tfod |
.csv |
TensorFlow Object Detection CSV (normalized bbox corners) | Lossy |
tfrecord |
.tfrecord |
TensorFlow Object Detection API-style tf.train.Example records (single-file, uncompressed, bbox-only in v1) |
Lossy |
vott-csv |
.csv |
Microsoft VoTT CSV export (image,xmin,ymin,xmax,ymax,label) |
Lossy |
vott-json |
.json file or vott-json-export/ directory |
Microsoft VoTT JSON export (assets / per-asset JSON with regions) |
Lossy |
yolo |
images/ + labels/ directory, or split data.yaml pointing to image-list .txt files |
YOLO .txt labels (flat, split-aware, Scaled-YOLOv4 aliases, optional confidence) |
Lossy |
yolo-keras |
.txt file or directory (yolo_keras.txt, annotations.txt, train.txt) |
YOLO Keras absolute-coordinate TXT (image xmin,ymin,xmax,ymax,class_id ...) |
Lossy |
yolov4-pytorch |
.txt file or directory (yolov4_pytorch.txt, train_annotation.txt, train.txt) |
YOLOv4 PyTorch absolute-coordinate TXT (image xmin,ymin,xmax,ymax,class_id ...) |
Lossy |
voc |
Annotations/ + JPEGImages/ directory |
Pascal VOC XML | Lossy |
hf |
metadata.jsonl / metadata.parquet directory |
Hugging Face ImageFolder metadata | Lossy |
sagemaker |
.manifest / .jsonl file |
AWS SageMaker Ground Truth object-detection manifest | Lossy |
labelme |
.json file or annotations/ directory |
LabelMe per-image JSON annotations | Lossy |
create-ml |
.json |
Apple CreateML annotation format | Lossy |
kitti |
label_2/ + image_2/ directory |
KITTI object detection labels | Lossy |
via |
.json |
VGG Image Annotator (VIA) JSON | Lossy |
retinanet |
.csv |
keras-retinanet CSV format | Lossy |
openimages |
.csv |
Google OpenImages CSV annotation format | Lossy |
kaggle-wheat |
.csv |
Kaggle Global Wheat Detection CSV | Lossy |
automl-vision |
.csv |
Google Cloud AutoML Vision CSV | Lossy |
udacity |
.csv |
Udacity Self-Driving Car Dataset CSV | Lossy |
superannotate |
.json file or annotations/ directory |
SuperAnnotate JSON export | Lossy |
supervisely |
.json file or ann/ / meta.json project directory |
Supervisely JSON project / dataset | Lossy |
cityscapes |
.json, gtFine/, or dataset root with gtFine/ |
Cityscapes polygon JSON; polygons become bbox envelopes | Lossy |
marmot |
.xml file or directory with same-stem companion images |
Marmot XML document-layout composites; hex doubles become pixel bboxes | Lossy |
datumaro |
.json |
Datumaro JSON annotation format | Lossy |
wider-face |
.txt |
WIDER Face aggregate TXT (single face class in panlabel) |
Lossy |
oidv4 |
directory with Label/ or .txt |
OIDv4 Toolkit TXT labels (directory probe uses Label/, not YOLO labels/) |
Lossy |
bdd100k |
.json |
BDD100K / Scalabel JSON detection subset | Lossy |
v7-darwin |
.json |
V7 Darwin JSON bbox subset | Lossy |
edge-impulse |
bounding_boxes.labels file or containing directory |
Edge Impulse bounding-box labels JSON | Lossy |
openlabel |
.json |
ASAM OpenLABEL JSON static-image 2D bbox subset | Lossy |
via-csv |
.csv |
VGG Image Annotator CSV (separate format from VIA JSON) | Lossy |
Run panlabel list-formats for the full details, or panlabel list-formats --output json for machine-readable format discovery.
TFRecord support in v1 is intentionally narrow: panlabel currently supports only single-file, uncompressed TensorFlow Object Detection API-style tf.train.Example bbox records (not arbitrary TFRecord payloads).
list-formats shows canonical names (for example label-studio), while commands also accept aliases (for example ls, label-studio-json). Across commands, --output-format is the consistent way to request JSON reports; convert and sample also keep --report as an alias. JSON is pretty-printed on a terminal and compact when piped or captured, which makes it friendlier for scripts and agents. stats also adapts its text renderer: rich/Unicode on a terminal, plain text layout when piped.
More convert examples
# COCO to IR JSON (lossless — no data lost)
# IR JSON to TFOD (lossy — requires explicit opt-in)
# Auto-detect input format from file extension/content or directory layout
# Request a machine-readable conversion report
# Preview a conversion without touching the output path
# Convert a remote Hugging Face dataset repo to COCO JSON
# (requires --features hf when building from source)
# Convert a zip-style HF dataset repo split to IR JSON (auto-detects extracted payload)
Dry runs still do the real thinking work — format detection, validation, sampling/conversion analysis, and lossiness checks — but they skip the final filesystem write. That means they are good for “what would happen?” checks, but they do not prove that the output path is writable.
Getting help
Documentation
Want to go deeper? The full docs are readable right here on GitHub:
- Documentation home — start here
- CLI reference — every flag and option
- Format reference — how each format works
- Tasks and use cases — what's supported today
- Conversion and lossiness — understanding what gets lost
- Contributing — we'd love your help
- Roadmap — what's coming next
Contributing
Contributions are welcome! Whether it's a bug report, a new format adapter, or a documentation fix — we appreciate the help. For major changes, please open an issue first so we can discuss the approach.
See the contributing guide for details on the codebase structure and how to make changes.
License
MIT — see LICENSE for details.