Panlabel

Crates.io Version PyPI Version GitHub License GitHub Repo stars Crates.io Total Downloads PyPI Downloads

The universal annotation converter

If you've ever written a one-off Python script to wrangle COCO annotations into YOLO format (or vice versa), panlabel is here to save you the trouble. It's a fast, single-binary CLI that converts between common object detection annotation formats — with built-in validation, clear lossiness warnings, and no Python dependencies to manage.

Panlabel’s current core scope is mainstream/static-image 2D axis-aligned object-detection bbox conversion. It does not provide first-class segmentation, keypoints/pose, oriented boxes, video tracking IDs, or 3D/multisensor labels. When broad schemas include richer structures, panlabel either skips/reports those structures or treats the conversion as lossy.

Panlabel is also available as a Rust library if you want to integrate format conversion into your own tools.

Note: Panlabel is in active development (v0.5.x). The CLI and library APIs may change between versions, so pin to a specific version if you're using it in production.

Installation

pip / uv (any platform)

pip install panlabel
# or
uv pip install panlabel

This installs a pre-built binary — no Rust toolchain needed.

Homebrew (macOS / Linux)

brew install strickvl/tap/panlabel

Shell script (macOS / Linux)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.sh | sh

PowerShell (Windows)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.ps1 | iex"

Cargo (from source)

cargo install panlabel
# Enable full HF support (remote Hub import + metadata.parquet)
cargo install panlabel --features hf

Pre-built binaries

Download from the latest GitHub Release. Builds are available for macOS (Intel + Apple Silicon), Linux (x86_64 + ARM64), and Windows.

Docker

docker pull strickvl/panlabel
# Convert a COCO file in your current directory to YOLO
docker run --rm -v "$PWD":/data strickvl/panlabel convert -f coco -t yolo -i /data/annotations.json -o /data/yolo_out --allow-lossy

Multi-arch images (amd64 + arm64) are published for each release.

As a Rust library

cargo add panlabel

Quick start

# Convert between formats (auto-detects the input)
panlabel convert --from auto --to yolo -i annotations.json -o ./yolo_out --allow-lossy

# Check a dataset for problems before training
panlabel validate --format coco annotations.json

# Get a quick overview of what's in a dataset
panlabel stats --format coco annotations.json

# Compare two datasets semantically
panlabel diff --format-a auto --format-b auto old.json new.json

# Sample a smaller subset for quick experiments
panlabel sample -i annotations.json -o sample.ir.json --from auto --to ir-json -n 100 --seed 42

# See every supported format and its capabilities
panlabel list-formats

The convert shape is always -f <source> -t <dest> -i <input> -o <output> — pick any source/destination from the Supported formats table. See More convert examples below for lossless vs. lossy conversions, machine-readable JSON reports, dry runs, and remote Hugging Face datasets.

What can panlabel do?

Command	What it does
`convert`	Convert between annotation formats, with clear warnings about what (if anything) gets lost
`validate`	Check your dataset for common problems — duplicate IDs, missing references, invalid bounding boxes
`stats`	Show rich dataset statistics in text, JSON, or HTML
`diff`	Compare two datasets semantically (summary or detailed output)
`sample`	Create subset datasets (random or stratified), with optional category filtering and JSON reports
`list-formats`	Show which formats are supported and their read/write/lossiness capabilities, including JSON discovery output

Supported formats

Format	Extension / Layout	Description	Lossiness
`ir-json`	`.json`	Panlabel's own intermediate representation	Lossless
`coco`	`.json`	COCO object detection format	Conditional
`ibm-cloud-annotations`	`_annotations.json` file or directory	IBM Cloud Annotations localization JSON	Lossy
`cvat`	`.xml` / `annotations.xml` export	CVAT for images XML annotation export	Lossy
`label-studio`	`.json`	Label Studio task export JSON (`rectanglelabels`)	Lossy
`labelbox`	`.json` / `.jsonl` / `.ndjson`	Labelbox current export rows (`data_row` / `projects.*.labels`)	Lossy
`scale-ai`	`.json` file or directory (`annotations/`)	Scale AI image annotation task/response JSON	Lossy
`unity-perception`	`.json` file or SOLO-like directory	Unity Perception / SOLO synthetic-data bbox JSON	Lossy
`tfod`	`.csv`	TensorFlow Object Detection CSV (normalized bbox corners)	Lossy
`tfrecord`	`.tfrecord`	TensorFlow Object Detection API-style `tf.train.Example` records (single-file, uncompressed, bbox-only in v1)	Lossy
`vott-csv`	`.csv`	Microsoft VoTT CSV export (`image,xmin,ymin,xmax,ymax,label`)	Lossy
`vott-json`	`.json` file or `vott-json-export/` directory	Microsoft VoTT JSON export (`assets` / per-asset JSON with `regions`)	Lossy
`yolo`	`images/ + labels/` directory, or split `data.yaml` pointing to image-list `.txt` files	YOLO `.txt` labels (flat, split-aware, Scaled-YOLOv4 aliases, optional confidence)	Lossy
`yolo-keras`	`.txt` file or directory (`yolo_keras.txt`, `annotations.txt`, `train.txt`)	YOLO Keras absolute-coordinate TXT (`image xmin,ymin,xmax,ymax,class_id ...`)	Lossy
`yolov4-pytorch`	`.txt` file or directory (`yolov4_pytorch.txt`, `train_annotation.txt`, `train.txt`)	YOLOv4 PyTorch absolute-coordinate TXT (`image xmin,ymin,xmax,ymax,class_id ...`)	Lossy
`voc`	`Annotations/ + JPEGImages/` directory	Pascal VOC XML	Lossy
`hf`	`metadata.jsonl` / `metadata.parquet` directory	Hugging Face ImageFolder metadata	Lossy
`sagemaker`	`.manifest` / `.jsonl` file	AWS SageMaker Ground Truth object-detection manifest	Lossy
`labelme`	`.json` file or `annotations/` directory	LabelMe per-image JSON annotations	Lossy
`create-ml`	`.json`	Apple CreateML annotation format	Lossy
`kitti`	`label_2/ + image_2/` directory	KITTI object detection labels	Lossy
`via`	`.json`	VGG Image Annotator (VIA) JSON	Lossy
`retinanet`	`.csv`	keras-retinanet CSV format	Lossy
`openimages`	`.csv`	Google OpenImages CSV annotation format	Lossy
`kaggle-wheat`	`.csv`	Kaggle Global Wheat Detection CSV	Lossy
`automl-vision`	`.csv`	Google Cloud AutoML Vision CSV	Lossy
`udacity`	`.csv`	Udacity Self-Driving Car Dataset CSV	Lossy
`superannotate`	`.json` file or `annotations/` directory	SuperAnnotate JSON export	Lossy
`supervisely`	`.json` file or `ann/` / `meta.json` project directory	Supervisely JSON project / dataset	Lossy
`cityscapes`	`.json`, `gtFine/`, or dataset root with `gtFine/`	Cityscapes polygon JSON; polygons become bbox envelopes	Lossy
`marmot`	`.xml` file or directory with same-stem companion images	Marmot XML document-layout composites; hex doubles become pixel bboxes	Lossy
`datumaro`	`.json`	Datumaro JSON annotation format	Lossy
`wider-face`	`.txt`	WIDER Face aggregate TXT (single `face` class in panlabel)	Lossy
`oidv4`	directory with `Label/` or `.txt`	OIDv4 Toolkit TXT labels (directory probe uses `Label/`, not YOLO `labels/`)	Lossy
`bdd100k`	`.json`	BDD100K / Scalabel JSON detection subset	Lossy
`v7-darwin`	`.json`	V7 Darwin JSON bbox subset	Lossy
`edge-impulse`	`bounding_boxes.labels` file or containing directory	Edge Impulse bounding-box labels JSON	Lossy
`openlabel`	`.json`	ASAM OpenLABEL JSON static-image 2D bbox subset	Lossy
`via-csv`	`.csv`	VGG Image Annotator CSV (separate format from VIA JSON)	Lossy

Run panlabel list-formats for the full details, or panlabel list-formats --output json for machine-readable format discovery.

TFRecord support in v1 is intentionally narrow: panlabel currently supports only single-file, uncompressed TensorFlow Object Detection API-style tf.train.Example bbox records (not arbitrary TFRecord payloads).

list-formats shows canonical names (for example label-studio), while commands also accept aliases (for example ls, label-studio-json). Across commands, --output-format is the consistent way to request JSON reports; convert and sample also keep --report as an alias. JSON is pretty-printed on a terminal and compact when piped or captured, which makes it friendlier for scripts and agents. stats also adapts its text renderer: rich/Unicode on a terminal, plain text layout when piped.

More convert examples

# COCO to IR JSON (lossless — no data lost)
panlabel convert -f coco -t ir-json -i input.json -o output.json

# IR JSON to TFOD (lossy — requires explicit opt-in)
panlabel convert -f ir-json -t tfod -i input.json -o output.csv --allow-lossy

# Auto-detect input format from file extension/content or directory layout
panlabel convert --from auto -t coco -i input.csv -o output.json

# Request a machine-readable conversion report
panlabel convert --from auto -t coco -i input.csv -o output.json --output-format json

# Preview a conversion without touching the output path
panlabel convert --from auto -t coco -i input.csv -o output.json --dry-run

# Convert a remote Hugging Face dataset repo to COCO JSON
# (requires --features hf when building from source)
panlabel convert -f hf -t coco --hf-repo rishitdagli/cppe-5 --split train -o coco_output.json

# Convert a zip-style HF dataset repo split to IR JSON (auto-detects extracted payload)
panlabel convert -f hf -t ir-json --hf-repo keremberke/football-object-detection --split train -o football.ir.json

Dry runs still do the real thinking work — format detection, validation, sampling/conversion analysis, and lossiness checks — but they skip the final filesystem write. That means they are good for “what would happen?” checks, but they do not prove that the output path is writable.

Getting help

panlabel --help              # See all commands
panlabel convert --help      # Help for a specific command
panlabel -V                  # Show version

Documentation

Want to go deeper? The full docs are readable right here on GitHub:

Documentation home — start here
CLI reference — every flag and option
Format reference — how each format works
Tasks and use cases — what's supported today
Conversion and lossiness — understanding what gets lost
Contributing — we'd love your help
Roadmap — what's coming next

Contributing

Contributions are welcome! Whether it's a bug report, a new format adapter, or a documentation fix — we appreciate the help. For major changes, please open an issue first so we can discuss the approach.

See the contributing guide for details on the codebase structure and how to make changes.

License

MIT — see LICENSE for details.

panlabel 0.7.0