panlabel 0.7.0

The universal annotation converter
Documentation

Panlabel

CI Crates.io Version PyPI Version GitHub License GitHub Repo stars Crates.io Total Downloads PyPI Downloads

The universal annotation converter

If you've ever written a one-off Python script to wrangle COCO annotations into YOLO format (or vice versa), panlabel is here to save you the trouble. It's a fast, single-binary CLI that converts between common object detection annotation formats — with built-in validation, clear lossiness warnings, and no Python dependencies to manage.

Panlabel’s current core scope is mainstream/static-image 2D axis-aligned object-detection bbox conversion. It does not provide first-class segmentation, keypoints/pose, oriented boxes, video tracking IDs, or 3D/multisensor labels. When broad schemas include richer structures, panlabel either skips/reports those structures or treats the conversion as lossy.

Panlabel is also available as a Rust library if you want to integrate format conversion into your own tools.

Note: Panlabel is in active development (v0.5.x). The CLI and library APIs may change between versions, so pin to a specific version if you're using it in production.

Installation

pip / uv (any platform)

pip install panlabel
# or
uv pip install panlabel

This installs a pre-built binary — no Rust toolchain needed.

Homebrew (macOS / Linux)

brew install strickvl/tap/panlabel

Shell script (macOS / Linux)

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.sh | sh

PowerShell (Windows)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/strickvl/panlabel/releases/latest/download/panlabel-installer.ps1 | iex"

Cargo (from source)

cargo install panlabel
# Enable full HF support (remote Hub import + metadata.parquet)
cargo install panlabel --features hf

Pre-built binaries

Download from the latest GitHub Release. Builds are available for macOS (Intel + Apple Silicon), Linux (x86_64 + ARM64), and Windows.

Docker

docker pull strickvl/panlabel
# Convert a COCO file in your current directory to YOLO
docker run --rm -v "$PWD":/data strickvl/panlabel convert -f coco -t yolo -i /data/annotations.json -o /data/yolo_out --allow-lossy

Multi-arch images (amd64 + arm64) are published for each release.

As a Rust library

cargo add panlabel

Quick start

# Convert between formats (auto-detects the input)
panlabel convert --from auto --to yolo -i annotations.json -o ./yolo_out --allow-lossy

# Check a dataset for problems before training
panlabel validate --format coco annotations.json

# Get a quick overview of what's in a dataset
panlabel stats --format coco annotations.json

# Compare two datasets semantically
panlabel diff --format-a auto --format-b auto old.json new.json

# Sample a smaller subset for quick experiments
panlabel sample -i annotations.json -o sample.ir.json --from auto --to ir-json -n 100 --seed 42

# See every supported format and its capabilities
panlabel list-formats

The convert shape is always -f <source> -t <dest> -i <input> -o <output> — pick any source/destination from the Supported formats table. See More convert examples below for lossless vs. lossy conversions, machine-readable JSON reports, dry runs, and remote Hugging Face datasets.

What can panlabel do?

Command What it does
convert Convert between annotation formats, with clear warnings about what (if anything) gets lost
validate Check your dataset for common problems — duplicate IDs, missing references, invalid bounding boxes
stats Show rich dataset statistics in text, JSON, or HTML
diff Compare two datasets semantically (summary or detailed output)
sample Create subset datasets (random or stratified), with optional category filtering and JSON reports
list-formats Show which formats are supported and their read/write/lossiness capabilities, including JSON discovery output

Supported formats

Format Extension / Layout Description Lossiness
ir-json .json Panlabel's own intermediate representation Lossless
coco .json COCO object detection format Conditional
ibm-cloud-annotations _annotations.json file or directory IBM Cloud Annotations localization JSON Lossy
cvat .xml / annotations.xml export CVAT for images XML annotation export Lossy
label-studio .json Label Studio task export JSON (rectanglelabels) Lossy
labelbox .json / .jsonl / .ndjson Labelbox current export rows (data_row / projects.*.labels) Lossy
scale-ai .json file or directory (annotations/) Scale AI image annotation task/response JSON Lossy
unity-perception .json file or SOLO-like directory Unity Perception / SOLO synthetic-data bbox JSON Lossy
tfod .csv TensorFlow Object Detection CSV (normalized bbox corners) Lossy
tfrecord .tfrecord TensorFlow Object Detection API-style tf.train.Example records (single-file, uncompressed, bbox-only in v1) Lossy
vott-csv .csv Microsoft VoTT CSV export (image,xmin,ymin,xmax,ymax,label) Lossy
vott-json .json file or vott-json-export/ directory Microsoft VoTT JSON export (assets / per-asset JSON with regions) Lossy
yolo images/ + labels/ directory, or split data.yaml pointing to image-list .txt files YOLO .txt labels (flat, split-aware, Scaled-YOLOv4 aliases, optional confidence) Lossy
yolo-keras .txt file or directory (yolo_keras.txt, annotations.txt, train.txt) YOLO Keras absolute-coordinate TXT (image xmin,ymin,xmax,ymax,class_id ...) Lossy
yolov4-pytorch .txt file or directory (yolov4_pytorch.txt, train_annotation.txt, train.txt) YOLOv4 PyTorch absolute-coordinate TXT (image xmin,ymin,xmax,ymax,class_id ...) Lossy
voc Annotations/ + JPEGImages/ directory Pascal VOC XML Lossy
hf metadata.jsonl / metadata.parquet directory Hugging Face ImageFolder metadata Lossy
sagemaker .manifest / .jsonl file AWS SageMaker Ground Truth object-detection manifest Lossy
labelme .json file or annotations/ directory LabelMe per-image JSON annotations Lossy
create-ml .json Apple CreateML annotation format Lossy
kitti label_2/ + image_2/ directory KITTI object detection labels Lossy
via .json VGG Image Annotator (VIA) JSON Lossy
retinanet .csv keras-retinanet CSV format Lossy
openimages .csv Google OpenImages CSV annotation format Lossy
kaggle-wheat .csv Kaggle Global Wheat Detection CSV Lossy
automl-vision .csv Google Cloud AutoML Vision CSV Lossy
udacity .csv Udacity Self-Driving Car Dataset CSV Lossy
superannotate .json file or annotations/ directory SuperAnnotate JSON export Lossy
supervisely .json file or ann/ / meta.json project directory Supervisely JSON project / dataset Lossy
cityscapes .json, gtFine/, or dataset root with gtFine/ Cityscapes polygon JSON; polygons become bbox envelopes Lossy
marmot .xml file or directory with same-stem companion images Marmot XML document-layout composites; hex doubles become pixel bboxes Lossy
datumaro .json Datumaro JSON annotation format Lossy
wider-face .txt WIDER Face aggregate TXT (single face class in panlabel) Lossy
oidv4 directory with Label/ or .txt OIDv4 Toolkit TXT labels (directory probe uses Label/, not YOLO labels/) Lossy
bdd100k .json BDD100K / Scalabel JSON detection subset Lossy
v7-darwin .json V7 Darwin JSON bbox subset Lossy
edge-impulse bounding_boxes.labels file or containing directory Edge Impulse bounding-box labels JSON Lossy
openlabel .json ASAM OpenLABEL JSON static-image 2D bbox subset Lossy
via-csv .csv VGG Image Annotator CSV (separate format from VIA JSON) Lossy

Run panlabel list-formats for the full details, or panlabel list-formats --output json for machine-readable format discovery.

TFRecord support in v1 is intentionally narrow: panlabel currently supports only single-file, uncompressed TensorFlow Object Detection API-style tf.train.Example bbox records (not arbitrary TFRecord payloads).

list-formats shows canonical names (for example label-studio), while commands also accept aliases (for example ls, label-studio-json). Across commands, --output-format is the consistent way to request JSON reports; convert and sample also keep --report as an alias. JSON is pretty-printed on a terminal and compact when piped or captured, which makes it friendlier for scripts and agents. stats also adapts its text renderer: rich/Unicode on a terminal, plain text layout when piped.

More convert examples

# COCO to IR JSON (lossless — no data lost)
panlabel convert -f coco -t ir-json -i input.json -o output.json

# IR JSON to TFOD (lossy — requires explicit opt-in)
panlabel convert -f ir-json -t tfod -i input.json -o output.csv --allow-lossy

# Auto-detect input format from file extension/content or directory layout
panlabel convert --from auto -t coco -i input.csv -o output.json

# Request a machine-readable conversion report
panlabel convert --from auto -t coco -i input.csv -o output.json --output-format json

# Preview a conversion without touching the output path
panlabel convert --from auto -t coco -i input.csv -o output.json --dry-run

# Convert a remote Hugging Face dataset repo to COCO JSON
# (requires --features hf when building from source)
panlabel convert -f hf -t coco --hf-repo rishitdagli/cppe-5 --split train -o coco_output.json

# Convert a zip-style HF dataset repo split to IR JSON (auto-detects extracted payload)
panlabel convert -f hf -t ir-json --hf-repo keremberke/football-object-detection --split train -o football.ir.json

Dry runs still do the real thinking work — format detection, validation, sampling/conversion analysis, and lossiness checks — but they skip the final filesystem write. That means they are good for “what would happen?” checks, but they do not prove that the output path is writable.

Getting help

panlabel --help              # See all commands
panlabel convert --help      # Help for a specific command
panlabel -V                  # Show version

Documentation

Want to go deeper? The full docs are readable right here on GitHub:

Contributing

Contributions are welcome! Whether it's a bug report, a new format adapter, or a documentation fix — we appreciate the help. For major changes, please open an issue first so we can discuss the approach.

See the contributing guide for details on the codebase structure and how to make changes.

License

MIT — see LICENSE for details.