Datalab CLI
Convert, extract, and process documents from the command line
Installation | Quick Start | Usage | Documentation
A powerful command-line interface for the Datalab document processing API. Built in Rust for speed and reliability.
Features
- 📄 Document Conversion — Convert PDFs, images, and documents to Markdown, HTML, JSON, or semantic chunks
- 🔍 Structured Extraction — Extract data using JSON schemas with confidence scores
- 📝 Form Filling — Fill PDF forms programmatically with smart field matching
- ⚡ Smart Caching — Local file-based caching reduces API costs on repeated requests
- 🤖 Agent-Friendly — JSON output to stdout, progress events to stderr, designed for piping
- 📊 Progress Streaming — Real-time JSON progress events for monitoring long operations
Installation
From crates.io
From source
Pre-built binaries
Download from GitHub Releases.
Quick Start
1. Get your API key from datalab.to/app/keys
2. Set the environment variable
3. Convert your first document
That's it! The converted markdown is output as JSON to stdout.
Usage
Convert Documents
# Convert to markdown (default)
# Convert to HTML
# High-quality mode for complex documents
# Convert specific pages
# Save to file
Extract Structured Data
# Extract with inline schema
# Extract with schema file
# Include confidence scores
Fill Forms
# Fill a form
File Management
# Upload a file
# List files
# Download a file
Cache Management
# View cache stats
# Clear old entries
Output Format
All commands output JSON to stdout for easy piping:
# Pipe to jq
|
# Save to file
Progress events stream to stderr as JSON:
Use --quiet to suppress progress, --verbose to force it.
Environment Variables
| Variable | Required | Description |
|---|---|---|
DATALAB_API_KEY |
Yes | Your API key |
DATALAB_BASE_URL |
No | Custom API endpoint (for on-prem) |
NO_COLOR |
No | Disable colored output |
Caching
Results are cached locally in ~/.cache/datalab/ to reduce API costs:
# First run: calls API
# Second run: instant from cache
# Bypass cache
Documentation
Full documentation is available in the documentation directory. To view locally:
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
# Development setup
# Run tests
# Run lints
License
MIT License - see LICENSE for details.
Built with Rust | Powered by Datalab