# Datalab CLI
A powerful command-line interface for converting, extracting, and processing documents using the [Datalab API](https://www.datalab.to).
---
## Features
### Document Conversion
Convert PDFs, images, and documents to markdown, HTML, JSON, or semantic chunks with high accuracy.
[Learn more](tutorials/convert-documents.md)
### Structured Extraction
Extract structured data from documents using JSON schemas. Get exactly the fields you need.
[Learn more](tutorials/extract-data.md)
### Smart Caching
Built-in file-based caching reduces API costs on repeated requests. Pay once, query many.
[Learn more](concepts/caching.md)
### Agent-Friendly
JSON output to stdout, progress events to stderr. Designed for piping and AI agent integration.
[Learn more](tutorials/agent-integration.md)
---
## Quick Install
```bash
cargo install datalab
```
Or build from source:
```bash
git clone https://github.com/datalab/datalab-cli
cd datalab-cli
cargo build --release
```
---
## Quick Start
**1. Set your API key**
```bash
export DATALAB_API_KEY="your-api-key-here"
```
Get your API key from the [Datalab dashboard](https://www.datalab.to/app/keys).
**2. Convert a document**
```bash
datalab convert document.pdf
```
**3. Extract structured data**
```bash
datalab extract invoice.pdf --schema '{"fields": [{"name": "total", "type": "number"}]}'
```
**4. Fill a form**
```bash
datalab fill form.pdf --fields '{"name": "John Doe"}' --output filled.pdf
```
[:octicons-arrow-right-24: Full quickstart guide](getting-started/quickstart.md)
---
## Commands at a Glance
| [`convert`](commands/convert.md) | Convert documents to markdown, HTML, JSON, or chunks |
| [`extract`](commands/extract.md) | Extract structured data using a JSON schema |
| [`segment`](commands/segment.md) | Split multi-document PDFs into logical sections |
| [`fill`](commands/fill.md) | Fill PDF or image forms with field data |
| [`track-changes`](commands/track-changes.md) | Extract track changes from Word documents |
| [`create-document`](commands/create-document.md) | Generate DOCX from markdown |
| [`extract-score`](commands/extract-score.md) | Score extraction results with confidence ratings |
| [`files`](commands/files.md) | Upload, list, download, and delete files |
| [`workflows`](commands/workflows.md) | Create and execute document processing workflows |
| [`cache`](commands/cache.md) | Manage the local response cache |
---
## Why Datalab CLI?
### For Developers
- **JSON everywhere**: All output is JSON for easy parsing and piping
- **Caching built-in**: Reduce API costs during development
- **Progress streaming**: Monitor long-running operations
### For AI Agents
- **Structured output**: Predictable JSON schemas for LLM consumption
- **Quiet mode**: Suppress progress for clean stdout
- **Checkpoints**: Efficient document reuse across operations
### For Automation
- **Exit codes**: Proper 0/1 codes for scripting
- **Error handling**: Clear error messages with suggestions
- **Rate limit awareness**: Graceful handling with retry info
---
## Getting Help
- [GitHub Issues](https://github.com/datalab/datalab-cli/issues) - Bug reports and feature requests
- [Datalab Documentation](https://documentation.datalab.to) - API documentation
- [Datalab Support](mailto:support@datalab.to) - Direct support
```bash
# Built-in help
datalab --help
datalab convert --help
# Man pages (if installed)
man datalab
man datalab-convert
```