datalab-cli 0.1.0

A powerful CLI for converting, extracting, and processing documents using the Datalab API
Documentation
# Datalab CLI

A powerful command-line interface for converting, extracting, and processing documents using the [Datalab API](https://www.datalab.to).

---

## Features

### Document Conversion

Convert PDFs, images, and documents to markdown, HTML, JSON, or semantic chunks with high accuracy.

[Learn more](tutorials/convert-documents.md)

### Structured Extraction

Extract structured data from documents using JSON schemas. Get exactly the fields you need.

[Learn more](tutorials/extract-data.md)

### Smart Caching

Built-in file-based caching reduces API costs on repeated requests. Pay once, query many.

[Learn more](concepts/caching.md)

### Agent-Friendly

JSON output to stdout, progress events to stderr. Designed for piping and AI agent integration.

[Learn more](tutorials/agent-integration.md)

---

## Quick Install

```bash
cargo install datalab
```

Or build from source:

```bash
git clone https://github.com/datalab/datalab-cli
cd datalab-cli
cargo build --release
```

---

## Quick Start

**1. Set your API key**

```bash
export DATALAB_API_KEY="your-api-key-here"
```

Get your API key from the [Datalab dashboard](https://www.datalab.to/app/keys).

**2. Convert a document**

```bash
datalab convert document.pdf
```

**3. Extract structured data**

```bash
datalab extract invoice.pdf --schema '{"fields": [{"name": "total", "type": "number"}]}'
```

**4. Fill a form**

```bash
datalab fill form.pdf --fields '{"name": "John Doe"}' --output filled.pdf
```

[:octicons-arrow-right-24: Full quickstart guide](getting-started/quickstart.md)

---

## Commands at a Glance

| Command | Description |
|---------|-------------|
| [`convert`]commands/convert.md | Convert documents to markdown, HTML, JSON, or chunks |
| [`extract`]commands/extract.md | Extract structured data using a JSON schema |
| [`segment`]commands/segment.md | Split multi-document PDFs into logical sections |
| [`fill`]commands/fill.md | Fill PDF or image forms with field data |
| [`track-changes`]commands/track-changes.md | Extract track changes from Word documents |
| [`create-document`]commands/create-document.md | Generate DOCX from markdown |
| [`extract-score`]commands/extract-score.md | Score extraction results with confidence ratings |
| [`files`]commands/files.md | Upload, list, download, and delete files |
| [`workflows`]commands/workflows.md | Create and execute document processing workflows |
| [`cache`]commands/cache.md | Manage the local response cache |

---

## Why Datalab CLI?

### For Developers

- **JSON everywhere**: All output is JSON for easy parsing and piping
- **Caching built-in**: Reduce API costs during development
- **Progress streaming**: Monitor long-running operations

### For AI Agents

- **Structured output**: Predictable JSON schemas for LLM consumption
- **Quiet mode**: Suppress progress for clean stdout
- **Checkpoints**: Efficient document reuse across operations

### For Automation

- **Exit codes**: Proper 0/1 codes for scripting
- **Error handling**: Clear error messages with suggestions
- **Rate limit awareness**: Graceful handling with retry info

---

## Getting Help

- [GitHub Issues]https://github.com/datalab/datalab-cli/issues - Bug reports and feature requests
- [Datalab Documentation]https://documentation.datalab.to - API documentation
- [Datalab Support]mailto:support@datalab.to - Direct support

```bash
# Built-in help
datalab --help
datalab convert --help

# Man pages (if installed)
man datalab
man datalab-convert
```