datalab-cli 0.1.0

A powerful CLI for converting, extracting, and processing documents using the Datalab API
Documentation
# extract-score

Score extraction results with confidence ratings.

## Synopsis

```
datalab extract-score [OPTIONS] --checkpoint-id <ID>
```

## Description

Score the results of a previous extraction using a checkpoint. Returns confidence ratings for the extracted data, helping you assess extraction quality.

!!! note
    This command requires a checkpoint ID from a previous extraction that was run with `--save-checkpoint`.

---

## Options

### Required Options

| Option | Description |
|--------|-------------|
| `--checkpoint-id <ID>` | Checkpoint ID from extraction with `--save-checkpoint` |

### Output Options

| Option | Description |
|--------|-------------|
| `-o, --output <FILE>` | Write result to file |

### Cache Options

| Option | Description |
|--------|-------------|
| `--skip-cache` | Skip local cache lookup |

### Advanced Options

| Option | Description | Default |
|--------|-------------|---------|
| `--timeout <SECS>` | Request timeout in seconds | `300` |

---

## Examples

### Basic Usage

First, run an extraction with checkpoint:

```bash
datalab extract invoice.pdf \
  --schema '{"fields": [{"name": "total", "type": "number"}]}' \
  --save-checkpoint

# Output includes: "checkpoint_id": "abc123"
```

Then score the extraction:

```bash
datalab extract-score --checkpoint-id abc123
```

### Save Score Results

```bash
datalab extract-score --checkpoint-id abc123 --output scores.json
```

---

## Output Schema

```json
{
  "scores": {
    "total": {
      "confidence": 0.95,
      "evidence": "Found clear total amount on page 1"
    },
    "invoice_number": {
      "confidence": 0.88,
      "evidence": "Invoice number found in header"
    }
  },
  "overall_confidence": 0.92,
  "metadata": {
    "checkpoint_id": "abc123",
    "processing_time": 0.5
  }
}
```

---

## Understanding Scores

| Confidence | Interpretation |
|------------|----------------|
| `0.9 - 1.0` | High confidence, likely accurate |
| `0.7 - 0.9` | Good confidence, probably accurate |
| `0.5 - 0.7` | Medium confidence, review recommended |
| `0.0 - 0.5` | Low confidence, manual verification needed |

---

## Workflow

1. **Extract with checkpoint**:
   ```bash
   datalab extract document.pdf --schema schema.json --save-checkpoint
   ```

2. **Review extraction results**

3. **Score if needed**:
   ```bash
   datalab extract-score --checkpoint-id <id>
   ```

4. **Use scores to filter or flag results**

---

## Related Commands

- [`extract`]extract.md - Extract data from documents
- [`convert`]convert.md - Convert documents (can save checkpoints)

---

## See Also

- [Checkpoints]../concepts/checkpoints.md
- [Extracting Data Tutorial]../tutorials/extract-data.md