canon-archive 0.2.2

A CLI tool for organizing large media libraries into a canonical archive
# Introduction

Canon helps you understand and take control of digital assets spread across many drives, backups, and years.
It helps you build a canonical archive from messy data.
Think Marie Kondo, but for files.

## The Problem

Over time, files accumulate across devices: old hard drives, backup folders, cloud downloads, phone exports. Finding what you have, identifying duplicates, and organizing everything into a coherent archive becomes overwhelming.

## The Approach

Canon takes a methodical, incremental approach:

1. **Scan** your devices to index files and compute content hashes
2. **Enrich** with metadata extracted by external tools (EXIF, file types, etc.)
3. **Discover** what you have using filters and queries
4. **Archive** selected files to a canonical location, at your own pace

Each step is revisitable. You can scan new sources, add more metadata, refine your queries, and archive in small batches. Canon tracks what's already archived, so you always know your progress.

## Key Features

- **Content-based deduplication**: Files are identified by their hash, not location
- **Flexible metadata**: Import any key-value facts from external tools
- **Powerful filtering**: Query by any combination of facts using boolean expressions
- **Safe archiving**: Preview operations, validate integrity, and maintain audit trails
- **Incremental workflow**: Work at your own pace with full state persistence

Ready to get started? See [Setup](setup.md) and [Getting Started](getting-started.md).