# Introduction
Canon helps you understand and take control of digital assets spread across many drives, backups, and years.
It helps you build a canonical archive from messy data.
Think Marie Kondo, but for files.
## The Problem
Over time, files accumulate across devices: old hard drives, backup folders, cloud downloads, phone exports. Finding what you have, identifying duplicates, and organizing everything into a coherent archive becomes overwhelming.
## The Approach
Canon takes a methodical, incremental approach:
1. **Scan** your devices to index files and compute content hashes
2. **Enrich** with metadata extracted by external tools (EXIF, file types, etc.)
3. **Discover** what you have using filters and queries
4. **Archive** selected files to a canonical location, at your own pace
Each step is revisitable. You can scan new sources, add more metadata, refine your queries, and archive in small batches. Canon tracks what's already archived, so you always know your progress.
## Key Features
- **Content-based deduplication**: Files are identified by their hash, not location
- **Flexible metadata**: Import any key-value facts from external tools
- **Powerful filtering**: Query by any combination of facts using boolean expressions
- **Safe archiving**: Preview operations, validate integrity, and maintain audit trails
- **Incremental workflow**: Work at your own pace with full state persistence
Ready to get started? See [Setup](setup.md) and [Getting Started](getting-started.md).