youtubeinfo-sync 1.0.2

Download YouTube video and channel metadata
# youtubeinfo-sync

A command-line tool to download and sync YouTube video and channel metadata using the YouTube Data API v3.

## Features

- Batch fetch video metadata by video IDs
- Optionally sync channel metadata for video authors
- Organize output into configurable feed directories
- Atomic file writes for data integrity
- Automatic request batching (50 IDs per request)

## Installation

### From Source

Requires Rust 1.85+ (2024 edition).

```bash
git clone https://codeberg.org/evancarroll/youtubeinfo-sync.git
cd youtubeinfo-sync
cargo build --release
```

The binary will be at `target/release/youtubeinfo-sync`.

## Configuration

### API Key

Obtain a YouTube Data API v3 key from the [Google Cloud Console](https://console.cloud.google.com/apis/credentials) and set it as an environment variable:

```bash
export YOUTUBE_API_KEY="your-api-key-here"
```

### Configuration File

Create a TOML configuration file (see `example_config.toml`):

```toml
# Output directory for all generated files
output_dir = "./output"

# Global default: whether to fetch channel data for video authors
sync_channels = false

# Define one or more feeds
[[feed]]
name = "music"
videoids = ["dQw4w9WgXcQ", "9bZkp7q19f0"]
sync_channels = true  # Override: sync channels for this feed

[[feed]]
name = "gaming"
videoids = ["jNQXAC9IVRw"]
# Inherits global sync_channels = false
```

### Configuration Reference

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `output_dir` | string | No | `"./output"` | Base directory for output files |
| `sync_channels` | boolean | No | `false` | Global setting to fetch channel metadata |
| `feed` | array | Yes | - | List of feed configurations |

#### Feed Configuration

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `name` | string | Yes | - | Feed name (used as output subdirectory) |
| `videoids` | array | Yes | - | List of 11-character YouTube video IDs |
| `sync_channels` | boolean | No | *inherited* | Per-feed override for channel syncing |

### Validation Rules

- At least one feed must be defined
- Feed names cannot be empty or contain path separators (`/`, `\`)
- Video IDs must be exactly 11 characters

## Usage

### Basic Usage

```bash
youtubeinfo-sync batch --batch-file config.toml
```

### With Custom Output Directory

```bash
youtubeinfo-sync batch --batch-file config.toml --output-dir /path/to/output
```

### Command-Line Options

```
Usage: youtubeinfo-sync batch [OPTIONS] --batch-file <BATCH_FILE>

Options:
  -b, --batch-file <BATCH_FILE>  Path to TOML configuration file
  -o, --output-dir <OUTPUT_DIR>  Output directory (overrides config file)
  -h, --help                     Print help
```

### Logging

Control log verbosity with the `RUST_LOG` environment variable:

```bash
# Default info level
youtubeinfo-sync batch --batch-file config.toml

# Debug logging
RUST_LOG=debug youtubeinfo-sync batch --batch-file config.toml

# Trace logging (very verbose)
RUST_LOG=trace youtubeinfo-sync batch --batch-file config.toml
```

## Output Structure

```
output/
├── feeds/
│   ├── music/
│   │   └── videos.json      # Map of videoId -> video object
│   └── gaming/
│       └── videos.json
└── channels/
    └── channels.json        # Map of channelId -> channel object
```

- `videos.json` - Object mapping video IDs to video metadata for each feed
- `channels.json` - Object mapping channel IDs to channel metadata (only created if `sync_channels` is enabled for any feed)

See [SCHEMA.md](SCHEMA.md) for detailed JSON schema documentation.

## API Quota Management

The YouTube Data API v3 has a default quota of **10,000 units per day**.

### Quota Costs

| Operation | Cost |
|-----------|------|
| videos.list | 1 unit per request |
| channels.list | 1 unit per request |

### Request Batching

This tool batches up to 50 IDs per request to minimize quota usage:

| Videos | Requests | Quota Used |
|--------|----------|------------|
| 50 | 1 | 1 unit |
| 100 | 2 | 2 units |
| 500 | 10 | 10 units |
| 1,000 | 20 | 20 units |

### Quota Tips

1. **Plan your syncs** - With 10,000 units/day, you can fetch ~500,000 videos daily
2. **Use `sync_channels` selectively** - Only enable for feeds where you need channel data
3. **Monitor usage** - Check quota in the [Google Cloud Console]https://console.cloud.google.com/apis/api/youtube.googleapis.com/quotas
4. **Request increase** - For larger workloads, apply for a quota increase via Google Cloud

### Rate Limiting

The tool automatically:
- Adds 100ms delay between batch requests
- Detects quota exceeded errors (HTTP 403 with `quotaExceeded` reason)
- Reports quota errors clearly in output

## Exit Codes

| Code | Meaning |
|------|---------|
| 0 | Success - all feeds processed |
| 1 | Partial failure - some feeds failed |
| 2 | Complete failure - all feeds failed or configuration error |

## Examples

### Sync Music Videos

```toml
# music.toml
output_dir = "./data"

[[feed]]
name = "favorites"
videoids = [
    "dQw4w9WgXcQ",  # Rick Astley - Never Gonna Give You Up
    "9bZkp7q19f0",  # PSY - Gangnam Style
    "kJQP7kiw5Fk",  # Luis Fonsi - Despacito
]
sync_channels = true
```

```bash
youtubeinfo-sync batch --batch-file music.toml
```

### Monitor Multiple Playlists

```toml
# playlists.toml
output_dir = "./youtube-data"
sync_channels = false

[[feed]]
name = "tech-reviews"
videoids = ["video1", "video2", "video3"]

[[feed]]
name = "tutorials"
videoids = ["video4", "video5"]
sync_channels = true  # Only sync channels for tutorials
```

## Troubleshooting

### "Missing API key" Error

Ensure `YOUTUBE_API_KEY` is set:

```bash
echo $YOUTUBE_API_KEY  # Should print your key
```

### "Quota exceeded" Error

You've hit the daily quota limit. Options:
- Wait until quota resets (midnight Pacific Time)
- Request a quota increase from Google Cloud
- Reduce the number of videos being synced

### Videos Not Found

Some videos may be unavailable due to:
- Private or deleted videos
- Region restrictions
- Age restrictions

The tool logs warnings for missing videos but continues processing.

## Example Usage

This tool is used to generate the video metadata for [Evan Carroll's Classes](https://www.evancarroll.com/classes/), a static site where the data is used to generate video listings.

## License

Copyright © 2025 Evan Carroll

This software is licensed under the Anti-Capitalist Software License (v 1.4).

See [LICENSE.md](LICENSE.md) for full license text, or visit https://anticapitalist.software/