archivr
A Tumblr backup tool.
Installing
At present, I don't have binary distributions available. You must either use cargo install, Nix, or compile the binary yourself.
With cargo, you can run
This repository also provides a Nix flake. You can run the command by using:
# or `nix shell` to add it to a shell
You can also add it to your Nix configuration that way.
Prerequisites
Register a new Tumblr application
In order to interact with the Tumblr API, archivr needs an OAuth consumer key and secret.
- In Tumblr, go to Settings > Apps
- Click on the "Register" link at the bottom
- Click the green "Register application" button
- Fill out the following fields:
- Application name: archivr (this doesn't really matter, this is for your reference)
- Application website: codeberg.org/ryf/archivr (again, doesn't really matter)
- Application description: Tumblr backup tool
- Administrative contact email: your email
- Default callback URL:
http://localhost:6263/callback - OAuth2 redirect URLs:
http://localhost:6263/redirect
- Click "Save changes"
Usage
This will kick off a job to back up an entire blog.
CLI flags
| Flag | Short | Description |
|---|---|---|
--consumer-key |
Tumblr OAuth consumer key | |
--consumer-secret |
Tumblr OAuth consumer secret | |
--config-file |
Path to a JSON config file with blog_name, consumer_key, and consumer_secret |
|
--output-dir |
-o |
Output directory (defaults to ./{blog_name}) |
--json |
Save posts as raw JSON instead of HTML | |
--template |
-t |
Custom Jinja template for HTML output (exclusive with --json) |
--directories |
-d |
Create a subdirectory for each post |
--save-images |
Download post images locally instead of linking to CDN | |
--before |
Only fetch posts before this date (Unix timestamp or RFC3339) | |
--after |
Only fetch posts after this date (Unix timestamp or RFC3339) | |
--resume |
Resume a previously interrupted backup | |
--quiet |
-q |
Suppress progress output |
--reauth |
Force re-authentication, ignoring saved tokens | |
--cookies-file |
Path to a Netscape/Mozilla-format cookies file for dashboard access | |
--dashboard |
Use Tumblr's internal dashboard API (requires --cookies-file) |
Job config file
You can specify all of the CLI arguments in a config file as well, passing --config-file <PATH> instead.
Custom templates
By default, archivr renders each post as a self-contained HTML file using a built-in template. You can override this with your own Jinja template:
Note: The
--template(-t) flag is mutually exclusive with--json. When--jsonis set, posts are saved as raw JSON and no template rendering occurs.
Templates are rendered with minijinja, which supports standard Jinja2 syntax — {{ }} for expressions, {% %} for control flow, and {# #} for comments.
Template context
Your template receives the following variables:
| Variable | Type | Description |
|---|---|---|
post |
object | The full post object (see fields below) |
is_reblog |
bool | true if the post was reblogged from another blog |
is_original |
bool | true if the post is original content |
newer_href |
string? | Relative URL to the next-newer post (for navigation links) |
Post fields
Access these as {{ post.field_name }}:
| Field | Type | Description |
|---|---|---|
id |
int | The post ID |
blog_name |
string | Name of the blog |
post_url |
string | Full URL to the post on Tumblr |
post_type |
string | Post type (e.g. "text", "photo") |
original_type |
string | Original post type before conversion |
timestamp |
int | Unix timestamp |
date |
string | Human-readable date |
content |
list | Content blocks (see below) |
trail |
list | Reblog trail items |
tags |
list | List of tag strings |
summary |
string | Post summary text |
note_count |
int | Number of notes |
slug |
string | URL slug |
short_url |
string | Short URL |
reblog_key |
string | Reblog key |
state |
string | Post state (e.g. "published") |
reblogged_from_name |
string? | Blog name this was reblogged from |
reblogged_from_url |
string? | URL of the blog this was reblogged from |
reblogged_root_name |
string? | Original post's blog name |
reblogged_root_url |
string? | Original post's blog URL |
liked |
bool | Whether you liked the post |
followed |
bool | Whether you follow the blog |
Content blocks
Each item in post.content (and in each trail item's content) is an object with a type field. The possible types and their fields are:
text — A text block.
text(string) — The text content (may contain HTML).subtype(string?) — Style hint:"heading1","heading2","quote","indented","chat", etc.
image — An image block.
media(list) — Each entry hasurl,width,height, andmedia_type.alt_text(string?) — Alt text for the image.caption(string?) — Image caption.
video — A video block.
media(list?) — Each entry hasurl,width,height, andmedia_type.url(string?) — External video URL (when no direct media).provider(string?) — Video provider name (e.g."youtube").embed_html(string?) — Embeddable HTML from the provider.duration(number?) — Duration in seconds.
audio — An audio block.
media(list?) — Each entry hasurlandmedia_type.url(string?) — External audio URL.provider(string?) — Audio provider name.title(string?) — Track title.artist(string?) — Artist name.album(string?) — Album name.embed_html(string?) — Embeddable HTML.
link — A link block.
url(string) — The link URL.title(string?) — Link title.description(string?) — Link description.
paywall — A paywall/premium content marker.
text(string?) — Display text (defaults to "Premium content").
Trail items
Each item in post.trail has:
| Field | Type | Description |
|---|---|---|
content |
list | Content blocks (same types as above) |
blog |
object? | Blog info with name, url, and uuid |
post |
object? | Post info with id |
is_root_item |
bool | Whether this is the root trail item |
The render_block() function
Templates have access to a built-in render_block(block) function that converts a content block into the default HTML representation. This lets you customize the overall page layout while reusing the default rendering for individual blocks:
{# Loop through content blocks, using the built-in renderer for each one #}
{% for block in post.content %}
{{ render_block(block) }}
{% endfor %}
You can also selectively override rendering for specific block types:
{% for block in post.content %}
{% if block.type == "image" %}
{# Custom image rendering #}
{% for m in block.media %}
<img src="{{ m.url }}" alt="{{ block.alt_text }}" loading="lazy">
{% endfor %}
{% else %}
{{ render_block(block) }}
{% endif %}
{% endfor %}
Example: minimal custom template
{{ post.blog_name }} - {{ post.id }}
{{ post.blog_name }}
{{ post.date }} · {{ post.note_count }} notes
{% if is_reblog %}
Reblogged from {{ post.reblogged_from_name }}
{% endif %}
{% for item in post.trail %}
{% if item.blog %}{{ item.blog.name }}:{% endif %}
{% for block in item.content %}
{{ render_block(block) }}
{% endfor %}
{% endfor %}
{% for block in post.content %}
{{ render_block(block) }}
{% endfor %}
{% for tag in post.tags %}
#{{ tag }}
{% endfor %}
Planned features
The following features are not yet implemented but are planned for future releases:
- Incremental backups (only fetch posts newer than the last run)
- Video and audio downloading (
--save-video,--save-audio) - Liked posts backup (
--likes) - Tag filtering (
--include-tags) - Notes backup (
--save-notes) - Index page generation (
--index-file) - Automatic rate limit retry with backoff