eclipse-sanitizer 0.1.1

A fast Rust CLI for sanitizing metadata from documents and images
eclipse-sanitizer-0.1.1 is not a library.

Eclipse

Local-first metadata sanitization for documents and images.
One Rust CLI to silence the gossip, keep the payload, and leave your files intact. The GitHub repo is now Eclipse, the crates.io package stays eclipse-sanitizer because eclipse is already taken, and the binary keeps the simpler eclipse name because even tools deserve a stage name.

release crates.io docs.rs written in Rust mode audit platform

What is this | Install | Update | Uninstall | Quick Start | CLI Options | Commands | Supported Files | Security | Audit Log | Project Structure | FAQ | License


Metadata has one job: stay in its lane. Eclipse gives it the boot.

What is this

Eclipse is built to remove common metadata from files without mangling the actual content. It works recursively over directories, can rewrite files in place or into a separate output directory, and checks hashes before and after persistence so the sanitized file does not quietly go feral.

In plain English: it is the digital equivalent of telling the metadata to sit down, shut up, and stop posting on main.

Supported Files

Type Extensions Notes
PDF pdf Strips trailer and document metadata fields where possible.
OOXML docx, docm, dotx, dotm, xlsx, xlsm, xltx, xltm, pptx, pptm, potx, potm Rewrites archives and removes document properties XML.
PNG png Removes metadata chunks such as text and time chunks.
JPEG jpg, jpeg Removes common metadata segments like EXIF, XMP, IPTC, and comments.

Install

Prerequisites

  • Rust stable toolchain
  • Cargo

If you do not already have Rust installed, use rustup first. It is the least dramatic way to get Cargo on your machine.

From crates.io

cargo install eclipse-sanitizer

That is the one-liner if you want the published build and would rather not flirt with cargo build on a Friday.

From source

git clone <REPO_URL>
cd Eclipse
cargo build --release

From a local path

cargo install --path .

That installs the eclipse binary into your Cargo bin directory, because the binary should still have a respectable name even if the package name got longer.

Update

If you installed Eclipse from crates.io, update it with the one-liner:

cargo install eclipse-sanitizer --force

If you installed from the local source tree, update it with:

cargo install --path . --force

If you are developing from the repository itself, the usual update flow is:

git pull
cargo build --release

Uninstall

If installed with Cargo from crates.io, remove it with:

cargo uninstall eclipse-sanitizer

If you built from source without installing globally, remove the project directory or the compiled binary in target/release.

Quick Start

  1. Build or install Eclipse.
  2. Point it at a file or folder.
  3. Use --dry-run first if you enjoy not being ambushed by your own filesystem.

Run the binary against either a single file or a directory:

eclipse <INPUT>

Write sanitized output to a separate folder:

eclipse <INPUT> --output <OUTPUT_DIR>

Preview changes without writing anything:

eclipse <INPUT> --dry-run

Override the worker thread count:

eclipse <INPUT> --jobs 8

Example: sanitize a folder into a new destination with four workers.

eclipse .\sample-files --output .\sanitized --jobs 4

Example: preview a single file and keep your hands clean.

eclipse .\report.pdf --dry-run

CLI Options

Option Description
INPUT File or directory to scan.
-o, --output <OUTPUT_DIR> Write sanitized output to a separate directory. If omitted, files are rewritten in place.
--jobs <THREADS> Override the Rayon worker thread count. Use this when you want to tell the scheduler how hard to flex.
--dry-run Report what would be removed without writing output.

Commands

Runtime Commands

Command What It Does
eclipse <INPUT> Scan and sanitize a file or directory in place.
eclipse <INPUT> --output <OUTPUT_DIR> Sanitize into a separate output directory while preserving relative paths.
eclipse <INPUT> --dry-run Show the planned changes without touching files.
eclipse <INPUT> --jobs <THREADS> Run with a custom worker pool size.
eclipse <INPUT> --output <OUTPUT_DIR> --dry-run Preview output mapping and metadata removal together.
eclipse <INPUT> --output <OUTPUT_DIR> --jobs <THREADS> Full-speed output mode with a custom thread count.

Cargo Commands

Command What It Does
cargo build Build the project in debug mode.
cargo build --release Build an optimized release binary.
cargo run -- --help Show the CLI help screen.
cargo test Run the unit tests for the sanitizers.
cargo check Validate the code compiles without building the final binary.
cargo install eclipse-sanitizer Install the published package from crates.io.
cargo install eclipse-sanitizer --force Update the published package from crates.io.
cargo uninstall eclipse-sanitizer Remove the published package from your machine.
cargo install --path . Install Eclipse locally as a Cargo binary.
cargo install --path . --force Update a local Cargo installation.

Security

Eclipse uses a defensive workflow rather than a hopeful one.

  • Files are written through a temporary file first, then moved into place atomically.
  • SHA-256 hashes are calculated before and after processing.
  • The program verifies the persisted file hash before considering the job successful.
  • --dry-run avoids writes entirely.
  • Interrupt handling is wired so CTRL+C stops further processing cleanly.
  • Audit output is kept separate from the files being sanitized.

This is not a “trust me bro” pipeline. It double-checks its own homework and then asks the compiler to sign it in blood.

Audit Log

Each run writes structured audit events to .eclipse_audit.log in the output directory when one is used. If no output directory is configured, the log is written beside the input data when possible.

The audit stream records details such as:

  • timestamp
  • status
  • file kind
  • source path
  • destination path
  • original hash
  • sanitized hash
  • removed items
  • errors, when present

Project Structure

File Purpose
src/main.rs Application entry point.
src/cli.rs Clap CLI definition.
src/app.rs Discovery, orchestration, reporting, and persistence.
src/models.rs Shared file and run models.
src/hashing.rs SHA-256 hashing helper.
src/audit.rs Tracing-based audit logger.
src/sanitizers/mod.rs Sanitizer trait and registry.
src/sanitizers/pdf.rs PDF sanitization logic.
src/sanitizers/ooxml.rs OOXML archive rewriting.
src/sanitizers/png.rs PNG metadata stripping.
src/sanitizers/jpeg.rs JPEG metadata stripping.

FAQ

No supported files found?

Double-check the extensions. Eclipse only processes files it knows how to sanitize, not every random blob with confidence issues.

Output directory must not be the same as the input directory?

That is a safety guard. Point --output somewhere else so the sanitizer does not politely eat its own lunch.

One or more files failed to sanitize?

Inspect the console output and .eclipse_audit.log for the exact file and error. Some files may be malformed, cursed, or just plain rude to parse.

Build or test fails?

Try these commands in order:

cargo check
cargo test
cargo run -- --help

License

MIT License. See LICENSE for the legally boring but necessary bits.