biip 0.5.0

Redact sensitive information from text!
Documentation

biip

biip (Beep + PII) is a tool (and a library) to scrub PII from text.

Install

For Linux and MacOS, you can install biip using the pre-built binaries:

curl -sfSL https://raw.githubusercontent.com/crodjer/biip/main/download.sh | bash

If you have Rust installed, you can install biip using Cargo:

cargo install biip

How does it work?

Pipe any text to biip to have it scrub away sensitive information.

For example, if you have a file with content:

Hi, I am "awesome-user"
Current Directory: /Users/awesome-user/foo/bar/baz
My Secret Key: mAM3zwogXpV6Czj6J
My Email: foo@bar.com
My IPs:
- fe80::aaa:8888:ffff:9999
- 192.168.42.42
Connect via ftp://user:pass@example.com
Auth token is eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0In0.sZtZQ...
My MAC address is 00-1A-2B-3C-4D-5E.

biip can redact some sensitive information from it:

$ cat /tmp/info.txt | biip
Hi, I am "user"
Current Directory: ~/foo/bar/baz
My Key: ••••••••
My Email: •••@•••
My IPs:
- IPv6<••:••:••:••:••:••:••:••>
- IPv4<••.••.••.••>
Connect via ftp://••••:••••@example.com
Auth token is ••••🌐•
My MAC address is ••:••:••:••:••:••.

What does it scrub?

Biip can scrub:

  1. Unix (Linux/Mac) username: It removes any mention of a user's Unix username.
  2. Home directory: It replaces paths referring to the home directory with ~.
  3. URL Credentials: Scrubs usernames and passwords from URLs (e.g., https://user:pass@...).
  4. Email Addresses: Replaces emails with •••@•••.
  5. IP Addresses: Redacts IPv4 and IPv6 addresses.
  6. MAC Addresses: Replaces MAC addresses.
  7. Phone Numbers: Redacts common phone number formats.
  8. Credit Card Numbers: Redacts common credit card number patterns.
  9. JSON Web Tokens (JWTs): Finds and redacts JWTs.
  10. API Keys: Redacts common API key formats from providers like AWS, OpenAI, etc.
  11. UUIDs: Replaces UUIDs with a redacted pattern.
  12. Keys / Passwords from environment: It replaces the values for any potentially sensitive environment variables with: ••••••••.

How is it useful?

LLM Context

When sharing code with LLMs for AI assistance, running it through biip would be beneficial to strip out any sensitive info. Like this:

fd --size -8K | xargs tail -n +1 | biip | pbcopy

This will copy your entire codebase to clipboard, excluding large files and redact sensitive information. On Linux, use xclip (for X11) and wl-copy (for wayland) instead of pbcopy.

Copying .env

biip considers .env, so it'll remember to not share any sensitive keys even if .env's content was in the stdin. So, biip would redact (keys, secrets etc) from the output:

$ cat .env | biip
S3_KEY="••••••••"
S3_SECRET="••••••••"
OPENAI_API_KEY="••••☁️•"