base-d

A universal, multi-alphabet encoding library and CLI tool for Rust. Encode binary data to 33+ alphabets including RFC standards, ancient scripts, emoji, playing cards, and more.
Overview
base-d is a flexible encoding framework that goes far beyond traditional base64. It supports:
- 33 built-in alphabets - From RFC 4648 standards to hieroglyphics and emoji
- 3 encoding modes - Mathematical, chunked (RFC-compliant), and byte-range
- Custom alphabets - Define your own via TOML configuration
- Streaming support - Memory-efficient processing for large files
- Library + CLI - Use programmatically or from the command line
Key Features
Multiple Encoding Modes
- Mathematical Base Conversion - Treats data as a large number, works with any alphabet size
- Chunked Mode - RFC 4648 compatible (base64, base32, base16)
- Byte Range Mode - Direct 1:1 byte-to-emoji mapping (base100)
Extensive Alphabet Collection
- Standards: base64, base32, base16, base58 (Bitcoin), base85 (Git)
- Ancient Scripts: Egyptian hieroglyphics, Sumerian cuneiform, Elder Futhark runes
- Game Pieces: Playing cards, mahjong tiles, domino tiles, chess pieces
- Esoteric: Alchemical symbols, zodiac signs, weather symbols, musical notation
- Emoji: Face emoji, animal emoji, base100 (256 emoji range)
- Custom: Define your own alphabets in TOML
Advanced Capabilities
- Streaming Mode - Process multi-GB files with constant 4KB memory usage
- User Configuration - Load custom alphabets from
~/.config/base-d/alphabets.toml - Project-Local Config - Override alphabets per-project with
./alphabets.toml - Three Independent Algorithms - Choose the right mode for your use case
Quick Start
# Install (once published)
# Or build from source
# List all 32 available alphabets
# Encode with playing cards (default)
|
# RFC 4648 base32
|
# Bitcoin base58
|
# Egyptian hieroglyphics
|
# Emoji faces
|
# Process files
Installation
Usage
As a Library
Add to your Cargo.toml:
[]
= "0.1"
Basic Encoding/Decoding
use ;
Streaming for Large Files
use ;
use File;
Custom Alphabets
use ;
Loading User Configurations
use AlphabetsConfig;
// Load with user overrides from:
// 1. Built-in alphabets
// 2. ~/.config/base-d/alphabets.toml
// 3. ./alphabets.toml
let config = load_with_overrides?;
// Or load from specific file
let config = load_from_file?;
As a CLI Tool
Encode and decode data using any alphabet defined in alphabets.toml:
# List available alphabets
# Encode from stdin (default alphabet is "cards")
|
# Encode a file
# Encode with specific alphabet
|
# Decode
|
# Round-trip encoding
| |
# Stream mode for large files (memory efficient)
Custom Alphabets
Add your own alphabets to alphabets.toml:
[]
# Your custom 16-character alphabet
= "ππππ€£πππ
ππππππππ₯°π"
# Chess pieces (12 characters)
= "ββββββββββββ"
Or create custom alphabets in ~/.config/base-d/alphabets.toml to use across all projects. See Custom Alphabets Guide for details.
Built-in Alphabets
base-d includes 33 pre-configured alphabets organized into several categories:
- RFC 4648 Standards: base16, base32, base32hex, base64, base64url
- Bitcoin & Blockchain: base58, base58flickr
- High-Density Encodings: base62, base85, ascii85, z85
- Human-Oriented: base32_crockford, base32_zbase
- Ancient Scripts: hieroglyphs, cuneiform, runic
- Game Pieces: cards, domino, mahjong, chess
- Esoteric Symbols: alchemy, zodiac, weather, music, arrows
- Emoji: emoji_faces, emoji_animals, base100
- Other: dna, binary, hex, base64_math, hex_math
Run base-d --list to see all available alphabets with their encoding modes.
For a complete reference with examples and use cases, see ALPHABETS.md.
How It Works
base-d supports three encoding algorithms:
-
Mathematical Base Conversion (default) - Treats binary data as a single large number and converts it to the target base. Works with any alphabet size.
-
Bit-Chunking - Groups bits into fixed-size chunks for RFC 4648 compatibility (base64, base32, base16).
-
Byte Range - Direct 1:1 byte-to-character mapping using a Unicode range (like base100). Each byte maps to a specific emoji with zero encoding overhead.
For a detailed explanation of all modes with examples, see ENCODING_MODES.md.
License
MIT OR Apache-2.0
Documentation
- Alphabet Reference - Complete guide to all 33 built-in alphabets
- Custom Alphabets - Create and load your own alphabets
- Encoding Modes - Detailed explanation of mathematical vs chunked vs byte range encoding
- Streaming - Memory-efficient processing for large files
- Hexadecimal Explained - Special case where both modes produce identical output
- Roadmap - Planned features and development phases
- CI/CD Setup - GitHub Actions workflow documentation
Contributing
Contributions are welcome! Please see ROADMAP.md for planned features.