Oboron
Oboron is a general-purpose symmetric encryption library focused on developer ergonomics:
- String in, string out: Encryption and encoding are bundled into one seamless process
- Standardized interface: Multiple encryption algorithms accessible through the same API
- Unified key management: A single 512-bit key works across all schemes with internal extraction to algorithm-specific keys
- Prefix-focused entropy: Maximizes entropy in initial characters for referenceable short prefixes (similar to Git commit hashes)
In essence, Oboron provides an accessible interface over established
cryptographic primitives—implementing AES-CBC, AES-GCM-SIV, and AES-SIV—
with a focus on developer ergonomics and output characteristics. Each
scheme follows a consistent naming pattern that encodes its security
properties, making it easier to choose the right tool without deep
cryptographic expertise: e.g., aasv = Authenticated + Avalanche
property + SiV algorithm (AES-SIV).
Key Advantages:
- Referenceable prefixes: High initial entropy enables Git-like short IDs
- Simplified workflow:
- No manual encoding/decoding between encryption stages
- No decoding encryption keys from env vars to bytes
- Performance optimized
Contents
- Quick Start
- Formats
- Algorithm
- Key Management
- Properties
- Python API Overview
- Applications
- Compatibility
- Getting Help
- License
- Appendix: Obtext Lengths
Quick Start
Installation
pip install oboron
Generate your 512-bit key (86 base64 characters) using the keygen script:
python -m oboron.keygen
or in your code:
key = oboron.generate_key
then save the key as an environment variable.
Use AasvC32 (a secure scheme, 256-bit encrypted with AES-SIV, encoded using Crockford's base32 variant) for enc/dec:
= # get the key
= # instantiate codec (cipher+encoder)
= # get obtext (encrypted+encoded)
= # get plaintext back (decode+decrypt obtext)
# "obtext: cbv74r1m7a7cf8n6gzdy6tf2vjddkhwdtwa5ssgv78v5c1g"
assert ==
Version 1.0: This release marks API stability. Oboron follows semantic versioning, so 1.x releases will maintain backward compatibility.
Formats
An Oboron format represents the full transformation of the plaintext to the encrypted text (obtext), including:
- Encryption: Plaintext UTF-8 string encrypted to ciphertext bytes using a cryptographic algorithm
- Encoding: The binary payload is encoded to a string representation
Scheme + Encoding = Format
Formats combine a scheme (cryptographic algorithm) with an encoding (string representation):
- Scheme: Cryptographic algorithm + mode + parameters (e.g.,
aasv) - Encoding: String representation method (e.g.,
.b64) - Format: Scheme + encoding = complete transformation (e.g.,
aasv.b64)
Given an encryption key, the format thus uniquely specifies the complete transformation from a plaintext string to an encoded obtext string.
Formats are represented by identifiers:
ob:{scheme}.{encoding}, (URI-like syntax, e.g.,ob:aasv.c32),{scheme}.{encoding}, when the context is clear
API Notes:
- The
ob:namespace prefix is not used in theoboronAPI. Formats likeaasv.c32are used directly. - The public interface uses
enc/decnames for methods and functions. Thus theencoperation comprises the full process, including the encryption and encoding stages.
Encodings
b32- standard base32: Balanced compactness and readability, uppercase alphanumeric (RFC 4648 Section 6)c32- Crockford base32: Balanced compactness and readability, lowercase alphanumeric; designed to avoid accidental obscenityb64- standard URL-safe base64: Most compact, case-sensitive, includes-and_characters (RFC 4648 Section 5)hex- hexadecimal: Slightly faster performance (~2-3%), longest output
FAQ: Why use Crockford's base32 instead of the RFC standard one?
Crockford's base32 alphabet minimizes the probability of accidental obscenity words, which is important when using with short prefixes: Whereas accidental obscenity is not an issue when working with full encrypted outputs (as any such words would be buried as substrings of a 28+ character long obtext), it may become a concern when using short prefixes as references or quasi-hash identifiers.
Schemes
Schemes define the encryption algorithm and its properties, classified into tiers:
Scheme Tiers
-
a- Authenticated- Provide both confidentiality and integrity protection
- Examples:
ob:aasv,ob:aags,ob:apsv,ob:apgs - Always prefer
a-tier schemes for security-critical applications
-
u- Unauthenticated- Provide confidentiality only (no integrity protection)
- Example:
ob:upbc - Suitable when integrity is verified externally or not required
- Warning: Vulnerable to ciphertext tampering
-
z- Obfuscation tier- Not cryptographically secure - for non-security use only
- Example:
ob:zrbcx- deterministic obfuscation with constant IV - Requires explicit
ztierfeature flag (not enabled by default) - See Z_TIER.md for details and warnings
Scheme Properties
The second letter of the scheme ID further describe the properties of the scheme:
.a..- avalanche, deterministic- deterministic => same plaintext always produces same obtext
- avalanche => entropy uniformly distributed; change in any byte of plaintext completely changes the entire obtext (hash-like property)
- Examples:
ob:aasv,ob:aags
.p..- probabilistic- Different output each time
- Examples:
ob:apsv,ob:apgs,ob:upbc
Scheme Cryptographic Algorithms
The remaining two letters in scheme IDs indicate the algorithm:
gs= AES-GCM-SIVsv= AES-SIVbc= AES-CBC
Summary Table
| Scheme | Algorithm | Deterministic? | Authenticated? | Notes |
|---|---|---|---|---|
ob:aasv |
AES-SIV | Yes | Yes | General purpose, deterministic |
ob:aags |
AES-GCM-SIV | Yes | Yes | Deterministic alternative |
ob:apsv |
AES-SIV | No | Yes | Maximum privacy protection |
ob:apgs |
AES-GCM-SIV | No | Yes | Probabilistic alternative |
ob:upbc |
AES-CBC | No | No | Unauthenticated - use with caution |
Key Concepts:
- Deterministic: Same input (key + plaintext) always produces same output. Useful for idempotent operations, lookup keys, caching, or hash-like references.
- Probabilistic: Incorporates a random nonce, producing different ciphertexts for identical plaintexts. Standard for most cryptographic use cases (non-cached, not used as hidden references).
- Authenticated: Ciphertext is tamper-proof. Any modification (even a single bit flipped) results in decryption failure.
Choosing a Scheme
ob:aasv: General-purpose secure encryption with deterministic output and compact sizeob:apsv: Maximum privacy with probabilistic output (larger size due to nonce)ob:upbc: Only when integrity is handled externally
Note on encryption strength: All
a-tier andu-tier schemes use 256-bit AES encryption. Thez-tier uses 128-bit AES for performance in non-security contexts.
Algorithm
Oboron combines encryption and encoding in a single operation, requiring specific terminology:
- enc: Combines encryption and encoding stages
- dec: Combines decoding and decryption stages
- obtext: The output of the
encoperation (encryption + encoding), distinct from cryptographic ciphertext
The cryptographic ciphertext (bytes, not string) is an internal implementation detail, not exposed in the public API.
The high-level process flow is:
enc operation:
[plaintext] (string) -> encryption -> [ciphertext] (bytes) -> encoding -> [obtext] (string)
dec operation:
[obtext] (string) -> decoding -> [ciphertext] (bytes) -> decryption -> [plaintext] (string)
The above diagram is conceptual; actual implementation includes
scheme-specific steps like scheme byte appending and (for z-tier
schemes only) optional ciphertext prefix restructuring. With this
middle-step included, the diagram becomes:
enc operation:
[plaintext] -> encryption -> [ciphertext] -> oboron pack -> [payload] -> encoding -> [obtext]
dec operation:
[obtext] -> decoding -> [payload] -> oboron unpack -> [ciphertext] -> decryption -> [plaintext]
In a-tier and u-tier schemes, the difference between the payload and
the ciphertext is in the 2-byte scheme marker that is appended to the
ciphertext, enabling scheme autodetection in decoding.
Padding Design
Oboron's CBC schemes use a custom padding scheme optimized for UTF-8 strings:
- Uses 0x01 byte for padding (Unicode control character, never valid in UTF-8)
- No padding needed when plaintext ends at block boundary
- 5% performance improvement over PKCS#7
- Smaller output size compared to PKCS#7
Rationale: Oboron exclusively processes UTF-8 strings, not arbitrary
binary data. The 0x01 padding byte can never appear in valid UTF-8
input, ensuring unambiguous decoding. Therefore, under the UTF-8 input
constraint, this padding is functionally equivalent to PKCS#7 and does
not weaken security. The UTF-8 input constraint is guaranteed by the
Rust type system - all enc functions and methods accept a &str,
therefore passing an input that is not valid UTF-8 would not be allowed
by the Rust compiler. This UTF-8 guarantee is enforced at compile time,
eliminating padding ambiguity errors at runtime.
Key Management
Single Master Key Model
Oboron uses a single 512-bit master key partitioned into algorithm-specific subkeys:
ob:aags,ob:apgs: use the first 32 bytes (256 bits) for AES-GCM-SIV keyob:aasv,ob:apsv: use the full 64 bytes (512 bits) for AES-SIV keyob:upbcuses the last 32 bytes (256 bits) for AES-CBC key
Design Rationale: This approach prioritizes low latency for short-string encryption. No hash-based KDF (e.g., HKDF) is used, as this would dominate runtime for intended workloads.
The master key never leaves your application. Algorithm-specific keys are extracted on-the-fly and never cached or stored.
FAQ: Why use a single key across all schemes?
- Simplifies deployment: Store one key instead of multiple
- Reduces errors: No risk of mismatching keys to algorithms
Key Format
The default key input format is base64. This is consistent with Oboron's strings-first API design. As any production use will typically read the key from an environment variable, this allows the string format to be directly fed into the constructor.
The base64 format was chosen for its compactness, as an 86-character base64 key is easier to handle manually (in secrets or environment variables management UI) than a 128-character hex key.
While any 512-bit key is accepted by Oboron, the keys generated with
oboron::generate_key() or cargo run --bin keygen do not include any
dashes or underscores, in order to ensure the keys are double-click
selectable, and to avoid any human visual parsing due to underscores.
Valid Base64 Keys
Important technical detail: Not every 86-character base64 string is a valid 512-bit key. Since 512 bits requires 85.3 bytes when base64-encoded, the final character is constrained by padding requirements. When generating keys, it is recommended to use one of the following methods:
- use Oboron's key generator (
oboron::generate_key()orcargo run --bin keygen) - generate random 64 bytes, then encode as base64
- generate random 128 hex characters, then convert hexadecimal to base64
Properties
Referenceable Prefixes
If you've used Git, you're already familiar with prefix entropy: you can
reference commits with just the first 7 characters of their SHA1 hash
(like git show a1b2c3d). This works because cryptographic hashes
distribute entropy evenly across all characters.
Oboron schemes exhibit similar prefix quality. Consider these comparisons:
Short Reference Strength:
- Git SHA1 (7 hex chars): 28 bits of entropy
- Oboron (6 base32 chars): 30 bits of entropy
- Oboron (7 base32 chars): 35 bits of entropy
Collision Resistance: For a 1-in-a-million chance of two items sharing the same prefix:
- Git 7-char prefix (28 bits): After ~38 items
- Oboron 6-char prefix (30 bits): After ~52 items
- Oboron 7-char prefix (35 bits): After ~262 items
(These estimates assume uniform ciphertext distribution under a fixed key.)
Practical Implications: In a system with 1,000 unique items using 7-character Oboron prefixes:
- Collision probability: ~0.007% (1 in 14,000)
- In a system with 10,000 items: ~0.7% (1 in 140)
This enables Git-like workflows for moderate-scale systems: database IDs, URL slugs, or commit references that are both human-friendly and cryptographically robust for everyday use cases.
Deterministic Injectivity
Comparing the prefix collision resistance in the previous section, Oboron and standard hashing algorithms were compared against each other. But when we consider the full output, then they are not on the same plane: while SHA1 and SHA256 collision probabilities are astronomically small, they are never zero, and the birthday paradox risk can become a factor in large systems even with the full hash. Oboron, on the other hand, is a symmetric encryption library, and as such it is collision free (although applying this label to an encryption library is awkward): for a fixed key and within the block-cipher domain limits, Oboron is injective (one-to-one), i.e. two different inputs can never result in the same output.
Performance Comparison
(All performance benchmarks are from the Rust library benchmarks, without the Python bindings overhead.)
Oboron is optimized for performance with short strings, often exceeding both SHA256 and JWT performance while providing reversible encryption.
Note: As a general-purpose encryption library, Oboron is not a replacement for either JWT or SHA256. We use those two for baseline comparison, as they are both standard and highly optimized libraries. However, as we show in the Applications section below, overlaps in applications with JWT and SHA256 are possible.
| Scheme | 8B Encode | 8B Decode | Security | Use Case |
|---|---|---|---|---|
ob:zrbcx |
132 ns | 126 ns | Insecure | Maximum speed + compactness |
ob:aasv |
334 ns | 364 ns | Secure + Auth | Balanced performance + security |
| JWT | 550 ns | 846 ns | Auth only* |
Signature without encryption |
| SHA256 | 191 ns | N/A | One-way | Hashing only |
* Note: JWT baseline (HMAC-SHA256) provides authentication without
encryption. Despite comparing against our stronger a-tier (secure
- authenticated), Oboron maintains performance advantages while providing full confidentiality.
More detailed benchmark results are presented in a separate document:
- BENCHMARKS.md. Data from JWT and SHA256 benchmarks performed on the same machine is available here:
- BASELINE_BENCHMARKS.md
Performance advantages:
ob:zrbcxencoding is 4.1x faster than JWT with 4.5x smaller output- All Oboron schemes outperform JWT for both encoding and decoding
ob:zrbcxshows lower latency than SHA256+hex for short strings while providing reversible (cryptographically insecure) encryption
Output Length Comparison
| Method | Small string output length |
|---|---|
ob:aasv |
31-48 characters |
ob:apsv |
56-74 characters |
ob:zrbcx |
29 characters |
| SHA256 | 64 characters |
| JWT | 150+ characters |
A more complete output length comparison is given in the Appendix.
Scheme Selection Guidelines
ob:aasv: General-purpose secure encryption with deterministic output and compact sizeob:apsv: Maximum privacy protection with probabilistic output (larger size due to nonce)ob:zrbcx: Non-security-critical applications prioritizing speed and compactness
Choose ob:aasv when:
- Cryptographic security with compact output is needed (~34-47 chars)
- Deterministic behavior is beneficial (lookup keys, caching)
Choose ob:apsv` when:
- Cryptographic security with maximum privacy is required (~60-72 chars)
- Hiding plaintext relationships is critical
Choose ob:zrbcx when:
- Performance and compactness are primary requirements (~28 chars)
- Security requirements are minimal (obfuscation contexts)
Python API Overview
Oboron provides multiple API styles supporting different use cases. For most production applications, compile-time format selection (option 1 below) offers the best combination of performance, type safety, and clarity.
1. Fixed Format Selection (Recommended for Production)
When your encryption format is fixed, instantiate the specific scheme class
(like AasvC32) directly for optimal performance and type safety:
=
=
=
assert ==
Available types include all combinations of scheme variants (e.g.,
Zrbcx, Upbc, Aags, Apgs, Aasv, Apsv) with encoding
specifications (B64, Hex, B32, or C32),
and concatenates the two in class names, for example:
ZrbcxB32- encoder forzrbcx.b32formatUpbcHex- encoder forupbc.hexformatAagsB64- encoder foraags.b64formatAasvC32- encoder foraasv.c32format.
2. Runtime Format Selection (Ob)
When format specification at runtime is required, use Ob:
=
= # aasv.b64 format obtext
=
assert ==
# switch format to aasv.c32
# now aasv.c32-encoded obtext
# switch wormat to aags.c32
# now aags.c32-encoded obtext
# now upbc.b64-encoded obtext
Example use: format provided by environment variable.
3. Multiple Format Support (Omnib)
Omnib differs in format management and provides comprehensive
autodec() functionality.
Multi-Format Workflow: Designed for simultaneous work with different formats, requiring format specification in each operation:
=
# Format specification per operation
=
=
=
Autodecode: While other interfaces perform scheme autodetection in
dec() methods, only Omnib provides full format autodetection
including encoding (base32rfc, base32crockford, base64, or hex). Other
classes decode only encodings matching their format.
# Autodecode when format is unknown
=
Note performance implications: autodetection uses trial-and-error across
encodings, with worst-case performance ~3x slower than known-format
dec operations. (However, the heuristic encoding detection makes the average
performace much closer to that of normal dec() operations than the worst case.)
Meanwhile, scheme autodetection in other interfaces (e.g., Ob.dec(),
AasvB64.dec()) has zero overhead, as the scheme is detected based
on the scheme byte in the payload, and the logic follows a direct path
with no retries.
Using Format Constants
For type safety and discoverability, use the provided format constants instead of string literals:
# With Ob (runtime format selection)
=
# With Omnib (multi-format operations)
=
=
=
Available constants:
ZRBCX_C32,ZRBCX_B32,ZRBCX_B64,ZRBCX_HEXUPBC_C32,UPBC_B32,UPBC_B64,UPBC_HEXAAGS_C32,AAGS_B32,AAGS_B64,AAGS_HEXAPGS_C32,APGS_B32,APGS_B64,APGS_HEXAASV_C32,AASV_B32,AASV_B64,AASV_HEXAPSV_C32,APSV_B32,APSV_B64,APSV_HEX- Testing:
MOCK1_*,MOCK2_* - Legacy:
LEGACY_*
Typical Production Use
For compile-time known schemes and encodings, however, static types provide optimal performance, concise syntax, and strongest type guarantees:
=
=
The format is built into the class, no format strings or constants, are needed.
OboronBase class
All types except Omnib implement the Oboron trait, providing a
consistent interface:
Methods:
enc(plaintext: str) -> str- Encrypt plaintext to obtextdec(obtext: str) -> str- Decrypt obtext to plaintext Properties:key -> str- Base64 key accesskey_bytes -> bytes- Raw key bytes accessformat -> str- Current format (scheme+encoding)scheme -> str- Current schemeencoding -> str- Current encoding
Working with Keys
= # base64 key
Warning: new_keyless() uses the publicly available hardcoded key
providing no security. Use only for testing or obfuscation contexts where
encryption is not required.
= # hardcoded key
Common Issues
- Key errors: Ensure keys are exactly 86 base64 characters characters properly encoded from 512 bits (see note about valid base64 keys)
- Format strings: Must match exactly, e.g., "aasv.b64" not "aasv-b64"
- Decoding errors: Use
autodec()when format is unknown
Applications
While Oboron serves as a general-purpose encryption library with its "string in, string out" API, its combination of properties—particularly prefix entropy and compactness—enables specialized applications:
- Git-like short IDs - High-entropy prefixes for unique references
- URL-friendly state tokens - Encrypt web application state into compact URLs
- No-lookup captcha systems - Server issues encrypted challenge, verifies without database lookup
- Database ID obfuscation - Hide sequential IDs while maintaining reversibility
- Compact authentication tokens - Efficient alternative to JWT for simple use cases where JWT may be overkill
- General-purpose symmetric encryption - Straightforward string-based API
Comparison with Alternatives
| Use Case | Traditional Solution | Oboron Approach |
|---|---|---|
| Short unique IDs | UUIDv4 (36 chars) | ob:zrbcx.c32 (28 chars, reversible) |
| URL parameters | JWT (150+ chars) | ob:aasv.b64 (4.5x smaller, 4x faster) |
| Database ID masking | Hashids (not secure) | Proper encryption |
API Simplification
Oboron simplifies symmetric encryption compared to lower-level cryptographic libraries:
Before (libsodium/ring - complex, byte-oriented):
# --- KEY ---
# Manual key and nonce management
=
=
# --- ENCRYPT+ENCODE ---
# Manual conversion of UTF-8 string to bytes
=
=
# Create a box
=
# Encrypt
=
# Manually encode for print/transport
=
# --- DECODE+DECRYPT ---
# Decode from base64
=
# Decrypt (returns bytes)
=
# Manual UTF-8 decoding required
=
After (Oboron - simplified, string-oriented):
# --- KEY ---
# Generate key in base64 (ready for storing as environment variable)
=
=
# --- ENCRYPT+ENCODE ---
# Direct string in, string out
=
=
# --- DECODE+DECRYPT ---
=
Benefits:
- No manual hex/base64 encoding/decoding
- Keys as base64 strings (no byte array management)
- Built-in nonce generation where applicable
- Consistent error handling
- Single dependency vs multiple packages
When Oboron is appropriate:
- General symmetric encryption requirements
- Need for compact, referenceable outputs
- Simplified key management (single 512-bit key)
- String-to-string interface preferred
When lower-level libraries may be preferable:
- Need for specific algorithms (ChaCha20-Poly1305, etc.)
- Streaming encryption of large files
- Asymmetric encryption cryptography requirements
- Specialized protocols (Signal, Noise, etc.)
Pattern Implementation Examples
Database ID Obfuscation
Before (Hashids - insecure, encoding only):
=
=
= # "k2d3e4"
= # 123
Problems:
- Only works with integers
- Uses a weak "salt" (not a cryptographic key)
- Output reveals information about input (length, structure)
- Anyone with the salt can decode all IDs
After (Oboron - encrypted, reversible, secure):
=
=
= # "waz7vh42v1jqwtavafwnxqy2anhn12w6"
= # "123"
Advantages:
- Encodes arbitrary strings (vs integer-only encoding)
- Actual encryption (not just encoding)
- Can embed metadata (e.g.,
"user:","order:"prefixes, or JSON) - Tamper-proof with authenticated schemes
The advantage of Hashids is that they are both short and reversible. With Oboron, if no reversibility is required, the first 6 characters of the obtext can be used as a collision-resistant reference (e.g., waz7vh").
State Tokens
Before (JWT - large, complex):
=
=
=
# 191-character base64 string
=
Note the API asymmetry:
- jwt.encode() takes
algorithm="HS256" - jwt.decode() takes
algorithms=["HS256"] - Security feature needed due to same API supporting both symmetric and asymmetric cryptography
Performance (on Intel i5):
jwt.encode(): 20 usjwt.decode(): 24 us
HS256 accepts any length secret, no warnings for short secrets:
# works fine
After (Oboron - compact, simple):
# Deterministic, authenticated scheme
# Same 86 base64 characters format used for all agorithms
# Each algorithm gets proper length cryptographic key
# (e.g. 256-bit key for AES-GCM-SIV)
=
=
=
=
=
# 142 characters base64 string
=
=
# Implement your own token validation logic in a few lines of code
...
Performance comparison (Intel i5 CPU):
| 89B claims (example above) | encode | decode | Note |
|---|---|---|---|
| JWT w/ HS256 auth | 20 us | 24 us | |
| Oboron w/ string payload | 1.9 us | 1.9 us | Rust execution dominated by Python bindings overhead |
| Oboron w/ dict to JSON | 4.7 us | 4.0 us | JSON serialization overhead exceeds encryption call |
=> encryption + authentication is 5x faster than JWT (HS256 provides auth only)
Token size comparison:
- JWT: 191B
- Oboron: 142B (25% smaller)
When to prefer Oboron over JWT:
- Simple symmetric encryption requirements
- Compact size important (URL parameters)
- JWT standardization not required
- Performance considerations
When JWT may be preferable:
- Industry-standard token format required
- Public/private key signatures needed
- Complex claims with registered names
ID Generation and Hash-like Applications
Oboron provides efficient alternatives to UUIDs and SHA256 for generating unique, referenceable identifiers.
The examples in this section use zrbcx and keyless features, which are
not included by default as cryptographically insecure. Enable
the required features explicitly in your Cargo.toml.
Approach 1: Full Oboron Output (Reversible)
= # Obfuscaton context
=
# "mdwsx9rdwkntyqcf806r9jhsp6gg" (28 base32 chars, reversible)
- Pros:
- Reversible (decodes to "user:alice"),
- Opaque structure: When decoded with base32, the obtext produces a binary blob, revealing no input patterns.
- Automatic handling: Oboron detects the scheme (
zrbcx), and can decrypt with its hardcoded key
- Cons:
- Using hardcoded key: Given the context (keyless Oboron), anyone can decode
- Best for:
- Internal systems where reversibility is useful
- Strong obfuscation where attackers have no context of Oboron use
Possible security tightening if reversibility is needed:
- Use
aagsoraasvfor strong 256-bit tamper-proof encryption. (Trade-off: longer output: 44 chars; 2-3x slower thanzrbcxbut still comparable performance to SHA256) - Keep the payload securely encrypted by having a shared secret:
env::var("OBORON_KEY")(Trade-off: shared secret management)
Approach 2: Trimmed Prefix (Hash-like, Non-reversible)
ob = ZrbcxC32
full = ob.enc
short_id = full
shorter_id = full # "mdwsx9" ~ Git 7 char hex commit reference
- Pros:
- Non-reversible even with hardcoded key
- No key management
- Adjustable length
- Cons:
- Not reversible
- Best for:
- Public-facing identifiers requiring opacity and referenceable short IDs.
Oboron for Hash-like Identifier Generation
SHA256 is the ubiquitous go-to solution for hash identifiers. However, it is not optimized for short strings. Hashing a 6-digit ID or an 10-character parameter is a very common use-case, however reaching for SHA256 in this context may have drawbacks:
- the output is much longer than the input (always 64 hex characters)
- cutting the output down to a short prefix requires weighing odds of the birthday paradox problem
- performance is not optimal (optimized for large files)
Performance considerations:
- SHA256 + hex: ~190 ns, 64 hex characters (128-bit collision resistance)
- Oboron zrbcx (one block): ~130 ns, 28 base32/34 hex chars (37% faster)
- Oboron zrbcx (two blocks): ~147 ns, 53 base32/66 hex chars (27% faster, stronger than SHA256) (Times from benchmarks run on an Intel i5 laptop.)
Collision resistance comparison:
- 6 base32 chars (30 bits): Exceeds 7 hex chars (28 bits) for short references
- 20 base32 chars (100 bits): Comparable to SHA1 collision resistance
- 28 base32 chars (136 bits): Slightly stronger than SHA256's 128 bits
- 53 base32 chars (264 bits): Substantially stronger than SHA256 Note that the consideration of Oboron's 28- and 53-bit outputs in the context of collision resistance only makes sense in a global namespace; when using a fixed key, the collision problem for full Oboron outputs disappears altogether.
Oboron advantages:
- Better performance - 27-37% faster than SHA256 for short strings
- More compact encoding - Base32 provides 5 bits per char vs hex's 4 bits
- Referenceable prefixes - High entropy from initial characters
- Tunable security - Select prefix length for specific collision resistance requirements
- Deterministic guarantee - Different inputs always produce different outputs
When to choose which approach:
- Oboron (28 chars): General-purpose quasi-hashing with deterministic non-collision guarantee, and improved performance over SHA256
- Oboron (53 chars): Stronger-than-SHA256 collision resistance (in a scenario without a fixed key)
- Shorter prefixes (6 chars): Git-like short references
Note: Oboron provides strong collision resistance for identifier generation but is not a comprehensive replacement for cryptographic hashing in all contexts (e.g., password hashing where slow hashes are desirable).
Compatibility
Oboron implementations maintain full cross-language compatibility:
- Identical encryption algorithms and key management
- Consistent encoding formats and scheme specifications
- Interoperable encoded values across Rust, Python, and Go (latter currently under development)
All implementations must pass the common test vectors
Getting Help
License
Licensed under the MIT license (LICENSE).
Appendix: Obtext Lengths
mock1 is a non-cryptographic scheme used for testing, whose ciphertext
is equal to the plaintext bytes (identity transformation). It is
included in the tables below as baseline.
(Note: the mock1 scheme is feature gated: use it by enabling the mock1
feature, or the ob7x testing feature group, or the non-crypto feature
group.)
Base32 encoding (b32/c32)
| Format | 4B | 8B | 12B | 16B | 24B | 32B | 64B | 128B |
|---|---|---|---|---|---|---|---|---|
| mock1.b32 | 10 | 16 | 23 | 29 | 42 | 55 | 106 | 208 |
| aags.b32 | 36 | 42 | 48 | 55 | 68 | 80 | 132 | 234 |
| aasv.b32 | 36 | 42 | 48 | 55 | 68 | 80 | 132 | 234 |
| apgs.b32 | 55 | 61 | 68 | 74 | 87 | 100 | 151 | 253 |
| apsv.b32 | 61 | 68 | 74 | 80 | 93 | 106 | 157 | 260 |
| upbc.b32 | 55 | 55 | 55 | 55 | 80 | 80 | 132 | 234 |
| zrbcx.b32 | 29 | 29 | 29 | 29 | 55 | 55 | 106 | 208 |
Base64 Encoding (b64)
| Format | 4B | 8B | 12B | 16B | 24B | 32B | 64B | 128B |
|---|---|---|---|---|---|---|---|---|
| mock1.b64 | 8 | 14 | 19 | 24 | 35 | 46 | 88 | 174 |
| aags.b64 | 30 | 35 | 40 | 46 | 56 | 67 | 110 | 195 |
| aasv.b64 | 30 | 35 | 40 | 46 | 56 | 67 | 110 | 195 |
| upbc.b64 | 46 | 46 | 46 | 46 | 67 | 67 | 110 | 195 |
| apgs.b64 | 46 | 51 | 56 | 62 | 72 | 83 | 126 | 211 |
| apsv.b64 | 51 | 56 | 62 | 67 | 78 | 88 | 131 | 216 |
| zrbcx.b64 | 24 | 24 | 24 | 24 | 46 | 46 | 88 | 174 |
Hex Encoding (hex)
| Format | 4B | 8B | 12B | 16B | 24B | 32B | 64B | 128B |
|---|---|---|---|---|---|---|---|---|
| mock1.hex | 12 | 20 | 28 | 36 | 52 | 68 | 132 | 260 |
| aags.hex | 44 | 52 | 60 | 68 | 84 | 100 | 164 | 292 |
| aasv.hex | 44 | 52 | 60 | 68 | 84 | 100 | 164 | 292 |
| upbc.hex | 68 | 68 | 68 | 68 | 100 | 100 | 164 | 292 |
| apgs.hex | 68 | 76 | 84 | 92 | 108 | 124 | 188 | 316 |
| apsv.hex | 76 | 84 | 92 | 100 | 116 | 132 | 196 | 324 |
| zrbcx.hex | 36 | 36 | 36 | 36 | 68 | 68 | 132 | 260 |