florid 0.1.0

Generate nice human-readable unique identifiers from word dictionaries
Documentation
# florid

Generate nice human-readable unique identifiers from word dictionaries.

```
cargo run -- 20
# => crisp-galaxy-meadow
```

## Features

- **Human-readable**: IDs are composed of real English words
- **Configurable length**: Generate IDs from 5 to 36 characters
- **Case-insensitive**: Always lowercase, normalizes on parse
- **Deterministic option**: Seed the RNG for reproducible IDs (testing)
- **No external dependencies**: Only `rand` and `thiserror`

## Usage

### Library

```rust
use florid::{florid, Florid};

// Generate a 20-character ID
let id = florid(20).unwrap();
println!("{}", id); // e.g., "crisp-galaxy-meadow"

// Parse and validate
let parsed: Florid = "red-cat-dog".parse().unwrap();
assert_eq!(parsed.word_count(), 3);

// With custom RNG for deterministic generation
use rand::SeedableRng;
use rand::rngs::StdRng;

let mut rng = StdRng::seed_from_u64(42);
let id = florid::florid_with_rng(20, &mut rng).unwrap();
```

### CLI

```bash
# Generate a single 20-character ID (default)
florid

# Generate a 15-character ID
florid 15

# Generate 5 IDs
florid -n 5

# Generate 10 IDs of 25 characters
florid -l 25 -n 10

# Show entropy and collision info
florid --info
```

## ID Format

| Length | Format | Example |
|--------|--------|---------|
| 5-9 | `word` + digit + `word` | `go3cat` |
| 10-15 | `word-word` | `crisp-galaxy` |
| 16-24 | `word-word-word` | `crisp-galaxy-meadow` |
| 25-36 | `word-word-word-word` | `crisp-galaxy-meadow-river` |

Short IDs (5-9 chars) use digits as separators instead of hyphens to maximize character efficiency.

## Entropy and Collision Probability

The library uses a curated dictionary of ~67,000 words (~16.0 bits of entropy per word), sourced from SCOWL (Spell Checker Oriented Word Lists) and the Moby POS database.

### Collision Probability

| IDs Generated | 2 Words | 3 Words | 4 Words |
|--------------|---------|---------|---------|
| 1,000 | ~1.1e-04 | ~1.7e-09 | ~2.5e-14 |
| 10,000 | ~1.1e-02 | ~1.7e-07 | ~2.5e-12 |
| 100,000 | ~6.8e-01 | ~1.7e-05 | ~2.5e-10 |
| 1,000,000 | ~1.0 | ~1.7e-03 | ~2.5e-08 |

**Recommendation**: Use 3+ words (16+ characters) for applications generating more than 100,000 IDs. Use 4 words (25+ characters) for high-volume applications (millions of IDs).

### Entropy per Configuration

- **2 words**: ~32.0 bits
- **3 words**: ~48.1 bits  
- **4 words**: ~64.1 bits

## Limitations

1. **Not cryptographically unique**: While we use `rand::thread_rng()` (cryptographically seeded), the limited dictionary size means IDs should not be used as secrets or tokens.

2. **Collision risk at scale**: With ~67,000 words:
   - 2-word IDs: Risk of collision around 100K-1M IDs
   - 3-word IDs: Safe up to ~100M IDs
   - 4-word IDs: Safe up to ~10B IDs

3. **Length constraints**: Some exact lengths may be harder to achieve due to word length distribution. The generator will retry up to 1,000 times before failing.

4. **No guaranteed uniqueness**: The library generates random IDs but does not check for uniqueness. For guaranteed uniqueness, maintain your own set/database.

## Performance

- ID generation is O(1) with a small constant factor
- Dictionary is embedded at compile time (no I/O)
- Thread-safe: uses thread-local RNG by default

Typical generation: ~1μs per ID on modern hardware.

## API Reference

### Functions

- `florid(length: usize) -> Result<String, FloridError>` - Generate an ID of exact length
- `florid_with_rng(length: usize, rng: &mut R) -> Result<String, FloridError>` - Generate with custom RNG
- `is_valid(s: &str) -> bool` - Check if string is a valid florid
- `normalize(s: &str) -> Option<String>` - Normalize case and validate
- `entropy_bits() -> f64` - Get entropy bits per word
- `collision_probability(num_ids: u64, num_words: usize) -> f64` - Calculate collision probability

### Types

- `Florid` - Parsed and validated ID with `Display`, `FromStr`, `Clone`, `Hash`, `Eq`
- `FloridError` - Error type for invalid lengths or formats

### Constants

- `MIN_LENGTH = 5`
- `MAX_LENGTH = 36`
- `ADJECTIVES` - Word list
- `NOUNS` - Word list  
- `SHORT_WORDS` - 2-4 letter words for compact IDs

## License

MIT