string-width 0.1.0

Accurate Unicode string width calculation for terminal applications, handling emoji, East Asian characters, combining marks, and ANSI escape sequences
Documentation
# string-width


[![Crates.io](https://img.shields.io/crates/v/string-width.svg)](https://crates.io/crates/string-width)
[![Documentation](https://docs.rs/string-width/badge.svg)](https://docs.rs/string-width)
[![License](https://img.shields.io/crates/l/string-width.svg)](https://github.com/sabry-awad97/string-width#license)
[![Build Status](https://github.com/sabry-awad97/string-width/workflows/CI/badge.svg)](https://github.com/sabry-awad97/string-width/actions)

Accurate Unicode string width calculation for terminal applications. This library correctly handles:

- **Emoji sequences** (πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦, πŸ‡ΊπŸ‡Έ, 1️⃣)
- **East Asian characters** (δ½ ε₯½, こんにけは, μ•ˆλ…•ν•˜μ„Έμš”)
- **Combining marks and diacritics** (Γ©, Γ±, ΓΌ)
- **Zero-width characters** (ZWJ, ZWNJ, format characters)
- **ANSI escape sequences** (colors, cursor movement)
- **Ambiguous width characters** (Β±, Γ—, Γ·)

## Features


- 🎯 **Accurate**: Implements Unicode Standard recommendations for character width
- πŸš€ **Fast**: Optimized for performance with minimal allocations
- πŸ”§ **Configurable**: Handle ambiguous characters and ANSI codes as needed
- πŸ“¦ **Zero-config**: Works out of the box with sensible defaults
- πŸ§ͺ **Well-tested**: Comprehensive test suite with 160+ test cases

## Quick Start


Add this to your `Cargo.toml`:

```toml
[dependencies]
string-width = "0.1.0"
```

**Minimum Supported Rust Version (MSRV)**: 1.85

## Usage


### Basic Usage


```rust
use string_width::string_width;

// ASCII text
assert_eq!(string_width("Hello"), 5);

// East Asian characters (full-width)
assert_eq!(string_width("δ½ ε₯½"), 4);

// Emoji
assert_eq!(string_width("πŸ‘‹"), 2);
assert_eq!(string_width("πŸ‡ΊπŸ‡Έ"), 2);  // Flag
assert_eq!(string_width("1️⃣"), 2);   // Keycap sequence

// Mixed content
assert_eq!(string_width("Hello πŸ‘‹ δΈ–η•Œ"), 13);

// ANSI escape sequences are ignored by default
assert_eq!(string_width("\x1b[31mRed\x1b[0m"), 3);
```

### Advanced Configuration


```rust
use string_width::{string_width, string_width_with_options, StringWidthOptions, AmbiguousWidthTreatment};

// Direct configuration
let options = StringWidthOptions {
    count_ansi: true,  // Count ANSI escape sequences
    ambiguous_width: AmbiguousWidthTreatment::Wide,  // Treat ambiguous as wide
};

assert_eq!(string_width_with_options("Β±Γ—Γ·", options.clone()), 6);  // Ambiguous chars as wide
assert_eq!(string_width_with_options("\x1b[31mRed\x1b[0m", options), 12);  // Count ANSI

// Using the builder pattern (recommended)
let options = StringWidthOptions::builder()
    .count_ansi(true)
    .ambiguous_as_wide()
    .build();

assert_eq!(string_width_with_options("Β±Γ—Γ·", options), 6);
```

### Configuration Builder Pattern


```rust
use string_width::{string_width, StringWidthOptions, AmbiguousWidthTreatment};

// Fluent builder API for easy configuration
let options = StringWidthOptions::builder()
    .count_ansi(true)
    .ambiguous_as_wide()
    .build();

let text = "\x1b[31mΒ±Γ—Γ·\x1b[0m";
assert_eq!(string_width((text, options)), 14);  // ANSI + wide ambiguous

// Convenience methods for common configurations
let narrow_options = StringWidthOptions::builder()
    .ambiguous_as_narrow()
    .build();

let wide_options = StringWidthOptions::builder()
    .ambiguous_as_wide()
    .build();

// Equivalent to direct construction but more readable
let manual_options = StringWidthOptions {
    count_ansi: true,
    ambiguous_width: AmbiguousWidthTreatment::Wide,
};
```

### Using the DisplayWidth Trait


```rust
use string_width::{DisplayWidth, StringWidthOptions};

let text = "Hello 🌍";
println!("Width: {}", text.display_width());

// With custom options using builder pattern
let options = StringWidthOptions::builder()
    .count_ansi(false)
    .ambiguous_as_narrow()
    .build();
println!("Width: {}", text.display_width_with_options(options));

// Works with both &str and String
let owned_string = String::from("Hello 🌍");
println!("Width: {}", owned_string.display_width());
```

## Terminal Applications


Perfect for building CLI tools that need proper text alignment:

### Text Alignment


```rust
use string_width::DisplayWidth;

fn align_text(text: &str, width: usize) -> String {
    let text_width = text.display_width();
    let padding = width.saturating_sub(text_width);
    format!("{}{}", text, " ".repeat(padding))
}

// Works correctly with Unicode content
println!("β”‚{}β”‚", align_text("Hello", 10));      // β”‚Hello     β”‚
println!("β”‚{}β”‚", align_text("δ½ ε₯½", 10));        // β”‚δ½ ε₯½      β”‚
println!("β”‚{}β”‚", align_text("πŸ‡ΊπŸ‡Έ USA", 10));     // β”‚πŸ‡ΊπŸ‡Έ USA   β”‚
```

### Table Formatting


```rust
use string_width::DisplayWidth;

let data = vec![
    ("Name", "Country", "Greeting"),
    ("Alice", "πŸ‡ΊπŸ‡Έ USA", "Hello!"),
    ("η”°δΈ­", "πŸ‡―πŸ‡΅ Japan", "こんにけは"),
];

// Calculate column widths
let widths: Vec<usize> = (0..3)
    .map(|col| data.iter().map(|row| {
        match col {
            0 => row.0.display_width(),
            1 => row.1.display_width(),
            2 => row.2.display_width(),
            _ => 0,
        }
    }).max().unwrap_or(0))
    .collect();

// Print aligned table
for (name, country, greeting) in data {
    println!("β”‚ {:width0$} β”‚ {:width1$} β”‚ {:width2$} β”‚",
        name, country, greeting,
        width0 = widths[0], width1 = widths[1], width2 = widths[2]
    );
}
```

## Examples


The library includes comprehensive examples demonstrating various use cases:

```bash
# Basic usage examples - demonstrates core functionality

cargo run --example basic_usage

# Terminal formatting and alignment - shows real-world usage

cargo run --example terminal_formatting

# CLI tool for measuring text width - interactive width analysis

cargo run --example cli_tool -- "Hello πŸ‘‹ World"
echo "Hello πŸ‘‹ World" | cargo run --example cli_tool
```

### Key Example Features


- **Basic Usage**: ASCII, Unicode, emoji, and mixed content width calculation
- **Terminal Formatting**: Text alignment, table formatting, and progress bars
- **CLI Tool**: Interactive analysis with detailed character breakdown and formatting examples

## Supported Unicode Features


| Feature          | Example                | Width |
| ---------------- | ---------------------- | ----- |
| ASCII            | `"Hello"`              | 5     |
| East Asian       | `"δ½ ε₯½"`               | 4     |
| Emoji            | `"πŸ˜€"`                 | 2     |
| Emoji sequences  | `"πŸ‘¨β€πŸ‘©β€πŸ‘§β€πŸ‘¦"`                 | 2     |
| Keycap sequences | `"1️⃣"`                 | 2     |
| Flag sequences   | `"πŸ‡ΊπŸ‡Έ"`                 | 2     |
| Combining marks  | `"Γ©"` (e + Β΄)          | 1     |
| Zero-width chars | `"a\u{200B}b"`         | 2     |
| ANSI codes       | `"\x1b[31mRed\x1b[0m"` | 3     |

## Architecture


The library is designed with a modular architecture:

- **`width_calculation`**: Core width calculation logic and public API
- **`character_classification`**: Unicode character categorization and analysis
- **`emoji`**: Emoji sequence detection and handling
- **`options`**: Configuration types and builder pattern
- **`unicode_constants`**: Precomputed Unicode property tables

## Performance


The library is optimized for performance:

- Minimal allocations during width calculation
- Efficient Unicode property lookups using precomputed tables
- Compiled regex patterns for zero-width detection
- Single-pass processing for most cases
- Optimized grapheme cluster processing

## API Overview


The library provides multiple ways to calculate string width:

| Function/Trait     | Use Case                  | Example                                 |
| ------------------ | ------------------------- | --------------------------------------- |
| `string_width()`   | Simple width calculation  | `string_width("Hello")`                 |
| `DisplayWidth`     | Ergonomic trait-based API | `"Hello".display_width()`               |
| `StringWidthInput` | Legacy compatibility      | `"Hello".calculate_width()`             |
| Builder Pattern    | Complex configuration     | `StringWidthOptions::builder().build()` |

## Comparison with Other Libraries


| Library          | Emoji Support | East Asian | ANSI Handling   | Combining Marks | Builder API |
| ---------------- | ------------- | ---------- | --------------- | --------------- | ----------- |
| **string-width** | βœ… Full       | βœ… Yes     | βœ… Configurable | βœ… Yes          | βœ… Yes      |
| unicode-width    | ❌ Basic      | βœ… Yes     | ❌ No           | βœ… Yes          | ❌ No       |
| textwrap         | ❌ No         | βœ… Yes     | ❌ No           | βœ… Yes          | ❌ No       |

## Contributing


Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

## License


This project is licensed under either of

- Apache License, Version 2.0, ([LICENSE-APACHE]LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT]LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

## Acknowledgments


- Unicode Consortium for the Unicode Standard
- East Asian Width property implementation
- Terminal emulator developers for width handling insights