litext 1.3.0 - Docs.rs

# litext - Detailed Documentation

This document covers advanced usage, edge cases, and implementation details.

## Quick Start

Extract a string literal:

```rust
use litext::{litext, TokenStream};

fn my_macro(input: TokenStream) -> TokenStream {
    let text: String = litext!(input);
    // Returns early with a compile error if extraction fails.
    // Use `litext!(input as String)` for the same result, explicitly typed.
}
```

Extract other literal types:

```rust
let num: i32  = litext!(input as i32);
let val: f64  = litext!(input as f64);
let ch: char  = litext!(input as char);
let flag: bool = litext!(input as bool);
```

Keep the `Result` instead of returning early:

```rust
let result: Result<String, TokenStream> = litext!(try input);
match result {
    Ok(text) => { /* use text */ }
    Err(e)   => return e,
}

// Same with an explicit type:
let result: Result<i32, TokenStream> = litext!(try input as i32);
```

## Extracting Multiple Literals at Once

`litext!` can extract several values from one `TokenStream` in a single call. Wrap the types in a tuple and put the separator token between them:

```rust
// Input TokenStream: "hello" , 42
let (s, n): (String, i32) = litext!(input as (String , i32));

// Three values separated by commas:
// Input: "hello" , 3.14 , true
let (s, f, b): (String, f64, bool) = litext!(input as (String , f64 , bool));

// Semicolon separator:
// Input: "key" ; 100
let (key, val): (String, u32) = litext!(input as (String ; u32));
```

The `try` form also works for tuples:

```rust
let result: Result<(String, i32), _> = litext!(try input as (String , i32));
```

Up to 12 values can be extracted in one call.

### Separator Rules

The separator written in the macro call and the separator in the actual `TokenStream` must match.

**Single-character separators**: Any ASCII punctuation character works: `,`, `;`, `|`, `&`, `+`, `-`, `*`, `/`, `%`, `^`, `!`, `?`, `<`, `>`, `=`, `.`, `:`, `@`, `~`.

**Two-character separators**: The following multi-character punctuation sequences are supported:
- `->` (arrow)
- `=>` (fat arrow)
- `::` (double colon)
- `..` (dot dot, range)

```rust
// Input: "key" -> 42
let (k, v): (String, i32) = litext!(input as (String -> i32));

// Input: "opt" => "val"
let (k, v): (String, String) = litext!(input as (String => String));

// Input: "ns" :: "item"
let (ns, item): (String, String) = litext!(input as (String :: String));

// Input: 1 .. 10
let (start, end): (i32, i32) = litext!(input as (i32 .. i32));

// Mixed: arrow + comma
// Input: "a" -> 1 , true
let (s, n, b): (String, i32, bool) = litext!(input as (String -> i32 , bool));
```

Grouping delimiters (`(`, `)`, `[`, `]`, `{`, `}`) and `$` are not valid separators.

### Type Constraints

Type arguments in tuple position must be single-token identifiers. Generic types like `LitInt<u8>` span multiple tokens and cannot appear in this position. Use the default forms (`LitInt`, `LitFloat`) which carry their default type parameters.

## Span-Aware Types

Every literal kind has a span-aware wrapper that bundles the parsed value with its source location. Use these when you need to emit a compiler diagnostic pointing at the exact literal in the user's code.

| Plain Type | Span-Aware Type | Literal Kind |
|------------|-----------------|--------------|
| `String` | `LitStr` | `"..."`, `r#"..."#` |
| `i8` to `i128`, `isize` | `LitInt<T>` (default: `i32`) | `42`, `0xFF`, `0b1010` |
| `u8` to `u128`, `usize` | `LitInt<T>` (default: `i32`) | same as above |
| `f32`, `f64` | `LitFloat<T>` (default: `f64`) | `3.14`, `1e10` |
| `bool` | `LitBool` | `true`, `false` |
| `char` | `LitChar` | `'a'`, `'\n'` |
| `u8` | `LitByte` | `b'a'`, `b'\xff'` |
| `Vec<u8>` | `LitByteStr` | `b"..."`, `br#"..."#` |
| `CString` | `LitCStr` | `c"..."`, `cr#"..."#` |

```rust
use litext::{litext, TokenStream};
use litext::literal::LitStr;

fn my_macro(input: TokenStream) -> TokenStream {
    let lit: LitStr = litext!(input as LitStr);

    if lit.value().is_empty() {
        return comperr::error(lit.span(), "string cannot be empty");
    }

    // Use lit.value() for the content, lit.span() for diagnostics.
}
```

```rust
use litext::literal::LitInt;

let lit: LitInt<u8> = litext!(input as LitInt<u8>);
let value: u8 = *lit.value();

if value == 0 {
    return comperr::error(lit.span(), "value cannot be zero");
}
```

All span-aware types expose:

- `.value()` - the parsed value
- `.span()` - the source location
- `.suffix()` - the explicit type suffix if present (`"i32"` for `42i32`, `None` for `42`)

### LitInt Details

`LitInt<T>` wraps an integer literal with its source span. The type parameter `T` must be one of the sealed integer types:

- Signed: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`
- Unsigned: `u8`, `u16`, `u32`, `u64`, `u128`, `usize`

The default is `i32` when you use `LitInt` without a type parameter.

All integer literal formats are supported:

```rust
let dec = ok::<LitInt<i32>>("42");
let hex = ok::<LitInt<u32>>("0xFF");
let oct = ok::<LitInt<u32>>("0o77");
let bin = ok::<LitInt<u32>>("0b1010");
let with_underscores = ok::<LitInt<u32>>("1_000_000");
let with_suffix = ok::<LitInt<u8>>("255u8");
```

### LitFloat Details

`LitFloat<T>` wraps a float literal with its source span. The type parameter `T` must be `f32` or `f64`. The default is `f64`.

Formats supported:

```rust
let std = ok::<LitFloat<f64>>("3.14");
let scientific = ok::<LitFloat<f64>>("1e10");
let scientific_neg = ok::<LitFloat<f64>>("1e-10");
let underscores = ok::<LitFloat<f64>>("1_000.5");
let with_suffix = ok::<LitFloat<f32>>("3.14f32");
```

## Round-Tripping with ToTokens

All span-aware types implement `ToTokens`, which converts them back into a `TokenStream`. This enables an extract, validate, emit pattern:

```rust
use litext::literal::{LitStr, ToTokens};

let lit: LitStr = litext!(input as LitStr);

if lit.value().starts_with("_") {
    return comperr::error(lit.span(), "string cannot start with underscore");
}

// Emit the literal back into the output with its original span preserved.
lit.to_token_stream()
```

The round-trip preserves the original span for diagnostics, but the exact token representation may differ slightly from the input.

## Custom Literal Types

Implement `FromLit` to make `litext!(input as MyType)` work for your own types:

```rust
use litext::literal::FromLit;
use proc_macro2::{Literal, TokenStream};

pub struct NonEmpty(String);

impl FromLit for NonEmpty {
    fn from_lit(lit: Literal) -> Result<Self, TokenStream> {
        let s = String::from_lit(lit)?;
        if s.is_empty() {
            return Err(comperr::error(
                proc_macro2::Span::call_site(),
                "string cannot be empty",
            ));
        }
        Ok(NonEmpty(s))
    }
}

// Then in your macro:
let val: NonEmpty = litext!(input as NonEmpty);
```

### from_ident

Override `from_ident` if your type is represented as an identifier token (like `bool`):

```rust
fn from_ident(ident: proc_macro2::Ident) -> Result<Self, TokenStream> {
    // handle true/false/etc.
}
```

### from_negative_lit

Override `from_negative_lit` if your type supports negative numeric literals:

```rust
fn from_negative_lit(lit: Literal) -> Result<Self, TokenStream> {
    // Handle negative numbers (dash token followed by literal)
}
```

The default implementation returns an error, which is correct for unsigned types and non-numeric types.

## What Is Supported

### Strings

| Format | Example |
|--------|---------|
| Regular | `"hello world"` |
| Escape sequences | `"\n"`, `"\t"`, `"\\"`, `"\""`, `"\0"`, `"\x41"`, `"\u{1F600}"` |
| Line continuation | `"hello \` + newline + `world"` |
| Raw | `r#"no escapes here"#` |
| Raw (multiple hashes) | `r##"can contain #"##` |

Escape sequence details:

- Basic: `\n`, `\r`, `\t`, `\\`, `\"`, `\0`
- Hex: `\x41` (two hex digits, range 0x00-0x7F for strings, 0x00-0xFF for bytes)
- Unicode: `\u{1F600}` (any valid Unicode scalar value)

### Integers

| Format | Example |
|--------|---------|
| Decimal | `42` |
| Hexadecimal | `0xFF`, `0XFF` |
| Octal | `0o77`, `0O77` |
| Binary | `0b1010`, `0B1010` |
| Underscore separators | `1_000_000`, `0xFF_FF` |
| Type suffix | `42i32`, `255u8`, `100usize` |

All integer types are supported: `i8` to `i128`, `isize`, `u8` to `u128`, `usize`. Overflow returns a compile error.

### Negative Integer Literals

Negative numbers are supported for signed types. In Rust's token stream, `-42` is two tokens: a `-` punctuation token followed by the positive literal `42`. `litext` handles this automatically:

```rust
let neg_i32: i32 = litext!(tokenstream("-42"));  // Works: returns -42
let neg_f64: f64 = litext!(tokenstream("-3.14")); // Works: returns -3.14

let neg_u32: u32 = litext!(tokenstream("-42"));   // Error: cannot negate unsigned
```

Supported signed types: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`, `f32`, `f64`.

The `LitInt<T>` and `LitFloat<T>` wrappers also support negative literals.

### Floats

| Format | Example |
|--------|---------|
| Standard | `3.14` |
| Scientific | `1e10`, `2.5e-3`, `1E10` |
| Underscore separators | `1_000.5` |
| Type suffix | `3.14f32`, `1e10f64` |

### Characters

Full escape support: `'a'`, `'\n'`, `'\t'`, `'\\'`, `'\''`, `'\"'`, `'\0'`, `'\x41'`, `'\u{1F600}'`.

### Booleans

`true` and `false` are parsed as identifier tokens, not literals. All `litext` extraction handles this transparently.

### Byte Literals

`b'a'`, `b'\n'`, `b'\xff'`. The full 0x00..=0xFF range is valid for `\x` escapes in byte literals.

### Byte Strings

`b"hello"`, `b"\xff\x80"`, `br#"raw bytes"#`. The full byte range is supported for `\x` escapes.

### C Strings

`c"hello"`, `cr#"raw"#`. Interior null bytes are rejected with a compile error.

## Error Handling

All errors are returned as `TokenStream` values containing `compile_error!` invocations. When returned from a proc-macro entry point, the compiler displays the error at the correct source location.

### Common Error Cases

| Input | Expected Type | Error |
|-------|---------------|-------|
| Empty | Any | "expected a literal, got nothing" |
| `"a" "b"` | String | "expected exactly one literal" |
| `42` | String | "expected a string literal" |
| `256` | u8 | "integer out of range" |
| `-42` | u8 | "cannot negate an unsigned integer" |
| `b"hello"` | String | "expected a string literal, not a byte string" |
| `c"\0"` | CString | "C string literal contains an interior null byte" |

### Span Preservation

Errors preserve the span of the problematic token, enabling precise compiler diagnostics.

## Implementation Details

### proc_macro2 Dependency

`litext` uses `proc_macro2` instead of `proc_macro` so it can be tested outside proc-macro context. At your proc-macro entry point, convert between the two:

```rust
#[proc_macro]
pub fn my_macro(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    my_macro_inner(input.into()).into()
}

fn my_macro_inner(input: proc_macro2::TokenStream) -> proc_macro2::TokenStream {
    let s: String = litext!(input);
    // ...
}
```

### How Extract Works

The `extract<T>` function and `litext!` macro work as follows:

1. Convert input to an iterator over tokens
2. Handle special cases (negative literals with `-` punct)
3. Match on token type (Literal, Ident, Punct, Group)
4. Call the appropriate `FromLit` method
5. Return `Result<T, TokenStream>`

### Trait Implementations

The `FromLit` trait is the extension point. See the trait documentation for all methods you can override.

### ToTokens Trait

The `ToTokens` trait converts span-aware types back to `TokenStream`. All `Lit*` types implement it. This enables round-tripping, validation patterns, and re-emission with preserved spans.

## Planned Features

- Feature flag gates to `FromLit` impls for standard types, makes `litext` lighter (2.0)
- Negative numbers added in 1.2
- Two-character separators added in 1.3

## Requirements

- Rust 2024 edition (Rust 1.85)
- `proc_macro2` as a direct dependency in your crate