# litext - Detailed Documentation
This document covers advanced usage, edge cases, and implementation details.
## Quick Start
Extract a string literal:
```rust
use litext::{litext, TokenStream};
fn my_macro(input: TokenStream) -> TokenStream {
let text: String = litext!(input);
// Returns early with a compile error if extraction fails.
// Use `litext!(input as String)` for the same result, explicitly typed.
}
```
Extract other literal types:
```rust
let num: i32 = litext!(input as i32);
let val: f64 = litext!(input as f64);
let ch: char = litext!(input as char);
let flag: bool = litext!(input as bool);
```
Keep the `Result` instead of returning early:
```rust
let result: Result<String, TokenStream> = litext!(try input);
match result {
Ok(text) => { /* use text */ }
Err(e) => return e,
}
// Same with an explicit type:
let result: Result<i32, TokenStream> = litext!(try input as i32);
```
## Extracting Multiple Literals at Once
`litext!` can extract several values from one `TokenStream` in a single call. Wrap the types in a tuple and put the separator token between them:
```rust
// Input TokenStream: "hello" , 42
let (s, n): (String, i32) = litext!(input as (String , i32));
// Three values separated by commas:
// Input: "hello" , 3.14 , true
let (s, f, b): (String, f64, bool) = litext!(input as (String , f64 , bool));
// Semicolon separator:
// Input: "key" ; 100
let (key, val): (String, u32) = litext!(input as (String ; u32));
```
The `try` form also works for tuples:
```rust
let result: Result<(String, i32), _> = litext!(try input as (String , i32));
```
Up to 12 values can be extracted in one call.
### Separator Rules
The separator written in the macro call and the separator in the actual `TokenStream` must match.
**Single-character separators**: Any ASCII punctuation character works: `,`, `;`, `|`, `&`, `+`, `-`, `*`, `/`, `%`, `^`, `!`, `?`, `<`, `>`, `=`, `.`, `:`, `@`, `~`.
**Two-character separators**: The following multi-character punctuation sequences are supported:
- `->` (arrow)
- `=>` (fat arrow)
- `::` (double colon)
- `..` (dot dot, range)
```rust
// Input: "key" -> 42
let (k, v): (String, i32) = litext!(input as (String -> i32));
// Input: "opt" => "val"
let (k, v): (String, String) = litext!(input as (String => String));
// Input: "ns" :: "item"
let (ns, item): (String, String) = litext!(input as (String :: String));
// Input: 1 .. 10
let (start, end): (i32, i32) = litext!(input as (i32 .. i32));
// Mixed: arrow + comma
// Input: "a" -> 1 , true
let (s, n, b): (String, i32, bool) = litext!(input as (String -> i32 , bool));
```
Grouping delimiters (`(`, `)`, `[`, `]`, `{`, `}`) and `$` are not valid separators.
### Type Constraints
Type arguments in tuple position must be single-token identifiers. Generic types like `LitInt<u8>` span multiple tokens and cannot appear in this position. Use the default forms (`LitInt`, `LitFloat`) which carry their default type parameters.
## Span-Aware Types
Every literal kind has a span-aware wrapper that bundles the parsed value with its source location. Use these when you need to emit a compiler diagnostic pointing at the exact literal in the user's code.
| `String` | `LitStr` | `"..."`, `r#"..."#` |
| `i8` to `i128`, `isize` | `LitInt<T>` (default: `i32`) | `42`, `0xFF`, `0b1010` |
| `u8` to `u128`, `usize` | `LitInt<T>` (default: `i32`) | same as above |
| `f32`, `f64` | `LitFloat<T>` (default: `f64`) | `3.14`, `1e10` |
| `bool` | `LitBool` | `true`, `false` |
| `char` | `LitChar` | `'a'`, `'\n'` |
| `u8` | `LitByte` | `b'a'`, `b'\xff'` |
| `Vec<u8>` | `LitByteStr` | `b"..."`, `br#"..."#` |
| `CString` | `LitCStr` | `c"..."`, `cr#"..."#` |
```rust
use litext::{litext, TokenStream};
use litext::literal::LitStr;
fn my_macro(input: TokenStream) -> TokenStream {
let lit: LitStr = litext!(input as LitStr);
if lit.value().is_empty() {
return comperr::error(lit.span(), "string cannot be empty");
}
// Use lit.value() for the content, lit.span() for diagnostics.
}
```
```rust
use litext::literal::LitInt;
let lit: LitInt<u8> = litext!(input as LitInt<u8>);
let value: u8 = *lit.value();
if value == 0 {
return comperr::error(lit.span(), "value cannot be zero");
}
```
All span-aware types expose:
- `.value()` - the parsed value
- `.span()` - the source location
- `.suffix()` - the explicit type suffix if present (`"i32"` for `42i32`, `None` for `42`)
### LitInt Details
`LitInt<T>` wraps an integer literal with its source span. The type parameter `T` must be one of the sealed integer types:
- Signed: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`
- Unsigned: `u8`, `u16`, `u32`, `u64`, `u128`, `usize`
The default is `i32` when you use `LitInt` without a type parameter.
All integer literal formats are supported:
```rust
let dec = ok::<LitInt<i32>>("42");
let hex = ok::<LitInt<u32>>("0xFF");
let oct = ok::<LitInt<u32>>("0o77");
let bin = ok::<LitInt<u32>>("0b1010");
let with_underscores = ok::<LitInt<u32>>("1_000_000");
let with_suffix = ok::<LitInt<u8>>("255u8");
```
### LitFloat Details
`LitFloat<T>` wraps a float literal with its source span. The type parameter `T` must be `f32` or `f64`. The default is `f64`.
Formats supported:
```rust
let std = ok::<LitFloat<f64>>("3.14");
let scientific = ok::<LitFloat<f64>>("1e10");
let scientific_neg = ok::<LitFloat<f64>>("1e-10");
let underscores = ok::<LitFloat<f64>>("1_000.5");
let with_suffix = ok::<LitFloat<f32>>("3.14f32");
```
## Round-Tripping with ToTokens
All span-aware types implement `ToTokens`, which converts them back into a `TokenStream`. This enables an extract, validate, emit pattern:
```rust
use litext::literal::{LitStr, ToTokens};
let lit: LitStr = litext!(input as LitStr);
if lit.value().starts_with("_") {
return comperr::error(lit.span(), "string cannot start with underscore");
}
// Emit the literal back into the output with its original span preserved.
lit.to_token_stream()
```
The round-trip preserves the original span for diagnostics, but the exact token representation may differ slightly from the input.
## Custom Literal Types
Implement `FromLit` to make `litext!(input as MyType)` work for your own types:
```rust
use litext::literal::FromLit;
use proc_macro2::{Literal, TokenStream};
pub struct NonEmpty(String);
impl FromLit for NonEmpty {
fn from_lit(lit: Literal) -> Result<Self, TokenStream> {
let s = String::from_lit(lit)?;
if s.is_empty() {
return Err(comperr::error(
proc_macro2::Span::call_site(),
"string cannot be empty",
));
}
Ok(NonEmpty(s))
}
}
// Then in your macro:
let val: NonEmpty = litext!(input as NonEmpty);
```
### from_ident
Override `from_ident` if your type is represented as an identifier token (like `bool`):
```rust
fn from_ident(ident: proc_macro2::Ident) -> Result<Self, TokenStream> {
// handle true/false/etc.
}
```
### from_negative_lit
Override `from_negative_lit` if your type supports negative numeric literals:
```rust
fn from_negative_lit(lit: Literal) -> Result<Self, TokenStream> {
// Handle negative numbers (dash token followed by literal)
}
```
The default implementation returns an error, which is correct for unsigned types and non-numeric types.
## What Is Supported
### Strings
| Regular | `"hello world"` |
| Escape sequences | `"\n"`, `"\t"`, `"\\"`, `"\""`, `"\0"`, `"\x41"`, `"\u{1F600}"` |
| Line continuation | `"hello \` + newline + `world"` |
| Raw | `r#"no escapes here"#` |
| Raw (multiple hashes) | `r##"can contain #"##` |
Escape sequence details:
- Basic: `\n`, `\r`, `\t`, `\\`, `\"`, `\0`
- Hex: `\x41` (two hex digits, range 0x00-0x7F for strings, 0x00-0xFF for bytes)
- Unicode: `\u{1F600}` (any valid Unicode scalar value)
### Integers
| Decimal | `42` |
| Hexadecimal | `0xFF`, `0XFF` |
| Octal | `0o77`, `0O77` |
| Binary | `0b1010`, `0B1010` |
| Underscore separators | `1_000_000`, `0xFF_FF` |
| Type suffix | `42i32`, `255u8`, `100usize` |
All integer types are supported: `i8` to `i128`, `isize`, `u8` to `u128`, `usize`. Overflow returns a compile error.
### Negative Integer Literals
Negative numbers are supported for signed types. In Rust's token stream, `-42` is two tokens: a `-` punctuation token followed by the positive literal `42`. `litext` handles this automatically:
```rust
let neg_i32: i32 = litext!(tokenstream("-42")); // Works: returns -42
let neg_f64: f64 = litext!(tokenstream("-3.14")); // Works: returns -3.14
let neg_u32: u32 = litext!(tokenstream("-42")); // Error: cannot negate unsigned
```
Supported signed types: `i8`, `i16`, `i32`, `i64`, `i128`, `isize`, `f32`, `f64`.
The `LitInt<T>` and `LitFloat<T>` wrappers also support negative literals.
### Floats
| Standard | `3.14` |
| Scientific | `1e10`, `2.5e-3`, `1E10` |
| Underscore separators | `1_000.5` |
| Type suffix | `3.14f32`, `1e10f64` |
### Characters
Full escape support: `'a'`, `'\n'`, `'\t'`, `'\\'`, `'\''`, `'\"'`, `'\0'`, `'\x41'`, `'\u{1F600}'`.
### Booleans
`true` and `false` are parsed as identifier tokens, not literals. All `litext` extraction handles this transparently.
### Byte Literals
`b'a'`, `b'\n'`, `b'\xff'`. The full 0x00..=0xFF range is valid for `\x` escapes in byte literals.
### Byte Strings
`b"hello"`, `b"\xff\x80"`, `br#"raw bytes"#`. The full byte range is supported for `\x` escapes.
### C Strings
`c"hello"`, `cr#"raw"#`. Interior null bytes are rejected with a compile error.
## Error Handling
All errors are returned as `TokenStream` values containing `compile_error!` invocations. When returned from a proc-macro entry point, the compiler displays the error at the correct source location.
### Common Error Cases
| Empty | Any | "expected a literal, got nothing" |
| `"a" "b"` | String | "expected exactly one literal" |
| `42` | String | "expected a string literal" |
| `256` | u8 | "integer out of range" |
| `-42` | u8 | "cannot negate an unsigned integer" |
| `b"hello"` | String | "expected a string literal, not a byte string" |
| `c"\0"` | CString | "C string literal contains an interior null byte" |
### Span Preservation
Errors preserve the span of the problematic token, enabling precise compiler diagnostics.
## Implementation Details
### proc_macro2 Dependency
`litext` uses `proc_macro2` instead of `proc_macro` so it can be tested outside proc-macro context. At your proc-macro entry point, convert between the two:
```rust
#[proc_macro]
pub fn my_macro(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
my_macro_inner(input.into()).into()
}
fn my_macro_inner(input: proc_macro2::TokenStream) -> proc_macro2::TokenStream {
let s: String = litext!(input);
// ...
}
```
### How Extract Works
The `extract<T>` function and `litext!` macro work as follows:
1. Convert input to an iterator over tokens
2. Handle special cases (negative literals with `-` punct)
3. Match on token type (Literal, Ident, Punct, Group)
4. Call the appropriate `FromLit` method
5. Return `Result<T, TokenStream>`
### Trait Implementations
The `FromLit` trait is the extension point. See the trait documentation for all methods you can override.
### ToTokens Trait
The `ToTokens` trait converts span-aware types back to `TokenStream`. All `Lit*` types implement it. This enables round-tripping, validation patterns, and re-emission with preserved spans.
## Planned Features
- Feature flag gates to `FromLit` impls for standard types, makes `litext` lighter (2.0)
- Negative numbers added in 1.2
- Two-character separators added in 1.3
## Requirements
- Rust 2024 edition (Rust 1.85)
- `proc_macro2` as a direct dependency in your crate