qubit-codec 0.3.3

Reusable byte and text codecs for Rust applications
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
# Qubit Codec

[![Rust CI](https://github.com/qubit-ltd/rs-codec/actions/workflows/ci.yml/badge.svg)](https://github.com/qubit-ltd/rs-codec/actions/workflows/ci.yml)
[![Coverage Status](https://coveralls.io/repos/github/qubit-ltd/rs-codec/badge.svg?branch=main)](https://coveralls.io/github/qubit-ltd/rs-codec?branch=main)
[![Crates.io](https://img.shields.io/crates/v/qubit-codec.svg?color=blue)](https://crates.io/crates/qubit-codec)
[![Rust](https://img.shields.io/badge/rust-1.94+-blue.svg?logo=rust)](https://www.rust-lang.org)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![中文文档](https://img.shields.io/badge/文档-中文版-blue.svg)](README.zh_CN.md)

Reusable byte and text codecs for Rust applications.

## Overview

Qubit Codec provides small, explicit codecs for stable byte and text encodings
commonly needed across Qubit Rust crates and applications. Its API stays
lightweight, typed, and idiomatic, with direct concrete methods for common use
cases and traits for generic boundaries.

This crate focuses on textual encodings with clear wire-format semantics:

- hexadecimal byte strings
- Base64 byte strings
- C integer literal fragments
- C string literal byte fragments
- percent-encoded UTF-8 text
- `application/x-www-form-urlencoded` UTF-8 text fragments

It intentionally does not replace Rust's `Display`, `FromStr`, `TryFrom`, or
`serde` APIs for ordinary object conversion.

## Design Goals

- **Explicit Semantics**: each codec documents its alphabet, separator, padding,
  and decoding rules.
- **Small API Surface**: expose direct `encode` and `decode` methods first, with
  traits available for generic call sites.
- **No Hidden Panics**: malformed input is reported as `CodecError` instead of
  panicking.
- **Composable Traits**: `Encoder`, `Decoder`, and `Codec` support reusable
  boundaries without forcing dynamic dispatch.
- **Reusable Implementations**: common encodings live in one crate instead of
  being reimplemented by downstream crates.
- **Minimal Dependencies**: rely on well-maintained crates only where they add
  real value.

## Features

### 🔡 **Hexadecimal Bytes**

- **Lowercase by Default**: `HexCodec::new()` produces contiguous lowercase hex.
- **Uppercase Mode**: `HexCodec::upper()` or `with_uppercase(true)` produces
  uppercase digits.
- **Optional Whole Prefix**: add and require a prefix such as `0x` before the
  entire encoded value.
- **Optional Per-Byte Prefix**: add and require a byte prefix such as `0x`
  before each encoded byte.
- **Optional Separator**: write and accept separators between bytes, such as
  `:` or a space.
- **Whitespace Handling**: optionally ignore ASCII whitespace while decoding.
- **Prefix Case Handling**: optionally ignore ASCII case when matching
  configured prefixes while decoding.
- **Buffer APIs**: `encode_into` and `decode_into` append into existing buffers.

### 🔐 **Base64 Bytes**

- **Standard Alphabet**: padded and no-padding standard Base64.
- **URL-Safe Alphabet**: padded and no-padding URL-safe Base64.
- **Typed Errors**: malformed input is reported as `CodecError::InvalidInput`.

### 🔤 **C String Literal Bytes**

- **Mixed Text and Escapes**: decodes fragments such as `PK\003\004` and
  `\xd0\xcf`.
- **C Escape Support**: handles simple, octal, hexadecimal, and universal byte
  escapes.
- **Byte-Oriented Output**: decodes directly to raw bytes without requiring
  UTF-8.

### 🔢 **C Integer Literals**

- **Radix Detection**: decodes decimal, octal, and `0x`/`0X` hexadecimal
  integer literals.
- **Unsigned Output**: returns `u64` for non-negative integer literal fragments.
- **Precise Errors**: reports invalid digits with their original input index.

### 🌐 **Percent-Encoding**

- **UTF-8 Text**: encodes and decodes UTF-8 strings.
- **RFC 3986 Unreserved Set**: leaves ASCII letters, digits, `-`, `.`, `_`, and
  `~` unchanged.
- **Uppercase Escapes**: writes percent escapes such as `%2F` and `%E4`.
- **Malformed Escape Detection**: reports truncated or invalid `%XX` sequences.

### 📝 **Form URL Encoding**

- **Form Fragment Codec**: handles `application/x-www-form-urlencoded` text
  fragments.
- **Space as Plus**: encodes spaces as `+` and decodes `+` back to spaces.
- **Percent Compatibility**: shares the same UTF-8 and `%XX` validation behavior
  as `PercentCodec`.

### 🎯 **Focused Public API**

- **`Encoder<Input>`**: encodes borrowed input into an associated output type.
- **`Decoder<Input>`**: decodes borrowed input into an associated output type.
- **`Codec<EncodeInput, DecodeInput>`**: combines encoder and decoder traits.
- **`CodecError` / `CodecResult`**: common error and result types for bundled
  codecs.

## Installation

Add this to your `Cargo.toml`:

```toml
[dependencies]
qubit-codec = "0.3.2"
```

## Quick Start

### Hexadecimal Bytes

```rust
use qubit_codec::HexCodec;

fn main() {
    let codec = HexCodec::upper()
        .with_prefix("0x")
        .with_separator(" ");

    let encoded = codec.encode(&[0x1f, 0x8b, 0x00, 0xff]);
    assert_eq!("0x1F 8B 00 FF", encoded);

    let decoded = codec
        .decode("0x1F 8B 00 FF")
        .expect("hex text should decode");
    assert_eq!(vec![0x1f, 0x8b, 0x00, 0xff], decoded);
}
```

### Base64 Bytes

```rust
use qubit_codec::Base64Codec;

fn main() {
    let codec = Base64Codec::standard();

    let encoded = codec.encode(b"hello");
    assert_eq!("aGVsbG8=", encoded);

    let decoded = codec
        .decode("aGVsbG8=")
        .expect("Base64 text should decode");
    assert_eq!(b"hello".to_vec(), decoded);
}
```

### URL-Safe Base64 Without Padding

```rust
use qubit_codec::Base64Codec;

fn main() {
    let codec = Base64Codec::url_safe_no_pad();

    let encoded = codec.encode(&[251, 255, 239]);
    assert_eq!("-__v", encoded);

    let decoded = codec
        .decode("-__v")
        .expect("URL-safe Base64 text should decode");
    assert_eq!(vec![251, 255, 239], decoded);
}
```

### C String Literal Bytes

```rust
use qubit_codec::CStringLiteralCodec;

fn main() {
    let codec = CStringLiteralCodec::new();

    let decoded = codec
        .decode(r"PK\003\004")
        .expect("C string literal should decode");
    assert_eq!(b"PK\x03\x04".to_vec(), decoded);

    let encoded = codec.encode(&[0xd0, 0xcf, 0x11, 0xe0]);
    assert_eq!(r"\xD0\xCF\x11\xE0", encoded);
}
```

### C Integer Literals

```rust
use qubit_codec::CIntegerLiteralCodec;

fn main() {
    let codec = CIntegerLiteralCodec::new();

    assert_eq!(123, codec.decode("123").expect("decimal should decode"));
    assert_eq!(83, codec.decode("0123").expect("octal should decode"));
    assert_eq!(
        0xbeef_c0de,
        codec.decode("0xBEEFC0DE").expect("hex should decode")
    );
}
```

### Percent-Encoding UTF-8 Text

```rust
use qubit_codec::PercentCodec;

fn main() {
    let codec = PercentCodec::new();

    let encoded = codec.encode("a b/中");
    assert_eq!("a%20b%2F%E4%B8%AD", encoded);

    let decoded = codec
        .decode("a%20b%2F%E4%B8%AD")
        .expect("percent-encoded text should decode");
    assert_eq!("a b/中", decoded);
}
```

### Form URL Encoding

```rust
use qubit_codec::FormUrlencodedCodec;

fn main() {
    let codec = FormUrlencodedCodec::new();

    let encoded = codec.encode("name=Qubit Codec");
    assert_eq!("name%3DQubit+Codec", encoded);

    let decoded = codec
        .decode("name%3DQubit+Codec")
        .expect("form-url-encoded text should decode");
    assert_eq!("name=Qubit Codec", decoded);
}
```

### Generic Trait Usage

Use the traits when application code should depend on an encoding capability
instead of a concrete codec type.

```rust
use qubit_codec::{
    CodecError,
    Encoder,
    HexCodec,
};

fn encode_payload<C>(codec: &C, payload: &[u8]) -> Result<String, CodecError>
where
    C: Encoder<[u8], Output = String, Error = CodecError>,
{
    codec.encode(payload)
}

fn main() {
    let text = encode_payload(&HexCodec::new(), &[0xab, 0xcd])
        .expect("hex encoding should not fail");
    assert_eq!("abcd", text);
}
```

## API Reference

### Trait Operations

| Trait | Method | Description |
|-------|--------|-------------|
| `Encoder<Input>` | `encode(&Input)` | Encode borrowed input into an associated output type |
| `Decoder<Input>` | `decode(&Input)` | Decode borrowed input into an associated output type |
| `Codec<EncodeInput, DecodeInput>` | - | Marker-style combination of `Encoder` and `Decoder` |

### `HexCodec` Operations

| Method | Description |
|--------|-------------|
| `new()` | Create a lowercase codec without prefix or separators |
| `upper()` | Create an uppercase codec without prefix or separators |
| `with_uppercase(enabled)` | Configure digit case |
| `with_prefix(prefix)` | Add and require a whole-output prefix, such as `0x1F8B` |
| `with_byte_prefix(prefix)` | Add and require a prefix before every byte, such as `0x1F 0x8B` |
| `with_separator(separator)` | Add and accept a separator between bytes |
| `with_ignored_ascii_whitespace(enabled)` | Ignore ASCII whitespace while decoding |
| `with_ignore_prefix_case(enabled)` | Ignore ASCII case when matching configured prefixes while decoding |
| `encode(bytes)` | Encode bytes into hexadecimal text |
| `encode_into(bytes, output)` | Append encoded text into an existing `String` |
| `decode(text)` | Decode hexadecimal text into bytes |
| `decode_into(text, output)` | Append decoded bytes into an existing `Vec<u8>` |

### `Base64Codec` Operations

| Method | Alphabet | Padding | Description |
|--------|----------|---------|-------------|
| `standard()` | Standard | Yes | Create standard Base64 codec |
| `standard_no_pad()` | Standard | No | Create standard Base64 codec without padding |
| `url_safe()` | URL-safe | Yes | Create URL-safe Base64 codec |
| `url_safe_no_pad()` | URL-safe | No | Create URL-safe Base64 codec without padding |
| `encode(bytes)` | Configured | Configured | Encode bytes into Base64 text |
| `decode(text)` | Configured | Configured | Decode Base64 text into bytes |

### `CStringLiteralCodec` Operations

| Method | Description |
|--------|-------------|
| `new()` | Create a C string literal byte codec |
| `encode(bytes)` | Encode bytes into a C string literal fragment |
| `decode(text)` | Decode a C string literal fragment into bytes |

### `CIntegerLiteralCodec` Operations

| Method | Description |
|--------|-------------|
| `new()` | Create a C integer literal decoder |
| `decode(text)` | Decode a non-negative C integer literal fragment into `u64` |

### Text Codec Operations

| Type | Method | Description |
|------|--------|-------------|
| `PercentCodec` | `new()` | Create a percent codec |
| `PercentCodec` | `encode(text)` | Encode UTF-8 text using percent encoding |
| `PercentCodec` | `decode(text)` | Decode percent-encoded UTF-8 text |
| `FormUrlencodedCodec` | `new()` | Create a form-url-encoded codec |
| `FormUrlencodedCodec` | `encode(text)` | Encode UTF-8 text, using `+` for spaces |
| `FormUrlencodedCodec` | `decode(text)` | Decode UTF-8 text, treating `+` as spaces |

## Error Handling

Bundled decoders return `CodecResult<T>`, an alias for
`Result<T, CodecError>`.

| Error | Meaning |
|-------|---------|
| `MissingPrefix` | A configured whole or per-byte hex prefix was required but missing |
| `InvalidDigit` | Input contained a digit that is invalid for the requested radix |
| `InvalidLength` | Input length does not satisfy a codec requirement |
| `InvalidEscape` | Input contained a malformed or unsupported escape sequence |
| `InvalidCharacter` | Input contained a character that cannot appear in that context |
| `InvalidInput` | Input was rejected by a codec-specific validator |
| `InvalidUtf8` | Decoded bytes were not valid UTF-8 |

## Testing & Code Coverage

This project keeps codec behavior covered by integration tests under `tests/`.

### Running Tests

```bash
# Run all tests
cargo test

# Run with coverage report
./coverage.sh

# Generate text format report
./coverage.sh text

# Align code with CI requirements
./align-ci.sh

# Run CI checks (format, clippy, test, coverage, audit)
./ci-check.sh
```

## Dependencies

Runtime dependencies are intentionally small:

- `base64` provides the Base64 engines.
- `thiserror` provides the public error type implementation.

## License

Copyright (c) 2026. Haixing Hu.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

See [LICENSE](LICENSE) for the full license text.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

### Development Guidelines

- Follow Rust API Guidelines.
- Keep tests comprehensive and deterministic.
- Document public APIs and behavior changes.
- Ensure all checks pass before submitting a PR.

## Author

**Haixing Hu**

## Related Projects

More Rust libraries from Qubit are available under the
[qubit-ltd](https://github.com/qubit-ltd) GitHub organization.

---

Repository: [https://github.com/qubit-ltd/rs-codec](https://github.com/qubit-ltd/rs-codec)