redaction 0.1.7

Layered data redaction controls: classification and redaction
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
# Redaction

`redaction` helps you keep sensitive values (tokens, secrets, PII) out of places they don't belong by:

- Deriving `Sensitive` on your types with `#[derive(Sensitive)]`
- Marking sensitive fields with `#[sensitive]` or `#[sensitive(Classification)]`
- Calling `.redact()` to produce a copy that is safe to log or serialize
- Generating `Debug` output that prints `"[REDACTED]"` for sensitive fields (independent of policies)

## Design (DDD / Clean Architecture friendly)

- **Classifications are domain concepts**: marker types like `Secret`, `Token`, or your own `InternalId` represent *what kind* of data a field contains.
- **Policies belong in the application layer**: policies are attached to classification types (`impl RedactionPolicy for MyClassification`) in the layer where you define "what is safe to expose", typically close to logging/serialization boundaries.
- **Sinks are optional adapters**: integrations like `slog` live behind feature flags; your domain types don't depend on a logging framework.
- **Layering is optional**: you can put classifications, policies, and redaction calls in a single crate if you prefer. The library supports both "clean architecture" layering and simple, pragmatic project layouts.

## The Problem

Sensitive data ends up places it shouldn't:

- **Logging**: Structured logs capture request/response bodies containing passwords, tokens, PII
- **Serialization**: API responses, database exports, and message queues include fields that should be hidden
- **Error reporting**: Stack traces and error contexts expose sensitive state
- **Debug output**: `#[derive(Debug)]` prints everything, including secrets

Once sensitive data reaches these systems, it is often:

- Stored long-term (retention policies, backups)
- Indexed and searchable
- Replicated across environments
- Visible to anyone with access to logs/telemetry

```rust
#[derive(Debug, serde::Serialize)]
struct LoginRequest {
    username: String,
    password: String,
}

let request = LoginRequest {
    username: "alice".into(),
    password: "hunter2".into(),
};

// Debug output exposes the password
println!("{:?}", request);
// → LoginRequest { username: "alice", password: "hunter2" }

// Serialization also exposes the password
let json = serde_json::to_string(&request).unwrap();
// → {"username":"alice","password":"hunter2"}
//
// (This example uses `serde_json` to make the risk concrete. The same problem
// exists with any serializer that includes `password`.)
```

## The Solution

Mark sensitive fields explicitly. This crate provides:

- **Safe Debug**: sensitive fields print as `[REDACTED]`
- **Explicit redaction**: call `.redact()` to get a copy safe for serialization and logging
- **Policy control**: choose how each classification is redacted (full, keep, mask)
- **External type support**: types you don't control (like `chrono::DateTime`) just work

```rust
use redaction::{Redactable, Secret, Token, Sensitive};

#[derive(Clone, Sensitive)]
struct LoginRequest {
    username: String,
    #[sensitive(Secret)]
    password: String,
    #[sensitive(Token)]
    api_key: String,
}

let request = LoginRequest {
    username: "alice".into(),
    password: "hunter2".into(),
    api_key: "tok_live_abcdef".into(),
};

// Get a redacted copy for serialization, APIs, and logging
let safe = request.redact();
assert_eq!(safe.password, "[REDACTED]");       // Secret: fully redacted
assert_eq!(safe.api_key, "***********cdef");  // Token: only last 4 visible
assert_eq!(safe.username, "alice");            // Not sensitive: unchanged

// Debug output is also safe, but it does NOT apply policies:
// it always prints `"[REDACTED]"` for `#[sensitive(...)]` fields.
println!("{:?}", request);
// → LoginRequest { username: "alice", password: "[REDACTED]", api_key: "[REDACTED]" }
```

## Installation

```toml
[dependencies]
redaction = "0.1"
```

## Basic Usage

- Add `#[derive(Clone, Sensitive)]` to your type
- Mark sensitive fields with `#[sensitive]` or `#[sensitive(Classification)]`
- Call `.redact()` before you log, serialize, return, or persist the value

```rust
use redaction::{Redactable, Secret, Token, Sensitive};

#[derive(Clone, Sensitive)]
struct ApiCredentials {
    #[sensitive(Secret)]
    password: String,
    #[sensitive(Token)]
    api_key: String,
    user_id: String,  // not sensitive, passed through unchanged
}

let creds = ApiCredentials {
    password: "super_secret".into(),
    api_key: "tok_live_abcdef".into(),
    user_id: "user_42".into(),
};

let redacted = creds.redact();
assert_eq!(redacted.password, "[REDACTED]");  // Secret → fully redacted
assert_eq!(redacted.api_key, "***********cdef"); // Token → only last 4 visible
assert_eq!(redacted.user_id, "user_42");      // unchanged
```

## Field Attributes

The `#[sensitive(...)]` attribute controls how each field is handled:

| Attribute | Use For | Behavior |
|-----------|---------|----------|
| *(none)* | Non-sensitive fields, external types | Pass through unchanged |
| `#[sensitive]` | Scalars OR nested `Sensitive` types | Walk containers, or redact scalars to default |
| `#[sensitive(Class)]` | String-like leaf values | Apply classification's redaction policy |

Classifications are for string-like leaf values; the field type must implement `SensitiveValue`
and `Classifiable`.

### Examples

```rust
use redaction::{Redactable, Secret, Sensitive};

#[derive(Clone, Sensitive)]
struct Address {
    #[sensitive(Secret)]
    street: String,
    city: String,        // Not sensitive
}

#[derive(Clone, Sensitive)]
struct User {
    #[sensitive(Secret)]
    ssn: String,         // Leaf value: apply Secret policy
    
    #[sensitive]
    address: Address,    // Nested struct: walk into it
    
    #[sensitive]
    age: i32,            // Scalar: redact to default (0)
    
    created_at: DateTime<Utc>,  // External type: passes through unchanged
    balance: Decimal,           // External type: passes through unchanged
}
```

### External Types Just Work

Fields without `#[sensitive]` pass through unchanged. This means external types like `chrono::DateTime`,
`rust_decimal::Decimal`, `uuid::Uuid`, or any type you don't control work automatically. Do not
add `#[sensitive]` unless the type implements `SensitiveType`.

```rust
use chrono::{DateTime, Utc};

#[derive(Clone, Sensitive)]
struct Transaction {
    #[sensitive(Secret)]
    account_number: String,
    
    // No annotation needed - external types pass through!
    timestamp: DateTime<Utc>,
    amount: Decimal,
    id: Uuid,
}
```

### Nested Sensitive Types

When a field's type also derives `Sensitive`, use `#[sensitive]` to walk into it:

```rust
#[derive(Clone, Sensitive)]
struct Credentials {
    #[sensitive(Secret)]
    password: String,
}

#[derive(Clone, Sensitive)]
struct User {
    #[sensitive]  // Walk into Credentials
    creds: Credentials,
}
```

**Important**: Without `#[sensitive]`, nested structs pass through unchanged (even if they derive `Sensitive`). This is by design - you explicitly choose what to redact.

### Nested Wrapper Classifications

Classifications work on nested wrapper types like `Option<Vec<String>>` automatically:

```rust
#[derive(Clone, Sensitive)]
struct User {
    #[sensitive(Pii)]
    emails: Option<Vec<String>>,  // Works! Recursively applies Pii to each String
    
    #[sensitive(Secret)]
    backup_codes: Vec<Option<String>>,  // Also works!
    
    #[sensitive(Secret)]
    metadata: HashMap<String, Vec<String>>,  // Maps work too!
}
```

The classification is applied recursively through any nesting depth of:
- `Option<T>`
- `Vec<T>`
- `Box<T>`
- `HashMap<K, V>` (values only)
- `BTreeMap<K, V>` (values only)
- `HashSet<T>` / `BTreeSet<T>`
- `Result<T, E>`

## Built-in Classifications

Each classification has a default redaction policy. Use the one that matches your data:

| Classification | Use for | Example output |
| --- | --- | --- |
| `Secret` | Passwords, private keys | `[REDACTED]` |
| `Token` | API keys, bearer tokens | `…abcd` (last 4) |
| `Email` | Email addresses | `jo…` (first 2) |
| `CreditCard` | Card numbers (PANs) | `…1234` (last 4) |
| `Pii` | Generic PII | `…_doe` (last 4) |
| `PhoneNumber` | Phone numbers | `…12` (last 2) |
| `NationalId` | SSN, passport numbers | `…6789` (last 4) |
| `AccountId` | Account identifiers | `…abcd` (last 4) |
| `SessionId` | Session tokens | `…wxyz` (last 4) |
| `IpAddress` | IP addresses | `…1.1` (last 4 chars) |
| `DateOfBirth` | Birth dates | `[REDACTED]` |
| `BlockchainAddress` | Wallet addresses | `…abc123` (last 6) |

## Custom Classifications

When built-in classifications don't fit, create your own:

```rust
use redaction::{Classification, RedactionPolicy, TextRedactionPolicy};

#[derive(Clone, Copy)]
struct InternalId;
impl Classification for InternalId {}

impl RedactionPolicy for InternalId {
    fn policy() -> TextRedactionPolicy {
        TextRedactionPolicy::keep_last(2)  // Show only last 2 characters
    }
}
```

Clean architecture note:

- Put the **classification type** (`InternalId`) in your **domain** crate/module.
- Put the **policy implementation** (`impl RedactionPolicy for InternalId`) in your **application** or **infrastructure** layer (where you define what is safe to expose and where logging/serialization happens).

Then use it like any built-in:

```rust
#[derive(Clone, Sensitive)]
struct Record {
    #[sensitive(InternalId)]
    id: String,
}
```

## Policies

Three policy types control how values are transformed:

- **Full**: replace the entire value with a placeholder

```rust
TextRedactionPolicy::default_full()           // → "[REDACTED]"
TextRedactionPolicy::full_with("<hidden>")    // → "<hidden>"
```

- **Keep**: keep specified characters visible, mask everything else

```rust
TextRedactionPolicy::keep_first(4)            // "secret123" → "secr*****"
TextRedactionPolicy::keep_last(4)             // "secret123" → "*****t123"
TextRedactionPolicy::keep_with(KeepConfig::both(2, 2))  // "secret" → "se**et"
```

- **Mask**: mask specified characters, keep the rest visible

```rust
TextRedactionPolicy::mask_first(4)            // "secret123" → "****et123"
TextRedactionPolicy::mask_last(4)             // "secret123" → "secre****"
```

## Logging with slog

With the `slog` feature, `Sensitive` types automatically redact when logged as
structured JSON:

```toml
[dependencies]
redaction = { version = "0.1", features = ["slog"] }
```

```rust
#[derive(Clone, Sensitive)]
#[cfg_attr(feature = "slog", derive(serde::Serialize))]
struct LoginEvent {
    #[sensitive(Secret)]
    password: String,
    username: String,
}

// Redacts automatically (no explicit .redact() needed)
slog::info!(logger, "login"; "event" => event);
```

**Structured JSON requirements (why they exist):**
- Type must implement `Clone` so the redacted JSON payload can be built without
  consuming the original value.
- Type must implement `serde::Serialize` because structured logging emits JSON
  derived from the redacted copy.
- The slog adapter uses `IntoRedactedJson`, which is auto-implemented for
  `Redactable + Serialize`.

If those bounds are too strict, use `SensitiveError` instead to log a redacted
string without requiring `Serialize`.

### Logging errors without Serialize

For types that cannot or should not derive `Serialize`, use `SensitiveError`. It
emits the same redacted `Debug` output, but logs as a string using a redacted
display template rather than JSON:

```rust
use redaction::{Secret, SensitiveError};

#[derive(SensitiveError)]
enum LoginError {
    #[error("invalid login for {username} {password}")]
    InvalidCredentials {
        username: String,
        #[sensitive(Secret)]
        password: String,
    },
    Io(std::io::Error), // not serializable
}

slog::error!(logger, "login failed"; "error" => err);
```

This path does not require `Serialize` on the type. `SensitiveError` generates a
`RedactedDisplay` implementation used by the `slog` adapter, so your error’s
normal `Display` (from `thiserror` or `displaydoc`) can remain unchanged while
logs still use a redacted string.

**Template rules and bounds:**
- Template required: `#[error("...")]` or doc comments (derive fails otherwise)
- Pass-through `{field}` uses `Display`; `{field:?}` uses `Debug`
- `#[sensitive(Classification)]` in template: `Clone + Display` (or `Debug` for `:?`)
- `#[sensitive]` scalars use defaults (no extra bounds)
- `#[sensitive]` non-scalars in template must derive `SensitiveError`

## Feature flags

- `classification` (default): built-in classification types
- `policy` (default): redaction policies and `.redact()`
- `slog`: structured logging adapter
- `testing`: unredacted `Debug` output in tests

---

## Reference

### Trait Bound Summary

- Traversal: `#[sensitive]` on non-scalars requires `SensitiveType`
- Classification: `#[sensitive(Classification)]` requires `Classifiable`
- Debug: fields shown in `Debug` output require `Debug`
- `slog` JSON (`Sensitive`): the type itself requires `Clone + Serialize + IntoRedactedJson`
- `slog` string (`SensitiveError`): see template rules above (bounds are template-dependent)

### Trait Concepts

The library uses these core traits, organized by layer:

**Domain Layer** (what is sensitive):
| Trait | Purpose | Implemented By |
|-------|---------|----------------|
| `SensitiveType` | Types that *contain* sensitive data | Structs/enums deriving `Sensitive` |
| `SensitiveValue` | Types that *are* sensitive data | `String`, `Cow<str>`, custom newtypes |

**Policy Layer** (how to redact):
| Trait | Purpose | Implemented By |
|-------|---------|----------------|
| `RedactionPolicy` | Maps classification → redaction strategy | Your custom classifications |
| `TextRedactionPolicy` | Concrete string transformations | Built-in (Full, Keep, Mask) |

**Application Layer** (redaction machinery):
| Trait | Purpose | Implemented By |
|-------|---------|----------------|
| `Classifiable` | Types that can have classifications applied | `String`, wrappers (`Option`, `Vec`, etc.) |
| `Redactable` | User-facing `.redact()` method | Auto-implemented for `SensitiveType` |
| `RedactionMapper` | Internal traversal machinery | `#[doc(hidden)]` |

- Use `#[sensitive]` on fields of `SensitiveType` types (to walk into them)
- Use `#[sensitive(Classification)]` on fields of `Classifiable` types (supports nested wrappers)

### Supported field types

**String-like** (`SensitiveValue`): Use `#[sensitive(Classification)]`:
- `String`
- `Cow<'_, str>` (redaction returns an owned value)

**Scalars**: Use bare `#[sensitive]` (no classification):
- Integers: `i8`-`i128`, `u8`-`u128`, `isize`, `usize`
- Floats: `f32`, `f64`
- `bool` → redacts to `false`
- `char` → redacts to `'X'`

**Containers** (`SensitiveType`): Use `#[sensitive]` to walk, or omit for pass-through:
- `Option<T>`: redacts inner value if present
- `Vec<T>`: redacts all elements
- `Box<T>`: redacts inner value
- `HashMap<K, V>`, `BTreeMap<K, V>`: redacts values only (keys unchanged)
- `HashSet<T>`, `BTreeSet<T>`: redacts elements
- `Result<T, E>`: redacts both `Ok` and `Err` sides

**External types**: No annotation needed (pass through):
- `chrono::DateTime<Tz>`, `rust_decimal::Decimal`, `uuid::Uuid`, etc.
- Any type that doesn't implement `SensitiveType`

**PhantomData**: Automatically handled (pass through, no trait bounds added).

### Compiler Error Messages

The library provides helpful error messages for common mistakes:

**Using a classification on a struct:**
```
error[E0277]: `Address` is not a `SensitiveValue`
   = note: classifications like `#[sensitive(Secret)]` are for leaf values (String, etc.)
   = note: if `Address` is a struct that derives `Sensitive`, use `#[sensitive]` instead
```

**Using `#[sensitive]` on an external type:**
```
error[E0277]: `DateTime<Utc>` does not implement `SensitiveType`
   = note: use `#[derive(Sensitive)]` on the type definition
   = note: or remove the #[sensitive] attribute to pass through unchanged
```

### Policy Behavior

- **Empty string (`""`)**:
  - **Keep/Mask**: returns `""`
  - **Full**: returns the placeholder (default: `"[REDACTED]"`)
- **Keep policies** (`keep_first`, `keep_last`, `KeepConfig::both`) operate on Unicode scalar values:
  - If `visible_prefix + visible_suffix >= length`, the value is returned unchanged
- **Mask policies** (`mask_first`, `mask_last`, `MaskConfig::both`) operate on Unicode scalar values:
  - If `mask_prefix + mask_suffix >= length`, the entire value is masked
- **Length**: keep/mask policies preserve the input length (full does not)

### Edge Cases

**Scalar type aliases**: Only bare primitive names (`i32`, `bool`) are recognized as scalars. Type aliases like `type MyInt = i32` or qualified paths like `std::primitive::i32` are treated as non-scalars and require `#[sensitive(Classification)]` or pass-through.

**Boxed trait objects**: The derive detects only the simple syntax `Box<dyn Trait>` and calls `redact_boxed`. It does not match `std::boxed::Box<dyn Trait>` or type aliases. The trait object must implement `RedactableBoxed`.

**Foreign string types**: For string-like types from other crates, wrap in a newtype:

```rust
use redaction::SensitiveValue;

struct WrappedId(external_crate::Id);

impl SensitiveValue for WrappedId {
    fn as_str(&self) -> &str { self.0.as_str() }
    fn from_redacted(s: String) -> Self { WrappedId(external_crate::Id::from(s)) }
}
```

**Map keys**: Never redacted. Move sensitive data into values.

**Debug vs `redact()`**: The derived `Debug` formats the type normally, but replaces the values of `#[sensitive(...)]` fields with the string `"[REDACTED]"`. It does not apply the field's policy. Use `.redact()` when you need policy-based output.

**Testing**: Enable the `testing` feature to get unredacted `Debug` output in tests:

```toml
[dev-dependencies]
redaction = { version = "0.1", features = ["testing"] }
```

### Security Considerations

- **Length preservation**: Keep/Mask policies preserve input length, which can leak information about value size. Use Full redaction for maximum privacy.
- **Timing**: Redaction is not constant-time. Do not use in cryptographic contexts.
- **Memory**: Original values may persist in memory until overwritten. Consider secure memory handling for highly sensitive data.

---

## Documentation

- [API Documentation]https://docs.rs/redaction

## Development

To set up git hooks for pre-commit checks (fmt, clippy, tests):

```bash
git config core.hooksPath .githooks
```

## License

Licensed under the MIT license ([LICENSE.md](LICENSE.md) or [opensource.org/licenses/MIT](https://opensource.org/licenses/MIT)).