rfe-types 0.1.0

# RFE Types Analysis

---

## 1. Version/protocol naming inconsistencies

**`test_vectors.rs` line 1:**

```rust
//! NORM Protocol v1.2 Canonical Test Vectors.
```

This conflicts with everything else. `protocol_v1.rs`, `SealInput::new_v1`, crate version `0.1.0` — all say "v1". "v1.2" implies a prior published v1.1, which does not exist. The TrustBox report also used "Seal v1.2" in a comment in `hashing.rs` (`canonical_json` doc: "Seal v1.2 results hash"). Fix both to "Protocol v1":

```rust
//! NORM Protocol v1 Canonical Test Vectors.
// hashing.rs: "SINGLE authoritative source for the Seal v1 results hash"
```

---

## 2. `FraudSign` variant names — mismatch with published articles

The variants in `hashing.rs` are:

```
ReceiverInDatabase, DeviceInDatabase, AtypicalTransaction,
SuspiciousSbpTransfer, SuspiciousNfcActivity, MultipleAccountsFromSingleDevice,
InconsistentGeolocation, HighVelocityTransfersInShortWindow,
RemoteAccessToolDetected, KnownProxyOrVpnEndpoint,
SocialEngineeringPatternDetected, SanctionedEntityMatch, Other(String)
```

The articles written in the previous session used a completely different set:

```
ReceiverInDatabase, DeviceInDatabase, AtypicalTransaction,
NewReceiver, VolumeExceeded, AtypicalTime, AtypicalChannel, AtypicalAmount,
DeviceCompromised, SocialEngineering, ExternalSignal, Other(String)
```

The articles need to be corrected to use the actual code names before Habr publish. The 12-sign table in `article-ru-od2506-final.md` and `article-en-od2506-final.md` must match exactly what `cbr-finapi-rs` exports.

---

## 3. `SanctionedEntityMatch` — regulatory misattribution

This is the most significant regulatory issue in the codebase.

`SanctionedEntityMatch` describes sanctions screening — matching a transaction party against a sanctions list. This is a **115-FZ** obligation (AML/CFT, "О противодействии легализации"), not a **161-FZ** obligation (NPS, unauthorized transfers). The ОД-2506 fraud signs are about behavioral indicators for transfers without client consent. Sanctions list matching is a different statutory requirement with different enforcement, different reporting channels (Rosfinmonitoring vs CBR), and different operational consequences.

If `cbr-finapi-rs` is positioned as a 161-FZ / ОД-2506 implementation, `SanctionedEntityMatch` should either:

- Be removed from this enum and placed in a future `aml-rs` crate
- Or be explicitly documented as "extended sign for 115-FZ AML context, not part of ОД-2506" with a `#[doc]` note

Keeping it silently in the ОД-2506 enum creates a compliance misattribution that a bank's legal team will catch during procurement review.

The sign that ОД-2506 actually describes for sign 12/Other is an open-ended fallback for future CBR additions. `SanctionedEntityMatch` as a named variant implies ОД-2506 explicitly defines it, which it does not.

**Corrected 12-sign mapping against actual ОД-2506 text:**

| Variant | ОД-2506 sign | Regulatory basis |
| --- | --- | --- |
| `ReceiverInDatabase` | Получатель в базе ФинЦЕРТ/ГИС Антифрод | 161-FZ, ОД-2506 |
| `DeviceInDatabase` | Устройство отправителя в базе | 161-FZ, ОД-2506 |
| `AtypicalTransaction` | Нетипичная операция (поведенческий профиль) | 161-FZ, ОД-2506 |
| `SuspiciousSbpTransfer` | Признаки нетипичного СБП-перевода | 161-FZ, ОД-2506 |
| `SuspiciousNfcActivity` | Признаки NFC-скимминга / подмены карты | 161-FZ, ОД-2506 |
| `MultipleAccountsFromSingleDevice` | Множество счетов с одного устройства | 161-FZ, ОД-2506 |
| `InconsistentGeolocation` | Несоответствие геолокации | 161-FZ, ОД-2506 |
| `HighVelocityTransfersInShortWindow` | Высокая частота переводов в короткий период | 161-FZ, ОД-2506 |
| `RemoteAccessToolDetected` | Обнаружен инструмент удалённого доступа | 161-FZ, ОД-2506 (sign 10) |
| `KnownProxyOrVpnEndpoint` | Известный прокси/VPN-эндпоинт | 161-FZ, ОД-2506 |
| `SocialEngineeringPatternDetected` | Паттерн социальной инженерии | 161-FZ, ОД-2506 |
| `SanctionedEntityMatch` | **NOT in ОД-2506** | 115-FZ only |

Replace `SanctionedEntityMatch` with the actual ОД-2506 sign 12, which is an operator-reported signal from another PSP/bank:

```rust
/// Signal received from another payment system operator.
/// Corresponds to OD-2506 sign 12.
ExternalOperatorSignal,
```

This is a non-breaking addition if done before `0.1.0` publish. Post-publish it would require `0.2.0`.

---

## 4. `bench/hashing.rs` — missing `required-features`

The `[[bench]] name = "hashing"` entry in `Cargo.toml` has no `required-features`, but the bench includes `bench_gost_export` which calls `rfe_types::audit_root_gost_export` — a function gated behind `gost-export` feature:

```rust
fn bench_gost_export(c: &mut Criterion) {
    c.bench_function("audit_root_gost_export_streebog", |b| {
        b.iter(|| rfe_types::audit_root_gost_export(&blake3_root))  // requires gost-export
    });
}
```

`cargo bench` without `--features gost-export` will fail to compile. Either:

**Option A** — add `required-features` to the bench entry (matches how `streebog` bench is handled):

```toml
[[bench]]
name = "hashing"
harness = false
required-features = ["gost-export"]
```

**Option B** — split `bench_gost_export` out of `hashing.rs` into `streebog.rs` entirely (where it conceptually belongs), and remove it from `hashing.rs`. This is cleaner — `hashing.rs` bench then covers only the always-available blake3 path, no feature gate required.

Option B is the better design. The `streebog.rs` bench already benchmarks `audit_root_gost_export_32b` — `bench_gost_export` in `hashing.rs` duplicates it under a different name.

---

## 5. `test_vectors.rs` — vectors declared but not executed

The `TestVector` structs are `pub const` values but there are no `#[test]` functions in the file that actually assert against them. This was flagged in the previous session's assessment as a gap. Before `0.1.0` ships, add:

```rust
#[cfg(test)]
mod tests {
    use super::*;
    use rfe_types::hashing::hash_raw_request;
    use rfe_types::SealInput;

    fn verify_vector(v: &TestVector) {
        let request_hash = hash_raw_request(v.csv_payload);
        assert_eq!(request_hash, v.expected_request_hash, "{}: request hash mismatch", v.name);

        let seal = SealInput::new_v1(
            v.nonce,
            v.expected_request_hash,
            v.expected_result_hash,
            v.chain_head_pre,
        )
        .compute_seal();
        assert_eq!(seal, v.expected_seal, "{}: seal mismatch", v.name);
    }

    #[test]
    fn vector_1_simple_approved() { verify_vector(&VECTOR_1); }

    #[test]
    fn vector_2_flagged_pdn() { verify_vector(&VECTOR_2); }
}
```

Without these tests the vectors are documentation, not verification. Cross-platform determinism (Xtensa vs x86_64) is the stated purpose of this file — that claim requires the tests to run.

Also: the `expected_request_hash` values in `VECTOR_1` and `VECTOR_2` look suspicious. `VECTOR_1.expected_request_hash` contains the sequence `0x1a, 0x04, 0x04, 0x43, 0x24, 0x0e, 0x2c, 0x0a` repeated three times in the second half. Real blake3 output of `b"TX001,1000,5000,1\n"` would not have that repetition pattern. These need to be recomputed with actual blake3 before the vectors can be trusted. Run:

```bash
# Quick verification
echo -n "TX001,1000,5000,1\n" | b3sum --no-names
# or in Rust:
# println!("{:?}", blake3::hash(b"TX001,1000,5000,1\n").as_bytes());
```

If the values are wrong, the tests added above will fail and surface the correction needed.

---

## 6. `src/hashing.rs` — `canonical_json` vs `hash_canonical_reports` relationship

Two public functions exist:

```rust
pub fn canonical_json(reports: &mut [ComplianceReport]) -> Result<Vec<u8>, CanonicalHashError>
pub fn hash_canonical_reports(reports: &mut [ComplianceReport]) -> Result<[u8; 32], CanonicalHashError>
```

`canonical_json` materializes the full JSON `Vec<u8>`. `hash_canonical_reports` streams directly into the hasher without materializing. For embedded/constrained use the streaming path exists precisely to avoid the allocation. But `canonical_json` will allocate a `Vec` proportional to the number of reports — potentially several KB for a 500-row batch.

The issue: `canonical_json` uses `serde_json::to_vec` (which internally does its own JSON serialization), while `hash_canonical_reports` uses the manual `hash_report_canonical` function with its own field ordering and escaping. These two paths can produce different JSON if `serde_json` serializes fields in a different order than the manual path, or if it escapes differently.

This is a correctness risk: if `canonical_json` is used for anything other than debugging, results may diverge from `hash_canonical_reports`. The doc comment on `hash_canonical_reports` says "SINGLE authoritative source for the Seal v1 results hash." That means `canonical_json` should be documented as a diagnostic helper that does **not** guarantee byte-for-byte equality with the hash path.

Add to `canonical_json`:

```rust
/// Returns the canonical JSON bytes for inspection and debugging.
///
/// WARNING: This function uses `serde_json` serialization which may produce
/// different field ordering or escaping than the streaming hash path in
/// `hash_canonical_reports`. Do NOT use this output as input to seal computation.
/// For the authoritative seal hash, use `hash_canonical_reports` directly.
#[cfg(feature = "std")]  // or leave alloc-only but document it
pub fn canonical_json(...) -> ...
```

If `canonical_json` is not actually used in any production path, consider removing it entirely before first publish to keep the public API surface minimal.

---

## 7. `use alloc::string::ToString` — redundant import

In `src/hashing.rs`:

```rust
use alloc::string::ToString;
```

`ToString` is in the Rust prelude (re-exported from `core::convert`). This import is redundant and will produce a `unused_imports` warning on some compiler versions. Remove it.

---

## 8. Comment language — full audit

Going through each file:

`src/hashing.rs` — all English. Consistent.

`test_vectors.rs` — English, with one issue:

```rust
//! Used to verify cross-platform determinism across Xtensa (Trust-Box) and x86_64 (Risk-Lens).
```

"Trust-Box" should be "TrustBox" (no hyphen) — consistent with the rest of the codebase branding.

`tests/protocol_v1.rs` — all English, inline comments are sparse. The test names are descriptive enough. `// Request: Single line CSV` and `// Response: Single report` are fine functional English comments. Consistent.

`benches/hashing.rs`:

```rust
// Benchmark: BLAKE3 vs Streebog-256 on a 32-byte chain root re-hash.
// BLAKE3 is used for internal audit chaining (fast, no_std).
// Streebog-256 is used exclusively at the export boundary (CBR / SMEV 4).
// This bench quantifies the latency penalty of regulatory export.
```

This comment block appears at the top of `benches/streebog.rs`, not `benches/hashing.rs`. It belongs where it is. English, consistent.

`benches/hashing.rs` — no file-level doc comment at all. Add one matching the pattern in `streebog.rs`:

```rust
// Benchmarks: Blake3 single hash and audit chain step.
// These cover the always-available (no feature gate) hot paths.
```

`README.md` — English throughout. The "Freeze Notice" section title is clear. The field list under `SealInput` is missing `timestamp` — `SealInput` in protocol_v1.rs uses `new_v1(nonce, request_hash, result_hash, chain_head_pre)`, which implies 4 fields. But the earlier TrustBox Phase 1 work defined `SealInput` with `nonce, request_hash, result_hash, chain_head, timestamp`. The README lists only 4 fields without `timestamp`. Verify which is canonical in `lib.rs`.

`CHANGELOG.md` — English, clean.

`Cargo.toml` — English. `homepage` is the same URL as `repository`. Acceptable for now but `homepage` should ideally point to a landing page rather than the GitHub repo. Low priority for `0.1.0`.

---

## 9. Optimization opportunities (no breaking changes)

**`hash_report_canonical` — control character branch:**

```rust
0x00..=0x1F => {
    let mut esc: [u8; 6] = [b'\\', b'u', b'0', b'0', b'0', b'0'];
    let hi = (b >> 4) & 0x0F;
    let lo = b & 0x0F;
    esc[4] = if hi < 10 { b'0' + hi } else { b'a' + (hi - 10) };
    esc[5] = if lo < 10 { b'0' + lo } else { b'a' + (lo - 10) };
    hasher.update(&esc);
}
```

This is correct but calls `hasher.update` with a 6-byte slice, which involves a function call per control character. For the common case (no control characters in transaction IDs or recommendation strings), this path is never hit. No change needed — the common path is a single `hasher.update(&[b])` per byte which is already minimal.

**`hash_canonical_reports` — the sort is in-place and mutates the caller's slice.** This is correct behavior and is tested in `test_canonical_sorting`. The doc comment should explicitly state this:

```rust
/// Sorts `reports` in-place by `transaction_id` before hashing.
/// The sort is deterministic and modifies the slice.
pub fn hash_canonical_reports(...)
```

**`FraudSign::Other(v)` in `hash_report_canonical`:**

```rust
crate::FraudSign::Other(v) => {
    hash_json_escaped(hasher, v.as_bytes());
}
```

`v` is a `String` — `v.as_bytes()` is a zero-cost borrow. Fine.

**The `canonical_json` function calls `reports.sort()` and then `serde_json::to_vec`.** If `canonical_json` is kept, it should reuse the same sort logic as `hash_canonical_reports` to guarantee ordering consistency. Currently both call `.sort()` independently which is fine, but a shared `sort_reports` helper would remove the duplication.

---

## Summary of required changes before publish

| Priority | File | Change |
| --- | --- | --- |
| P0 blocker | `hashing.rs` | Replace `SanctionedEntityMatch` with `ExternalOperatorSignal` to correct 161-FZ vs 115-FZ attribution |
| P0 blocker | `benches/hashing.rs` | Remove `bench_gost_export` (duplicate of `streebog.rs` bench) OR add `required-features = ["gost-export"]` to the `[[bench]]` entry |
| P0 blocker | `test_vectors.rs` | Add executable `#[test]` functions; verify hardcoded hash values against actual blake3 output |
| P1 | `test_vectors.rs` | Change "Protocol v1.2" → "Protocol v1" |
| P1 | `hashing.rs` doc comment | Change "Seal v1.2" → "Seal v1" |
| P1 | `hashing.rs` | Remove `use alloc::string::ToString` |
| P1 | `hashing.rs` | Add doc clarification on `canonical_json` (not the hash path, diagnostic only) |
| P1 | `README.md` | Verify `SealInput` field list against `lib.rs` (timestamp missing?) |
| P2 | `test_vectors.rs` | "Trust-Box" → "TrustBox" |
| P2 | `benches/hashing.rs` | Add file-level doc comment |
| P2 | `hashing.rs` | Add explicit sort-mutation note to `hash_canonical_reports` doc |
| Post-publish | Articles | Update 12-sign tables to use actual variant names from `hashing.rs` |