csv-schema-validator 0.2.0

Derive macro to validate CSV
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
# csv-schema-validator

In the roadmap: `version 0.2.1` with more cross-validations: 

- `if_column`: Checks if the value of the conditional column is in a list of values, then checks the value of the annotated field: 

```rust
#[validate(if_column("status", ["paid", "cancelled"], ["done","rejected"]))] // This is not the final format! Just an idea.
```

## Version 0.2.0

[![Crates.io](https://img.shields.io/crates/v/csv-schema-validator.svg)](https://crates.io/crates/csv-schema-validator) [![Documentation](https://docs.rs/csv-schema-validator/badge.svg)](https://docs.rs/csv-schema-validator)

[<img src="./contribute_work_yes_badge.png" width=120>](./CONTRIBUTING.md).

[<img src="./noobs_yes_badge.png" width=120>](./CONTRIBUTING.md).

A Rust library for validating CSV record data based on rules defined directly in your structs using the `#[derive(ValidateCsv)]` macro.

## Installation

Add the following to your `Cargo.toml`:

```toml
[dependencies]
csv-schema-validator = "0.2.0"
serde = { version = "1.0", features = ["derive"] }
csv = "1.3"
regex = "1.11"
once_cell = "1.21"
```

## Quick Start

```rust
use serde::Deserialize;
use csv::Reader;
use csv_schema_validator::{ValidateCsv, ValidationError};

#[derive(Deserialize, ValidateCsv, Debug)]
struct TestRecord {
    #[validate(range(min = 0.0, max = 100.0))]
    grade: f64,

    #[validate(regex = r"^[A-Z]{3}\d{4}$")]
    code: String,

    #[validate(required, length(min = 10, max = 50), not_blank)]
    name: Option<String>,

    #[validate(custom = "length_validation")]
    comments: String,

    #[validate(required, one_of("short", "medium", "long"))]
    more_comments: Option<String>,

    #[validate(required, not_in("forbidden", "banned"))]
    tag: Option<String>,

    #[validate(range(min = -5, max = 20))]
    temp1: i32,

    #[validate(range(min = 10))]
    temp2: i32,

    #[validate(range(max = 100))]
    temp3: i32,

}

// Custom validator: comments must be at most 50 characters
fn length_validator(s: &str) -> Result<(), String> {
    if s.len() <= 50 {
        Ok(())
    } else {
        Err("Comments too long".into())
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut reader = Reader::from_path("data.csv")?;
    for result in reader.deserialize() {
        let rec: Record = result?;
        rec.validate_csv()?;
        println!("Record valid: {:?}", rec);
    }
    Ok(())
}
```

## Usage

### Range Validation (since 0.1.0, changed in 0.1.3)

```rust
#[validate(range(min = 0.0, max = 100.0))]
grade: f64,
```

Ensures that `grade` is between 0.0 and 100.0 (inclusive). 

If using version 0.1.3 you can specify just `min` or just `max` to check `greater-or-equal-to` and `less-or-equal-to`.
Literal type must match field type. You can have `int` or `float` fields but literals must match field type. Only for numeric fields.

### Regex Validation (since 0.1.0)

```rust
#[validate(regex = r"^[A-Z]{3}\d{4}$")]
code: String,
```

Validates the field against a regular expression. Only for `String`.

### Required Validation (since 0.1.0)

```rust
#[validate(required)]
name: Option<String>,
```

Ensures that the `Option` is not `None`. If using `required` the field must be `Option<T>`.

### Custom Validation (since 0.1.0)

```rust
#[validate(custom = "path::to::func")]
comments: String,
```

Calls your custom function `fn(&T) -> Result<(), String>` for additional checks. Only for `String` fields.

### Length (since 0.1.1)

```rust
#[validate(required, length(min = 10, max = 50))]
name: Option<String>,
```

Only for `String` fields.

### Not Blank (since 0.1.2)

Checks for all spaces or all whitespaces field (Strings):

```rust
#[validate(required, length(min = 10, max = 50), not_blank)]
name: Option<String>,
```

Only for `String` fields.

### One of (since 0.1.2)

Checks if the string has one of the allowed values: 

```rust
#[validate(required, one_of("short", "medium", "long"))]
more_comments: Option<String>,
```

Only for `String` fields.

### Not in (since 0.1.2)

Checks if the string has one of the not allowed values: 

```rust
#[validate(required, not_in("forbidden", "banned"))]
tag: Option<String>,
```
Only for `String` fields.

### if_then (since 0.2.0)

Defines a *cross-column implication rule* between two columns. If the conditional column matches a given value, the current column must equal a specific target value.

```rust
#[validate(if_then("<conditional_column>", "<conditional_value>", "<expected_value>"))]
```

* All arguments must be String literals. The types will be adjusted according to the fields types. 
* If the conditional column (`<conditional_column>`) is `Some(<conditional_value>)`,
  then the current field must be equal to `<expected_value>`.
* If the condition is not met, the current field is not validated (it can be `None` or any other value).
* Both columns must be optional (`Option<T>` and `Option<R>`), but their inner types may differ — for example, `Option<String>` and `Option<i32>`.
* Comparison uses equality (`==`).

```rust
#[derive(Deserialize, ValidateCsv, Debug)]
struct Order {
    status: Option<String>,

    // If status == "paid" → payment_state must be "done"
    #[validate(if_then("status", "paid", "done"))]
    payment_state: Option<String>,

    plan: Option<String>,

    // If plan == "P" → seats must be 100
    #[validate(if_then("plan", "P", "100"))]
    seats: Option<u32>,
}
```

### Struct check

The macro validates the type it is annotating, only strucs with named fields are allowed: 

```rust
use serde::Deserialize;
use csv_schema_validator::ValidateCsv;

#[derive(Deserialize, ValidateCsv)]
struct TupleStruct(f64, String);

#[derive(Deserialize, ValidateCsv)]
enum Status {
    Success { code: f64, message: String },
    Error(f64, String),
    Unknown,
}

fn main() {
    let record = TupleStruct(42.0, "ABC1234".to_string());
    let s = Status::Success { code: 200.0, message: "OK".into() };
    let _ = record.validate_csv();
    let _ = s.validate_csv();
}
```

Trying to compile this code will result in errors: 

```shell
cargo run
error: only structs with named fields (e.g., `struct S { a: T }`) are supported
 --> src/main.rs:5:19
  |
5 | struct TupleStruct(f64, String);
  |                   ^^^^^^^^^^^^^

error: only structs are supported
  --> src/main.rs:8:1
   |
8  | / enum Status {
9  | |     Success { code: f64, message: String },
10 | |     Error(f64, String),
11 | |     Unknown,
12 | | }
   | |_^

```

### Complete example

This is an example which reads a csv file: 

`Cargo.toml`:

```toml
[package]
name = "use-csv-validator"
version = "0.1.1"
edition = "2021"

[dependencies]
csv = "1.1"
serde = { version = "1.0", features = ["derive"] }
csv-schema-validator = "0.1.3"
```

`src/main.rs`:

```rust
use std::error::Error;
use csv::ReaderBuilder;
use serde::Deserialize;
use csv_schema_validator::{ValidateCsv, ValidationError};

/// Custom validator: ensure comments string isn't too long
fn length_validation(s: &str) -> Result<(), String> {
    if s.len() <= 20 {
        Ok(())
    } else {
        Err("Comments too long".into())
    }
}

#[derive(Deserialize, ValidateCsv, Debug)]
struct TestRecord {
    #[validate(range(min = 0.0, max = 100.0))]
    grade: f64,

    #[validate(regex = r"^[A-Z]{3}\d{4}$")]
    code: String,

    #[validate(required, length(min = 10, max = 50), not_blank)]
    name: Option<String>,

    #[validate(custom = "length_validation")]
    comments: String,

    #[serde(rename = "more")]
    #[validate(required, one_of("short", "medium", "long"))]
    more_comments: Option<String>,

    #[validate(required, not_in("forbidden", "banned"))]
    tag: Option<String>,

    #[validate(range(min = 1))]
    level: i32,

    #[validate(range(max = 100))]
    top: Option<i32>,
}

fn main() -> Result<(), Box<dyn Error>> {
    // open the CSV file placed alongside Cargo.toml
    let mut reader = ReaderBuilder::new()
        .has_headers(true)
        .from_path("data.csv")?;

    // for each record, deserialize and validate
    for (i, result) in reader.deserialize::<TestRecord>().enumerate() {
        let record = result?;
        match record.validate_csv() {
            Ok(()) => println!("Line {}: Record is valid: {:?}", i + 1, record),
            Err(errors) => {
                eprintln!("Line {}: Validation errors:", i + 1);
                for ValidationError { field, message } in errors {
                    eprintln!("  Field `{}`: {}", field, message);
                }
            }
        }
    }

    Ok(())
}

```

`data.csv`: 

```csv
grade,code,name,comments,more,tag,level,top
85.5,XYZ1234,Alice Smith,All good,short,allowed,2,
90.0,XYZ5678,Bob Marley,Too long comment indeed,medium,allowed,0,
110.0,XYZ4567,      ,ok,short,allowed,5,
95.0,xWF9101,Charlie,code,long,allowed,6,
110.0,XYZ2345,Dave Copperfield,range,short,allowed,-1,80
34.0,XYZ6789,,name,medium,allowed,5,
78.0,XYZ7890,Frank,more,invalid comment,allowed,10,
88.0,XYZ4567,Grace,All good,short,,3,
90.0,XYZ3567,Grace of All Times,All good,medium,forbidden,5,150
3.0,XYZ3456,Eve Max Smith,,short,invalid grade,2,
f34s,XYZ3456,Eve,comments,short,invalid grade,,,
```

Running this example will generate these messages: 

```shell
Line 1: Record is valid: TestRecord { grade: 85.5, code: "XYZ1234", name: Some("Alice Smith"), comments: "All good", more_comments: Some("short"), tag: Some("allowed"), level: 2, top: None }
Line 2: Validation errors:
  Field `comments`: Comments too long
  Field `level`: value below min: 1
Line 3: Validation errors:
  Field `grade`: value out of expected range: 0 to 100
  Field `name`: length out of expected range: 10 to 50
  Field `name`: must not be blank or contain only whitespace
Line 4: Validation errors:
  Field `code`: does not match the expected pattern
  Field `name`: length out of expected range: 10 to 50
Line 5: Validation errors:
  Field `grade`: value out of expected range: 0 to 100
  Field `level`: value below min: 1
Line 6: Validation errors:
  Field `name`: mandatory field
Line 7: Validation errors:
  Field `name`: length out of expected range: 10 to 50
  Field `more_comments`: invalid value
Line 8: Validation errors:
  Field `name`: length out of expected range: 10 to 50
  Field `tag`: mandatory field
Line 9: Validation errors:
  Field `tag`: value not allowed
  Field `top`: value above max: 100
Line 10: Record is valid: TestRecord { grade: 3.0, code: "XYZ3456", name: Some("Eve Max Smith"), comments: "", more_comments: Some("short"), tag: Some("invalid grade"), level: 2, top: None }
Error: Error(UnequalLengths { pos: Some(Position { byte: 542, line: 12, record: 11 }), expected_len: 8, len: 9 })
```

## Why Use This Crate?

* **Declarative API:** Define validation rules directly in your struct.
* **Zero Runtime Overhead:** All checks are generated at compile time.
* **Seamless Serde & CSV Integration:** Works directly with `serde` and `csv` crates.
* **Clear Error Messages:** Each failure reports the field and reason.

## Comparison with csv Crate Validations

While the `csv` crate provides low‑level parsing and some helper methods, this derive‑based approach offers:

* **Field‑Level Declarative Rules:** Annotate each struct field with its own validation, rather than writing imperative checks after parsing.
* **Type‑Safety & Integration:** Leverages your existing `serde::Deserialize` types, so you get compile‑time guarantees on types and validations in one place.
* **Custom Validators:** Easily plug in custom functions per field without manual looping or error‑handling boilerplate.
* **Centralized Error Collection:** Automatically collects all errors into a single `Vec<ValidationError>`, instead of ad‑hoc early exits.
* **Reusable Across Projects:** Define your struct once, reuse validations in different contexts (CLI, web server, batch jobs) with the same guarantees.

By contrast, using the `csv` crate directly may require manual loops over records and explicit `match`/`if` chains for each validation, leading to more boilerplate and potential for missing checks.

## Compatibility

* This crate requires the Rust standard library (it is **not** compatible with `#![no_std]` environments).
* Rust **1.56+**
* `serde` **1.0**
* `csv` **1.3**
* `regex` **1.11**

## Contributing

Feel free to open issues and submit pull requests. See [CONTRIBUTING.md](CONTRIBUTING.md) for details.

## License

This project is licensed under the **MIT License**. See the [LICENSE](LICENSE) file for details.

## Links

* **Released on crates.io:** [csv-schema-validator]https://crates.io/crates/csv-schema-validator
* **API Documentation:** [docs.rs]https://docs.rs/csv-schema-validator
* **Source Code:** [https://github.com/cleuton/rustingcrab/tree/main/code_samples/csv-schema-validator]https://github.com/cleuton/rustingcrab/tree/main/code_samples/csv-schema-validator