data-protocol-validator
Rust validator for Data Protocol schemas - validates versioned bioinformatics analysis output against JSON Schema-based protocol definitions.
Overview
Bioinformatics analysis programs evolve frequently, producing structural changes in their output data across versions. The Data Protocol format provides a versioned, machine-readable definition that describes the expected shape of output data from specific analysis pipelines.
This Rust implementation validates data against Data Protocol schemas, which are based on a strict subset of JSON Schema (draft 2020-12) augmented with domain-specific extensions for bioinformatics use cases.
Features
- ✅ Full conformance with Data Protocol 1.0 specification
- ✅ Validates data against versioned protocol schemas
- ✅ Supports all standard JSON Schema validation keywords (type, properties, required, etc.)
- ✅ Format validation (date, date-time, email, uri, uuid)
- ✅ Composition keywords (allOf, anyOf, oneOf)
- ✅ Reference resolution ($ref, $defs)
- ✅ Custom extensions (x-display-name, x-unit, x-deprecated, etc.)
- ✅ Detailed error messages with suggestions for fixes
- ✅ Partial validation mode for validating specific paths
- ✅ Validation statistics (fields checked, valid, invalid)
Installation
Add this to your Cargo.toml:
[]
= "0.1"
Quick Start
use ;
use json;
Usage Examples
Basic Validation
use validate;
use json;
let protocol = json!;
let data = json!;
let result = validate;
assert!;
Partial Validation
Validate only specific paths in your data:
use ;
let options = Some;
let result = validate;
Schema-Only Validation
Validate against a bare schema without a protocol envelope:
use validate_schema;
let schema = json!;
let data = json!;
let result = validate_schema;
assert!;
Error Handling with Suggestions
let result = validate;
for error in result.errors
Validation Options
Validation Result
Error Codes
The validator produces standardized error codes:
E001: Type mismatchE002: Missing required propertyE003: Additional property not allowedE004: String constraint violationE005: Number constraint violationE006: Array constraint violationE007: Object constraint violationE008: Format validation failureE009: Enum/const violationE010: Composition failure (allOf/anyOf/oneOf)E011: Reference resolution failureW001: Deprecated field warning
Supported JSON Schema Keywords
Core
type,properties,required,additionalPropertiesitems,$ref,$defs
String Constraints
minLength,maxLength,pattern,format
Numeric Constraints
minimum,maximum,exclusiveMinimum,exclusiveMaximum,multipleOf
Array Constraints
minItems,maxItems,uniqueItems
Object Constraints
minProperties,maxProperties
Enum and Const
enum,const
Composition
allOf,anyOf,oneOf
Custom Extensions
x-display-name,x-unit,x-sort-key,x-sort-orderx-deprecated,x-tags
Format Validation
Supports the following format validators:
date: ISO 8601 date (YYYY-MM-DD)date-time: RFC 3339 date-timeemail: RFC 5322 email addressuri: RFC 3986 URIuuid: UUID (case-insensitive)
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.