xportrs
Pure Rust SAS XPORT (XPT) reader and writer for CDISC clinical trial data submissions.
xportrs provides a safe, DataFrame-agnostic implementation of XPT v5 I/O with built-in regulatory compliance validation for FDA, PMDA, and NMPA submissions.
Features
- DataFrame-agnostic - Works with any in-memory table representation
- Agency compliance - Built-in validation for FDA, PMDA, and NMPA requirements
- Auto file splitting - Automatically splits large files to meet agency size limits (5GB)
- XPT v5 support - Full read and write support for SAS XPORT v5 format
- Configurable - Text encoding modes, validation strictness, and more
Installation
Add to your Cargo.toml:
[]
= "0.0.2"
With optional features:
[]
= { = "0.0.2", = ["serde", "tracing"] }
Quick Start
Reading XPT Files
use Xpt;
// Simple: read the first dataset
let dataset = read ?;
println!;
println!;
// Read a specific member from a multi-dataset file
let dm = reader ?.read_member ?;
// Read all members
let datasets = reader ?.read_all ?;
// Inspect file metadata without loading data
let info = inspect ?;
for name in info.member_names
Writing XPT Files
use ;
// Create a dataset
let dataset = new ?;
// Write with structural validation only
writer
.finalize ?
.write_path ?;
Agency Compliance
When submitting clinical trial data to regulatory agencies, use the
agency() method to enable agency-specific validation rules:
use ;
let dataset = new ?;
// FDA submission - applies all FDA validation rules
let files = writer
.agency
.finalize ?
.write_path ?;
// Returns Vec<PathBuf> - multiple files if splitting occurred
println!;
Supported Agencies
| Agency | Description | Max File Size |
|---|---|---|
Agency::FDA |
U.S. Food and Drug Administration | 5 GB |
Agency::PMDA |
Japan Pharmaceuticals and Medical Devices Agency | 5 GB |
Agency::NMPA |
China National Medical Products Administration | 5 GB |
Agency Validation Rules
When an agency is specified, the following validations are applied:
- ASCII-only names, labels, and character values
- Dataset names: max 8 bytes, uppercase alphanumeric, must start with letter
- Variable names: max 8 bytes, uppercase alphanumeric with underscores
- Labels: max 40 bytes
- Character values: max 200 bytes
- File naming: dataset name must match file stem (case-insensitive)
Automatic File Splitting
Large XPT files are automatically split when an agency is specified:
use ;
// Large dataset with millions of rows
let large_dataset = new ?;
// Files > 5GB are automatically split into numbered parts
let files = writer
.agency
.finalize ?
.write_path ?;
// Result: ["lb_001.xpt", "lb_002.xpt", ...] if split
// Result: ["lb.xpt"] if no split needed
Data Types
xportrs supports the following column data types:
| Rust Type | XPT Type | Description |
|---|---|---|
ColumnData::F64 |
Numeric | 64-bit floating point |
ColumnData::I64 |
Numeric | 64-bit integer (stored as float) |
ColumnData::String |
Character | Variable-length text (1-200 bytes) |
All types support Option<T> for missing values (SAS missing = .).
Validation Issues
The library provides detailed validation feedback:
use ;
let plan = writer
.agency
.finalize ?;
// Check for issues before writing
if plan.has_errors
if plan.has_warnings
plan.write_path ?;
CDISC Terminology
This crate uses CDISC SDTM vocabulary:
| Term | Description |
|---|---|
| Domain dataset | A table identified by a domain code (e.g., "AE", "DM", "LB") |
| Observation | One row/record in the dataset |
| Variable | One column; may have a role (Identifier/Topic/Timing/Qualifier/Rule) |
Feature Flags
| Feature | Description |
|---|---|
serde |
Enable serialization/deserialization support |
tracing |
Enable structured logging with the tracing crate |
full |
Enable all optional features |
# Enable all features
= { = "0.0.2", = ["full"] }
Temporal Utilities
Convert between Rust chrono types and SAS date/time values:
use ;
use NaiveDate;
// Convert Rust date to SAS days
let date = from_ymd_opt.unwrap;
let sas_days = sas_days_since_1960;
// Convert SAS days back to Rust date
let back = date_from_sas_days;
Safety
This crate is built with
#![forbid(unsafe_code)]. All binary parsing and encoding uses safe Rust constructs. The library has been designed with security in mind:
- No unsafe code blocks
- No external C dependencies
- Comprehensive input validation
- Protection against malformed files
Minimum Supported Rust Version (MSRV)
The minimum supported Rust version is 1.92.
Related Projects
- xportr (R) - The R package that inspired this crate
- Trial Submission Studio - Desktop application using xportrs
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Run tests (
cargo test) - Run clippy (
cargo clippy --all-targets --all-features -- -D warnings) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request