SQL-on-FHIR Implementation
This crate provides a complete implementation of the SQL-on-FHIR specification for Rust, enabling the transformation of FHIR resources into tabular data using declarative ViewDefinitions. It supports all major FHIR versions (R4, R4B, R5, R6) through a version-agnostic abstraction layer.
Overview
The sof crate implements the HL7 FHIR SQL-on-FHIR Implementation Guide, providing:
- ViewDefinition Processing - Transform FHIR resources into tabular data using declarative configuration
- Multi-Version Support - Works seamlessly with R4, R4B, R5, and R6 FHIR specifications
- FHIRPath Integration - Complex data extraction using FHIRPath expressions
- Multiple Output Formats - CSV, JSON, NDJSON, and Parquet support
- Command Line Interface - Ready-to-use CLI tool for batch processing
- Server Implementation - HTTP API for on-demand transformations (planned)
Python Developers
Looking to use SQL-on-FHIR from Python? Check out the pysof package, which provides Python bindings for this crate:
# Transform FHIR data to CSV, JSON, NDJSON, or Parquet
=
Features:
- High Performance - Rust-powered processing with automatic multithreading (5-7x speedup)
- Simple API - Easy-to-use functions with native Python types
- Multiple Formats - Support for CSV, JSON, NDJSON, and Parquet outputs
- FHIR Versions - Compatible with R4, R4B, R5, and R6 (configurable at build time)
- PyPI Distribution - Install with
pip install pysof
See the pysof README for installation and usage details.
Executables
This crate provides two executable targets:
sof-cli - Command Line Interface
A full-featured command-line (CLI) tool for running ViewDefinition transformations. The CLI tool accepts FHIR ViewDefinition and Bundle resources as input (either from files or stdin) and applies the SQL-on-FHIR transformation to produce structured output in formats like CSV, JSON, or other supported content types.
# Basic CSV output (includes headers by default)
# CSV output without headers
# JSON output to file
# Read ViewDefinition from stdin, Bundle from file
|
# Read Bundle from stdin, ViewDefinition from file
|
# Load data using --source parameter (supports local paths and URLs)
# Filter resources modified after a specific date
# Limit results to first 100 rows
# Combine filters: recent resources limited to 50 results
# Load NDJSON file (newline-delimited JSON) - automatically detected by .ndjson extension
# NDJSON content detection (works even without .ndjson extension)
# Streaming mode for large NDJSON files (memory-efficient chunked processing)
CLI Features
- Flexible Input:
- Read ViewDefinitions from file (
-v) or stdin - Read Bundles from file (
-b), stdin, or external sources (-s) - Use
-s/--sourceto load from local paths (relative or absolute) or URLs:file://,http(s)://,s3://,gs://,azure:// - Cannot read both ViewDefinition and Bundle from stdin simultaneously
- Read ViewDefinitions from file (
- Output Formats: CSV (with/without headers), JSON (pretty-printed array), NDJSON (newline-delimited), Parquet (columnar binary format)
- Output Options: Write to stdout (default) or specified file with
-o - Result Filtering:
- Filter resources by modification time with
--since(RFC3339 format) - Limit number of results with
--limit(1-10000)
- Filter resources by modification time with
- Streaming Mode: Memory-efficient chunked processing for large NDJSON files
- Automatically enabled when using
--bundlewith.ndjsonfiles - Configurable chunk size with
--chunk-size(default: 1000 resources) - Skip invalid lines with
--skip-invalidfor fault-tolerant processing
- Automatically enabled when using
- FHIR Version Support: R4 by default; other versions (R4B, R5, R6) require compilation with feature flags
- Error Handling: Clear, actionable error messages for debugging
Command Line Options
-v, --view <VIEW> Path to ViewDefinition JSON file (or use stdin if not provided)
-b, --bundle <BUNDLE> Path to FHIR Bundle JSON file (or use stdin if not provided)
-s, --source <SOURCE> Path or URL to FHIR data source (see Data Sources below)
-f, --format <FORMAT> Output format (csv, json, ndjson, parquet) [default: csv]
--no-headers Exclude CSV headers (only for CSV format)
-o, --output <OUTPUT> Output file path (defaults to stdout)
--since <SINCE> Filter resources modified after this time (RFC3339 format)
--limit <LIMIT> Limit the number of results (1-10000)
--fhir-version <VERSION> FHIR version to use [default: R4]
--parquet-row-group-size <MB> Row group size for Parquet (64-1024MB) [default: 256]
--parquet-page-size <KB> Page size for Parquet (64-8192KB) [default: 1024]
--parquet-compression <ALG> Compression for Parquet [default: snappy]
Options: none, snappy, gzip, lz4, brotli, zstd
--max-file-size <MB> Maximum file size for Parquet output (10-10000MB) [default: 1000]
When exceeded, creates numbered files (e.g., output_001.parquet)
--chunk-size <N> Number of resources per chunk for streaming NDJSON [default: 1000]
--skip-invalid Skip invalid JSON lines in NDJSON files instead of failing
-h, --help Print help
* Additional FHIR versions (R4B, R5, R6) available when compiled with corresponding features
Data Sources
The CLI provides two ways to specify FHIR data:
-b/--bundle: Direct path to a local file (simple, no protocol prefix needed)-s/--source: URL-based loading with protocol support (more flexible)
The --source parameter supports loading FHIR data from various sources:
Local Files
# Using --bundle (simpler for local files)
# Using --source with relative path
# Using --source with absolute path
# Using --source with file:// protocol
HTTP/HTTPS URLs
AWS S3
# Set AWS credentials
# Load from S3 bucket
S3-Compatible Services (MinIO, Ceph, LocalStack)
# Standard AWS credentials
# Custom endpoint for S3-compatible service
# Required for HTTP (non-HTTPS) endpoints
Google Cloud Storage
# Option 1: Service account credentials
# Option 2: Application Default Credentials
# Load from GCS bucket
Azure Blob Storage
# Option 1: Storage account credentials
# Option 2: Azure managed identity (when running in Azure)
# No environment variables needed
# Load from Azure container
The source can contain:
- A FHIR Bundle (JSON)
- A single FHIR resource (will be wrapped in a Bundle)
- An array of FHIR resources (will be wrapped in a Bundle)
- NDJSON file (newline-delimited FHIR resources, automatically detected)
NDJSON Input Format
In addition to standard JSON, the CLI and server support NDJSON (newline-delimited JSON) as an input format. NDJSON files contain one FHIR resource per line, making them ideal for streaming large datasets.
Format Detection:
- Extension-based: Files with
.ndjsonextension are automatically parsed as NDJSON - Content-based fallback: If JSON parsing fails on a multi-line file, NDJSON parsing is attempted automatically
- Works with all data sources: local files, HTTP(S), S3, GCS, and Azure
Example NDJSON file:
{"resourceType": "Patient", "id": "patient-1", "gender": "male"}
{"resourceType": "Patient", "id": "patient-2", "gender": "female"}
{"resourceType": "Observation", "id": "obs-1", "status": "final", "code": {"text": "Test"}}
Error Handling:
- Invalid lines are skipped with warnings printed to stderr (or use
--skip-invalidin streaming mode) - Processing continues as long as at least one valid FHIR resource is found
- Empty lines and whitespace-only lines are ignored
Streaming Mode (Memory-Efficient Processing):
When processing large NDJSON files with --bundle, the CLI automatically uses streaming mode for memory-efficient processing:
# Stream large NDJSON file with default chunk size (1000 resources)
# Custom chunk size for memory-constrained environments
# Skip invalid lines and continue processing
# Output to file with streaming
Streaming mode features:
- Bounded memory: Only
--chunk-sizeresources are loaded at a time (~10MB per 1000 resources) - Progressive output: Results are written incrementally, ideal for large datasets
- Fault tolerance: Use
--skip-invalidto continue past malformed JSON lines - Statistics: Reports resources processed, chunks, and output rows on completion
Usage Examples:
# Load from local NDJSON file
# Load from cloud storage
# Mix NDJSON source with JSON bundle
# Server API with NDJSON
Output Formats
The CLI supports multiple output formats via the -f/--format parameter:
-
csv (default) - Comma-separated values format
- Includes headers by default
- Use
--no-headersflag to exclude column headers - All values are properly quoted according to CSV standards
-
json - JSON array format
- Pretty-printed for readability
- Each row is a JSON object with column names as keys
- Suitable for further processing with JSON tools
-
ndjson - Newline-delimited JSON format
- One JSON object per line
- Streaming-friendly format
- Ideal for processing large datasets
-
parquet - Apache Parquet columnar format
- Efficient binary format for analytics workloads
- Automatic schema inference from data
- Configurable compression (snappy, gzip, lz4, brotli, zstd, none)
- Optimized for large datasets with automatic chunking
- Configurable row group and page sizes for performance tuning
sof-server - HTTP Server
A high-performance HTTP server providing SQL-on-FHIR ViewDefinition transformation capabilities with advanced Parquet support and streaming for large datasets. Use this server if you need a stateless, simple web service for SQL-on-FHIR implementations. Should you need to perform SQL-on-FHIR transformations using server-stored ViewDefinitions and server-stored FHIR data, use the full capabilities of the Helios FHIR Server in hfs.
# Start server with defaults
# Custom configuration via command line
# Custom configuration via environment variables
SOF_SERVER_PORT=3000 SOF_SERVER_HOST=0.0.0.0
# Check server health
# Get CapabilityStatement
Configuration
The server can be configured using either command-line arguments or environment variables. Command-line arguments take precedence when both are provided.
Environment Variables
| Variable | Description | Default |
|---|---|---|
SOF_SERVER_PORT |
Server port | 8080 |
SOF_SERVER_HOST |
Server host address | 127.0.0.1 |
SOF_LOG_LEVEL |
Log level (error, warn, info, debug, trace) | info |
SOF_MAX_BODY_SIZE |
Maximum request body size in bytes | 10485760 (10MB) |
SOF_REQUEST_TIMEOUT |
Request timeout in seconds | 30 |
SOF_ENABLE_CORS |
Enable CORS (true/false) | true |
SOF_CORS_ORIGINS |
Allowed CORS origins (comma-separated, * for any) | * |
SOF_CORS_METHODS |
Allowed CORS methods (comma-separated, * for any) | GET,POST,PUT,DELETE,OPTIONS |
SOF_CORS_HEADERS |
Allowed CORS headers (comma-separated, * for any) | Common headers¹ |
Command-Line Arguments
| Argument | Short | Description | Default |
|---|---|---|---|
--port |
-p |
Server port | 8080 |
--host |
-H |
Server host address | 127.0.0.1 |
--log-level |
-l |
Log level | info |
--max-body-size |
-m |
Max request body (bytes) | 10485760 |
--request-timeout |
-t |
Request timeout (seconds) | 30 |
--enable-cors |
-c |
Enable CORS | true |
--cors-origins |
Allowed origins (comma-separated) | * |
|
--cors-methods |
Allowed methods (comma-separated) | GET,POST,PUT,DELETE,OPTIONS |
|
--cors-headers |
Allowed headers (comma-separated) | Common headers¹ |
Examples
# Production configuration with environment variables
# 50MB
# Development configuration
# CORS configuration for specific frontend
# Disable CORS for internal services
# Show all configuration options
Cloud Storage Configuration
When using the source parameter with cloud storage URLs, ensure the appropriate credentials are configured:
AWS S3 (s3:// URLs):
S3-Compatible Services (MinIO, Ceph, LocalStack):
# Required for HTTP (non-HTTPS) endpoints
Google Cloud Storage (gs:// URLs):
# Option 1: Service account
# Option 2: Application Default Credentials
Azure Blob Storage (azure:// URLs):
# Option 1: Storage account credentials
# Option 2: Use managed identity when running in Azure
CORS Configuration
The server provides flexible CORS (Cross-Origin Resource Sharing) configuration to control which web applications can access the API:
-
Origins: Specify which domains can access the server
- Use
*to allow any origin (default) - Provide comma-separated list for specific origins:
https://app1.com,https://app2.com
- Use
-
Methods: Control which HTTP methods are allowed
- Default:
GET,POST,PUT,DELETE,OPTIONS - Use
*to allow any method - Provide comma-separated list:
GET,POST,OPTIONS
- Default:
-
Headers: Specify which headers clients can send
- Default: Common headers¹
- Use
*to allow any header - Provide comma-separated list:
Content-Type,Authorization,X-Custom-Header
Important Security Notes:
- When using wildcard (
*) for origins, credentials (cookies, auth headers) are automatically disabled for security - To enable credentials, you must specify exact origins, not wildcards
- In production, always specify exact origins rather than using
*to prevent unauthorized access
# Development (permissive, no credentials)
# Production CORS configuration (with credentials)
¹ Default headers: Accept,Accept-Language,Content-Type,Content-Language,Authorization,X-Requested-With
Server Features
- HTTP API: RESTful endpoints for ViewDefinition execution
- CapabilityStatement: Discovery endpoint for server capabilities
- ViewDefinition Runner: Synchronous execution of ViewDefinitions
- Multi-format Output: Support for CSV, JSON, NDJSON, and Parquet responses
- Advanced Parquet Support:
- Configurable compression, row group size, and page size
- Automatic file splitting when size limits are exceeded
- ZIP archive generation for multi-file outputs
- Streaming Response: Chunked transfer encoding for large datasets
- FHIR Compliance: Proper OperationOutcome error responses
- Configurable CORS: Fine-grained control over cross-origin requests with support for specific origins, methods, and headers
API Endpoints
GET /metadata
Returns the server's CapabilityStatement describing supported operations:
POST /ViewDefinition/$viewdefinition-run
Execute a ViewDefinition transformation:
# JSON output (default)
# CSV output (includes headers by default)
# CSV output without headers
# NDJSON output
Parameters
The $viewdefinition-run POST operation accepts parameters either as query parameters or in a FHIR Parameters resource.
Parameter table:
| Name | Type | Use | Scope | Min | Max | Documentation |
|---|---|---|---|---|---|---|
| _format | code | in | type, instance | 1 | 1 | Output format - application/json, application/ndjson, text/csv, application/parquet |
| header | boolean | in | type, instance | 0 | 1 | This parameter only applies to text/csv requests. true (default) - return headers in the response, false - do not return headers. |
| maxFileSize | integer | in | type, instance | 0 | 1 | Maximum Parquet file size in MB (10-10000). When exceeded, generates multiple files in a ZIP archive. |
| rowGroupSize | integer | in | type, instance | 0 | 1 | Parquet row group size in MB (64-1024, default: 256) |
| pageSize | integer | in | type, instance | 0 | 1 | Parquet page size in KB (64-8192, default: 1024) |
| compression | code | in | type, instance | 0 | 1 | Parquet compression: none, snappy (default), gzip, lz4, brotli, zstd |
| viewReference | Reference | in | type, instance | 0 | 1 | Reference to ViewDefinition to be used for data transformation. (not yet supported) |
| viewResource | ViewDefinition | in | type | 0 | 1 | ViewDefinition to be used for data transformation. |
| patient | Reference | in | type, instance | 0 | * | Filter resources by patient. |
| group | Reference | in | type, instance | 0 | * | Filter resources by group. (not yet supported) |
| source | string | in | type, instance | 0 | 1 | URL or path to FHIR data source. Supports file://, http(s)://, s3://, gs://, and azure:// protocols. |
| _limit | integer | in | type, instance | 0 | 1 | Limits the number of results. (1-10000) |
| _since | instant | in | type, instance | 0 | 1 | Return resources that have been modified after the supplied time. (RFC3339 format, validates format only) |
| resource | Resource | in | type, instance | 0 | * | Collection of FHIR resources to be transformed into a tabular projection. |
Query Parameters
All parameters except viewReference, viewResource, patient, group, and resource can be provided as POST query parameters:
- _format: Output format (required if not in Accept header)
application/json- JSON array output (default)text/csv- CSV outputapplication/ndjson- Newline-delimited JSONapplication/parquet- Parquet file
- header: Control CSV headers (only applies to CSV format)
true- Include headers (default for CSV)false- Exclude headers
- source: URL to FHIR data (file://, http://, s3://, gs://, azure://)
- _limit: Limit results (1-10000)
- _since: Filter by modification time (RFC3339 format)
- maxFileSize: Maximum Parquet file size in MB (10-10000)
- rowGroupSize: Parquet row group size in MB (64-1024)
- pageSize: Parquet page size in KB (64-8192)
- compression: Parquet compression algorithm
Body Parameters
For POST requests, parameters can be provided in a FHIR Parameters resource:
- _format: As valueCode or valueString (overrides query params and Accept header)
- header: As valueBoolean (overrides query params)
- viewReference: As valueReference (not yet supported)
- viewResource: As resource (inline ViewDefinition)
- patient: As valueReference
- group: As valueReference (not yet supported)
- source: As valueString (URL to external FHIR data)
- _limit: As valueInteger
- _since: As valueInstant
- resource: As resource (can be repeated)
- maxFileSize: As valueInteger (for Parquet output)
- rowGroupSize: As valueInteger (for Parquet output)
- pageSize: As valueInteger (for Parquet output)
- compression: As valueCode or valueString (for Parquet output)
Parameter Precedence
When the same parameter is specified in multiple places, the precedence order is:
- Parameters in request body (highest priority)
- Query parameters
- Accept header (for format only, lowest priority)
Response Headers
The server automatically sets appropriate response headers based on the output format and size:
Standard Response Headers:
Content-Type: Based on format parameterapplication/jsonfor JSON outputtext/csvfor CSV outputapplication/ndjsonfor NDJSON outputapplication/parquetfor single Parquet fileapplication/zipfor multiple Parquet files
Streaming Response Headers (for large files):
Transfer-Encoding: chunked- Automatically set for files > 10MBContent-Disposition: attachment; filename="..."- Suggests filename for downloads- Single Parquet:
filename="data.parquet" - Multiple Parquet (ZIP):
filename="data.zip"
- Single Parquet:
Note: The Transfer-Encoding: chunked header is automatically managed by the server. Clients don't need to set any special headers to receive chunked responses - they will automatically receive data in chunks if the response is large.
Examples
# Limit results - first 50 records as CSV
# CSV without headers, limited to 20 results
# Using header parameter in request body (overrides query params)
# Filter by modification time (requires resources with lastUpdated metadata)
# Load data from S3 bucket
# Load data from Azure with filtering
# Generate Parquet with custom compression and row group size
# Generate large Parquet with file splitting (returns ZIP if multiple files)
# Using Parquet parameters in request body
Core Features
ViewDefinition Processing
Transform FHIR resources using declarative ViewDefinitions:
use ;
// Parse ViewDefinition and Bundle
let view_definition: ViewDefinition = from_str?;
let bundle: Bundle = from_str?;
// Wrap in version-agnostic containers
let sof_view = R4;
let sof_bundle = R4;
// Transform to CSV with headers
let csv_output = run_view_definition?;
Multi-Version FHIR Support
Seamlessly work with any supported FHIR version:
// Version-agnostic processing
match fhir_version
Advanced ViewDefinition Features
forEach Iteration
Process collections with automatic row generation:
Constants and Variables
Define reusable values for complex expressions:
Where Clauses
Filter resources using FHIRPath expressions:
Union Operations
Combine multiple select statements:
Output Formats
Multiple output formats for different integration needs:
use ContentType;
// CSV without headers
let csv = run_view_definition?;
// CSV with headers
let csv_headers = run_view_definition?;
// Pretty-printed JSON array
let json = run_view_definition?;
// Newline-delimited JSON (streaming friendly)
let ndjson = run_view_definition?;
// Apache Parquet (columnar binary format)
let parquet = run_view_definition?;
Parquet Export
The SOF implementation supports Apache Parquet format for efficient columnar data storage and analytics:
- Automatic Schema Inference: Column types are automatically determined from the data
- FHIR Type Mapping:
boolean→ BOOLEANstring/code/uri→ UTF8integer→ INT32decimal→ FLOAT64dateTime/date→ UTF8- Arrays → List types with nullable elements
- Optimized for Large Datasets:
- Automatic chunking into optimal batch sizes (100K-500K rows)
- Memory-efficient streaming for datasets > 1GB
- Configurable row group size (default: 256MB, range: 64-1024MB)
- Configurable page size (default: 1MB, range: 64KB-8MB)
- Compression Options:
snappy(default): Fast compression with good ratiosgzip: Maximum compatibility, good compressionlz4: Fastest compression/decompressionzstd: Balanced speed and compression ratiobrotli: Best compression rationone: No compression for maximum speed
- Null Handling: All fields are OPTIONAL to accommodate FHIR's nullable nature
- Complex Types: Objects and nested structures are serialized as JSON strings
Example usage:
# CLI export with default settings (256MB row groups, snappy compression)
# Optimize for smaller files with better compression
# Maximize compression for archival
# Fast processing with minimal compression
# Split large datasets into multiple files (500MB each)
# Creates: output.parquet (first 500MB)
# output_002.parquet (next 500MB)
# output_003.parquet (remaining data)
# Server API - single Parquet file
# Server API - with file splitting (returns ZIP archive if multiple files)
# Server API - optimized settings for large datasets
Performance Guidelines
- Row Group Size: Larger row groups (256-512MB) improve compression and columnar efficiency but require more memory during processing
- Page Size: Smaller pages (64-512KB) enable fine-grained reads and better predicate pushdown; larger pages (1-8MB) reduce metadata overhead
- Compression:
- Use
snappyorlz4for real-time processing - Use
zstdfor balanced storage and query performance - Use
brotliorgzipfor long-term storage where space is critical
- Use
- Large Datasets: The implementation automatically chunks data to prevent memory issues, processing in batches optimized for the configured row group size
- File Splitting: When
--max-file-sizeormaxFileSizeis specified:- Files are split when they exceed the specified size in MB
- Each file contains complete row groups and is independently queryable
- CLI: Files are named with sequential numbering:
base.parquet,base_002.parquet,base_003.parquet, etc. - Server: Multiple files are automatically packaged into a ZIP archive for convenient download
- Ideal for distributed processing systems that parallelize across files
- Streaming Response (Server only):
- Files larger than 10MB are automatically streamed using chunked transfer encoding
- Reduces memory usage on both server and client
- Multiple Parquet files are streamed as a ZIP archive with proper content disposition headers
- Enables processing of gigabyte-scale datasets without memory constraints
- Response headers for streaming:
Transfer-Encoding: chunked- Automatically set by the server for streaming responsesContent-Type: application/parquetorapplication/zip- Based on single or multi-file outputContent-Disposition: attachment; filename="data.parquet"orfilename="data.zip"- For convenient file downloads
- Chunked responses use 64KB chunks for optimal network efficiency
Performance
Multi-Threading
The SQL-on-FHIR implementation leverages multi-core processors for optimal performance through parallel resource processing:
- Automatic Parallelization: FHIR resources are processed in parallel using
rayonfor both batch and streaming modes - Streaming Mode Benefits: Streaming NDJSON processing uses parallel chunk processing, achieving throughput comparable to or better than batch mode while using 35-150x less memory
- Zero Configuration: Parallel processing is always enabled with intelligent work distribution
- Thread Pool Control: Optionally control thread count via
RAYON_NUM_THREADSenvironment variable
RAYON_NUM_THREADS Environment Variable
The RAYON_NUM_THREADS environment variable controls the number of threads used for parallel processing:
# Use all available CPU cores (default behavior)
# Limit to 4 threads for resource-constrained environments
RAYON_NUM_THREADS=4
# Use single thread (disables parallelization)
RAYON_NUM_THREADS=1
# Server with custom thread pool
RAYON_NUM_THREADS=8
# Python (pysof) also respects this variable
RAYON_NUM_THREADS=4
When to adjust thread count:
- Reduce threads (
RAYON_NUM_THREADS=2-4): On shared systems, containers with CPU limits, or when running multiple instances - Increase threads: Rarely needed; rayon auto-detects available cores
- Single thread (
RAYON_NUM_THREADS=1): For debugging, profiling, or deterministic output ordering
Performance Benchmarks
Batch Mode (Bundle processing):
| Bundle Size | Sequential Time | Parallel Time | Speedup |
|---|---|---|---|
| 10 patients | 22.7ms | 8.3ms | 2.7x |
| 50 patients | 113.8ms | 16.1ms | 7.1x |
| 100 patients | 229.4ms | 35.7ms | 6.4x |
| 500 patients | 1109ms | 152ms | 7.3x |
Streaming Mode (NDJSON processing):
| Dataset | Batch Mode | Streaming Mode | Memory Reduction |
|---|---|---|---|
| 10k Patients (32MB) | 2.66s, 1.6GB | 0.93s, 45MB | 35x less memory, 2.9x faster |
| 93k Encounters (136MB) | 3.97s, 3.9GB | 2.75s, 25MB | 155x less memory, 1.4x faster |
The parallel processing ensures:
- Each FHIR resource is processed independently on available threads
- Column ordering is maintained consistently across parallel operations
- Thread-safe evaluation contexts for FHIRPath expressions
- Efficient load balancing through work-stealing algorithms
- Both batch and streaming modes benefit from parallelization
Architecture
Version-Agnostic Design
The crate uses trait abstractions to provide uniform processing across FHIR versions:
// Core traits for version independence
Processing Pipeline
- Input Validation - Verify ViewDefinition structure and FHIR version compatibility
- Constant Extraction - Parse constants/variables for use in expressions
- Resource Filtering - Apply where clauses to filter input resources
- Row Generation - Process select statements with forEach support
- Output Formatting - Convert to requested format (CSV, JSON, etc.)
Error Handling
Comprehensive error types for different failure scenarios:
use SofError;
match run_view_definition
Feature Flags
Enable support for specific FHIR versions:
[]
= { = "1.0", = ["R4", "R5"] }
# Or enable all versions
= { = "1.0", = ["R4", "R4B", "R5", "R6"] }
Available features:
R4- FHIR 4.0.1 support (default)R4B- FHIR 4.3.0 supportR5- FHIR 5.0.0 supportR6- FHIR 6.0.0 support
Integration Examples
Batch Processing Pipeline
use ;
use fs;
Custom Error Handling
use ;
Testing
The crate includes comprehensive tests covering:
- ViewDefinition Validation - Structure and logic validation
- FHIRPath Integration - Expression evaluation and error handling
- Multi-Version Compatibility - Cross-version processing
- Output Format Validation - Correct CSV, JSON, and NDJSON generation
- Edge Cases - Empty results, null values, complex nested structures
- Query Parameter Validation - Pagination, filtering, and format parameters
- Error Handling - Proper FHIR OperationOutcome responses for invalid parameters
Run tests with:
# All tests
# Specific FHIR version
# Integration tests only