Skip to main content

Module iterator

Module iterator 

Source
Expand description

Streaming record iterator for file-level decoding. Record iterator for streaming access to decoded records

This module provides iterator-based access to records for programmatic processing, allowing users to process records one at a time without loading entire files into memory.

§Overview

The iterator module implements streaming record processing with bounded memory usage. It provides low-level iterator primitives for reading COBOL data files sequentially, supporting both fixed-length and RDW (Record Descriptor Word) variable-length formats.

Key capabilities:

  1. Streaming iteration (RecordIterator) - Process records one at a time
  2. Format flexibility - Handle both fixed-length and RDW variable-length records
  3. Raw access (RecordIterator::read_raw_record) - Access undecoded record bytes
  4. Convenience functions (iter_records_from_file, iter_records) - Simplified creation

§Performance Characteristics

The iterator uses buffered I/O and maintains bounded memory usage:

  • Memory: One record buffer (typically <32 KiB per record)
  • Throughput: Depends on decode complexity (DISPLAY vs COMP-3)
  • Latency: Sequential I/O optimized with BufReader

For high-throughput parallel processing, consider using crate::decode_file_to_jsonl which provides parallel worker pools and streaming output.

§Examples

§Basic Fixed-Length Record Iteration

use copybook_codec::{iter_records_from_file, DecodeOptions, Codepage, RecordFormat};
use copybook_core::parse_copybook;

// Parse copybook schema
let copybook_text = r#"
    01 CUSTOMER-RECORD.
       05 CUSTOMER-ID    PIC 9(5).
       05 CUSTOMER-NAME  PIC X(20).
       05 BALANCE        PIC S9(7)V99 COMP-3.
"#;
let schema = parse_copybook(copybook_text)?;

// Configure decoding options
let options = DecodeOptions::new()
    .with_codepage(Codepage::CP037)
    .with_format(RecordFormat::Fixed);

// Create iterator from file
let iterator = iter_records_from_file("customers.bin", &schema, &options)?;

// Process records one at a time
for (index, result) in iterator.enumerate() {
    match result {
        Ok(json_value) => {
            println!("Record {}: {}", index + 1, json_value);
        }
        Err(error) => {
            eprintln!("Error in record {}: {}", index + 1, error);
            break; // Stop on first error
        }
    }
}

§RDW Variable-Length Records

use copybook_codec::{RecordIterator, DecodeOptions, RecordFormat};
use copybook_core::parse_copybook;
use std::fs::File;

let copybook_text = r#"
    01 TRANSACTION.
       05 TRAN-ID       PIC 9(10).
       05 TRAN-AMOUNT   PIC S9(9)V99 COMP-3.
       05 TRAN-DESC     PIC X(100).
"#;
let schema = parse_copybook(copybook_text)?;

let options = DecodeOptions::new()
    .with_format(RecordFormat::RDW);  // RDW variable-length format

let file = File::open("transactions.dat")?;
let mut iterator = RecordIterator::new(file, &schema, &options)?;

// Process with error recovery
let mut processed = 0;
let mut errors = 0;

for (index, result) in iterator.enumerate() {
    match result {
        Ok(json_value) => {
            processed += 1;
            // Process record...
        }
        Err(error) => {
            errors += 1;
            eprintln!("Record {}: {}", index + 1, error);

            if errors > 10 {
                eprintln!("Too many errors, stopping");
                break;
            }
        }
    }
}

println!("Processed: {}, Errors: {}", processed, errors);

§Raw Record Access (No Decoding)

use copybook_codec::{RecordIterator, DecodeOptions, RecordFormat};
use copybook_core::parse_copybook;
use std::io::Cursor;

let copybook_text = "01 RECORD.\n   05 DATA PIC X(10).";
let schema = parse_copybook(copybook_text)?;

let options = DecodeOptions::new()
    .with_format(RecordFormat::Fixed);

let data = b"RECORD0001RECORD0002";
let mut iterator = RecordIterator::new(Cursor::new(data), &schema, &options)?;

// Read raw bytes without JSON decoding
while let Some(raw_bytes) = iterator.read_raw_record()? {
    println!("Raw record {}: {} bytes",
             iterator.current_record_index(),
             raw_bytes.len());

    // Process raw bytes directly...
    // (useful for binary analysis, checksums, etc.)
}

§Collecting Records into a Vec

use copybook_codec::{iter_records, DecodeOptions};
use copybook_core::parse_copybook;
use serde_json::Value;
use std::io::Cursor;

let copybook_text = "01 RECORD.\n   05 ID PIC 9(5).";
let schema = parse_copybook(copybook_text)?;
let options = DecodeOptions::default();

let data = b"0000100002";
let iterator = iter_records(Cursor::new(data), &schema, &options)?;

// Collect all successful records
let records: Vec<Value> = iterator
    .filter_map(Result::ok)  // Skip errors
    .collect();

println!("Collected {} records", records.len());

§Using with DecodeOptions and Metadata

use copybook_codec::{iter_records_from_file, DecodeOptions, Codepage, JsonNumberMode};
use copybook_core::parse_copybook;

let copybook_text = r#"
    01 RECORD.
       05 AMOUNT PIC S9(9)V99 COMP-3.
"#;
let schema = parse_copybook(copybook_text)?;

// Configure with lossless numbers and metadata
let options = DecodeOptions::new()
    .with_codepage(Codepage::CP037)
    .with_json_number_mode(JsonNumberMode::Lossless)
    .with_emit_meta(true);  // Include field metadata

let iterator = iter_records_from_file("data.bin", &schema, &options)?;

for result in iterator {
    let json_value = result?;
    // JSON includes metadata: {"AMOUNT": "123.45", "_meta": {...}}
    println!("{}", serde_json::to_string_pretty(&json_value)?);
}

Structs§

RecordIterator
Iterator over records in a data file, yielding decoded JSON values

Functions§

iter_records
Convenience function to create a record iterator from any readable source
iter_records_from_file
Convenience function to create a record iterator from a file path