scanstate 0.1.0

Generic scan checkpoint, journal, and progress primitives for pause/resume workflows
Documentation

scanstate — Scan checkpointing and progress primitives

License Tests Crates.io

Why

Long-running scanners are interrupted by crashes, deploy rollouts, and rate limits. scanstate is the reliability layer that prevents rescanning already finished targets and preserves progress state between runs.

It provides checkpoint persistence, write-ahead journaling, and live progress metrics so scan orchestrators can resume safely and stay observable.

Use it anywhere you need at-least-once progress semantics with deterministic resume behavior.

Quick Start

use scanstate::{load_or_new, ScanProgress, ScanCheckpoint};

fn main() -> Result<(), scanstate::ScanStateError> {
    let path = std::path::Path::new("/tmp/my-scan.json");
    let mut checkpoint = load_or_new(path, "scan-2026-03-25");

    checkpoint.mark_complete("https://example.com");
    assert!(checkpoint.is_complete("https://example.com"));

    let mut progress = ScanProgress::new(10);
    progress.record_completed();
    progress.record_findings(2);

    checkpoint.save(path)?;
    println!("completed: {}", checkpoint.completed_count());
    Ok(())
}

Features

  • Atomic checkpoint persistence with crash-safe save/reload.
  • Write-ahead journal append/replay model for robust resume.
  • Progress counters with rate + ETA computation.
  • Marker-based trait Checkpointable for custom state objects.
  • Merge support for combining checkpoints from multiple workers.

TOML Configuration

scanstate supports checkpoint settings in TOML.

scan_id = "my-scan"
checkpoint_path = "/var/run/scan-checkpoint.json"
total_targets = 1000
sync_checkpoint = true
flush_interval_secs = 5
use scanstate::CheckpointSettings;

let settings = CheckpointSettings::from_toml(
    r#"scan_id = "my-scan"
checkpoint_path = "checkpoint.json""#
).unwrap();
assert_eq!(settings.scan_id, "my-scan");

API Overview

  • ScanCheckpoint: completed target registry with mark_complete, is_complete, persistence.
  • WriteAheadJournal: append-only recovery log for target events.
  • Entry: journal record shape.
  • ScanProgress: completed/skipped/findings counters + rate + ETA (serde serializable).
  • Checkpointable: adapter trait for custom scan states.
  • load_or_new: one call resume helper.

Examples

1) Persist and restore completion state

use scanstate::{ScanCheckpoint, load_or_new};

let path = std::path::Path::new("runs/checkpoint.json");
let mut ck = load_or_new(path, "daily-run");
ck.mark_complete("target-a");
ck.save(path).unwrap();

let restored = ScanCheckpoint::load(path).unwrap();
assert!(restored.is_complete("target-a"));

2) Log target transitions with write-ahead journal

use scanstate::{WriteAheadJournal, Entry};
use std::time::{SystemTime, UNIX_EPOCH};

let journal = WriteAheadJournal::new("runs/run.log");
journal
    .append(&Entry {
        target_id: "https://example.com".into(),
        status: "completed".into(),
        timestamp: SystemTime::now().duration_since(UNIX_EPOCH).unwrap().as_secs(),
        findings_count: 2,
    })
    .unwrap();

let (entries, corrupt) = journal.replay_lenient().unwrap();
println!("entries: {} corrupt: {}", entries.len(), corrupt);

3) Implement Checkpointable on a custom state

use scanstate::Checkpointable;

struct MyState {
    done: std::collections::HashSet<String>,
}

impl Checkpointable for MyState {
    fn mark_done(&mut self, target_id: &str) {
        self.done.insert(target_id.to_string());
    }
    fn is_done(&self, target_id: &str) -> bool {
        self.done.contains(target_id)
    }
    fn done_count(&self) -> usize {
        self.done.len()
    }
}

Traits

scanstate exposes Checkpointable for your own resume-ready state type. Implementing it lets you reuse the same checkpoint merge/persist patterns without adapting everything manually.

Related Crates

License

MIT, Corum Collective LLC

Docs: https://docs.rs/scanstate

Santh ecosystem: https://santh.io