mocra 0.3.0

A distributed, event-driven crawling and data collection framework
docs.rs failed to build mocra-0.3.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: mocra-0.1.6

mocra

mocra is a Rust crawler runtime library for building module-based crawlers. A module describes what to request, how to parse responses, and how parsed output moves to the next node in a workflow.

The current runtime supports:

  • single-node local execution;
  • Kafka-backed queue execution;
  • Raft/RocksDB-backed coordination and cache state;
  • DAG and linear module workflows;
  • response caching, middleware, cron scheduling, metrics, health checks, and DLQ operations.

Redis is not supported by the current codebase.

Install

[dependencies]
mocra = "0.3.0"

Optional features:

mocra = { version = "0.3.0", features = ["js-v8", "polars"] }

The crate currently targets Rust 1.85 and edition 2024.

Minimal Runtime

use std::sync::Arc;

use mocra::common::state::State;
use mocra::engine::Engine;

#[tokio::main]
async fn main() -> mocra::errors::Result<()> {
    let state = Arc::new(State::new("config.toml").await);
    let engine = Engine::new(Arc::clone(&state), None).await?;

    // engine.register_module(MyModule::default_arc()).await;

    engine.start().await;
    Ok(())
}

Documentation

Chinese documentation: