1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
//! # Stygian Graph
//!
//! A high-performance, graph-based web scraping engine for Rust.
//!
//! ## Overview
//!
//! Stygian treats scraping pipelines as Directed Acyclic Graphs (DAGs) where each node
//! is a pluggable service module (HTTP fetchers, AI extractors, headless browsers).
//! Built for extreme concurrency and extensibility using hexagonal architecture.
//!
//! ## Quick Start
//!
//! ```no_run
//! use stygian_graph::domain::graph::Pipeline;
//! use stygian_graph::domain::pipeline::PipelineUnvalidated;
//!
//! #[tokio::main]
//! async fn main() -> Result<(), Box<dyn std::error::Error>> {
//! // Create a simple scraping pipeline
//! let config = serde_json::json!({
//! "nodes": [],
//! "edges": []
//! });
//!
//! let pipeline = PipelineUnvalidated::new(config)
//! .validate()?
//! .execute()
//! .complete(serde_json::json!({"status": "success"}));
//!
//! println!("Pipeline complete: {:?}", pipeline.results());
//! Ok(())
//! }
//! ```
//!
//! ## Architecture
//!
//! Stygian follows hexagonal (ports & adapters) architecture:
//!
//! - **Domain**: Core business logic (graph execution, pipeline orchestration)
//! - **Ports**: Trait definitions (service interfaces, abstractions)
//! - **Adapters**: Implementations (HTTP, AI providers, storage, caching)
//! - **Application**: Orchestration (service registry, executor, CLI)
//!
//! ## Features
//!
//! - πΈοΈ **Graph-based execution**: DAG pipelines with petgraph
//! - π€ **Multi-AI support**: Claude, GPT, Gemini, Copilot, Ollama
//! - π **JavaScript rendering**: Optional browser automation via `stygian-browser`
//! - π **Multi-modal extraction**: HTML, PDF, images, video, audio
//! - π‘οΈ **Anti-bot handling**: User-Agent rotation, proxy support, rate limiting
//! - π **High concurrency**: Worker pools, backpressure, Tokio + Rayon
//! - π **Idempotent operations**: Safe retries with idempotency keys
//! - π **Observability**: Metrics, tracing, monitoring
//!
//! ## Crate Features
//!
//! - `browser` (default): Include stygian-browser for JavaScript rendering
//! - `full`: All features enabled
//!
//! ## Request Signing
//!
//! Use [`ports::signing::SigningPort`] + [`adapters::signing::HttpSigningAdapter`] to attach
//! HMAC signatures, AWS Sig V4, OAuth 1.0a, or Frida RPC tokens to any outbound request.
//! No feature flag required β zero additional dependencies.
// βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
// Internal Module Organization (Hexagonal Architecture)
// βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
/// Core domain logic - graph execution, pipelines, orchestration
///
/// **Hexagonal principle**: Domain never imports adapters, only ports (traits).
/// Port trait definitions - service abstractions
///
/// Defines interfaces that adapters must implement:
/// - `ScrapingService`: HTTP fetchers, browser automation
/// - `AIProvider`: LLM extraction services
/// - `CachePort`: Caching abstractions
/// - `CircuitBreaker`: Resilience patterns
/// Adapter implementations - infrastructure concerns
///
/// Concrete implementations of port traits:
/// - HTTP client with anti-bot features
/// - AI providers (Claude, GPT, Gemini, Ollama)
/// - Storage backends (file, S3, database)
/// - Cache backends (memory, Redis, file)
/// Application layer - orchestration and coordination
///
/// High-level coordination logic:
/// - Service registry with dependency injection
/// - Pipeline executor
/// - CLI interface
/// - Configuration management
// βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
// Public API
// βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
/// Error types used throughout the crate
/// Re-exports for convenient imports
///
/// # Example
///
/// ```
/// use stygian_graph::prelude::*;
/// ```
// Re-export browser crate if feature is enabled
pub use stygian_browser;
/// MCP (Model Context Protocol) server β exposes scraping & pipeline tools