Expand description
stygian-plugin: Chrome browser plugin fallback scraper
Provides a flexible, interactive visual data extraction framework as a fallback when stygian-graph and stygian-browser cannot scrape a page.
§Architecture
Following hexagonal architecture with clear separation:
┌─────────────────────────────────────┐
│ Application / MCP Layer │
│ (plugin_apply_template, etc.) │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Domain Layer (pure Rust) │
│ ExtractionTemplate │
│ ExtractionRequest/Result │
│ Transformation Pipeline │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Ports (traits) │
│ PluginTemplateStore │
│ PluginExtractionPort │
│ IdempotencyKeyStore │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Adapters (implementations) │
│ FileTemplateStore │
│ ExtractionEngine │
│ MemoryIdempotencyStore │
└─────────────────────────────────────┘§Features
- Template-based extraction: Define schema once, apply to multiple elements
- Recording-based: User clicks/highlights → learns pattern
- Query-driven: Declarative extraction with CSS/XPath selectors
- Region-based: Multiple independent zones, each with own rules
- Multi-instance: Iterate template across matching elements
- Multi-set: Extract different shapes from same page
- Cross-page: Reuse templates in crawl sessions
- Idempotency: Safe retries via ULID-based deduplication
- Transformations: Regex, type coercion, HTML stripping, etc.
§Quick Start
use stygian_plugin::domain::{ExtractionTemplate, Region, Selector, ExtractionRequest};
use stygian_plugin::ports::PluginExtractionPort;
use serde_json::json;
// Create a template with regions
let template = ExtractionTemplate::new("Product")
.with_region(
Region::new(
"title",
Selector::css(".product-title"),
json!({"type": "string"}),
)
)
.with_region(
Region::new(
"price",
Selector::css(".product-price"),
json!({"type": "number"}),
)
);
// Create extraction request
let request = ExtractionRequest::new(
template,
"https://example.com/products",
"<html>...</html>"
);
// Execute (requires a PluginExtractionPort adapter)
// let result = extraction_port.execute(&request).await?;Re-exports§
pub use domain::ExtractionRequest;pub use domain::ExtractionResult;pub use domain::ExtractionTemplate;pub use domain::IdempotencyKey;pub use domain::Region;pub use domain::Selector;pub use domain::Transformation;pub use error::PluginError;pub use error::Result;pub use mcp::McpPluginServer;pub use mcp::McpRequestHandler;pub use ports::IdempotencyKeyStore;pub use ports::PluginExtractionPort;pub use ports::PluginTemplateStore;pub use reliability::ReliabilityBand;pub use reliability::ReliabilityScore;pub use reliability::ReliabilityScorer;pub use reliability::ScoreWeightedSelector;pub use reliability::ScoredCandidate;pub use reliability::ScoringWeights;
Modules§
- adapters
- Adapter implementations: concrete providers of port traits Adapter implementations of port traits
- config
- Runtime configuration for the standalone MCP server Runtime configuration for the standalone MCP server.
- domain
- Domain layer: pure business logic and value objects
- error
- Error types Error types for stygian-plugin
- mcp
- MCP (Model Context Protocol) server for the plugin system MCP (Model Context Protocol) server for the plugin extraction system
- ports
- Port trait definitions: interfaces adapters must implement
- reliability
- Extraction reliability scoring
- storage
- Storage adapters: template persistence, idempotency tracking Storage adapters for persisting templates and idempotency keys