Semioscan
Semioscan is a Rust library for blockchain analytics, providing production-grade tools for calculating gas costs, extracting price data from DEX swaps, and working with block ranges across multiple EVM-compatible chains.
Key differentiator: Semioscan is a library-only crate with no CLI, API server, or database dependencies. You bring your own infrastructure and integrate semioscan into your existing systems.
Built on Alloy, the modern Ethereum library for Rust, semioscan provides type-safe blockchain interactions with zero-copy parsing and excellent performance.
Table of Contents
- Semioscan
- Table of Contents
- Features
- Use Cases
- Installation
- Quick Start
- Examples and Tutorials
- Core Concepts
- Implementing Custom Price Sources
- Library Architecture
- Multi-Chain Support
- Advanced Configuration
- Performance Considerations
- Running Tests and Examples
- Troubleshooting
- When NOT to Use Semioscan
- Production Usage
- Contributing
- License
- Acknowledgments
Features
- Gas Cost Calculation: Accurately calculate transaction gas costs for both L1 (Ethereum) and L2 (Optimism Stack) chains, including L1 data fees
- Block Window Calculations: Map UTC dates to blockchain block ranges with intelligent caching
- DEX Price Extraction: Extensible trait-based system for extracting price data from on-chain swap events
- Multi-Chain Support: Works with 12+ EVM chains including Ethereum, Arbitrum, Base, Optimism, Polygon, and more
- Event Scanning: Extract transfer amounts and events from blockchain transaction logs
- Production-Ready: Battle-tested in production for automated trading and DeFi applications processing millions of dollars in swaps
Use Cases
Semioscan is ideal for:
- DeFi Liquidation Bots: Calculate profitability accounting for accurate gas costs across L1/L2 chains
- Trading Automation: Extract real-time price data from DEX swaps for arbitrage detection
- Blockchain Analytics: Map calendar dates to block ranges for historical analysis and reporting
- Token Discovery: Scan chains for tokens transferred to specific addresses (e.g., router contracts)
- Financial Reporting: Calculate transaction costs for accounting and tax purposes
- MEV Research: Analyze gas costs and swap prices for MEV opportunity detection
- Multi-Chain Operations: Consistent API across 12+ EVM chains with automatic L2 fee handling
Installation
Add semioscan to your Cargo.toml:
[]
# Core library (gas, block windows, events)
= "0.3"
# With Odos DEX reference implementation (optional)
= { = "0.3", = ["odos-example"] }
Feature Flags
odos-example: IncludesOdosPriceSourceas a reference implementation of thePriceSourcetrait for Odos DEX aggregator (optional, not included by default)
Quick Start
1. Calculate Gas Costs
Calculate total gas costs for transactions between two addresses:
use GasCalculator;
use ProviderBuilder;
use NamedChain;
async
L2 chains (Arbitrum, Base, Optimism) automatically include L1 data fees in the calculation.
2. Calculate Daily Block Windows
Map a UTC date to the corresponding blockchain block range:
use BlockWindowCalculator;
use ProviderBuilder;
use NamedChain;
use NaiveDate;
async
Caching: Block windows are automatically cached to disk for faster subsequent queries.
3. Extract DEX Price Data
Use the PriceSource trait to extract price data from on-chain swap events:
use OdosPriceSource; // requires "odos-example" feature
use PriceCalculator;
use ProviderBuilder;
async
Examples and Tutorials
The examples/ directory contains complete, production-ready examples demonstrating semioscan's capabilities. See examples/README.md for comprehensive documentation, setup instructions, and troubleshooting.
Quick Reference
| Example | Use Case | Difficulty |
|---|---|---|
daily_block_window.rs |
Map UTC dates to block ranges | Beginner |
router_token_discovery.rs |
Discover tokens sent to router contracts | Intermediate |
eip4844_blob_gas.rs |
Calculate EIP-4844 blob gas for L2 rollups | Advanced |
custom_dex_integration.rs |
Implement PriceSource for any DEX |
Advanced |
Running Examples
# Basic usage
RPC_URL=https://arb1.arbitrum.io/rpc
# With chain-specific environment variables
ARBITRUM_RPC_URL=https://arb1.arbitrum.io/rpc
# With logging for debugging
RUST_LOG=debug
For detailed setup, configuration, performance tips, and troubleshooting, see examples/README.md.
Core Concepts
Block Windows
A block window maps a calendar date (in UTC) to the range of blocks produced during that day. Different chains have different block production rates:
- Arbitrum: ~4 blocks/second (~345,600 blocks/day)
- Ethereum: ~12 seconds/block (~7,200 blocks/day)
- Base: ~2 seconds/block (~43,200 blocks/day)
Block windows enable date-based queries for analytics, reporting, and historical analysis.
L1 Data Fees (L2 Chains)
L2 chains like Arbitrum, Base, and Optimism post transaction data to Ethereum for security. This creates two separate gas costs:
- Execution gas: Cost of running the transaction on L2 (cheap, uses L2 gas price)
- L1 data fee: Cost of posting transaction data to Ethereum (expensive, varies by calldata size and L1 gas price)
Semioscan automatically detects L2 chains and calculates both components for accurate total costs. This is critical for profitability calculations in liquidation bots and trading systems.
Caching
Semioscan provides flexible caching for block window calculations using a trait-based backend system. You can choose the caching strategy that best fits your needs.
Cache Backends
DiskCache (recommended for production)
- Persistent JSON-based cache with file locking
- Survives process restarts
- Multi-process safe (advisory file locks)
- Configurable TTL and size limits
- Automatic path validation
- ~1-2ms cache hit latency
MemoryCache
- In-memory HashMap cache
- Fastest performance (<0.1ms cache hits)
- Data lost when process exits
- Configurable size limits with LRU eviction
- Ideal for short-lived processes
NoOpCache
- Disables caching entirely
- Zero overhead
- Always performs RPC queries
- Useful for testing or one-time queries
Basic Usage
use ;
use Duration;
// Disk cache (simplest, recommended)
let calculator = with_disk_cache?;
// Memory cache
let calculator = with_memory_cache;
// No cache
let calculator = without_cache;
Advanced Configuration
use ;
use Duration;
// Disk cache with TTL and size limit
let cache = new
.with_ttl // 7 days
.with_max_entries // Max 1000 entries
.validate?; // Validate path
let calculator = new;
// Memory cache with size limit
let cache = new
.with_max_entries
.with_ttl;
let calculator = new;
Cache Statistics
All cache backends track performance metrics:
let stats = calculator.cache_stats.await;
println!;
println!;
println!;
Cache Best Practices
- Production: Use
DiskCachewith TTL for persistent caching - Development: Use
MemoryCachefor faster iteration without disk I/O - Testing: Use
NoOpCacheorMemoryCacheto avoid file system dependencies - Path validation: Always call
.validate()onDiskCacheto catch path issues early - TTL: Set TTL based on your use case (block windows are immutable for past dates)
- Size limits: Set reasonable limits to prevent unbounded cache growth
Multi-Process Safety
DiskCache uses advisory file locking to prevent corruption when multiple processes share the same cache file. However, for high-concurrency scenarios, consider:
- Using separate cache files per process
- Using a centralized cache service (Redis, etc.) via custom
BlockWindowCachetrait implementation
Custom Cache Backends
Implement the BlockWindowCache trait to create custom cache backends (Redis, S3, etc.):
use ;
use DailyBlockWindow;
use async_trait;
What's Cached
- Block windows: Mappings from (chain, date) to block ranges
- Immutable for past dates (perfect for caching)
- ~200 bytes per cached entry
- Dramatically reduces RPC usage (5-15s query → <1ms)
- Gas calculations: In-memory cache only (not persisted)
- Price calculations: In-memory cache only (not persisted)
Implementing Custom Price Sources
Semioscan uses a trait-based architecture that allows you to implement price extraction for any DEX protocol. The PriceSource trait is object-safe and designed for easy extensibility.
Example: Uniswap V3 Price Source
use ;
use ;
use Log;
use sol;
// Define Uniswap V3 Swap event
sol!
See the PriceSource trait documentation for more details and best practices.
Library Architecture
Semioscan is a library-only crate with no binaries, CLI tools, or API servers. You bring your own:
- Blockchain Providers: Use Alloy to create providers for your chains
- Price Sources: Implement the
PriceSourcetrait for your DEX protocol - Configuration: Configure RPC endpoints and chain settings in your application
This design makes semioscan highly composable and easy to integrate into existing systems.
Multi-Chain Support
Semioscan works with any EVM-compatible chain. Chains with L2-specific features (like L1 data fees) are automatically detected and handled correctly.
Tested chains include:
- L1: Ethereum, Avalanche, BNB Chain
- L2: Arbitrum, Base, Optimism, Polygon, Scroll, Mode, Sonic, Fraxtal
Chain support is based on alloy-chains NamedChain enum.
Advanced Configuration
Use SemioscanConfig to customize RPC behavior per chain:
use SemioscanConfigBuilder;
use NamedChain;
let config = default
.with_chain_override
.build?;
// Pass config to calculators
let calculator = with_config;
Performance Considerations
Block Range Chunking
Large block ranges are automatically chunked to prevent RPC timeouts:
- Default: 5,000 blocks per chunk (configurable per chain)
- Benefits: Prevents timeouts, enables progress tracking, reduces memory usage
Rate Limiting
Automatic rate limiting protects against RPC provider limits:
- Default: 100 requests/second (configurable per chain)
- Recommendation: Use paid RPC providers for production (300-1000+ req/s)
Memory Usage
- Minimal: Caches are written to disk, not held in memory
- Typical cache size: 1-10 MB per chain
- Concurrency: Safe to run multiple queries concurrently
Query Performance
Typical performance characteristics (depends on RPC provider):
- Block window calculation: 5-15 seconds (first query), <1ms (cached)
- Gas calculation (1,000 blocks): 10-30 seconds
- Token discovery (10,000 blocks): 2-5 minutes
See examples/README.md#performance-tips for optimization strategies.
Running Tests and Examples
Running Tests
Semioscan has comprehensive unit tests for all business logic:
# Run all tests
# Run only unit tests (no integration tests)
# Run specific test file
# Run with logging
RUST_LOG=debug
Running Examples
Examples demonstrate real-world usage with live blockchain connections:
# Run example with environment variables
RPC_URL=https://arb1.arbitrum.io/rpc
# Run with logging
RUST_LOG=info RPC_URL=https://arb1.arbitrum.io/rpc
# Run with chain-specific configuration
ARBITRUM_RPC_URL=https://arb1.arbitrum.io/rpc \
API_KEY=your_api_key \
For detailed example documentation, see examples/README.md.
Troubleshooting
Common Issues
Rate Limiting (429 Too Many Requests)
- Solution: Use a paid RPC provider or increase rate limit delay in config
- See: examples/README.md#rpc-errors
Block Range Too Large
- Solution: Reduce
max_block_rangein config (default: 5,000) - Cause: Some RPC providers have stricter limits
Missing Data / No Logs Found
- Possible causes: Wrong contract address, invalid block range, chain reorganization
- Solution: Verify addresses and block range using a block explorer
Chain ID Issues
- Solution: Set
CHAIN_IDenvironment variable for chains withouteth_chainIdsupport - Affected chains: Some Avalanche RPC endpoints
For comprehensive troubleshooting, see examples/README.md#troubleshooting.
When NOT to Use Semioscan
Semioscan may not be the best choice for:
- Real-time price feeds: Use WebSocket-based oracles (Chainlink, Pyth, etc.) for sub-second price updates
- Non-EVM chains: Semioscan is EVM-specific (Solana, Cosmos, etc. are not supported)
- Simple balance queries: Use lighter libraries like
ethers-rsfor basic token balances - Indexing entire chains: Use The Graph or custom indexers for comprehensive blockchain indexing
- High-frequency trading: RPC-based queries have latency; use WebSocket streams or MEV infrastructure
Semioscan excels at batch analytics, historical queries, and multi-chain operations where accurate gas cost calculation and flexible price extraction are required.
Production Usage
Semioscan is battle-tested in production for:
- Automated trading and DeFi applications processing millions of dollars in swaps across 12+ chains
- Financial reporting for blockchain transaction accounting
- Token analytics for discovering and tracking token transfers
Contributing
Contributions are welcome! Areas of interest:
- Additional DEX protocol implementations (Uniswap, SushiSwap, Curve, etc.)
- Performance optimizations for large block ranges
- Additional caching strategies
- Documentation improvements
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.
Acknowledgments
Built by Semiotic AI as part of the Likwid liquidation infrastructure. Extracted and open-sourced to benefit the Rust + Ethereum ecosystem.