dravr-sciotte-server-0.1.0 has been yanked.
dravr-sciotte
Sport activity scraper with headless Chrome, TOML-configurable providers, and in-memory caching.
Logs into sport platforms via a browser (no API credentials needed), scrapes training data from activity pages, and exposes it through four integration surfaces: Rust trait, REST API, MCP server, and CLI.
Quick Start
# Login (opens a browser — log in to your account)
# List activities (fast, from the training page)
# List with full detail (navigates each activity page for HR, cadence, weather, device, etc.)
# Start REST + MCP server
# Start MCP stdio transport (for Claude integration)
How It Works
- Browser login — opens a visible Chrome window to the provider's login page. You log in normally. Session cookies are captured and encrypted at rest.
- List page scraping — navigates to the training/activity list page in headless Chrome, extracts activity rows using CSS selectors defined in the provider TOML.
- Detail enrichment (opt-in via
--detail) — navigates into each activity page and extracts full metrics using a JS snippet from the provider TOML. - Caching — results are cached in-memory with configurable TTL (default 15 min).
Provider Configuration
Scraping rules are defined in TOML files under providers/. The default provider is Strava (providers/strava.toml), compiled into the binary.
Example: Strava (providers/strava.toml)
[]
= "strava"
= "https://www.strava.com/login"
= ["/dashboard", "/athlete", "/feed"]
= ["/login", "/session"]
[]
= "https://www.strava.com/athlete/training"
= "tr.training-activity-row"
= 'a[data-field-name="name"]'
= '/\/activities\/(\d+)/'
[]
= 'a[data-field-name="name"]'
= 'td[data-field-name="sport_type"]'
= "td.col-date"
= 'td[data-field-name="time"]'
= "td.col-dist"
= "td.col-elev"
= "td.col-suffer-score"
[]
= "https://www.strava.com/activities/{id}"
= '''
(function() {
// ... JS that extracts activity data and returns JSON ...
})()
'''
To add a new provider, create a TOML file with the same structure and load it via ProviderConfig::from_file().
Integration Modes
CLI
REST API
| Method | Path | Description |
|---|---|---|
| POST | /auth/login |
Trigger browser login |
| GET | /auth/status |
Check authentication |
| GET | /api/activities?limit=20 |
List activities |
| GET | /api/activities/{id} |
Activity detail |
| GET | /health |
Health check |
| POST | /mcp |
MCP HTTP transport |
MCP (Model Context Protocol)
6 tools available via stdio or HTTP transport:
| Tool | Description |
|---|---|
auth_status |
Check session status |
browser_login |
Open browser for login |
get_activities |
Scrape activity list |
get_activity |
Scrape single activity detail |
cache_status |
Cache hit/miss stats |
cache_clear |
Clear cached data |
Rust Trait
use ;
use CacheConfig;
let scraper = default_config;
let cached = new;
let session = cached.browser_login.await?;
let activities = cached.get_activities.await?;
Activity Data Model
Activities scraped from detail pages include:
| Category | Fields |
|---|---|
| Core | id, name, sport_type, start_date, duration_seconds |
| Distance | distance_meters, elevation_gain, pace, gap |
| Heart Rate | average_heart_rate, max_heart_rate |
| Power | average_power, max_power, normalized_power |
| Cadence | average_cadence |
| Speed | average_speed, max_speed |
| Training | suffer_score, calories, elapsed_time_seconds |
| Weather | temperature, feels_like, humidity, wind_speed, wind_direction, weather |
| Equipment | device_name, gear_name |
| Location | city, region, country |
| Other | perceived_exertion, sport_type_detail, workout_type |
Architecture
dravr-sciotte/
├── providers/strava.toml # Provider config (selectors, JS, URLs)
├── src/ # Core library
│ ├── provider.rs # TOML config loading and JS generation
│ ├── scraper.rs # Chrome-based scraping engine
│ ├── models.rs # Activity data model
│ ├── cache.rs # In-memory TTL cache
│ ├── auth.rs # Session encryption/persistence
│ └── types.rs # ActivityScraper trait
├── crates/dravr-sciotte-mcp/ # MCP server (stdio + HTTP)
└── crates/dravr-sciotte-server/ # REST API + CLI
Environment Variables
| Variable | Description | Default |
|---|---|---|
CHROME_PATH |
Path to Chrome/Chromium binary | auto-detect |
DRAVR_SCIOTTE_API_KEY |
Bearer token for REST auth | none (open) |
DRAVR_SCIOTTE_CACHE_TTL |
Cache TTL in seconds | 900 (15 min) |
DRAVR_SCIOTTE_CACHE_MAX |
Max cache entries | 100 |
DRAVR_SCIOTTE_SESSION_DIR |
Session storage directory | ~/.config/dravr-sciotte/ |
License
MIT OR Apache-2.0