backDisco
backDisco is a tool that discovers exposed backend origins from CDN frontends using LLM-assisted pattern analysis and brute force enumeration.
Overview
Given a known pattern of CDN frontend and backend hostname pairs, backDisco uses an LLM to identify naming patterns and then applies those patterns to discover additional backend origins from a list of target frontends. It can also perform brute force subdomain enumeration based on extracted patterns.
Features
- Multi-API LLM Support: Automatic detection and support for OpenAI-compatible, Ollama, and Anthropic APIs
- LLM-Powered Pattern Analysis: Automatically identifies naming patterns between frontend and backend hostnames
- Pattern-Based Discovery: Applies discovered patterns to target frontends to generate backend candidates
- Position-Aware Candidate Generation: Parses hostnames by both '.' and '-' delimiters for context-aware expansion
- Batched LLM Expansion: Efficiently expands word lists using configurable batch processing
- SAN Extraction: Extracts Subject Alternative Names from target certificates to discover additional hosts
- Brute Force Enumeration: Generates subdomain candidates based on backend URL patterns with structured generation
- LLM Word Expansion: Uses LLM to generate related words for brute force wordlists at specific positions
- Model Selection: Interactive model selection when model is not specified
- Concurrent Verification: Efficiently tests candidates with configurable concurrency
- DNS and HTTP Verification: Validates discovered backends via DNS resolution and HTTP/HTTPS checks
Requirements
- Rust (edition 2021)
- Access to an LLM API endpoint (OpenAI-compatible, Ollama, or Anthropic)
- Network access to target hosts
Installation
From crates.io
From source
The binary will be located at target/release/backdisco.
Configuration
backDisco supports multiple LLM API types with automatic detection:
- OpenAI-compatible APIs: Standard OpenAI API format (e.g.,
https://api.openai.com/v1) - Ollama: Local or remote Ollama instances (e.g.,
http://localhost:11434/v1orhttp://localhost:11434/api) - Anthropic: Claude API endpoints (e.g.,
https://api.anthropic.com/v1)
The LLM endpoint URL and model can be specified via command-line arguments:
--llmurl: LLM API base URL (defaults tohttp://localhost:11434/v1)--model: Model name (if not provided, the tool will fetch available models and prompt for selection)
The API type is automatically detected from the URL using pattern matching, so you don't need to specify it manually.
Usage
Basic Usage
With Custom LLM Configuration
Specify a custom LLM endpoint and model:
If --model is not specified, the tool will fetch available models and prompt for selection:
With SAN Extraction
Extract Subject Alternative Names from target certificates:
With Brute Force Enumeration
Enable brute force subdomain enumeration with position-aware expansion:
The --brute flag enables structured candidate generation that:
- Parses hostnames by both '.' and '-' delimiters to preserve positional context
- Expands words at each position using LLM (e.g., "dev" → ["dev", "prod", "test", "staging"])
- Generates candidates using cartesian products of position expansions
- Processes expansions in batches for efficiency
Complete Example
Command-Line Options
| Option | Short | Description |
|---|---|---|
--front |
-f |
Known frontend hostname (required) |
--back |
-b |
Known backend hostname (required) |
--targets |
-t |
File with target frontends, one per line (required) |
--output |
-o |
Output file (defaults to stdout) |
--verbose |
-v |
Verbosity level 0-2 (default: 1) |
--dns-only |
Skip HTTP checks, DNS only | |
--timeout |
HTTP timeout in seconds (default: 5) | |
--concurrency |
Concurrent check limit (default: 50) | |
--extract-sans |
Extract SANs from target certificates and add to target list | |
--no-sans |
Skip SAN extraction (opposite of --extract-sans) | |
--brute |
Enable brute force subdomain enumeration with position-aware expansion | |
--llm-expand |
Number of related words to generate per seed word using LLM (default: 5, 0 to disable) | |
--llm-batch-size |
Batch size for LLM position expansion (default: 20) | |
--max-depth |
Override maximum subdomain depth for brute forcing (default: derived from backend URL) | |
--llmurl |
LLM API base URL (default: http://localhost:11434/v1) | |
--model |
LLM model name (if not provided, will fetch and prompt for selection) | |
--gen-wordlist-output |
Output file for generated candidate list (one hostname per line) |
How It Works
-
LLM Configuration:
- If
--llmurlis provided, uses that endpoint; otherwise uses default - If
--modelis provided, uses that model; otherwise fetches available models and prompts for selection - Automatically detects API type (OpenAI-compatible, Ollama, or Anthropic) from the URL
- If
-
Pre-flight Check: Verifies connectivity to the LLM endpoint and model availability
-
Backend Analysis:
- Extracts subdomain words, depth, and base domain from the known backend hostname
- Parses hostname structure by both '.' and '-' delimiters for position-aware processing
-
Target Loading: Reads target frontends from the specified file
-
SAN Extraction (optional): Extracts Subject Alternative Names from target certificates to discover additional hosts and wildcard patterns
-
Pattern Discovery: Queries the LLM to identify naming patterns between the frontend and backend hostnames
-
Pattern Application: Applies discovered patterns to target frontends to generate backend candidates
-
Brute Force (optional):
- Parses hostname structures from backend, targets, and SANs
- Groups words by position across all structures
- Expands words at each position using batched LLM calls (context-aware: environment, service type, etc.)
- Generates structured candidates using cartesian products of position expansions
- Filters candidates to match backend base domain
-
Verification: Tests all candidates via DNS resolution and HTTP/HTTPS checks with configurable concurrency
-
Output: Reports live backends with DNS and HTTP status information
Output Format
The tool provides colored output with different verbosity levels:
- Level 0: Minimal output
- Level 1 (default): Standard output with progress and results
- Level 2: Detailed debug information including pattern details, SAN extraction results, and failed candidates
Example output:
[+] LIVE: api.internal.example.com
DNS: 192.168.1.100
HTTPS: 200 OK
HTTP: 301 Moved Permanently
Examples
Example 1: Simple Pattern Discovery
Given:
- Frontend:
cdn.example.com - Backend:
api.internal.example.com - Pattern discovered: Replace
cdnwithapi.internal
Applied to targets:
cdn.target1.com→api.internal.target1.comcdn.target2.com→api.internal.target2.com
Example 2: SAN Extraction
If cdn.example.com has a certificate with SANs:
*.internal.example.comapi.internal.example.comadmin.internal.example.com
These will be added to the target list for pattern application.
Example 3: Position-Aware Brute Force with LLM Expansion
Backend: api-v2.internal.example.com
- Parsed structure: segments
["api", "v2", "internal"], base"example.com" - Position-aware expansion:
- Position 0 (service):
["api"]→["api", "rest", "graphql", "rpc", "service", "gateway"] - Position 1 (version):
["v2"]→["v2", "v1", "v3", "beta", "alpha", "prod"] - Position 2 (environment):
["internal"]→["internal", "int", "private", "corp", "internal"]
- Position 0 (service):
- Generated candidates (cartesian product):
api.v2.internal.example.com,rest.v1.internal.example.com,graphql.prod.int.example.com, etc.
- All candidates filtered to match base domain
example.com
License
This project is licensed under the MIT License. See the LICENSE file for details.
Contributing
Contributions are welcome! Please ensure your code follows Rust best practices and includes appropriate error handling.