Concept Analyzer - First Principles Extractor
A unified pipeline that analyzes code repositories and extracts first-principles instructions for AI agents to recreate systems from scratch.
Features
- Single API Call: Point at a repository and get results in S3
- Parallel Processing: Uses the
pipelinescrate for efficient parallel analysis - First-Principles Output: Strips away implementation details to focus on core concepts
- Demo-Friendly: Detailed progress logging for presentations
- AI-Optimized: Output designed for agent consumption
Installation
Usage
CLI
# Set environment variables
# Run analysis
As a Library
use analyze_repository;
use Path;
let s3_url = analyze_repository.await?;
HTTP API
# Start the API server
# Make a request
Output Format
The S3 output contains a JSON document with:
Demo Mode
When running the CLI, you'll see detailed progress:
🎯 CONCEPT ANALYZER - First Principles Extractor
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔍 STARTING REPOSITORY ANALYSIS
📁 Repository: /path/to/repo
🪣 S3 Target: s3://my-bucket/results.json
⚙️ Workers: 8
📋 STAGE 1/6: File Collection
└─ Scanning repository for source files...
✓ Collected 150 files in 3 batches (125ms)
🔬 STAGE 2-4: Parallel Analysis Pipeline
├─ Stage 2: Extracting abstractions from code
├─ Stage 3: Analyzing relationships between concepts
└─ Stage 4: Processing with 8 parallel workers
✓ Found 25 abstractions and 45 relationships (2341ms)
🧬 STAGE 5/6: Synthesizing First Principles
├─ Extracting essential concepts
├─ Simplifying relationships
├─ Determining build order
└─ Generating rebuild instructions
✓ Synthesis complete (1523ms)
☁️ STAGE 6/6: Publishing to S3
└─ Uploading to s3://my-bucket/results.json
✓ Published successfully (234ms)
✅ ANALYSIS COMPLETE!
⏱️ Total time: 4.22s
📊 Output size: 12.34 KB
🔗 Results: s3://my-bucket/results.json
Architecture
The pipeline consists of 6 stages:
- File Collection: Scans repository for source files
- Abstraction Extraction: Uses LLM to identify high-level concepts
- Relationship Analysis: Maps dependencies between concepts
- Parallel Processing: Processes batches concurrently
- Synthesis: Generates first-principles output
- Publishing: Uploads results to S3
Dependencies
pipelines: For parallel processingaws-sdk-s3: For S3 publishingtokio: Async runtimeaxum: Web framework (optional)clap: CLI parsingserde: JSON serialization