🔐 ETH.id - Zero-Knowledge Document Verification CLI

A high-performance Rust CLI that uses Zero-Knowledge Proofs and LLMs to answer yes/no questions about documents without ever exposing the original document content.

The document never leaves your machine.

🎯 What It Does

ETH.id answers yes/no questions about your documents using a combination of:

Zero-Knowledge Proofs for deterministic claims (age, amounts, dates)
LLMs for semantic understanding (signatures, clauses, context)

Example Questions:

"Is this person over 18 years old?" → true
"Does the CPF match 123.456.789-00?" → true
"Was this document issued in the last 90 days?" → false
"Is the income above R$ 5,000?" → true
"Is this document signed by both parties?" → true

✨ Key Features

🔒 Privacy-First: Document never leaves your machine
🧠 Smart Filtering: Only minimal, claim-relevant data is processed
🔐 Zero-Knowledge: Mathematical proofs for deterministic claims
🤖 LLM Support: Claude, OpenAI, or Ollama (offline)
📝 Audit Trail: Complete history with hashes only
📜 Attestations: Cryptographic proof bundles
🚀 Fast: Rust performance with async processing
🔧 CLI-First: Simple, powerful command-line interface

🚀 Quick Start

Prerequisites

Rust 1.70+ (install from rustup.rs)
OpenAI API key

Installation

# Clone or navigate to the project
cd ethid

# Copy environment template
cp .env.example .env

# Add your OpenAI API key to .env
echo "OPENAI_API_KEY=sk-your-key-here" > .env

# Build the project
cargo build --release

# Run the server
cargo run --release

The server will start on http://localhost:3000

📡 API Usage

Single Verification

Endpoint: POST /api/v1/verify

Request:

{
  "document_base64": "base64_encoded_document",
  "document_mime_type": "application/pdf",
  "questions": [
    "Is this person over 18 years old?",
    "Does the CPF match 123.456.789-00?",
    "Is this a valid government ID?"
  ]
}

Response:

{
  "success": true,
  "data": {
    "session_id": "uuid-v4",
    "answers": [
      {
        "question": "Is this person over 18 years old?",
        "answer": true,
        "confidence": 0.95,
        "reasoning": "Birth date shows person was born in 1990"
      }
    ],
    "document_hash": "sha256_hash",
    "timestamp": "2026-02-24T15:30:00Z",
    "processing_time_ms": 1234
  }
}

Batch Verification

Endpoint: POST /api/v1/verify/batch

Request:

{
  "verifications": [
    {
      "document_base64": "...",
      "document_mime_type": "application/pdf",
      "questions": ["Question 1", "Question 2"]
    },
    {
      "document_base64": "...",
      "document_mime_type": "image/jpeg",
      "questions": ["Question 3"]
    }
  ]
}

🔒 Security Features

Zero-Knowledge Architecture

No Persistent Storage: Documents are processed entirely in-memory
Automatic Cleanup: Secure memory wiping after processing
Hash-Only Logging: Only SHA-256 hashes are logged, never content
Encryption: All sensitive data encrypted with AES-256-GCM
Session Isolation: Each request gets a unique session ID

Privacy Guarantees

✅ Documents never written to disk
✅ No database storage of document content
✅ LLM receives document content but only returns boolean answers
✅ Audit logs contain only hashes and metadata
✅ Automatic memory cleanup on drop

🧪 Example Use Cases

Age Verification

curl -X POST http://localhost:3000/api/v1/verify \
  -H "Content-Type: application/json" \
  -d '{
    "document_base64": "...",
    "document_mime_type": "application/pdf",
    "questions": ["Is this person over 18 years old?"]
  }'

CPF Validation

curl -X POST http://localhost:3000/api/v1/verify \
  -H "Content-Type: application/json" \
  -d '{
    "document_base64": "...",
    "document_mime_type": "image/jpeg",
    "questions": ["Does the CPF match 123.456.789-00?"]
  }'

Document Authenticity

curl -X POST http://localhost:3000/api/v1/verify \
  -H "Content-Type: application/json" \
  -d '{
    "document_base64": "...",
    "document_mime_type": "application/pdf",
    "questions": [
      "Is this a valid government-issued ID?",
      "Does the document contain a photo?",
      "Is the document expired?"
    ]
  }'

🏗️ Architecture

┌─────────────────┐
│   Client App    │
└────────┬────────┘
         │ HTTPS
         ▼
┌─────────────────┐
│   Axum Server   │
│   (Port 3000)   │
└────────┬────────┘
         │
    ┌────┴────┐
    │   API   │
    └────┬────┘
         │
    ┌────┴──────────┐
    │ Verification  │
    │    Engine     │
    └────┬──────────┘
         │
    ┌────┴────┬─────────┬──────────┐
    │         │         │          │
┌───▼───┐ ┌──▼──┐ ┌────▼────┐ ┌───▼────┐
│ Doc   │ │ LLM │ │Security │ │ Audit  │
│Parser │ │Client│ │ Manager │ │  Log   │
└───────┘ └─────┘ └─────────┘ └────────┘

📦 Modules

api: HTTP endpoints and request/response handling
document: PDF/image parsing and text extraction
llm: OpenAI integration for question answering
security: Encryption, hashing, and secure memory
verification: Core verification orchestration
error: Error types and handling

🛠️ Development

Run in Development Mode

cargo run

Run Tests

cargo test

Build for Production

cargo build --release

Enable OCR (Optional)

cargo build --features ocr

📊 Performance

Document Processing: ~100-500ms (depending on size)
LLM Query: ~1-3s per question
Memory Usage: ~50MB base + document size
Concurrent Requests: Supports async/await with Tokio

🔧 Configuration

Environment variables:

OPENAI_API_KEY: Your OpenAI API key (required)
RUST_LOG: Logging level (default: zkid_verifier=debug)
SERVER_PORT: Server port (default: 3000)

🤝 Contributing

This is a production-grade zero-knowledge verification system. Contributions should maintain:

Zero-knowledge guarantees
Memory safety
Security best practices
Performance standards

📄 License

MIT License - See LICENSE file for details

⚠️ Important Notes

API Key Security: Never commit your .env file
Production Use: Use HTTPS in production
Rate Limiting: Implement rate limiting for production deployments
LLM Costs: Each verification uses OpenAI API credits
Document Size: Max 10MB per document

🎯 Roadmap

Support for more document formats (DOCX, etc.)
OCR integration for scanned documents
Multi-language support
Custom LLM provider support (Anthropic, local models)
WebSocket support for real-time verification
Prometheus metrics
Docker container
Kubernetes deployment manifests

eth-id 0.1.0