VELKA

A fast, privacy-first secret and PII scanner for codebases.

Detects leaked credentials (AWS, GCP, Azure, GitHub, Stripe, 52+ providers), PII (CPF, CNPJ, SSN, NIF, DNI, IBAN) and sensitive tokens — with ML-powered confidence scoring to cut false positives.

cargo install velka        # install
velka scan .               # scan current directory

Why Velka?

Zero telemetry — nothing leaves your machine, secrets redacted by default
Fast — memory-mapped I/O, parallel scanning, compiled regex
Low noise — structural validation + ML ensemble keeps false positives under 0.1%

English | Portugues (BR)

Features

52+ Detection Rules: AWS, GCP, Azure, GitHub, Stripe, SendGrid, Twilio, Datadog, Cloudflare, Supabase, Vercel, and more
PII Compliance: CPF, CNPJ (including 2026 alphanumeric), NIF, DNI, SSN, IBAN — all with check-digit validation
Privacy First: Zero telemetry, no network calls, secrets redacted by default
High Performance: Memory-mapped I/O, parallel scanning, compiled regex
CI/CD Ready: JUnit, SARIF, CSV, Markdown, HTML output formats
Incremental Scanning: --diff and --staged for fast pre-commit checks
Git Forensics: --deep-scan finds secrets buried in commit history
God Mode: --god-mode enables semantic analysis, bloom dedup, and full ML scoring
Library API: Use as a Rust crate (velka::scan_str, velka::scan)
LSP Server: Real-time secret detection in your editor
Interactive TUI: Terminal dashboard for triaging findings
ML Classifier: Ensemble scoring (entropy + char frequency + structural + length)
K8s Admission Controller: Block Pods with secrets in manifests
Runtime Log Scanner: Monitor container stdout for secret leaks

Privacy & Security

Velka is Local-First, No-Telemetry, and Air-Gapped by Default. No data ever leaves your machine unless you explicitly opt in with --verify.

See PRIVACY.md for the full privacy policy and independent verification steps.

Installation

Pre-built Binaries (Recommended)

Download the latest release for your platform from GitHub Releases.

# Linux / macOS (shell installer)
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/wesllen-lima/velka/releases/latest/download/velka-installer.sh | sh

# Windows (PowerShell installer)
powershell -ExecutionPolicy ByPass -c "irm https://github.com/wesllen-lima/velka/releases/latest/download/velka-installer.ps1 | iex"

Cargo (from crates.io)

cargo install velka

Cargo (from GitHub)

cargo install --git "https://github.com/wesllen-lima/velka" --locked

From Source (local checkout)

cargo install --path .

Docker

docker run --rm -v $(pwd):/code velka scan /code

As Library

# Cargo.toml
[dependencies]
velka = "1.4"

use velka::{scan, Severity};

fn main() -> velka::VelkaResult<()> {
    let sins = velka::scan(std::path::Path::new("."))?;

    let mortal_count = sins.iter()
        .filter(|s| s.severity == Severity::Mortal)
        .count();

    if mortal_count > 0 {
        std::process::exit(1);
    }
    Ok(())
}

v1.4.0 — The Precision Update

AST-Powered Analysis

Velka 1.4.0 introduces scope-aware analysis that understands code structure — not just text patterns.

Test detection: findings inside test functions, #[cfg(test)] blocks, and test files (*_test.go, test_*.py, *.spec.ts, etc.) are automatically down-scored
Docstring awareness: example credentials in documentation and JSDoc blocks are filtered
40% fewer false positives on real-world codebases without touching entropy thresholds
Multi-language: Rust, Python, Go, TypeScript, JavaScript, Java, Ruby, PHP, C/C++

# AST filtering is on by default — no flags needed
velka scan .

# See filtering decisions in JSON output
velka scan . --format json | jq '.[] | {file, rule, filtered_reason}'

Permission-Aware Verification

--verify now extracts the actual permissions attached to a live secret and classifies its blast radius.

[MORTAL] AWS_ACCESS_KEY  src/config.rs:14
  Value   : AKIA****MPLE
  Status  : ACTIVE
  Risk    : Critical
  Perms   : s3:*, iam:*, ec2:*  (Admin-equivalent)
  Detail  : Key belongs to IAM user "deploy-bot" (account 123456789012)

[MORTAL] GITHUB_TOKEN  .env:3
  Value   : ghp_****Xk9
  Status  : ACTIVE
  Risk    : High
  Perms   : repo, workflow, write:packages
  Detail  : Token owned by "wesllen-lima", expires never

Risk levels: Critical · High · Medium · Low · Info

velka scan . --verify --format json | jq '.[] | select(.risk_level == "Critical")'

Infrastructure Security

Dedicated IaC scanner for Terraform, Kubernetes, and Docker — same rule engine, purpose-built rules.

Category	Rules
Terraform	Hardcoded credentials in `provider {}`, public S3 buckets, open security groups (0.0.0.0/0), unencrypted RDS/EBS
Kubernetes	`privileged: true`, `hostNetwork/hostPID`, missing resource limits, secrets in env vars, latest image tags
Docker	`USER root`, `:latest` tag, secrets in `ENV`/`ARG`, `--privileged`, `curl \| bash` patterns

# Scan IaC files explicitly (also detected automatically during velka scan)
velka scan ./infra --format terminal
velka scan ./k8s   --format sarif > k8s-findings.sarif

Drift Detection (Baseline)

Track your secret posture over time. Save a baseline and alert only on new findings.

# Save current findings as baseline
velka baseline save

# Later: show only new findings since baseline
velka baseline diff

# Inspect saved baseline
velka baseline show

Example output of velka baseline diff:

Baseline: 2026-02-10T14:32:00Z  (12 findings)
Current : 2026-02-17T09:15:00Z  (14 findings)

NEW (2):
  [+] MORTAL  AWS_ACCESS_KEY   src/infra/deploy.tf:8
  [+] VENIAL  HARDCODED_IP     src/service/client.rs:42

RESOLVED (0):
  (none)

Baseline is stored in ~/.velka/baseline.json (per-project, keyed by repo root).

Usage

# Basic scan
velka scan .

# Show progress bar
velka scan . --progress

# Only changed files (fast pre-commit)
velka scan . --diff

# Only staged files
velka scan . --staged

# Git history forensics
velka scan . --deep-scan

# Only critical issues
velka scan . --mortal-only

# Different output formats
velka scan . --format json
velka scan . --format csv
velka scan . --format junit    # CI dashboards
velka scan . --format sarif    # GitHub Code Scanning
velka scan . --format markdown
velka scan . --format html
velka scan . --format report   # Before/After remediation (redacted)

# Use configuration profile
velka scan . --profile ci

# Show full secrets (debugging only)
velka scan . --no-redact

# Verify secrets via API (opt-in; makes network calls for GitHub token, etc.)
velka scan . --verify

# Migrate secrets to .env and update source (opt-in; requires .env in .gitignore)
velka scan . --migrate-to-env --dry-run   # Preview only
velka scan . --migrate-to-env --yes       # Apply without confirmation
velka scan . --migrate-to-env             # Interactive confirmation
velka scan . --migrate-to-env --env-file .env.local

# God Mode: full deep analysis (semantic decoding, bloom dedup, ML scoring)
velka scan . --god-mode
velka scan . --god-mode --format json

# Scan from stdin (e.g. pipe from git diff)
git diff | velka stdin
cat logs/*.log | velka stdin --format json

# Install pre-commit hook
velka hook install

Exit codes

0: no Mortal sins found
1: at least one Mortal sin found

LSP Server (Editor Integration)

Velka includes a built-in Language Server Protocol server that provides real-time secret detection as you type.

Setup

# Start the LSP server (stdio transport)
velka lsp

VS Code

Add to your settings.json:

{
  "velka.lsp.enabled": true,
  "velka.lsp.path": "velka"
}

Or use the VS Code extension in vscode-extension/.

Neovim (nvim-lspconfig)

require('lspconfig').velka.setup{
  cmd = { "velka", "lsp" },
  filetypes = { "*" },
}

Features

Diagnostics on save: warnings/errors for detected secrets
Works with any editor supporting LSP (VS Code, Neovim, Helix, Zed, Emacs)
Uses the same rule engine and ML classifier as the CLI
Hot-reloads dynamic rules from ~/.velka/rules.d/

Interactive TUI

A full terminal dashboard for triaging and managing secret findings.

# Launch TUI on current directory
velka tui .

# Include git history findings
velka tui . --deep-scan

Controls

Key	Action
`j`/`k` or arrows	Navigate findings
`Enter`	View finding details with syntax highlighting
`e`	Open entropy visualizer
`q`	Quit
`?`	Help

Features

File explorer with syntax-highlighted code preview
Entropy density visualization (bar charts)
ML confidence scores per finding
Keyboard-driven workflow for security triage

ML Classifier

Velka uses an ensemble scoring system to achieve <0.1% false positive rate. No external ML runtime required.

How it works

Pattern match (regex) establishes base confidence
Shannon entropy filters low-entropy false positives
Context scoring analyzes surrounding code (assignments, comments, tests)
ML features: character class distribution, bigram frequency, structural analysis
Final confidence = weighted blend of all factors

# Verify output includes confidence scores
velka scan . --format json | jq '.[].confidence'

See docs/architecture.md for the full technical explanation.

God Mode (Deep Analysis)

The --god-mode flag activates all analysis engines simultaneously:

Semantic decoding: Detects base64-encoded, hex-encoded, and ROT13 obfuscated secrets
Variable name analysis: Flags suspicious assignments like password = "..." even without regex match
String concatenation detection: Finds secrets split across multiple lines
Bloom filter dedup: Eliminates duplicate snippets across files (zero false negatives)
ML ensemble scoring: All findings enriched with confidence scores

Without --god-mode, Velka runs only pattern matching and ML scoring for maximum speed. God mode trades throughput for depth.

velka scan . --god-mode --format json

Kubernetes Integration

Admission Controller (Webhook)

Block Pods and Deployments that contain secrets in their manifests before they reach the cluster.

# Start admission webhook (plain HTTP for development)
velka k8s webhook --addr 0.0.0.0:8443

# With TLS (production)
velka k8s webhook --addr 0.0.0.0:8443 --tls-cert cert.pem --tls-key key.pem

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: velka-secret-scanner
webhooks:
  - name: velka.security.io
    clientConfig:
      service:
        name: velka-webhook
        namespace: velka-system
        path: /validate
    rules:
      - apiGroups: [""]
        resources: ["pods", "secrets", "configmaps"]
        apiVersions: ["v1"]
        operations: ["CREATE", "UPDATE"]
    failurePolicy: Ignore
    sideEffects: None
    admissionReviewVersions: ["v1"]

Manifest Scanning

Scan local YAML files without running the webhook server:

velka k8s scan deployment.yaml

Runtime Log Scanner

Monitor container logs in real-time for accidentally leaked secrets.

# Scan from stdin (pipe from docker/kubectl)
kubectl logs -f my-pod | velka runtime

# Scan log files
velka runtime /var/log/app.log /var/log/worker.log

# Follow mode (tail -f behavior)
velka runtime /var/log/app.log --follow

Exits with code 1 if mortal secrets are detected. Useful as a sidecar container or log monitoring daemon.

Shell Completions

Generate autocompletion scripts for your shell:

# Bash
velka completions bash > ~/.local/share/bash-completion/completions/velka

# Zsh
velka completions zsh > ~/.zfunc/_velka

# Fish
velka completions fish > ~/.config/fish/completions/velka.fish

# PowerShell
velka completions powershell > velka.ps1

Configuration

Create velka.toml in your project root:

[scan]
ignore_paths = ["vendor/**", "tests/fixtures/**"]
entropy_threshold = 4.6
whitelist = ["localhost", "example.com", "test@example.com"]

[output]
redact_secrets = true

[cache]
enabled = true
location = "both"  # "project", "user", or "both"

[rules]
disable = ["HARDCODED_IP"]

[[rules.custom]]
id = "INTERNAL_API"
pattern = "MYCOMPANY_[A-Z0-9]{32}"
severity = "Mortal"
description = "Internal API key detected"

[profile.ci]
cache.enabled = false
output.redact_secrets = true

[profile.dev]
scan.entropy_threshold = 5.0
output.redact_secrets = false

Inline ignores: Add velka:ignore comment on any line to skip it.

Quick Init

velka init --preset balanced  # also: strict, ci, monorepo

Detection Rules

Mortal Sins (Critical)

Rule	Description
`AWS_ACCESS_KEY`	AWS Access Key ID
`AWS_SECRET_KEY`	AWS Secret Access Key
`GOOGLE_API_KEY`	Google API Key
`GITHUB_TOKEN`	GitHub Personal Access Token
`STRIPE_SECRET`	Stripe Secret Key
`PRIVATE_KEY`	SSH/PGP Private Keys
`SLACK_WEBHOOK`	Slack Webhook URL
`SENDGRID_API`	SendGrid API Key
`TWILIO_API`	Twilio API Key
`NPM_TOKEN`	NPM Auth Token
`PYPI_TOKEN`	PyPI API Token
`DISCORD_TOKEN`	Discord Bot Token
`TELEGRAM_BOT`	Telegram Bot Token
`DB_CONNECTION_STRING`	Database Connection String
`HARDCODED_PASSWORD`	Hardcoded Password
`AZURE_STORAGE_KEY`	Azure Storage Account Key
`GCP_SERVICE_ACCOUNT`	GCP Service Account Key
`HEROKU_API_KEY`	Heroku API Key
`MAILGUN_API_KEY`	Mailgun API Key
`SQUARE_ACCESS_TOKEN`	Square Access Token
`SQUARE_OAUTH_SECRET`	Square OAuth Secret
`CREDIT_CARD`	Credit Card (Luhn validated)
`HIGH_ENTROPY`	High Entropy Strings
`K8S_PRIVILEGED`	Kubernetes Privileged Pod

Venial Sins (Warnings)

Rule	Description
`JWT_TOKEN`	JWT Token
`HARDCODED_IP`	Hardcoded IP Address
`EVAL_CALL`	eval() Call
`DOCKER_ROOT`	Dockerfile Root User
`DOCKER_LATEST`	Dockerfile :latest Tag
`K8S_HOST_NETWORK`	Kubernetes Host Network
`K8S_HOST_PID`	Kubernetes Host PID
`GENERIC_API_KEY`	Generic API Key Pattern
`GENERIC_SECRET`	Generic Secret Pattern

CI/CD Integration

GitHub Actions (Official Action)

- uses: actions/checkout@v4
- uses: wesllen-lima/velka/.github/actions/velka-scan@main
  with:
    path: .
    format: terminal
    mortal-only: 'true'
    fail-on-secrets: 'true'
    # diff-only: 'true'    # PR mode: only scan changed files
    # deep-scan: 'true'    # Also scan git history
    # since: 'main'        # Incremental: changes since branch

GitHub Actions (Manual + SARIF)

- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- run: cargo install velka --locked
- run: velka scan . --format sarif > results.sarif
- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: results.sarif

GitLab CI

velka-scan:
  script:
    - velka scan . --format junit > velka-report.xml
  artifacts:
    reports:
      junit: velka-report.xml

Pre-commit Hook

Option 1 - pre-commit framework (add to .pre-commit-config.yaml):

repos:
  - repo: https://github.com/wesllen-lima/velka
    rev: v1.4.0
    hooks:
      - id: velka

Requires velka on PATH (cargo install velka). Then run pre-commit run velka.

Option 2 - Git hook only:

velka hook install           # Standard (blocks mortal only)
velka hook install --strict  # Strict (blocks all sins)

Honeytokens

Generate and inject canary tokens to detect unauthorized access:

# Generate and inject to .env.example
velka honeytoken generate --target .env.example

# Also inject to README.md
velka honeytoken generate --target .env.example --readme

Velka automatically detects its own honeytokens during scans and flags them separately.

Secret Rotation

Get step-by-step rotation guides for detected secrets:

# Show rotation guidance
velka rotate .

# Filter by rule
velka rotate . --rule AWS_ACCESS_KEY

# Show executable CLI commands
velka rotate . --commands

# Mark as remediated
velka rotate . --mark-remediated

Security

Zero Telemetry: No data ever leaves your machine
Redaction by Default: Secrets are masked in output (AKIA****MPLE)
Secure Cache: Only stores file hashes, never secret content
Path Validation: System paths (/proc, /sys, /dev) cannot be scanned
Secure Errors: Error messages don't leak sensitive paths

Performance

Parallel Scanning: Uses ignore crate's parallel walker
Memory-Mapped I/O: Files >1MB use mmap for efficiency
Compiled Regex: All patterns compiled once via std::sync::LazyLock
Lock-free Channels: crossbeam-channel for zero-contention
Smart Skipping: Binary detection via magic bytes, minified code skipped
Batch Cache Writes: Cache misses are buffered and flushed once per run to reduce RwLock contention

Benchmarks

Run cargo bench to reproduce. Benchmarks live in benches/scan_bench.rs.

Throughput (cache disabled):

Files	Benchmark name	Typical median
100	`scan_100_files`	~2 ms
1,000	`scan_1000_files`	~4.5 ms
5,000	`scan_5000_files`	~12 ms
10,000	`scan_10000_files`	~21 ms

Cache impact (1,000 files, cache enabled):

Benchmark name	Description
`scan_1000_files_cache_cold`	First run: full scan, cache populated
`scan_1000_files_cache_hit`	Second run: cache hit, no re-scan

Run only cache benchmarks: cargo bench scan_1000_files_cache. Run a single bench: cargo bench scan_1000_files.

Velka is designed to be significantly faster than alternatives (e.g. TruffleHog, detect-secrets) due to Rust's zero-cost abstractions, parallel file walking, and memory-mapped I/O. Run both on your codebase to compare.

Architecture

For a deep dive into the Ensemble Scoring engine, rule plugin system, and module map, see docs/architecture.md.

Documentation

Architecture - Engine internals and scoring system
Privacy Policy - Local-first, no-telemetry guarantee
Contributing - How to contribute
Changelog - Version history
Security Policy - Vulnerability reporting

License

Licensed under MIT OR Apache-2.0.

See LICENSE, LICENSE-MIT, and LICENSE-APACHE.

velka 1.4.0

VELKA

Features

Privacy & Security

Installation

Pre-built Binaries (Recommended)

Cargo (from crates.io)

Cargo (from GitHub)

From Source (local checkout)

Docker

As Library

v1.4.0 — The Precision Update

AST-Powered Analysis

Permission-Aware Verification

Infrastructure Security

Drift Detection (Baseline)

Usage

Exit codes

LSP Server (Editor Integration)

Setup

VS Code

Neovim (nvim-lspconfig)

Features

Interactive TUI

Controls

Features

ML Classifier

How it works

God Mode (Deep Analysis)

Kubernetes Integration

Admission Controller (Webhook)

Manifest Scanning

Runtime Log Scanner

Shell Completions

Configuration

Quick Init

Detection Rules

Mortal Sins (Critical)

Venial Sins (Warnings)

CI/CD Integration

GitHub Actions (Official Action)

GitHub Actions (Manual + SARIF)

GitLab CI

Pre-commit Hook

Honeytokens

Secret Rotation

Security

Performance

Benchmarks

Architecture

Documentation

License