k8sql 0.1.11

Query Kubernetes with SQL
k8sql-0.1.11 is not a library.

k8sql

Query Kubernetes clusters using SQL. Powered by Apache DataFusion.

Features

  • SQL queries on Kubernetes resources - SELECT, WHERE, ORDER BY, LIMIT, GROUP BY
  • Multi-cluster support - Query across clusters with _cluster column
  • Query optimizer - Pushes filters to K8s API (namespaces, labels)
  • Multiple interfaces - Interactive REPL, batch mode, PostgreSQL wire protocol
  • CRD support - Automatically discovers and caches Custom Resource Definitions

Installation

Quick Install (Linux/macOS)

curl -sSfL https://raw.githubusercontent.com/ndenev/k8sql/master/install.sh | sh

The script automatically detects your OS and architecture, downloads the latest release, and installs to /usr/local/bin or ~/.local/bin.

From Cargo

cargo install k8sql

Manual Download

Download pre-built binaries for Linux, macOS, and Windows from GitHub Releases.

Quick Start

# Interactive REPL
k8sql

# Single query
k8sql -q "SELECT name, namespace, status->>'phase' FROM pods"

# Multi-cluster query
k8sql -c "prod-*" -q "SELECT _cluster, name FROM pods WHERE status->>'phase' = 'Failed'"

# Output as JSON
k8sql -q "SELECT * FROM deployments" -o json

CLI Options

-q, --query <SQL>       Execute a SQL query directly
-c, --context <CTX>     Kubernetes context(s): name, comma-separated, or glob pattern
-o, --output <FMT>      Output format: table, json, csv, yaml (default: table)
-f, --file <PATH>       Execute queries from a file
    --no-headers        Omit column headers in output
    --refresh-crds      Force refresh of cached CRD schemas
-v, --verbose           Enable verbose logging

Subcommands

k8sql interactive       # Start REPL (default)
k8sql daemon -p 15432   # PostgreSQL wire protocol server

SQL Extensions

The _cluster Column

Every table includes _cluster as the first column, enabling cross-cluster queries:

SELECT * FROM pods WHERE _cluster = 'prod'           -- Specific cluster
SELECT * FROM pods WHERE _cluster IN ('prod', 'dev') -- Multiple clusters
SELECT * FROM pods WHERE _cluster = '*'              -- All clusters (parallel)

JSON Field Access

Access nested fields in spec, status, labels, annotations using PostgreSQL-style operators:

SELECT name, status->>'phase' FROM pods
SELECT name, status->'nodeInfo'->>'kubeletVersion' FROM nodes
SELECT * FROM pods WHERE labels->>'app' = 'nginx'

Or function syntax for deeper nesting:

SELECT json_get_str(spec, 'containers', 0, 'image') FROM pods
SELECT json_get_int(spec, 'replicas') FROM deployments

Available functions: json_get_str, json_get_int, json_get_float, json_get_bool, json_get_json, json_get_array, json_length, json_keys.

Working with Arrays

Use json_get_array() with UNNEST to expand arrays:

SELECT name, json_get_str(UNNEST(json_get_array(spec, 'containers')), 'image') as image
FROM pods

Query Optimization

k8sql pushes predicates to the Kubernetes API when possible:

Predicate Optimization
namespace = 'x' Namespaced API call
labels->>'app' = 'nginx' Label selector (server-side)
_cluster = 'prod' Only queries that cluster
_cluster = '*' Parallel queries to all clusters

Use -v to see which filters are pushed down.

Table Schema

All resources share a consistent schema:

Column Type Description
_cluster text Kubernetes context name
api_version text API version (v1, apps/v1, etc.)
kind text Resource kind
name text Resource name
namespace text Namespace (null for cluster-scoped)
uid text Unique identifier
created timestamp Creation timestamp
labels json Labels (access with labels->>'key')
annotations json Annotations
spec json Resource specification
status json Resource status

Special cases: ConfigMaps/Secrets have data instead of spec/status. Events have dedicated columns.

REPL Commands

SHOW TABLES          List available tables
SHOW DATABASES       List kubectl contexts (* = active)
DESCRIBE <table>     Show table schema
USE <cluster>        Switch context(s) - supports globs: USE prod-*
\dt, \l, \d, \x, \q  Shortcuts

Context selection persists to ~/.k8sql/config.json and restores on next startup.

Daemon Mode

Run as a PostgreSQL-compatible server:

k8sql daemon --port 15432 --context "prod-*"
psql -h localhost -p 15432

Caching

k8sql caches CRD schemas for fast startup:

  • CRD schemas: Cached indefinitely (use --refresh-crds to force refresh)
  • Cluster CRD list: Checked hourly for new/removed CRDs
  • Cache location: ~/.k8sql/cache/

Example Queries

-- Pods with status
SELECT name, namespace, status->>'phase' as phase FROM pods

-- Unhealthy deployments
SELECT name, json_get_int(spec, 'replicas') as desired,
       json_get_int(status, 'readyReplicas') as ready
FROM deployments
WHERE json_get_int(status, 'readyReplicas') < json_get_int(spec, 'replicas')

-- Cross-cluster pod count
SELECT _cluster, COUNT(*) FROM pods WHERE _cluster = '*' GROUP BY _cluster

-- Find pods by label (server-side filtering)
SELECT name FROM pods WHERE labels->>'app' = 'nginx' AND namespace = 'default'

Built With

License

BSD-3-Clause