# Observability Guide
## Overview
PG-API includes optional observability features for sending metrics, logs, and traces to an OpenSearch API service. When enabled, the service automatically tracks:
- API request metrics (method, path, status, duration)
- Error rates and response times
- Database query performance
- System health metrics
## Configuration
Observability is disabled by default. To enable, set the following environment variables:
```bash
# Enable observability
OPENSEARCH_ENABLED=true
# OpenSearch API endpoint
OPENSEARCH_API_URL=https://opensearch-api.yourdomain.com
# API token for authentication
OPENSEARCH_API_TOKEN=sk_live_pg_api_production_xxxxx
# Index prefix for data (default: pg-api)
OPENSEARCH_INDEX_PREFIX=pg-api
# Batch size for sending events (default: 100)
OPENSEARCH_BATCH_SIZE=100
# Flush interval in seconds (default: 5)
OPENSEARCH_FLUSH_INTERVAL=5
```
## Generating API Tokens
Use the provided script to generate API tokens:
```bash
# Generate production token
python3 scripts/generate_opensearch_token.py --environment production
# Generate development token
python3 scripts/generate_opensearch_token.py --environment development --output token.json
```
The generated token must be configured in your opensearch-api service to allow ingestion.
## Metrics Collected
### Request Metrics
- `method`: HTTP method (GET, POST, etc.)
- `path`: Request path
- `status_code`: HTTP response code
- `duration_ms`: Request processing time
- `correlation_id`: Unique request identifier
- `client_ip`: Client IP address (when available)
- `api_key_id`: API key used for authentication
- `error`: Error message for failed requests
### System Metrics
- Service health status
- Active connections count
- Rate limit status
- License validation status
## Data Flow
1. **Event Generation**: Metrics are generated automatically by middleware
2. **Buffering**: Events are buffered in memory (max 10,000 events)
3. **Batching**: Events are sent in batches (default: 100 events)
4. **Flushing**: Batches are sent every 5 seconds or when batch size is reached
5. **Retry**: Failed sends are retried with events returned to buffer
## Index Pattern
Data is sent to daily indices with the pattern:
```
{index_prefix}-YYYY.MM.DD
```
Example: `pg-api-2025.08.07`
## Performance Impact
When observability is enabled:
- Minimal CPU overhead (<1%)
- Memory usage: ~10MB for buffer (10,000 events)
- Network: Async batched sends, non-blocking
- Graceful degradation if OpenSearch is unavailable
## Troubleshooting
### Observability Not Working
1. Check environment variables are set correctly
2. Verify OpenSearch API is accessible
3. Confirm API token is valid and has write permissions
4. Check logs for error messages
### High Memory Usage
Reduce buffer size or batch size:
```bash
OPENSEARCH_BATCH_SIZE=50
```
### Events Not Appearing
1. Check flush interval (events are batched)
2. Verify network connectivity to OpenSearch API
3. Check OpenSearch API logs for ingestion errors
## Security Considerations
- API tokens should be kept secure
- Use HTTPS for OpenSearch API endpoint
- Sensitive data is not logged by default
- Correlation IDs help trace requests without exposing data
## Disabling Observability
To disable observability:
```bash
OPENSEARCH_ENABLED=false
# or simply remove/unset the environment variable
```
## Integration with OpenSearch API
The observability module expects an OpenSearch API service that accepts:
- Bearer token authentication
- JSON payloads with event batches
- POST requests to `/v1/ingest/{index}`
Configure your opensearch-api service to accept tokens generated by the `generate_opensearch_token.py` script.