json-register
Note: This library is currently in beta. The API is stable but may change in future releases based on user feedback and production usage.
json-register is a caching registry for JSON objects, with storage in a PostgreSQL database, using their JSONB encoding. It ensures that semantically equivalent JSON objects are cached only once by employing a canonicalisation strategy in the cache, and using JSONB comparisons in the database. The database assigns a uniqiue 32-bit integer identifier to each object.
This library is written in Rust and provides native bindings for Python, allowing for seamless integration into applications written in either language.
Features
- Canonicalisation: JSON objects are canonicalised (keys sorted, whitespace removed) before storage to ensure uniqueness based on content.
- Caching: An in-memory Least Recently Used (LRU) cache minimizes database lookups for frequently accessed objects.
- PostgreSQL Integration: Efficiently stores and retrieves JSON data using PostgreSQL's
JSONBtype. - Batch Processing: Supports batch registration of objects to reduce network round-trips and improve throughput.
- Cross-Language Support: Provides a native Rust API and a Python extension module.
- Security: SQL injection prevention through identifier validation and automatic password sanitization in error messages.
- Configurable Timeouts: Optional connection pool timeouts for acquire, idle, and maximum lifetime settings.
- Monitoring: Query methods for connection pool metrics and cache hit rate statistics.
Installation
Rust
Add the following to your Cargo.toml:
[]
= "0.1.0"
= { = "1.0", = ["full"] }
= "1.0"
Python
Ensure you have a compatible Python environment (3.8+) and install the package.
Currently available on TestPyPI:
Once published to PyPI:
Database Schema
Before using json-register, create the required table and index in your PostgreSQL database:
(
id SERIAL PRIMARY KEY,
json_object JSONB UNIQUE NOT NULL
);
(json_object);
The GIN index enables efficient containment and path queries on the JSONB column. You can customise the table name, id column, and jsonb column names - just ensure they match your Register / JsonRegister configuration.
Usage
Rust Example
The following example demonstrates how to initialize the registry and register JSON objects using the Rust API.
use Register;
use json;
use Error;
async
Python Example (Synchronous)
The following example demonstrates how to use the library within a Python application using the synchronous API.
# Initialize the register
=
# Register a single object
=
=
# Register a batch of objects
=
=
Python Example (Asynchronous)
For async Python applications (FastAPI, aiohttp, etc.), use the async variants to avoid blocking the event loop.
# Initialize the register (constructor is synchronous)
=
# Register a single object asynchronously
=
= await
# Register a batch of objects asynchronously
=
= await
Configuration
Timeout Parameters
Optional timeout parameters can be specified when initializing the register. All timeouts are in seconds.
acquire_timeout_secs: Timeout for acquiring a connection from the pool (default: 5)idle_timeout_secs: Timeout before closing idle connections (default: 600)max_lifetime_secs: Maximum lifetime of a connection (default: 1800)
Rust Example with Custom Timeouts
let register = new.await?;
Python Example with Custom Timeouts
=
Monitoring
The library provides comprehensive telemetry metrics for integration with monitoring systems such as Prometheus, OpenTelemetry, or custom logging. All metrics can be retrieved individually or as a complete snapshot.
Connection Pool Metrics
pool_size(): Total number of connections in the pool (idle and active)idle_connections(): Number of idle connections available for useactive_connections(): Number of connections currently in useis_closed(): Whether the connection pool is closed
Cache Metrics
cache_hits(): Total number of successful cache lookupscache_misses(): Total number of unsuccessful cache lookupscache_hit_rate(): Hit rate as a percentage (0.0 to 100.0)cache_size(): Current number of items in the cachecache_capacity(): Maximum cache capacitycache_evictions(): Total number of items evicted from the cache
Database Metrics
db_queries_total(): Total number of database queries executeddb_query_errors(): Total number of failed database queries
Operation Metrics
register_single_calls(): Number of timesregister_objectwas calledregister_batch_calls(): Number of timesregister_batch_objectswas calledtotal_objects_registered(): Total number of objects registered across all calls
Telemetry Snapshot
The telemetry_metrics() method (Rust only) returns a complete snapshot of all metrics in a single call, which is useful for OpenTelemetry exporters
Rust Monitoring Example
// Get all metrics at once (recommended for OpenTelemetry)
let metrics = register.telemetry_metrics;
println!;
println!;
println!;
println!;
println!;
// Or query individual metrics
let hit_rate = register.cache_hit_rate;
let active = register.active_connections;
Python Monitoring Example
# Individual metrics
Logging
The library uses the tracing crate for structured logging. Logs include connection info, cache hit/miss statistics, and batch sizes.
Rust
Use tracing-subscriber to see logs:
use EnvFilter;
fmt
.with_env_filter
.init;
Set the RUST_LOG environment variable to control log levels:
# See debug logs from json-register
RUST_LOG=json_register=debug
# See trace logs (cache hits/misses)
RUST_LOG=json_register=trace
Python
Logs are automatically bridged to Python's logging module:
# Configure Python logging as usual
# Logs from json-register will appear with logger name 'json_register'
# You can also configure just the json_register logger:
Log Levels
| Level | Content |
|---|---|
INFO |
Connection events, configuration |
DEBUG |
Cache statistics, batch sizes, database queries |
TRACE |
Individual cache hits/misses (verbose) |
License
This project is licensed under the Apache-2.0 License.