prefix-register 0.2.2

A PostgreSQL-backed namespace prefix registry for CURIE expansion and prefix management
Documentation

prefix-register

Crates.io PyPI Documentation License

Status: Beta - API may change before 1.0 release.

A PostgreSQL-backed namespace prefix registry for CURIE expansion and prefix management. Available for both Rust and Python.

Features

  • Dual language support - Native Rust library with Python bindings via PyO3
  • Async-only - Built on tokio for high concurrency
  • In-memory caching - Prefixes loaded on startup for fast CURIE expansion
  • First-prefix-wins - Each URI can only have one registered prefix
  • Batch operations - Efficiently store multiple prefixes in a single round trip
  • PostgreSQL backend - Durable, scalable storage with connection pooling
  • Startup resilience - Optional retry with exponential backoff for container orchestration
  • Input validation - Prevents DoS via length limits (prefix max 64, URI max 2048 chars)
  • Tracing instrumentation - Built-in spans and events for observability (configure subscriber in your app)

Use Cases

  • CURIE expansion in RDF processing
  • Namespace prefix management for semantic web applications
  • Prefix discovery from Turtle, JSON-LD, XML documents

Installation

Rust

Add to your Cargo.toml:

[dependencies]
prefix-register = "0.1"
tokio = { version = "1", features = ["rt-multi-thread"] }

Python

pip install prefix-register

Requires Python 3.10+.

Database Setup

Create the namespaces table in your PostgreSQL database:

CREATE TABLE IF NOT EXISTS namespaces (
    uri TEXT PRIMARY KEY,
    prefix TEXT NOT NULL UNIQUE
);

Usage

Rust

use prefix_register::PrefixRegistry;

#[tokio::main]
async fn main() -> prefix_register::Result<()> {
    // Connect to PostgreSQL
    let registry = PrefixRegistry::new(
        "postgres://localhost/mydb",
        10,  // max connections
    ).await?;

    // Store a prefix (only if URI doesn't already have one)
    let stored = registry.store_prefix_if_new(
        "foaf",
        "http://xmlns.com/foaf/0.1/"
    ).await?;
    println!("Prefix stored: {}", stored);

    // Expand a CURIE
    if let Some(uri) = registry.expand_curie("foaf", "Person").await? {
        println!("foaf:Person = {}", uri);
        // Output: foaf:Person = http://xmlns.com/foaf/0.1/Person
    }

    // Batch store prefixes - returns detailed result for logging
    let prefixes = vec![
        ("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#"),
        ("rdfs", "http://www.w3.org/2000/01/rdf-schema#"),
        ("schema", "https://schema.org/"),
    ];
    let result = registry.store_prefixes_if_new(prefixes).await?;
    println!("Stored {}, skipped {}", result.stored, result.skipped);

    // Shorten a URI (longest-match wins)
    if let Some((prefix, local)) = registry.shorten_uri("http://xmlns.com/foaf/0.1/Person").await? {
        println!("{}:{}", prefix, local);
        // Output: foaf:Person
    }

    // Or get a formatted string (returns original URI if no match)
    let curie = registry.shorten_uri_or_full("http://xmlns.com/foaf/0.1/Person").await?;
    println!("{}", curie);  // Output: foaf:Person

    Ok(())
}

Python

import asyncio
from prefix_register import PrefixRegistry

async def main():
    # Connect to PostgreSQL
    registry = await PrefixRegistry.new(
        "postgres://localhost/mydb",
        10  # max connections
    )

    # Store a prefix (only if URI doesn't already have one)
    stored = await registry.store_prefix_if_new(
        "foaf",
        "http://xmlns.com/foaf/0.1/"
    )
    print(f"Prefix stored: {stored}")

    # Expand a CURIE
    uri = await registry.expand_curie("foaf", "Person")
    if uri:
        print(f"foaf:Person = {uri}")
        # Output: foaf:Person = http://xmlns.com/foaf/0.1/Person

    # Batch store prefixes
    prefixes = [
        ("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#"),
        ("rdfs", "http://www.w3.org/2000/01/rdf-schema#"),
        ("schema", "https://schema.org/"),
    ]
    result = await registry.store_prefixes_if_new(prefixes)
    print(f"Stored {result['stored']}, skipped {result['skipped']}")

    # Shorten a URI (longest-match wins)
    result = await registry.shorten_uri("http://xmlns.com/foaf/0.1/Person")
    if result:
        prefix, local = result
        print(f"{prefix}:{local}")  # Output: foaf:Person

    # Or get a formatted string (returns original URI if no match)
    curie = await registry.shorten_uri_or_full("http://xmlns.com/foaf/0.1/Person")
    print(curie)  # Output: foaf:Person

asyncio.run(main())

API

PrefixRegistry

Method Description
new(database_url, max_connections) Connect to PostgreSQL and load existing prefixes
new_with_retry(...) Connect with retry logic for transient failures
get_uri_for_prefix(prefix) Get the URI for a prefix (cache-first)
get_prefix_for_uri(uri) Get the prefix for a URI
store_prefix_if_new(prefix, uri) Store a prefix if the URI doesn't have one (returns bool)
store_prefixes_if_new(prefixes) Batch store prefixes
expand_curie(prefix, local_name) Expand a CURIE to a full URI
expand_curie_batch(curies) Batch expand CURIEs (returns None for unknown prefixes)
shorten_uri(uri) Shorten a URI to (prefix, local_name) using longest-match
shorten_uri_or_full(uri) Shorten to "prefix:local" or return original URI
shorten_uri_batch(uris) Batch shorten URIs (returns None for unmatched)
get_all_prefixes() Get all registered prefix mappings
prefix_count() Get the number of registered prefixes

RetryConfig (Rust) / new_with_retry parameters (Python)

Configuration for startup retry behaviour:

Rust:

  • RetryConfig::default() - 5 retries, 1s initial delay, 30s max delay
  • RetryConfig::none() - No retries (fail immediately)
  • RetryConfig::new(max_retries, initial_delay, max_delay) - Custom configuration

Python:

registry = await PrefixRegistry.new_with_retry(
    "postgres://localhost/mydb",
    10,                    # max_connections
    max_retries=5,         # default: 5
    initial_delay_ms=1000, # default: 1000
    max_delay_ms=30000     # default: 30000
)

BatchStoreResult

Returned by store_prefixes_if_new() for detailed reporting:

Rust:

  • stored: usize - Number of new prefixes stored
  • skipped: usize - Number of prefixes skipped (URI already had a prefix)
  • total() - Total prefixes processed
  • all_stored() - Returns true if all were stored
  • none_stored() - Returns true if none were stored

Python:

result = await registry.store_prefixes_if_new(prefixes)
print(result["stored"])   # Number of new prefixes stored
print(result["skipped"])  # Number skipped (URI already had a prefix)

License

Apache-2.0