synta 0.2.0

ASN.1 parser, decoder, and encoder library with DER/BER support and C FFI
Documentation
# Synta Python Package

<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*

- [Overview]#overview
- [Installation]#installation
  - [From Wheel]#from-wheel
  - [From Source]#from-source
- [Quick Start]#quick-start
- [Features]#features
  - [Supported ASN.1 Types]#supported-asn1-types
- [X.509 Path Validation]#x509-path-validation
- [Documentation]#documentation
- [Testing]#testing
- [Performance]#performance
  - [X.509 Certificate Parsing]#x509-certificate-parsing
  - [PKCS#7 and PKCS#12 Certificate Extraction]#pkcs7-and-pkcs12-certificate-extraction
- [License]#license
- [Contributing]#contributing

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

High-performance ASN.1 parser and encoder for Python, powered by Rust.

## Overview

Synta provides Python bindings to the high-performance Synta ASN.1 library written in Rust.
It offers a low-level API for parsing and encoding ASN.1 data with near-native performance.

The package is delivered as three native extension modules: `_synta.abi3.so` (built by maturin from the `synta-python` crate), `_krb5.abi3.so` (built by cargo from `synta-python-krb5`, registers `synta.krb5` and `synta.spnego`), and `_mtc.abi3.so` (built by cargo from `synta-python-mtc`, registers `synta.mtc`). A shared rlib `synta-python-common` is statically linked into all three.

## Installation

### From Wheel

```bash
pip install synta-0.1.0-cp38-abi3-manylinux_2_34_x86_64.whl
```

### From Source

Requirements:
- Rust toolchain (1.70+)
- Python 3.8+
- Maturin

```bash
# Install maturin
pip install maturin

# Build and install
maturin build --release
pip install target/wheels/synta-*.whl
```

## Quick Start

```python
import synta

# Decode an INTEGER
data = b'\x02\x01\x2A'  # DER-encoded INTEGER 42
decoder = synta.Decoder(data, synta.Encoding.DER)
integer = decoder.decode_integer()
print(integer.to_int())  # Output: 42

# Encode an OBJECT IDENTIFIER
oid = synta.ObjectIdentifier("1.2.840.113549.1.1.1")
encoder = synta.Encoder(synta.Encoding.DER)
encoder.encode_oid(oid)
output = encoder.finish()
print(output.hex())  # Output: 06092a864886f70d010101

# Parse an X.509 certificate
with open('cert.der', 'rb') as f:
    cert = synta.Certificate.from_der(f.read())
print(cert.subject)
print(cert.not_before, "–", cert.not_after)
```

## Features

- **High Performance**: Built in Rust with minimal Python overhead
- **Type Safety**: Full type support with Python type hints
- **Multiple Encodings**: Support for DER, BER, and CER
- **X.509 Certificate Parsing**: Parse DER-encoded X.509 certificates with 19 cached getters;
  `from_der()` is lazy (shallow envelope scan, ~0.16 µs); `full_from_der()` does a complete
  RFC 5280 decode upfront for workloads that cannot tolerate first-access latency
- **X.509 Path Validation** (`synta.x509`): RFC 5280 §6 + CABF Baseline Requirements
  certificate chain verification; multi-name SAN matching (`any`/`all`); configurable
  validation time, chain depth, and compliance profile (`webpki` / `rfc5280`); powered by
  the OpenSSL signature backend
- **PyCA Bridge**: Convert between `synta.Certificate` and `cryptography.x509.Certificate`
  via `cert.to_pyca()` / `Certificate.from_pyca(pyca_cert)` for cryptographic operations

### Supported ASN.1 Types

- **INTEGER**: Arbitrary precision integers
- **OCTET STRING**: Binary data
- **OBJECT IDENTIFIER**: OIDs with dotted notation
- **BIT STRING**: Bit-level data
- **BOOLEAN**: True/false values
- **REAL**: IEEE 754 double-precision floating point (including ±∞ and NaN)
- **NULL**: ASN.1 null value
- **UTF8String**: UTF-8 encoded strings
- **PrintableString**: Printable ASCII subset
- **IA5String**: ASCII strings
- **UTCTime**: UTC time values (1950–2049)
- **GeneralizedTime**: Generalized time values with optional milliseconds
- **Certificate**: X.509 DER certificate — 19 cached getters covering identity, validity,
  signature, public key, and raw DER spans; `from_der()` lazy (shallow scan) /
  `full_from_der()` eager (full RFC 5280 decode); `to_pyca()` / `from_pyca()` bridge to PyCA

## X.509 Path Validation

```python
import synta
import synta.x509 as x509

with open("root.der", "rb") as f:
    root_der = f.read()
with open("leaf.der", "rb") as f:
    leaf_der = f.read()
with open("intermediate.der", "rb") as f:
    intermediate_der = f.read()

store  = x509.TrustStore([root_der])
policy = x509.VerificationPolicy(server_names=["example.com"])
chain  = x509.verify_server_certificate(leaf_der, [intermediate_der], store, policy)

# chain is list[bytes], root-first
for i, cert_der in enumerate(chain):
    cert = synta.Certificate.from_der(cert_der)
    print(f"chain[{i}]: {cert.subject}")
```

See [synta-x509-verification/README.md](../synta-x509-verification/README.md) for the
complete Rust API guide and the full Python API reference in
[docs/python/src/introduction.md](../docs/python/src/introduction.md).

## Documentation

Full documentation is available in [docs/python/src/introduction.md](../docs/python/src/introduction.md) including:
- Complete API reference for all types, Decoder, and Encoder
- Certificate parsing examples
- Error handling
- Build and development instructions

## Testing

Run the test suite:

```bash
python -m pytest tests/python/
```

## Performance

Measured with `python/bench_certificate.py` (release build, CPython 3.14+).

### X.509 Certificate Parsing

| Operation | `synta` lazy (`from_der`) | `synta_full` eager (`full_from_der`) | `cryptography.x509` |
|-----------|---------------------------|--------------------------------------|---------------------|
| Parse-only (traditional, ~900 B) | **0.16 µs** | 0.75–0.76 µs | 1.63–1.72 µs |
| Parse-only (ML-DSA, 4–7.5 KB) | **0.17 µs** | 0.73–0.74 µs | 1.54–1.55 µs |
| Parse + all fields (traditional, cold, 19 fields) | **3.38–3.51 µs** || 16.64–16.95 µs (9 fields) |
| Parse + all fields (ML-DSA, cold, 19 fields) | **3.63–3.71 µs** || 13.14–13.41 µs |
| Field access only (warm, 19 fields) | **0.44–0.45 µs** || ~1.00 µs (9 fields) |

- **Parse-only (`from_der`):** synta is **~10× faster** than `cryptography.x509` for traditional
  certs (~9× for ML-DSA) — both do a lazy envelope scan; synta's is 4 ops (outer SEQUENCE
  tag+length, TBSCertificate tag+length), flat across cert sizes; `Py<PyBytes>` hold avoids
  copying input
- **Parse-only (`full_from_der`):** complete RFC 5280 decode upfront at 0.75–0.76 µs —
  **~2.2× faster** than `cryptography.x509`; comparable to `rust_typed` (0.50 µs) with ~0.25 µs
  PyO3 overhead (GIL + `Py<PyBytes>` + struct construction)
- **Parse + all fields (cold):** synta is **~4.7× faster** for traditional certs despite accessing
  19 fields vs `cryptography`'s 9 — all synta field caches are empty on first access; cost is
  parse (0.16 µs) + full RFC 5280 decode of all fields via lazy `OnceLock<Py<T>>` promotion
  (~3.2 µs). `cryptography`'s cold cost (16.6–17.0 µs for 9 fields) reflects eager Python-object
  allocation for every field on first parse, including `public_key()` SPKI decode via OpenSSL
- **Field access (warm):** synta is **~2.2× faster** — 0.44–0.45 µs (19 fields, ~23 ns/field)
  vs ~1.00 µs (9 fields, ~111 ns/field). synta's warm path is `OnceLock` atomic load +
  `clone_ref`; `cryptography`'s warm path reflects Python property dispatch + PyO3 boundary
  with per-field object construction (most fields do not use `PyOnceLock` internally)
- **ML-DSA:** `from_der` parse-only is flat across cert sizes (7 KB parses as fast as 4 KB);
  parse+fields adds ~0.3 µs for first-call `PyBytes` copies of large signature/key data;
  `cryptography.x509` requires ≥ 44.0 (FIPS 204 / OpenSSL ≥ 3.5) for ML-DSA `public_key()`

### PKCS#7 and PKCS#12 Certificate Extraction

Measured with `python/bench_pkcs.py` (release build, CPython 3.14+).

| Operation | `synta` | `cryptography` | Speedup |
|-----------|---------|----------------|---------|
| PKCS#7 DER (amazon roots, 2 certs, 1,848 B) | **1.55 µs** | 48.3 µs | ~31× |
| PKCS#7 PEM (ISRG, 1 cert, 1,992 B) | **4.47 µs** | 37.4 µs | ~8× |
| PKCS#12 unencrypted (3 certs, 3,539 B) | **2.11 µs** | 159.7 µs | ~76× |
| PKCS#12 unencrypted (1 cert + key, 756 B) | **1.06 µs** |||

- **PKCS#7 DER:** synta walks the SignedData SEQUENCE and collects raw DER certificate spans
  in a single pass — no intermediate allocation per certificate. `cryptography` constructs a
  `pkcs7.PKCS7` wrapper plus a list of `x509.Certificate` objects (~48 µs for 2 certs)
- **PKCS#7 PEM:** the PEM decode step adds ~3 µs of base-64 decoding before the DER parse;
  the ratio drops to 8× because both implementations share the PEM decode cost
- **PKCS#12:** synta finds all certificate bags in a single forward pass through the PFX
  ContentInfo tree with no MAC verification or key decryption for certificate-only extraction.
  `cryptography` calls OpenSSL `PKCS12_parse()` which verifies the MAC and decrypts the full
  archive even when only certificates are requested (~160 µs for 3 certs)

For full benchmark methodology and Rust-layer comparison see [docs/performance.md]../docs/performance.md.

## License

Synta is dual-licensed under MIT OR Apache-2.0.

## Contributing

Contributions are welcome! Please see the main Synta repository for contribution guidelines.