synta 0.2.3

ASN.1 parser, decoder, and encoder library with DER/BER support and C FFI
Documentation
# synta.schema — ASN.1 Structure Definitions in Python

`synta.schema` lets you define ASN.1 SEQUENCE and CHOICE types in Python using
class decorators and standard type annotations, then automatically get `to_der()`
and `from_der()` methods — the Python equivalent of Rust's
`#[derive(Asn1Sequence)]` and `#[derive(Asn1Choice)]`.

```python
from synta.schema import asn1_sequence, asn1_choice, asn1_field
```

The module is implemented in pure Python on top of the `synta.Decoder` and
`synta.Encoder` primitives.  It requires no extra compilation step and is
available wherever the `synta` package is installed.


## Overview

In the Rust API you write an ASN.1 schema file, run `synta-codegen`, and the
generated Rust types derive encode/decode automatically.  In the Python bindings
the same effect is achieved with three decorators:

| Decorator / helper | Purpose |
|---|---|
| `@asn1_sequence` | Add DER SEQUENCE encode/decode to a `@dataclass` |
| `@asn1_sequence_of` | Add DER SEQUENCE OF encode/decode to a `@dataclass` with a single `list[T]` field |
| `@asn1_choice` | Add DER CHOICE encode/decode to a `@dataclass` |
| `asn1_field(...)` | Attach an explicit or implicit context tag to a field |

The decorated class retains all standard dataclass behaviour — `__repr__`,
`__eq__`, `__init__`, and `dataclasses.fields()` all continue to work as
expected.


## `@asn1_sequence`

Applied **after** `@dataclass`.  Adds:

- `instance.to_der() -> bytes` — DER-encode the instance as an ASN.1 SEQUENCE.
- `ClassName.from_der(data: bytes) -> ClassName` — decode a DER-encoded SEQUENCE
  into a new instance.

Fields are encoded and decoded in annotation declaration order.

```python
from synta.schema import asn1_sequence
from dataclasses import dataclass
import synta

@asn1_sequence
@dataclass
class Validity:
    not_before: synta.UtcTime
    not_after:  synta.UtcTime

v = Validity(
    not_before=synta.UtcTime(2024, 1, 1, 0, 0, 0),
    not_after=synta.UtcTime(2025, 1, 1, 0, 0, 0),
)
der = v.to_der()          # bytes
v2  = Validity.from_der(der)
print(v2.not_before.year)  # 2024
print(v == v2)             # True  — dataclass __eq__
```

Applying `@asn1_sequence` before `@dataclass` raises `TypeError`.


## `@asn1_sequence_of`

Applied **after** `@dataclass`.  Use this for *naked* `SEQUENCE OF T` types — a
homogeneous list wrapped in a single outer SEQUENCE with no named intermediate
fields — as opposed to `@asn1_sequence` which handles the heterogeneous
`SEQUENCE { field1 T1, field2 T2, ... }` case.

The class must have exactly one field annotated as `list[T]` (not `Optional`).
The element type `T` is inferred from that annotation.

Adds:

- `instance.to_der() -> bytes` — DER-encode the instance as an ASN.1 `SEQUENCE OF T`.
- `ClassName.from_der(data: bytes) -> ClassName` — decode a DER-encoded
  `SEQUENCE OF T` into a new instance.

```python
from synta.schema import asn1_sequence_of
from dataclasses import dataclass
import synta

@asn1_sequence_of
@dataclass
class JWTClaimNames:
    names: list[synta.IA5String]

names = JWTClaimNames([synta.IA5String("div"), synta.IA5String("opt")])
der    = names.to_der()
names2 = JWTClaimNames.from_der(der)
assert [n.as_str() for n in names2.names] == ["div", "opt"]
```

`@asn1_sequence_of` types may be used as field types inside `@asn1_sequence`
classes, enabling nested `SEQUENCE OF` structures:

```python
@asn1_sequence_of
@dataclass
class Names:
    items: list[synta.Utf8String]

@asn1_sequence
@dataclass
class Outer:
    name_list: Names
    serial:    synta.Integer
```

Applying `@asn1_sequence_of` before `@dataclass` raises `TypeError`.
Providing a class with zero or more than one non-optional `list[T]` field raises `TypeError`.


## `@asn1_choice`

Applied **after** `@dataclass`.  Adds the same `to_der()` / `from_der()` pair.

All fields must be typed as `SomeType | None` (or `Optional[SomeType]`) with
`default=None`.  Exactly one field must be non-`None` when encoding.  Dispatch
during decoding is performed by matching the ASN.1 tag of the incoming element
against the registered types.

```python
from synta.schema import asn1_choice
from dataclasses import dataclass
import synta

@asn1_choice
@dataclass
class Time:
    utc_time:         synta.UtcTime         | None = None
    generalized_time: synta.GeneralizedTime | None = None

t = Time(utc_time=synta.UtcTime(2024, 6, 1, 12, 0, 0))
t2 = Time.from_der(t.to_der())
assert t2.utc_time is not None
assert t2.generalized_time is None
```

Encoding a `@asn1_choice` instance where all fields are `None` raises
`ValueError`.


## `asn1_field()`

Returns a `dataclasses.field()` with embedded ASN.1 tagging metadata.  Use it
as the default value for a field that carries an explicit or implicit context
tag.

```python
asn1_field(
    *,
    tag: int | None = None,
    explicit: bool = True,
    implicit: bool = False,
    default: Any = None,
) -> Any
```

| Parameter | Description |
|---|---|
| `tag` | Context tag number (e.g. `tag=0` for `[0]`).  `None` means no tagging. |
| `explicit` | EXPLICIT wrapping — the context tag is added around the original TLV.  Default `True`. |
| `implicit` | IMPLICIT tagging — the context tag replaces the original type tag.  Mutually exclusive with `explicit=True`. |
| `default` | Default value for the dataclass field.  Defaults to `None` so `Optional[T]` fields work without a separate `= None`. |

The `explicit` and `implicit` parameters control tagging when a `tag` is given.
Passing `implicit=True` sets `explicit=False` internally; do not pass both.

```python
from synta.schema import asn1_sequence, asn1_field
from dataclasses import dataclass
from typing import Optional
import synta

@asn1_sequence
@dataclass
class TBSCertificate:
    serial:    synta.Integer
    # [0] EXPLICIT version, absent when None
    version:   Optional[synta.Integer]     = asn1_field(tag=0, explicit=True)
    # [2] IMPLICIT key identifier, absent when None
    key_id:    Optional[synta.OctetString] = asn1_field(tag=2, implicit=True)
```


## Field type support

### Primitive synta types

All synta primitive types are supported as field types:

| Type | ASN.1 tag |
|---|---|
| `synta.Integer` | 2 — INTEGER |
| `synta.BitString` | 3 — BIT STRING |
| `synta.OctetString` | 4 — OCTET STRING |
| `synta.Null` | 5 — NULL |
| `synta.ObjectIdentifier` | 6 — OBJECT IDENTIFIER |
| `synta.Real` | 9 — REAL |
| `synta.Utf8String` | 12 — UTF8String |
| `synta.NumericString` | 18 — NumericString |
| `synta.PrintableString` | 19 — PrintableString |
| `synta.TeletexString` | 20 — TeletexString |
| `synta.IA5String` | 22 — IA5String |
| `synta.UtcTime` | 23 — UTCTime |
| `synta.GeneralizedTime` | 24 — GeneralizedTime |
| `synta.VisibleString` | 26 — VisibleString |
| `synta.GeneralString` | 27 — GeneralString |
| `synta.UniversalString` | 28 — UniversalString |
| `synta.BmpString` | 30 — BMPString |

### Optional fields

Annotate a field as `Optional[T]` or `T | None`.  The field is omitted from the
encoded SEQUENCE when its value is `None`.  During decoding the tag of the next
element is peeked; if it does not match the expected tag the field is set to
`None` and parsing continues with the next field.

```python
@asn1_sequence
@dataclass
class Example:
    required: synta.Integer
    optional: Optional[synta.OctetString] = None
```

### SEQUENCE OF (list fields)

Annotate a field as `List[T]`.  The field is encoded as a nested SEQUENCE
wrapping zero or more elements of type `T`.  On decoding the nested SEQUENCE
is consumed and all elements are collected into a Python list.

```python
@asn1_sequence
@dataclass
class SubjectAltNames:
    names: List[synta.IA5String]
```

The list default is `dataclasses.field(default_factory=list)` when used without
`asn1_field()`.  The `default_factory` must be set explicitly:

```python
import dataclasses

@asn1_sequence
@dataclass
class Names:
    entries: List[synta.Utf8String] = dataclasses.field(default_factory=list)
```

### Nested `@asn1_sequence` and `@asn1_choice` types

Use another decorated class as a field type directly:

```python
@asn1_sequence
@dataclass
class TBSCertificate:
    serial:   synta.Integer
    validity: Validity     # @asn1_sequence
    time:     Time         # @asn1_choice
```

The nested type's `to_der()` is called during encoding.  During decoding the
SEQUENCE or CHOICE tag is matched and the nested type's `from_der`-equivalent
decoder is called recursively.


## Error handling

| Situation | Exception |
|---|---|
| Required field is `None` at encode time | `ValueError` |
| `@asn1_choice` instance has all fields `None` | `ValueError` |
| Decorator applied before `@dataclass` | `TypeError` |
| Unknown field type (no registered decoder) | `TypeError` |
| Implicit tagging used for a type not in the tag table | `TypeError` |
| `List[T]` without a type parameter | `TypeError` |
| `Union` with more than one non-`None` arm | `TypeError` |
| DER parse error from the underlying `synta.Decoder` | `synta.SyntaError` / `ValueError` |
| Error in a nested field includes the field name in the message | same exception, prefixed |


## Dataclass behaviour is preserved

`@asn1_sequence` and `@asn1_choice` do not replace `@dataclass` — they only
add the three ASN.1 methods (`to_der`, `from_der`, `_asn1_from_decoder`) to
the class.  All standard dataclass features remain:

```python
v = Validity(
    not_before=synta.UtcTime(2024, 1, 1, 0, 0, 0),
    not_after=synta.UtcTime(2025, 1, 1, 0, 0, 0),
)
print(v)         # Validity(not_before=..., not_after=...)  — __repr__
assert v == v    # True — __eq__
import dataclasses
print(dataclasses.asdict(v))   # works if field types support it
```


## Complete example

```python
from synta.schema import asn1_sequence, asn1_choice, asn1_field
from dataclasses import dataclass, field
from typing import Optional, List
import synta

# CHOICE type — exactly one of the two fields is set
@asn1_choice
@dataclass
class Time:
    utc_time:         synta.UtcTime         | None = None
    generalized_time: synta.GeneralizedTime | None = None

# Simple SEQUENCE
@asn1_sequence
@dataclass
class Validity:
    not_before: Time
    not_after:  Time

# SEQUENCE with optional and tagged fields, a nested type, and a SEQUENCE OF
@asn1_sequence
@dataclass
class TBSCertificate:
    serial:     synta.Integer
    validity:   Validity
    dns_names:  List[synta.IA5String]           = field(default_factory=list)
    version:    Optional[synta.Integer]          = asn1_field(tag=0, explicit=True)
    key_id:     Optional[synta.OctetString]      = asn1_field(tag=2, implicit=True)

# Build an instance
tbs = TBSCertificate(
    serial=synta.Integer(12345),
    validity=Validity(
        not_before=Time(utc_time=synta.UtcTime(2024, 1, 1, 0, 0, 0)),
        not_after=Time(utc_time=synta.UtcTime(2025, 1, 1, 0, 0, 0)),
    ),
    dns_names=[synta.IA5String("example.com"), synta.IA5String("www.example.com")],
    version=synta.Integer(2),
)

# Encode and decode
der = tbs.to_der()
tbs2 = TBSCertificate.from_der(der)
assert tbs2.serial.to_int() == 12345
assert tbs2.version.to_int() == 2
assert tbs2.key_id is None
assert len(tbs2.dns_names) == 2
```


## Current limitations (Phase 1)

- **DER only**`synta.Encoding.DER` is used for all encoding and decoding.
  BER and CER are not supported.
- **SEQUENCE and SEQUENCE OF only**`SET` and `SET OF` are not generated.
- **No per-variant CHOICE tags** — CHOICE variants with their own explicit or
  implicit context tags are not supported.  All CHOICE dispatch is performed by
  matching the universal ASN.1 tag of each field type.
- **No DEFAULT omission** — optional fields are always encoded when non-`None`.
  A field with a DEFAULT value in the ASN.1 schema will still be written into
  the encoding even when it holds the default value.
- **No constraint enforcement** — size constraints, value ranges, and permitted
  alphabet constraints from ASN.1 schemas are not enforced at encode or decode
  time.


## Relation to Rust codegen

The Rust `synta-codegen` tool generates code from `.asn1` schema files and
compiles it into the native extension.  `synta.schema` offers the same
encode/decode convenience for use cases where:

- The schema is small, one-off, or highly dynamic.
- No Rust compilation step is available (e.g. in scripts or notebooks).
- The performance of a native Rust type is not required.

For production-grade schemas covering established protocols (X.509, Kerberos,
CMS, etc.) the pre-built types in `synta-certificate`, `synta-krb5`, and
`synta-mtc` are recommended.