# Decoder
## Constructor
```python
Decoder(data: bytes, encoding: Encoding)
```
Creates a streaming decoder over `data`. The internal position starts at 0.
Each `decode_*` call advances the position past the decoded element.
## Primitive decode methods
| `decode_integer()` | `Integer` | INTEGER | `0x02` |
| `decode_octet_string()` | `OctetString` | OCTET STRING | `0x04` |
| `decode_oid()` | `ObjectIdentifier` | OBJECT IDENTIFIER | `0x06` |
| `decode_bit_string()` | `BitString` | BIT STRING | `0x03` |
| `decode_boolean()` | `Boolean` | BOOLEAN | `0x01` |
| `decode_utc_time()` | `UtcTime` | UTCTime | `0x17` |
| `decode_generalized_time()` | `GeneralizedTime` | GeneralizedTime | `0x18` |
| `decode_null()` | `Null` | NULL | `0x05` |
| `decode_real()` | `Real` | REAL | `0x09` |
| `decode_utf8_string()` | `Utf8String` | UTF8String | `0x0c` |
| `decode_printable_string()` | `PrintableString` | PrintableString | `0x13` |
| `decode_ia5_string()` | `IA5String` | IA5String | `0x16` |
| `decode_numeric_string()` | `NumericString` | NumericString | `0x12` |
| `decode_teletex_string()` | `TeletexString` | TeletexString / T61String | `0x14` |
| `decode_visible_string()` | `VisibleString` | VisibleString | `0x1a` |
| `decode_general_string()` | `GeneralString` | GeneralString | `0x1b` |
| `decode_universal_string()` | `UniversalString` | UniversalString | `0x1c` |
| `decode_bmp_string()` | `BmpString` | BMPString | `0x1e` |
| `decode_any()` | any Python object | any element | — |
| `decode_any_str()` | `str` | any string type | — |
### `decode_any()` dispatch table
`decode_any()` dispatches on the tag at the current position:
| BOOLEAN | `Boolean` |
| INTEGER | `Integer` |
| BIT STRING | `BitString` |
| OCTET STRING | `OctetString` |
| NULL | `Null` |
| OBJECT IDENTIFIER | `ObjectIdentifier` |
| UTF8String | `Utf8String` |
| PrintableString | `PrintableString` |
| IA5String | `IA5String` |
| NumericString | `NumericString` |
| TeletexString | `TeletexString` |
| VisibleString | `VisibleString` |
| GeneralString | `GeneralString` |
| UniversalString | `UniversalString` |
| BmpString | `BmpString` |
| UTCTime | `UtcTime` |
| GeneralizedTime | `GeneralizedTime` |
| SEQUENCE / SET | `list` of the above |
| Tagged | `TaggedElement` |
| Unknown universal | `RawElement` |
### `decode_any_str()` encoding table
`decode_any_str()` reads one TLV and decodes it as a native Python `str`,
applying the correct encoding for each of the nine ASN.1 string types:
| 12 | UTF8String | UTF-8 (lossy) |
| 18 | NumericString | UTF-8 |
| 19 | PrintableString | UTF-8 |
| 20 | TeletexString / T61String | Latin-1 (each byte → U+0000–U+00FF) |
| 22 | IA5String | UTF-8 |
| 26 | VisibleString | UTF-8 |
| 27 | GeneralString | UTF-8 |
| 28 | UniversalString | UCS-4 big-endian |
| 30 | BMPString | UCS-2 big-endian |
Raises `ValueError` for any other tag; raises `EOFError` if the decoder is
empty. This is the single-call replacement for the duck-typing probe on
`decode_any()`:
```python
# Before — three-way probe:
val = decoder.decode_any()
if hasattr(val, 'as_str'):
s = val.as_str()
elif hasattr(val, 'to_bytes'):
s = val.to_bytes().decode('latin-1')
else:
raise ValueError(f"not a string: {type(val)}")
# After — one call, correct encoding for all nine types:
s = decoder.decode_any_str()
```
## Structured / container decode methods
| `decode_sequence` | `()` | `Decoder` | Consume a SEQUENCE TLV; return child decoder over its contents. |
| `decode_set` | `()` | `Decoder` | Consume a SET TLV; return child decoder over its contents. |
| `decode_explicit_tag` | `(tag_num: int)` | `Decoder` | Strip an explicit context-specific tag `[tag_num]`; return child decoder over the content. |
| `decode_implicit_tag` | `(tag_num: int, tag_class: str)` | `Decoder` | Strip an implicit tag; return child decoder over the **value bytes only** (no tag/length). `tag_class` is `"Context"`, `"Application"`, `"Private"`, or `"Universal"`. |
| `decode_raw_tlv` | `()` | `bytes` | Read the next complete TLV (tag + length + value) as raw bytes and advance past it. |
## Introspection helpers
| `peek_tag()` | `tuple[int, str, bool]` | `(tag_number, tag_class, is_constructed)` — does **not** advance the position. Raises `EOFError` if no data remains. |
| `remaining_bytes()` | `bytes` | All bytes from the current position to the end. Useful after `decode_implicit_tag` to retrieve bare primitive value bytes. |
| `is_empty()` | `bool` | `True` when the current position equals the data length. |
| `position()` | `int` | Current byte offset. |
| `remaining()` | `int` | Number of bytes left. |
## Full class stub
```python
class Decoder:
def __init__(self, data: bytes, encoding: Encoding) -> None: ...
# Primitive types
def decode_integer(self) -> Integer: ...
def decode_octet_string(self) -> OctetString: ...
def decode_oid(self) -> ObjectIdentifier: ...
def decode_bit_string(self) -> BitString: ...
def decode_boolean(self) -> Boolean: ...
def decode_real(self) -> Real: ...
def decode_null(self) -> Null: ...
def decode_utc_time(self) -> UtcTime: ...
def decode_generalized_time(self) -> GeneralizedTime: ...
# String types
def decode_utf8_string(self) -> Utf8String: ...
def decode_printable_string(self) -> PrintableString: ...
def decode_ia5_string(self) -> IA5String: ...
def decode_numeric_string(self) -> NumericString: ... # tag 18
def decode_teletex_string(self) -> TeletexString: ... # tag 20
def decode_visible_string(self) -> VisibleString: ... # tag 26
def decode_general_string(self) -> GeneralString: ... # tag 27
def decode_universal_string(self) -> UniversalString: ... # tag 28
def decode_bmp_string(self) -> BmpString: ... # tag 30
# Constructed / tagged
def decode_sequence(self) -> Decoder: ...
# Reads a SEQUENCE TLV, advances past it, and returns a new Decoder over
# the content bytes. Raises ValueError if the next element is not a SEQUENCE.
def decode_explicit_tag(self, tag_num: int) -> Decoder: ...
# Reads an explicit context-specific tag [tag_num], advances past it, and
# returns a new Decoder over the tagged content.
# Raises ValueError if the tag number does not match.
def decode_set(self) -> Decoder: ...
# Reads a SET TLV (tag 0x31), advances past it, and returns a new Decoder
# over the content bytes. Raises ValueError if the next element is not a SET.
def decode_implicit_tag(self, tag_num: int, tag_class: str) -> Decoder: ...
# Strips an implicit tag of the given number and class and returns a new
# Decoder over the raw value bytes. tag_class must be "Universal",
# "Context", "Application", or "Private". Raises ValueError on mismatch.
# The caller must know the original type and call the appropriate decode_*
# method on the returned Decoder.
#
# Example:
# raw_decoder = decoder.decode_implicit_tag(0, "Context")
# value = raw_decoder.decode_integer()
def peek_tag(self) -> tuple[int, str, bool]: ...
# Returns (tag_number, tag_class, is_constructed) of the next element without
# consuming any bytes. Raises EOFError if the decoder is empty.
# Use for CHOICE dispatch or optional-field detection:
# tag_num, tag_class, _ = decoder.peek_tag()
# if tag_class == "Context" and tag_num == 0:
# version = decoder.decode_explicit_tag(0)
def decode_raw_tlv(self) -> bytes: ...
# Reads the complete next TLV (tag + length + value bytes) as a bytes object
# and advances past it. Useful when the element type is unknown or when
# decoding should be deferred:
# tlv = decoder.decode_raw_tlv()
# inner = synta.Decoder(tlv, synta.Encoding.DER)
def remaining_bytes(self) -> bytes: ...
# Returns all remaining bytes from the current position without advancing.
# Primarily useful after decode_implicit_tag() for **primitive** implicit
# types whose raw value bytes cannot be decoded with the decode_* methods
# (those expect a full TLV, but implicit stripping leaves only the value):
#
# # Decode dNSName [2] IMPLICIT IA5String
# child = decoder.decode_implicit_tag(2, "Context")
# dns_name = child.remaining_bytes().decode("ascii")
#
# # Decode iPAddress [7] IMPLICIT OCTET STRING
# child = decoder.decode_implicit_tag(7, "Context")
# ip_bytes = child.remaining_bytes() # 4 or 16 raw bytes
# Dynamic decoding
def decode_any(self) -> object: ...
# Returns a typed Python object. Sequence/Set → list.
# Tagged elements → TaggedElement.
# Unknown universal tags → RawElement.
def decode_any_str(self) -> str: ...
# Decode any ASN.1 string type as a Python str (correct encoding per type).
# Raises ValueError for non-string tags; EOFError if empty.
# State
def is_empty(self) -> bool: ...
def position(self) -> int: ...
def remaining(self) -> int: ...
```
## Usage examples
### Decoding ASN.1 data
```python
import synta
# Decode an INTEGER
data = b'\x02\x01\x2A' # DER-encoded INTEGER 42
decoder = synta.Decoder(data, synta.Encoding.DER)
integer = decoder.decode_integer()
print(integer.to_int()) # Output: 42
# Decode an OBJECT IDENTIFIER
oid_data = b'\x06\x09\x2a\x86\x48\x86\xf7\x0d\x01\x01\x01'
decoder = synta.Decoder(oid_data, synta.Encoding.DER)
oid = decoder.decode_oid()
print(str(oid)) # Output: 1.2.840.113549.1.1.1
# Decode an OCTET STRING
octet_data = b'\x04\x05hello'
decoder = synta.Decoder(octet_data, synta.Encoding.DER)
octet_string = decoder.decode_octet_string()
print(octet_string.to_bytes()) # Output: b'hello'
# Decode a NULL
null_data = b'\x05\x00'
decoder = synta.Decoder(null_data, synta.Encoding.DER)
null = decoder.decode_null()
# Decode a REAL (IEEE 754 double)
real_data = b'\x09\x01\x40' # PLUS-INFINITY
decoder = synta.Decoder(real_data, synta.Encoding.DER)
r = decoder.decode_real()
import math
assert math.isinf(float(r))
# Decode any element dynamically
data = b'\x02\x01\x2A'
decoder = synta.Decoder(data, synta.Encoding.DER)
obj = decoder.decode_any() # Returns Integer, OctetString, list (Sequence/Set), etc.
```
### Decoding SEQUENCE structures
Use `decode_sequence()` to enter a SEQUENCE and get a child `Decoder`
positioned over the content bytes. Iterate with typed `decode_*` methods
and `is_empty()`.
```python
import synta
# Encoded SEQUENCE { INTEGER 42, BOOLEAN TRUE }
data = b'\x30\x06\x02\x01\x2a\x01\x01\xff'
decoder = synta.Decoder(data, synta.Encoding.DER)
child = decoder.decode_sequence() # advances past the outer SEQUENCE TLV
assert decoder.is_empty()
while not child.is_empty():
obj = child.decode_any() # INTEGER, then BOOLEAN
# Decode an explicit context tag [1] wrapping an INTEGER
tagged_data = b'\xa1\x05\x02\x03\x00\x00\x63' # [1] EXPLICIT INTEGER 99
decoder = synta.Decoder(tagged_data, synta.Encoding.DER)
child = decoder.decode_explicit_tag(1) # raises ValueError if tag != [1]
integer = child.decode_integer()
assert integer.to_int() == 99
```