Expand description
Tensogram C FFI
Exposes the tensogram library to C and C++ callers via opaque handles, typed accessor functions, and a flat C ABI.
Memory ownership rules:
- Handles returned by
tgm_*functions are owned by the caller. Free them with the matchingtgm_*_freefunction. - Pointers returned by accessor functions (e.g.
tgm_object_shape) are borrowed from the handle and valid until the handle is freed. tgm_bytes_treturned by encode functions must be freed withtgm_bytes_free.
§JSON schema for tgm_encode
The metadata_json argument to tgm_encode is a JSON object with:
"descriptors"(array, required): one entry per data object. Each entry merges tensor info and encoding pipeline info into a single object:type,ndim,shape,strides,dtype,byte_order,encoding,filter,compression. Additional keys are stored as params."base"(array, optional): per-object application metadata.- Any other top-level keys (e.g.
"mars") are stored under the message-level_extra_map. The CBOR metadata frame is free-form — the wire-format version lives exclusively in the preamble (seeplans/WIRE_FORMAT.md§3) and must NOT be supplied by callers. A legacy"version"top-level field is tolerated for pre-0.17 schema compatibility and silently discarded.
Structs§
- TgmBuffer
Iter - Opaque handle for iterating over messages in a byte buffer.
- TgmBytes
- An owned byte buffer returned by encode functions.
- TgmDecode
Mask Options - Decode-side companion to
TgmEncodeMaskOptions. Pass a pointer to opt out of canonical NaN / Inf restoration. PassNULLfor the defaultrestore_non_finite = true. - TgmEncode
Mask Options - Mask-companion options for encode entry points (see
plans/WIRE_FORMAT.md§6.5 anddocs/src/guide/nan-inf-handling.md). Pass a pointer to this struct to opt into NaN / ±Inf substitution with bitmask companion frames. - TgmFile
- File handle.
- TgmFile
Iter - Opaque handle for iterating over messages in a file.
- TgmMessage
- Decoded message: global metadata + decoded (descriptor, payload) pairs.
- TgmMetadata
- Metadata-only handle (no decoded payloads).
- TgmObject
Iter - Opaque handle for iterating over objects within a single message.
- TgmScan
Entry - Scan result: array of (offset, length) pairs.
- TgmScan
Result - Opaque handle for scan results.
- TgmStreaming
Encoder - Opaque handle for a streaming encoder that writes data objects progressively.
Enums§
Constants§
- TGM_
WIRE_ VERSION - Wire-format version emitted and required by this build of the library.
Functions§
- tgm_
buffer_ iter_ count - Return the total number of messages in the buffer iterator.
- tgm_
buffer_ iter_ create - Create a buffer message iterator.
- tgm_
buffer_ iter_ free - Free a buffer iterator handle.
- tgm_
buffer_ iter_ next - Advance the buffer iterator. On success, sets
out_bufandout_lento the next message slice (borrowed from the original buffer). - tgm_
bytes_ free - Free a byte buffer returned by
tgm_encode. - tgm_
compute_ hash - Compute a hash of the given data.
Returns
TGM_ERROR_OKon success, fillsoutwith atgm_bytes_tcontaining the hex-encoded hash string (NOT null-terminated). Free withtgm_bytes_free. - tgm_
decode - Decode a complete message (global metadata + all object payloads).
- tgm_
decode_ metadata - Decode only the global metadata (no payload bytes are read).
- tgm_
decode_ object - Decode a single object by index.
- tgm_
decode_ range - Decode partial ranges from a data object.
- tgm_
decode_ with_ options - Decode with explicit NaN / Inf restoration options.
- tgm_
doctor_ to_ json - Run environment diagnostics and serialise the report as a JSON byte buffer.
- tgm_
encode - Encode a Tensogram message from JSON metadata and raw data slices.
- tgm_
encode_ pre_ encoded - Encode a Tensogram message from JSON metadata and pre-encoded payload bytes.
- tgm_
encode_ with_ options - Encode with explicit NaN / Inf mask-companion options.
- tgm_
error_ string - Convert an error code to a human-readable string. Returns a static string (always valid, never NULL).
- tgm_
file_ append - Encode and append a message to the file.
Same JSON schema as
tgm_encodeformetadata_json. - tgm_
file_ append_ raw - Append raw message bytes to the file.
- tgm_
file_ append_ with_ options - Append a message to a file with explicit NaN / Inf mask-companion options.
- tgm_
file_ close - Close a file handle and release resources.
- tgm_
file_ create - Create a new Tensogram file for writing.
- tgm_
file_ decode_ message - Decode message at
indexfrom the file. On success fillsoutwith aTgmMessagehandle. - tgm_
file_ iter_ create - Create a file message iterator from an open TgmFile.
- tgm_
file_ iter_ free - Free a file iterator handle.
- tgm_
file_ iter_ next - Advance the file iterator. On success, fills
outwith aTgmBytesbuffer containing the raw message bytes (caller owns, free withtgm_bytes_free). - tgm_
file_ message_ count - Count messages in the file (may trigger lazy scan).
- tgm_
file_ open - Open an existing Tensogram file for reading.
- tgm_
file_ path - Returns the file path as a null-terminated string. The pointer is valid until the file handle is closed.
- tgm_
file_ read_ message - Read raw message bytes at
index. On success fillsoutwith aTgmBytesbuffer. - tgm_
last_ error - Returns a pointer to the last error message, or NULL if no error. The pointer is valid until the next FFI call on the same thread.
- tgm_
message_ free - Free a decoded message handle.
- tgm_
message_ metadata - Extract a metadata handle from a decoded message.
The metadata handle is independent — free it separately with
tgm_metadata_free. - tgm_
message_ num_ decoded - Returns the number of decoded payload buffers.
Equivalent to
tgm_message_num_objects— kept for ABI compatibility. - tgm_
message_ num_ objects - Returns the number of decoded objects in this message handle.
For
tgm_decodethis equals the total object count; fortgm_decode_objectthis is always 1. - tgm_
message_ version - Returns the wire format version the decoder read from the preamble.
- tgm_
metadata_ free - Free a metadata handle.
- tgm_
metadata_ get_ float - Look up a float value by dot-notation key.
- tgm_
metadata_ get_ int - Look up an integer value by dot-notation key.
Returns
default_valif the key is not found or is not an integer. - tgm_
metadata_ get_ string - Look up a string value by dot-notation key (e.g. “mars.class”). Returns NULL if the key is not found or is not a string. The pointer is valid until the metadata handle is freed.
- tgm_
metadata_ num_ objects - Returns the number of objects described in the global metadata.
- tgm_
metadata_ version - Returns the wire format version.
- tgm_
object_ byte_ order - Returns the byte order string (“big” or “little”). Valid until message freed.
- tgm_
object_ compression - Returns the compression string (e.g. “none”, “zstd”). Valid until message freed.
- tgm_
object_ data - Returns a pointer to the decoded payload bytes for a decoded object.
decoded_indexis the index into the decoded objects array (0 for the first decoded object, regardless of the original object index).out_lenreceives the byte length. - tgm_
object_ dtype - Returns the dtype as a null-terminated string (e.g. “float32”). The pointer is valid until the message is freed.
- tgm_
object_ filter - Returns the filter string (e.g. “none”, “shuffle”). Valid until message freed.
- tgm_
object_ hash_ type - Returns the hash type string (“xxh3”) or NULL if no hash. Valid until message freed.
- tgm_
object_ hash_ value - Returns the hash value hex string or NULL if no hash. Valid until message freed.
- tgm_
object_ iter_ create - Create an object iterator from raw message bytes.
- tgm_
object_ iter_ free - Free an object iterator handle.
- tgm_
object_ iter_ next - Advance the object iterator. On success, fills
outwith aTgmMessagehandle containing exactly one decoded object (the next in sequence). - tgm_
object_ ndim - Returns the number of dimensions for object at index.
- tgm_
object_ shape - Returns a pointer to the shape array. Length is
tgm_object_ndim(). The pointer is valid until the message is freed. - tgm_
object_ strides - Returns a pointer to the strides array. Length is
tgm_object_ndim(). - tgm_
object_ type - Returns the object type string (e.g. “ndarray”). Valid until message freed.
- tgm_
payload_ encoding - Returns the encoding string for a data object descriptor (e.g. “none”, “simple_packing”). The pointer is valid until the message is freed.
- tgm_
payload_ has_ hash - Returns 1 if the i-th data object has a populated inline hash slot, 0 otherwise.
- tgm_
scan - Scan a buffer for message boundaries.
- tgm_
scan_ count - Returns the number of messages found by
tgm_scan. - tgm_
scan_ entry - tgm_
scan_ free - Free a scan result handle.
- tgm_
simple_ packing_ compute_ params - Compute simple_packing parameters for a set of f64 values.
- tgm_
streaming_ encoder_ count - Return the number of objects written so far.
- tgm_
streaming_ encoder_ create - Create a streaming encoder writing to a file.
- tgm_
streaming_ encoder_ create_ with_ options - Streaming-encoder constructor with NaN / Inf mask-companion options.
- tgm_
streaming_ encoder_ finish - Finalize the streaming encoder, writing footer and closing the file.
- tgm_
streaming_ encoder_ free - Free a streaming encoder without finalizing (abandons the output).
- tgm_
streaming_ encoder_ write - Write a single data object to the streaming encoder.
- tgm_
streaming_ encoder_ write_ pre_ encoded - Write a single pre-encoded data object to the streaming encoder.
- tgm_
streaming_ encoder_ write_ preceder - Write a PrecederMetadata frame for the next data object.
- tgm_
validate - Validate a single Tensogram message buffer.
- tgm_
validate_ file - Validate all messages in a
.tgmfile.