child_ai_json_engine 0.0.2

Crate provides a JSON engine for collecting, aggregating, and managing models.
Documentation
  • Coverage
  • 0%
    0 out of 111 items documented0 out of 49 items with examples
  • Size
  • Source code size: 106.13 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 11.05 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 18s Average build duration of successful builds.
  • all releases: 19s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • rayanmorel4498-ai

child_ai_json_engine

child_ai_json_engine is a lightweight, minimal Rust library that offers low-level primitives for collecting, validating, persisting and aggregating JSON event data, and for storing small binary model artifacts. The crate provides safe, atomic file operations (writes and appends), a tiny internal json::Value parser/printer, schema validation helpers and a small in-process metrics surface.

This README documents the library's responsibilities, important guarantees, and how to use the public API. This crate is intentionally small and dependency-free except for a single allowed helper crate; it is designed to be embedded under a higher-level application that enforces formats and security boundaries.

Important warning

THIS CRATE IS NOT INTENDED FOR DIRECT PUBLIC USE. It provides low-level I/O and JSON primitives but does not itself guarantee production-grade JSON schema contracts, user input sanitization, or API-level compatibility. Always place an abstraction layer between user-facing code and this crate that enforces JSON creation standards, schema contracts and access control before exposing data to external callers. This crate does not guarantee that the JSON formats it produces or persists constitute a stable or official standard. Do not rely on these formats as a protocol or interchange standard until they have been validated and accepted through appropriate review and consensus processes.

What this crate provides

This crate exposes low-level, durability-focused primitives for the following responsibilities:

  • Collecting newline-delimited JSON events and atomically appending them to data/events.jsonl.
  • Optional schema validation helpers (missing schemas are tolerated but logged).
  • Persisting small model artifacts (metadata .meta.json and binary .bin) under data/models/ with a configurable size limit.
  • Periodic aggregation of metrics/events into data/aggregated_stats.jsonl.
  • An in-process metrics surface (atomic counters) with a Prometheus-text exporter via metrics::to_prometheus_text().

Guarantees

  • Atomic writes: write/append helpers use a temporary file + fsync + rename pattern to avoid partial-file corruption on crashes.
  • Final flush on shutdown: the aggregator performs a final flush on orderly shutdown to reduce in-memory data loss.
  • Model size limit: model saves are rejected when exceeding CHILD_JSON_ENGINE_MAX_MODEL_BYTES (default: 200_000_000 bytes).
  • Validation tolerance: when a schema is missing the validator becomes permissive and logs the condition instead of failing hard.

Public API (examples)

Small usage examples (minimal interface):

// Build and use a collector
let mut collector = DataCollector::new("data/");
collector.record_event(json_str.as_bytes());
// Persist a model (enforces size limit)
collector.save_model("model-123", &model_bytes)?;

// Start the aggregator (run on a separate thread)
let mut agg = MetricsAggregator::start("data/", Duration::from_secs(60));
// Request stop and wait for final flush
agg.request_stop();
agg.join();

See the exact function signatures in the crate's exported modules.

Environment variables

  • CHILD_JSON_ENGINE_MAX_MODEL_BYTES — maximum allowed model size (bytes). Default: 200000000.

Operational notes

  • All file operations are blocking and perform fsync; for async apps, run these calls inside spawn_blocking or on a dedicated thread.
  • This crate is intentionally a low-level library: put a higher-level abstraction in front of it to validate, filter and enforce access control before exposing functionality to untrusted callers.

Visual architecture (Mermaid)

graph LR
	A[Client / Producer] -->|send events| B[DataCollector]
	B -->|buffer -> atomic append| C[events.jsonl]
	C -->|read by| D[MetricsAggregator]
	D -->|atomic append| E[aggregated_stats.jsonl]
	B -->|save model| F[model_store (data/models)]
	F -->|meta| G[model_id.meta.json]
	F -->|binary| H[model_id.bin]
	B -->|save vocab| I[vocab files (data/vocab/*.json)]
	style A fill:#f9f,stroke:#333,stroke-width:1px
	style B fill:#bbf,stroke:#333
	style D fill:#bfb,stroke:#333
	style F fill:#ffa,stroke:#333

JSON Schemas (examples)

Engine Execution Schema (engine_execution)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/engine_execution.json",
	"title": "EngineExecution",
	"type": "object",
	"required": ["schema_version","trace_id","task_id","engine_id","engine_instance_id","timestamp","input","output","metrics","status"],
	"properties": {
		"schema_version": { "type": "string" },
		"trace_id": { "type": "string" },
		"task_id": { "type": "string" },
		"engine_id": { "type": "string" },
		"engine_instance_id": { "type": "string" },
		"timestamp": { "type": "string", "format": "date-time" },
		"input": { "type": "object" },
		"output": { "type": "object" },
		"metrics": { "type": "object" },
		"status": { "type": "string" }
	}
}

Comparison Result Schema (comparison_result)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/comparison_result.json",
	"title": "ComparisonResult",
	"type": "object",
	"required": ["schema_version","trace_id","comparison_id","timestamp","query","candidates","winner","metrics"],
	"properties": {
		"comparison_id": { "type": "string" },
		"query": { "type": "object" },
		"candidates": { "type": "array" },
		"winner": { "type": "object" },
		"metrics": { "type": "object" }
	}
}

Training Sample Schema (training_sample)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/training_sample.json",
	"title": "TrainingSample",
	"type": "object",
	"required": ["schema_version","sample_id","created_at","source_trace_id","tier","input","output","validation"],
	"properties": {
		"sample_id": { "type": "string" },
		"created_at": { "type": "string", "format": "date-time" },
		"tier": { "type": "string" },
		"input": { "type": "object" },
		"output": { "type": "object" },
		"validation": { "type": "object" }
	}
}

Model Metadata Schema (model_metadata)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/model_metadata.json",
	"title": "ModelMetadata",
	"type": "object",
	"required": ["model_id","epoch","checksum","format_version","created_at","size_bytes"],
	"properties": {
		"model_id": { "type": "string" },
		"epoch": { "type": "integer" },
		"checksum": { "type": "string" },
		"format_version": { "type": "integer" },
		"created_at": { "type": "string", "format": "date-time" },
		"size_bytes": { "type": "integer" }
	}
}

Aggregated Stats Schema (aggregated_stats)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/aggregated_stats.json",
	"title": "AggregatedStats",
	"type": "object",
	"required": ["timestamp","engine_id","window_minutes","metrics"],
	"properties": {
		"timestamp": { "type": "string", "format": "date-time" },
		"engine_id": { "type": "string" },
		"window_minutes": { "type": "integer" },
		"metrics": { "type": "object" }
	}
}

Vocab Schema (vocab)

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"$id": "https://example.org/schemas/v1/vocab.json",
	"title": "Vocab",
	"type": "object",
	"required": ["name","created_at","tokens_count","entries"],
	"properties": {
		"name": { "type": "string" },
		"created_at": { "type": "string", "format": "date-time" },
		"tokens_count": { "type": "integer" },
		"entries": { "type": "object" }
	}
}

License: MIT