tokitai-operator 0.1.0

Verified DL kernel compiler: formally-checked GEMM, p-adic, sheaf, contract-carrying ops. Paper-artifact grade.
Documentation
//! Phase 2.6: synthetic data generator for training.
//!
//! Standalone, pure-CPU module used to drive the `Model` / `MoEModel`
//! layers in `src/model/` and `src/moe_model/` when the real
//! `tokitai-search` SQLite ledgers are not available (e.g. unit tests
//! and dev/CI environments). Every generator is deterministic given
//! the seed, so the same `(n_samples, in_dim, out_dim, seed)` triple
//! always yields the same dataset.
//!
//! Public surface (re-exported for convenience):
//! - [`regression::make_regression_dataset`] — closed-form-friendly
//!   linear regression: `y = X w_star + epsilon`.
//! - [`decision_outcome::make_quality_decision_dataset`] — mimics the
//!   real 96-dim input / 20-dim output schema from
//!   `crates/training/` in tokitai-search, with a 4-expert
//!   non-linear generation process.
//! - [`stream::SyntheticSampleStream`] — feeds any
//!   `Vec<QualitySample>` in mini-batches.

pub mod decision_outcome;
pub mod regression;
pub mod stream;

pub use decision_outcome::{
    AUX_DIMS, CATEGORICAL_DIMS, NUMERICAL_DIMS, OUTCOME_DIMS, QualitySample,
    make_quality_decision_dataset,
};
pub use regression::{RegressionSample, make_regression_dataset};
pub use stream::SyntheticSampleStream;

/// Dimensionality of the synthetic quality-decision input vector.
/// Mirrors `crates/training::CATEGORICAL_DIMS + NUMERICAL_DIMS` in
/// tokitai-search.
pub const QUALITY_INPUT_DIM: usize = 96;

/// Dimensionality of the synthetic quality-decision output vector.
/// Mirrors `crates/training::OUTCOME_KIND_DIMS + AUX_METRIC_DIMS`.
pub const QUALITY_OUTPUT_DIM: usize = 20;