Skip to main content

Crate rullama_finetune

Crate rullama_finetune 

Source
Expand description

Local LoRA fine-tuning for the rullama Rust runtime.

Same trainer on native and wasm32-unknown-unknown. The forward, backward, LoRA state, optimizer, and dataset parsing all compile on both targets — the only native-only bits are filesystem helpers that wrap the bytes-based core API (see load_jsonl_from_bytes / save_adapter_to_bytes / load_adapter_into_state_from_bytes).

Module map:

  • shared — config / error / progress types.
  • dataset_loader — JSONL parser (bytes-in core + native path wrapper) + Tokenizer trait + byte-level and HF-tokenizers-backed implementations.
  • lr_schedule — warmup + linear / cosine / cosine-warm-restarts schedules. Cosine clamps progress at 1.0.
  • lora — LoRA A/B state, forward correction, A/B grad accumulation.
  • scratch — per-step GPU scratch buffers for the backward pass.
  • sessionTrainingSession driving one training step end-to-end (forward → loss → backward → Adam).

Re-exports§

pub use session::load_adapter_into_state;
pub use session::TrainingSession;
pub use session::load_adapter_into_state_from_bytes;

Modules§

dataset_loader
JSONL dataset loader + tokenizer trait. Dataset loading for local training.
lora
Per-LoRA GPU state: A and B matrices for each wrapped projection. Per-LoRA GPU state: A and B matrices for each wrapped projection.
lr_schedule
Learning rate schedules. Learning rate scheduling for local training.
scratch
Per-step GPU scratch buffers for the backward pass. TrainingScratch — per-step GPU scratch buffers for the backward pass.
session
TrainingSession — drives one training step end-to-end. TrainingSession — drives one training step end-to-end: forward (with LoRA correction + activation capture) → loss → backward → Adam update.
shared
Shared configuration, error, and progress types. Shared config / error / types — vendored from brainwires-finetune so rullama-finetune is fully standalone (no cross-repo dep on brainwires-framework). Kept structurally identical to the upstream modules so types map 1:1; tweaks can diverge over time.