Experiment Manager built using Rust

High-performance experiment manager written in Rust, with a Python wrapper for non-blocking logging, a live web dashboard, and a friendly CLI.

Features

Non-blocking Python logging: log_vector() is a ~100ns channel send — never blocks your training loop
Live dashboard: SSE-powered real-time metric streaming, run comparison charts, artifact browser
Scalar metric filtering: Toggle which metric columns appear in the runs table with one click
Single binary: CLI + web server in one exp binary — no Python runtime needed for the server
Efficient storage: Batched Arrow/Parquet writes, not per-step read-concat-write
Nix dev environment: Reproducible with nix develop

Screenshots

Installation

From Cargo

cargo install expman

From PYPI

pip install expman-rs

From Nix

# Run standalone CLI
nix run github:lokeshmohanty/expman-rs

# Build local package
nix build .#expman-rust
nix build .#python3Packages.expman-rs

[!TIP] This project provides a Cachix cache. Enable it for faster builds: cachix use lokeshmohanty

Alternatively: Download from GitHub Releases

Direct Download: Download the pre-built standalone exp binaries or Python wheels directly from our GitHub Releases.

Quick Start

Python

Option A: Global Singleton (Easiest)

import expman as exp

exp.init("resnet_cifar10")
exp.log_params({"lr": 0.001})
exp.log_vector({"loss": 0.5}, step=0)
# Auto-closes on script exit

Option B: Context Manager (Recommended for scope control)

from expman import Experiment

with Experiment("resnet_cifar10") as exp:
    exp.log_vector({"loss": 0.5}, step=0)

For Rust

Basic usage:

use expman::{ExperimentConfig, LoggingEngine, RunStatus};

fn main() -> anyhow::Result<()> {
   let config = ExperimentConfig::new("my_rust_exp", "./experiments");
   let engine = LoggingEngine::new(config)?;

   engine.log_vector([("loss".to_string(), 0.5.into())].into(), Some(0));

   engine.close(RunStatus::Finished);
   Ok(())
}

Dashboard

exp serve ./experiments
# Open http://localhost:8000

CLI

exp list ./experiments              # list all experiments
exp list ./experiments -e resnet    # list runs for an experiment
exp inspect ./experiments/resnet/runs/20240101_120000
exp clean resnet --keep 5 --force   # delete old runs
exp export ./experiments/resnet/runs/20240101_120000 --format csv
exp export ./experiments/resnet/runs/20240101_120000 --format tensorboard -o ./tb_logs
exp import /path/to/tensorboard_logs --dir ./experiments

TensorBoard Interoperability

Drop-in SummaryWriter — Replace your TensorBoard import, keep the same code:

# Before:
# from torch.utils.tensorboard import SummaryWriter

# After:
from expman import SummaryWriter

writer = SummaryWriter(log_dir="runs/my_experiment")
for step in range(100):
    writer.add_scalar("loss", 1.0 / (step + 1), step)
writer.close()

Import existing TensorBoard logs into expman:

exp import /path/to/tensorboard_logs --dir ./experiments

Export expman runs to TensorBoard format:

exp export ./experiments/my_exp/20240101_120000 --format tensorboard -o ./tb_logs
tensorboard --logdir ./tb_logs

Development

Please see CONTRIBUTING.md for detailed instructions on setting up your local environment, building the Python bindings, and important git configuration notes.

nix develop                    # enter dev shell
just test                      # run all tests
just dev-py                    # build Python extension (uv pip install -e .)
just serve ./experiments       # start dashboard
just watch                     # watch mode for tests
just build-docs                # build and open documentation

Documentation

For detailed usage, refer to the source code modules in src/:

cli - Command-line interface definitions and references.
core - Core high-performance async Rust logging engine.
wrappers/python - Python extension Rust bindings.
api - Axum web server and SSE live streaming API.
app - Leptos frontend web application.
python-package - Python package code and tests.

Dashboard Features

Live Metrics: Real-time SSE streaming of experiment metrics and logs.
Live Jupyter Notebooks: Instantly spawn a live Jupyter instance natively bound to any run's execution environment directly from the UI, with auto-generated analytics boilerplate (Polars).
Scalar Filter: Toggle individual metric columns in the Runs table via chip buttons — no page reload.
Deep Inspection: View detailed run configurations, metadata, and artifacts.
Artifact Browser: Preview parquet, csv, and other files directly in the browser.
Comparison View: Overlay multiple runs on a shared timeline for analysis.
Server-side filtering: Pass ?metrics=loss,acc to /api/experiments/:exp/runs to limit which scalars are returned.

Examples

Practical code samples are provided in the examples/ directory. The Python example demonstrates logging metrics, alongside generating and storing rich media artifacts (audio, video, plots) directly natively.

Python: examples/python/basic_training.py
Python (TensorBoard migration): examples/python/tensorboard_migration.py
Rust: examples/rust/logging.rs

To run the Python examples, ensure you have built the extension first with just dev-py and installed the dev dependencies (uv pip install -e ".[dev]").

To run the Rust example, use:

cargo run --example logging --features cli

Experiments Layout

experiments/
  my_experiment/
    experiment.yaml          # display name, description
    20240101_120000/         # run directory
      metrics.parquet      # all logged metrics (Arrow/Parquet)
      config.yaml          # logged params/hyperparameters
      run.yaml             # run metadata (status, duration, timestamps)
      run.log              # text log
      artifacts/           # user-saved files

expman 0.5.1