langsmith-rust 0.1.3

# Architecture Documentation

## Overview

This crate implements a modular, extensible architecture for tracing to LangSmith. It follows SOLID principles and uses several design patterns to ensure maintainability and scalability.

This documentation reflects the **current** crate layout and also highlights how a LangGraph-like executor (e.g. TeachMeWowAgent’s Graph DSL) can integrate via `GraphTrace` and `RunScope`.

## Hierarchical Structure

```
langsmith-rust/
├── src/
│   ├── lib.rs                    # Main entry point, re-exports
│   ├── config/                   # Configuration management
│   │   ├── mod.rs
│   │   └── env.rs                # Environment variable loading
│   ├── client/                   # HTTP client layer
│   │   ├── mod.rs
│   │   └── http.rs               # LangSmith API client
│   ├── models/                   # Data models
│   │   ├── mod.rs
│   │   ├── run.rs                # Run, RunType, RunUpdate
│   │   ├── messages.rs           # Message types (AI, Tool, Human)
│   │   └── metrics.rs            # Metrics (tokens, costs)
│   ├── tracing/                  # Core tracing logic
│   │   ├── mod.rs
│   │   ├── tracer.rs             # Tracer (main tracing struct)
│   │   ├── context.rs            # TraceContext (propagation)
│   │   └── decorator.rs          # trace_node helpers
│   │   ├── graph.rs              # GraphTrace (LangGraph-like helpers)
│   │   └── scope.rs              # RunScope (ergonomic run lifecycle)
│   ├── strategies/               # Strategy pattern implementations
│   │   ├── mod.rs
│   │   ├── tracing_strategy.rs   # Tracing strategies (async/sync)
│   │   └── serialization_strategy.rs  # Serialization strategies
│   ├── factories/                # Factory pattern
│   │   ├── mod.rs
│   │   └── tracer_factory.rs     # TracerFactory
│   ├── observability/            # Observer pattern
│   │   ├── mod.rs
│   │   ├── observer.rs           # Observer trait
│   │   ├── observable.rs         # Observable trait
│   │   └── node_wrapper.rs       # ObservableNodeWrapper
│   ├── utils/                    # Utilities
│   │   ├── mod.rs
│   │   ├── serialization.rs      # JSON serialization helpers
│   │   └── validation.rs         # Data validation
│   └── error.rs                  # Error types
```

## Class Relationships

### Core Classes

```
┌─────────────────┐
│     Config      │
│  (Singleton)    │
└────────┬────────┘
         │ provides config
         │
         ▼
┌─────────────────┐      ┌─────────────────┐
│ LangSmithClient │◄─────│     Tracer      │
│                 │ uses │                 │
└─────────────────┘      └────────┬────────┘
                                  │
                                  │ creates
                                  ▼
                          ┌─────────────────┐
                          │       Run        │
                          │  (Data Model)    │
                          └─────────────────┘
```

### Tracing Flow

```
User Code
    │
    ├─► TracerFactory.create() ──┐
    │                              │
    ├─► Tracer::new() ────────────┼──► Tracer
    │                              │      │
    └─► trace_node() ──────────────┘      │
                                         │
                                         ├─► Run (created)
                                         │
                                         ├─► LangSmithClient.post_run()
                                         │      │
                                         │      └─► HTTP POST /runs
                                         │
                                         ├─► Function execution
                                         │
                                         ├─► Run.end(outputs)
                                         │
                                         └─► LangSmithClient.patch_run()
                                                │
                                                └─► HTTP PATCH /runs/{id}
```

### Parent-Child Relationship

```
Root Tracer
    │
    ├─► trace_id: UUID-1
    ├─► dotted_order: "20240101T120000Zuuid-1"
    │
    └─► create_child()
            │
            ├─► parent_run_id: UUID-1
            ├─► trace_id: UUID-1 (inherited)
            └─► dotted_order: "20240101T120000Zuuid-1.20240101T120100Zuuid-2"
                    │
                    └─► create_child()
                            │
                            ├─► parent_run_id: UUID-2
                            ├─► trace_id: UUID-1 (inherited)
                            └─► dotted_order: "...uuid-1....uuid-2....uuid-3"
```

## Data Flow

### 1. Initialization Flow

```
Application Start
    │
    ├─► langsmith_rust::init()
    │      │
    │      └─► dotenvy::dotenv() ──► Load .env file
    │
    └─► Config::get() ──► Config::from_env()
            │
            ├─► Read LANGSMITH_TRACING
            ├─► Read LANGSMITH_ENDPOINT
            ├─► Read LANGSMITH_API_KEY
            └─► Read LANGSMITH_PROJECT
```

### 2. Tracing Flow (trace_node)

```
trace_node("node_name", RunType::Llm, inputs, function)
    │
    ├─► Check Config::is_tracing_enabled()
    │      │
    │      └─► If false: Execute function directly, return
    │
    ├─► ensure_inputs_object(inputs) ──► Convert to JSON object
    │      │
    │      └─► If primitive: wrap in {"input": value}
    │
    ├─► Tracer::new("node_name", RunType::Llm, inputs_json)
    │      │
    │      ├─► Run::new() ──► Generate UUID, timestamp
    │      └─► Set session_name from Config
    │
    ├─► tracer.post() ──► POST /runs
    │      │
    │      ├─► Initialize trace_id if root
    │      ├─► Generate dotted_order
    │      └─► LangSmithClient.post_run()
    │             │
    │             └─► HTTP POST to LangSmith API
    │
    ├─► Execute function(inputs)
    │      │
    │      ├─► Success ──┐
    │      │              │
    │      └─► Error ─────┼──► tracer.set_error()
    │                     │
    ├─► ensure_outputs_object(output) ──► Convert to JSON object
    │      │
    │      └─► If primitive: wrap in {"output": value}
    │
    ├─► tracer.end(outputs_json)
    │      │
    │      └─► Run.end() ──► Set outputs, end_time
    │
    └─► tracer.patch() ──► PATCH /runs/{id}
            │
            ├─► RunUpdate::from(&run)
            └─► LangSmithClient.patch_run()
                   │
                   └─► HTTP PATCH to LangSmith API
```

### 3. Hierarchical Tracing Flow

```
Root Node (Chain)
    │
    ├─► Tracer::new("root", RunType::Chain, inputs)
    │      │
    │      ├─► trace_id = run.id (UUID-1)
    │      └─► dotted_order = "20240101T120000Zuuid-1"
    │
    ├─► tracer.post() ──► POST /runs (root run)
    │
    └─► Child Node (LLM)
            │
            ├─► tracer.create_child("llm", RunType::Llm, inputs)
            │      │
            │      ├─► parent_run_id = parent.run.id
            │      ├─► trace_id = parent.trace_id (UUID-1)
            │      └─► dotted_order = parent + ".20240101T120100Zuuid-2"
            │
            ├─► child.post() ──► POST /runs (child run)
            │
            ├─► Execute LLM call
            │
            ├─► child.end(outputs)
            │
            └─► child.patch() ──► PATCH /runs/{child_id}
```

## GraphTrace Flow (LangGraph-like executors)

Many agent runtimes (like TeachMeWowAgent’s Graph DSL) execute a graph as a sequence of node iterations. The recommended integration is:

- Create a single **root run** using `GraphTrace::start_root(...)`.
- For each node execution, create a **node RunScope** using `start_node_iteration(...)`.
- Trace nested operations (LLM/tool/decisions) using helper methods.
- Close the node scope with `end_ok(...)` or `end_error(...)`.
- Close the root run with `end_root(...)`.

```mermaid
sequenceDiagram
participant Exec as GraphExecutor
participant Trace as GraphTrace
participant Scope as RunScope
participant LangSmith as LangSmithAPI

Exec->>Trace: start_root(inputs,thread_id)
Trace->>LangSmith: POST /runs (root)

loop each_node_iteration
  Exec->>Trace: start_node_iteration(node,inputs)
  Trace->>LangSmith: POST /runs (child)
  Exec->>Scope: end_ok(outputs) OR end_error(err)
  Scope->>LangSmith: PATCH /runs/{id}
end

Exec->>Trace: end_root(outputs)
Trace->>LangSmith: PATCH /runs/{root_id}
```

## Design Patterns

### 1. Strategy Pattern

**Purpose**: Allow different implementations of tracing and serialization

**Implementation**:
- `TracingStrategy` trait - defines interface for tracing operations
- `AsyncTracingStrategy` - async implementation
- `SyncTracingStrategy` - sync implementation (uses blocking runtime)
- `SerializationStrategy` trait - defines interface for serialization
- `DefaultSerializationStrategy` - wraps primitives in objects

**Benefits**:
- Easy to add new strategies (e.g., BatchTracingStrategy)
- Testable with mock strategies
- Runtime selection of strategy

### 2. Factory Pattern

**Purpose**: Centralize creation of complex objects (Tracers)

**Implementation**:
- `TracerFactory` - static factory methods
- Methods: `create()`, `create_with_thread()`, `create_root()`, `create_for_node()`

**Benefits**:
- Consistent tracer creation
- Hides complexity of configuration
- Easy to extend with new creation patterns

### 3. Observer Pattern

**Purpose**: Allow nodes to be observed for events (start, end, error)

**Implementation**:
- `Observer` trait - defines observer interface
- `Observable` trait - defines observable interface
- `ObservableNodeWrapper` - wraps nodes to make them observable
- `LangSmithObserver` - concrete observer that traces to LangSmith

**Benefits**:
- Decouples tracing from node execution
- Multiple observers can be attached
- Easy to add logging, metrics, etc.

## Module Responsibilities

### config/
- **Responsibility**: Load and manage configuration from environment
- **Key Types**: `Config`
- **Pattern**: Singleton (using `Lazy<Mutex<Option<Config>>>`)

### client/
- **Responsibility**: HTTP communication with LangSmith API
- **Key Types**: `LangSmithClient`
- **Methods**: `post_run()`, `patch_run()`

### models/
- **Responsibility**: Data structures representing LangSmith entities
- **Key Types**: `Run`, `RunType`, `RunUpdate`, `Message`, `Metrics`
- **Note**: Pure data structures, no business logic

### tracing/
- **Responsibility**: Core tracing logic and context propagation
- **Key Types**: `Tracer`, `TraceContext`
- **Key Functions**: `trace_node()`, `trace_node_sync()`

### strategies/
- **Responsibility**: Strategy pattern implementations
- **Key Types**: `TracingStrategy`, `SerializationStrategy`

### factories/
- **Responsibility**: Factory pattern for tracer creation
- **Key Types**: `TracerFactory`

### observability/
- **Responsibility**: Observer pattern for node observation
- **Key Types**: `Observer`, `Observable`, `ObservableNodeWrapper`

### utils/
- **Responsibility**: Utility functions (serialization, validation)
- **Key Functions**: `ensure_object()`, `validate_run()`

## Error Handling

All errors are wrapped in `LangSmithError` enum:

```rust
pub enum LangSmithError {
    Http(reqwest::Error),
    Serialization(serde_json::Error),
    Config(String),
    TracingDisabled,
    Other(String),
}
```

**Error Propagation Strategy**:
- Tracing errors are logged but **never break application execution**
- Functions return `Result<T>` but tracing failures are caught and logged
- Application code continues even if LangSmith is down

## Thread Safety

- `Config`: Thread-safe singleton using `Lazy<Mutex<Option<Config>>>`
- `Tracer`: Not thread-safe (should be used within single async task)
- `LangSmithClient`: Thread-safe (uses `Arc` internally)
- `Observer`: Thread-safe (uses `Arc<dyn Observer>`)

## Performance Considerations

1. **Lazy Initialization**: Config is loaded only when needed
2. **Non-blocking**: All HTTP calls are async
3. **Efficient Serialization**: Uses `serde_json` (fast)
4. **Minimal Allocations**: Reuses clients where possible
5. **Early Returns**: Checks `is_tracing_enabled()` before any work

## Extension Points

### Adding a New Run Type

```rust
// In models/run.rs
pub enum RunType {
    // ... existing variants
    Custom(String),
    MyNewType,  // Add here
}
```

### Adding a New Strategy

```rust
// In strategies/
pub struct MyNewStrategy;

impl TracingStrategy for MyNewStrategy {
    // Implement trait methods
}
```

### Adding a New Observer

```rust
pub struct MyObserver;

impl Observer for MyObserver {
    fn on_node_start(&self, node_name: &str, inputs: &Value) {
        // Your logic
    }
    // ... implement other methods
}
```