# Case Study: Lord of the Rings - The Council of Elrond
This guide provides a comprehensive technical walkthrough for building a state-of-the-art narrative generation pipeline using the **Skill-based Workflow** system in `llama-cpp-v3-agent-sdk`. We will orchestrate a specialized "Pipeline of Experts" to generate high-stakes drama from *The Lord of the Rings*.
---
## 1. Core Philosophy: The Pipeline of Experts
Traditional "Mega-Prompts" often suffer from **Instruction Bleed**—a phenomenon where the LLM, overwhelmed by too many constraints, begins to ignore formatting rules, skips character nuances, or forgets specific lore requirements.
`llama-cpp-v3-agent-sdk` solves this through **Agentic Isolation**. By splitting a complex task into multiple specialized steps, we ensure:
- **Focused Context**: Each agent only sees the information it needs for its specific task.
- **Strict Formatting**: Smaller prompts allow for 100% adherence to complex JSON schemas or Markdown structures.
- **VRAM Efficiency**: Only one agent's context is active at a time, allowing high-fidelity generation on consumer hardware.
---
## 2. Anatomy of a "Skill"
A **Skill** is a self-contained module that encapsulates a specific workflow. This modularity allows developers to swap out "Writer" or "Critic" prompts without changing the application code.
### Skill Directory Structure
```text
skills/lotr-scene-generator/
├── SKILL.md # Metadata & Requirements
├── workflow.json # Orchestration logic
├── prompts/ # Specialized agent instructions
│ ├── planner.md
│ ├── writer.md
│ └── critic.md
├── schemas/ # JSON validation schemas
│ └── review_report.json
└── references/ # Lore & Context
└── lore_reference.md
```
### 2.1 Skill Metadata (`SKILL.md`)
This file serves as the documentation for the skill, outlining its purpose and required input context.
```markdown
---
name: lotr-scene-generator
description: Generate high-fidelity dramatic scenes set in Middle-earth. It is specifically tuned for the 'Council of Elrond' style of debate.
---
# Lord of the Rings: Council Scene Generator
This skill generates high-fidelity dramatic scenes set in Middle-earth. It is specifically tuned for the 'Council of Elrond' style of debate.
## Required Input Context
- `outline`: (String) A brief summary of the confrontation.
- `characters`: (Array) List of Tolkien characters to include.
- `lore_strictness`: (Number 0-1) How strictly to adhere to canon.
```
### 2.2 The Planner Prompt (`prompts/planner.md`)
The Planner is responsible for structure. It must output clean JSON.
```markdown
# Role: Middle-earth Scene Architect
You are an expert at narrative structure and Tolkien's storytelling patterns.
# Task
Deconstruct the provided scene outline into a detailed beat sheet.
# Requirements
1. Define 3-5 distinct emotional shifts.
2. Specify the "Lore Anchor" for this scene (e.g., the history of Isildur).
3. Identify the core conflict for each character.
# Output Format
You MUST output a valid JSON object following this structure:
{
"beats": [{"description": "string", "emotion": "string"}],
"lore_anchor": "string",
"character_goals": {"character_name": "string"}
}
```
### 2.3 The Writer Prompt (`prompts/writer.md`)
The Writer focuses on dialogue and prose. It receives the Planner's JSON output.
```markdown
# Role: Epic Fantasy Dramatist
You are a master of dialogue, subtext, and the specific voices of Middle-earth.
# Input Specification
You will receive a `beats` object from the Architect.
# Voice Guidelines
- Elrond: Ancient, weary but hopeful, authoritative.
- Boromir: Proud, desperate, uses military metaphors.
- Aragorn: Quietly noble, guarded, uses archaic but simple speech.
# Task
Write the full screenplay for the scene. Use the provided beats to drive the tension.
```
### 2.4 The Critic Schema (`schemas/review_report.json`)
By providing a schema, you ensure the Critic's feedback is actionable by the engine.
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"overall_score": { "type": "number", "minimum": 0, "maximum": 1 },
"lore_errors_found": { "type": "boolean" },
"voice_consistency": { "type": "string" },
"must_fix_notes": { "type": "array", "items": { "type": "string" } }
},
"required": ["overall_score", "lore_errors_found", "must_fix_notes"]
}
```
### 2.5 Lore Reference (`references/lore_reference.md`)
Static files included in the skill to ground the agents in your specific world.
```markdown
# The Nature of the One Ring
- It cannot be used for good, no matter the intent.
- It corrupts through the user's desire to do good (e.g., Boromir's desire to save Gondor).
- Only in the fires of Mount Doom can it be unmade.
```
---
## 3. The Orchestration Logic (`workflow.json`)
The `workflow.json` file is the declarative manifest of your pipeline. It manages data dependencies and control flow.
```json
{
"name": "Middle-earth Narrative Pipeline",
"steps": [
{
"name": "planner",
"description": "Constructing scene architecture for Rivendell...",
"agent_prompt": "prompts/planner.md",
"temperature": 0.2,
"output_type": "json"
},
{
"name": "writer",
"description": "Drafting the Council of Elrond screenplay...",
"agent_prompt": "prompts/writer.md",
"temperature": 0.8,
"stop_sequences": ["# END OF SCENE"],
"input_mapping": {
"beats": "planner"
}
},
{
"name": "critic",
"description": "Validating Ring-lore and character voices...",
"agent_prompt": "prompts/critic.md",
"temperature": 0.1,
"output_type": "json",
"input_mapping": {
"initial_outline": "outline",
"draft": "writer"
}
},
{
"name": "rewrite",
"description": "Refining the dialogue based on Critic's lore check...",
"agent_prompt": "prompts/rewrite.md",
"conditional": "critic.lore_errors_found",
"input_mapping": {
"original_draft": "writer",
"lore_report": "critic"
}
}
]
}
```
### In-Depth Feature Explanations:
#### A. Input Mapping (`input_mapping`)
This is the "wiring" of your pipeline. By default, every step receives the initial global context. However, `input_mapping` allows you to inject results from previous steps into specific keys.
- **Example**: The `writer` step above will receive a JSON object with a key `"beats"` containing the result of the `planner` step. The engine handles the lookup and merging automatically.
#### B. Conditional Execution (`conditional`)
The engine uses dot-notation (e.g., `critic.lore_errors_found`) to determine if a step should run.
1. The engine fetches the `"critic"` result.
2. It attempts to parse it as JSON.
3. It checks the value of `"lore_errors_found"`. If `false`, the `rewrite` step is entirely bypassed.
#### C. Output Types
Setting `output_type: "json"` triggers an internal "Sanitization Pass." The engine extracts the `{...}` block from the agent's output, cleans control characters, and parses it. This ensures that downstream agents receive clean data objects rather than raw strings with conversational filler.
---
## 4. State Persistence: Implementing `WorkflowStorage`
For a professional system, losing progress due to a network error or crash is unacceptable. The `WorkflowEngine` uses a **Stateless Engine + Stateful Storage** pattern.
By implementing `WorkflowStorage` (e.g., using SQLite), you enable **Stateful Resumption**.
### Implementation Example (SQLite)
```rust
use llama_cpp_v3_agent_sdk::workflow::{WorkflowStorage, Result};
use std::collections::HashMap;
impl WorkflowStorage for SqliteArchive {
fn insert_artifact(&self, session_id: &str, artifact_type: &str, content: &str, is_json: bool) -> Result<()> {
// Record the artifact with a timestamp.
// This 'artifact_type' maps to the 'step.name' in workflow.json.
self.conn.execute(
"INSERT INTO artifacts (session_id, step_name, content) VALUES (?1, ?2, ?3)",
params![session_id, artifact_type, content]
)?;
Ok(())
}
fn get_latest_artifacts(&self, session_id: &str) -> Result<HashMap<String, String>> {
// Load the most recent artifacts for this session.
// The engine uses this to populate context for resumed runs.
let mut results = HashMap::new();
// ... fetch from DB ...
Ok(results)
}
}
```
---
## 5. Execution & Lifecycle Management
The `WorkflowEngine` manages the VRAM lifecycle of agents. To keep memory usage low, it follows a **Build-Run-Drop** pattern:
1. **Build**: Constructs an `Agent` using the shared `InferenceEngine` and `InferenceScheduler`.
2. **Run**: Executes the inference and streams tokens.
3. **Drop**: The agent and its associated `LlamaContext` are dropped immediately after completion, freeing VRAM for the next step.
### Running the Council of Elrond Pipeline
```rust
use llama_cpp_v3_agent_sdk::workflow::{WorkflowEngine, PipelineEvent};
let engine = WorkflowEngine::new(inference_engine, scheduler, my_storage, skills_path);
// 1. Define the scene requirements
let context = json!({
"outline": "Boromir demands the Ring for the defense of Gondor. Aragorn reveals himself as the Heir of Isildur.",
"tone": "Grandiose and Tense"
});
// 2. Start the orchestrated run
let results = engine.run(
"lotr-generator",
"council-session-001",
context,
None, // resume_from_step: used to restart from a specific failed point
vec![], // force_regenerate: used to ignore cache and redo specific steps
|event| match event {
PipelineEvent::StepStarted { name, .. } => {
println!("\n[PHASE: {}]", name.to_uppercase());
},
PipelineEvent::Token { token, .. } => {
print!("{}", token);
io::stdout().flush().unwrap();
},
PipelineEvent::Processing { message, .. } => {
println!("\n[ENGINE]: {}", message);
},
_ => {}
}
).await?;
```
---
## 6. Advanced: Output Post-Processing
Sometimes agents generate syntax that is slightly off (e.g., they might use `(Action)` instead of the required `[ACTION]` tag). You can define **Post-Processing Rules** in your skill:
### `skills/lotr-generator/schemas/post_process.json`
```json
{
"rules": [
{
"pattern": "\\((.*?)\\)",
"replacement": "[ACTION: $1]"
}
]
}
```
The engine automatically applies these regex-based rules to the `writer` and `rewrite` steps before persisting the results. This ensures your final data is always compliant with your application's requirements.
---
## 7. Developer Best Practices
1. **Low Temperature for Logic**: Set `temperature: 0.1` or `0.2` for `planner` and `critic` steps to ensure deterministic and logical results.
2. **High Temperature for Prose**: Set `temperature: 0.8` for the `writer` to allow for varied and creative dialogue.
3. **Schema Validation**: Always use `output_type: "json"` for steps that drive logic (like the `critic`), as this allows for robust conditional branching.
4. **Session Isolation**: Use unique `session_id`s for every generation request to prevent data corruption in the storage layer.