adk-skill
AgentSkills parser, index, matcher, and runtime injection helpers for ADK-Rust.
Overview
adk-skill is the engine for specification-driven agent skills, implementing the agentskills.io specification. It provides the core building blocks to discover, parse, and index skill metadata, enabling agents to dynamically configure their behavior based on structured Markdown definitions.
This crate is provider-agnostic and can be used through:
adk-agent(LlmAgentBuilder::with_skills*)adk-runner(Runner::with_auto_skills)- Direct API calls from custom runtimes
Supported File Conventions
agentskills.io Compliance (.skills/**/*.md)
Each skill is a Markdown file with YAML frontmatter following the agentskills.io standard. This structure allows for both human-readable instructions and machine-readable governance.
name: zenith-voice-receptionist
description: Official voice persona for Mario's Plumbing Co. receptionist.
version: "1.1.0"
license: MIT
compatibility: Gemini Live 2.5 Flash Native Audio
allowed-tools:
- - - -
Specification Fields
| Field | Required | Description |
|---|---|---|
name |
Yes | Unique identifier (lowercase, numbers, hyphens). |
description |
Yes | Concise summary of the skill's purpose for agent selection. |
version |
No | Semantic version of the skill. |
license |
No | License identifier (e.g., MIT, Apache-2.0). |
compatibility |
No | Environment or model constraints. |
tags |
No | List of discovery and filtering labels. |
allowed-tools |
No | List of tools the agent is permitted to use for this skill. |
references |
No | External assets (JSON, CSV) required by the skill. |
trigger |
No | If true, requires explicit @name invocation. |
hint |
No | UI guidance for user input. |
metadata |
No | Arbitrary key-value map for extensions. |
Parsing Strictness
adk-skill employs a dual-validation strategy based on the file's location:
- Strict Mode (
.skills/**/*.md):- Mandatory:
nameanddescriptionmust be present and non-empty. - Validation: Failing to provide these fields will result in a
SkillError, and the file will not be indexed. - Optional: All other fields (version, license, tools, etc.) are optional and will default to empty or
None.
- Mandatory:
- Permissive Mode (Convention Files):
- Files like
AGENTS.mdorSOUL.mdtreat frontmatter as entirely optional. - If frontmatter is missing, the parser automatically derives the skill name from the filename and assigns default convention tags.
- Files like
Instruction Convention Files
The index loader also discovers and ingests these markdown files:
AGENTS.mdandAGENT.mdCLAUDE.mdGEMINI.mdCOPILOT.mdSKILLS.mdSOUL.md(root-level)
For these files, frontmatter is optional:
- If valid frontmatter is present, it is used.
- Otherwise the file is parsed as plain markdown instructions and converted into a skill document with convention tags (for example
agents-md,claude-md).
Logic-as-Data
The core philosophy of adk-skill is that Agent behavior should be controlled by configuration, not just code.
By parsing allowed-tools and references, the runtime (e.g., the data_plane) can dynamically instantiate the appropriate toolset for an agent session. This enables:
- Role-Based Access Control: Limit an agent's capabilities based on the active skill.
- Pluggable Personalities: Swap personas by simply changing the active skill metadata.
- Resource Injection: Automatically load the correct reference data for specific flows.
What The Crate Does
1. Discovery
- Scans
<root>/.skills/recursively for frontmatter skills. - Scans
<root>recursively for supported convention files. - Skips common heavy directories (
.git,target,node_modules, etc.). - Returns deterministic sorted file paths with de-duplication.
API: discover_skill_files(root)
API: discover_instruction_files(root)
Tool Discovery & Validation
While adk-skill handles the declaration of tools via allowed-tools, the implementation and validation are managed via core ADK traits.
ToolRegistry(Core): In adk-core, use theToolRegistrytrait to map string identifiers (e.g.,user_profile) to concreteArc<dyn Tool>implementations.ValidationMode(Core): Control whether the framework should strictly enforce tool availability or allow permissive binding.- Selective Injection: Use the
ContextCoordinatorto filter available tools against a skill'sallowed_toolslist, ensuring the agent only sees authorized capabilities.
Example Flow:
adk-skillparsesallowed-tools: [weather].- Runtime looks up
"weather"in its local registry. - Runtime injects the
WeatherToolinto the LLM context.
2. Parsing + Validation
- Strict path (
.skills/**): parses required frontmatter as YAML with validation. - Convention path (
AGENTS.md,CLAUDE.md, etc.): parses plain markdown (or frontmatter if provided). - Returns actionable errors with file path context for strict frontmatter paths.
API: parse_skill_markdown(path, content)
API: parse_instruction_markdown(path, content)
3. Indexing
- Builds
SkillIndexfrom discovered files. - Computes:
- content hash (
SHA-256) last_modified(Unix timestamp seconds when available)- stable document id:
normalized-name + first-12-hash-chars
- content hash (
- Sorts documents deterministically by
(name, path).
API: load_skill_index(root)
4. The Context Coordinator (Context Engineering)
The ContextCoordinator is the high-level engine that orchestrates the Context Engineering Pipeline. It bridges the gap between skill selection and agent execution, ensuring that any instruction given to the LLM is backed by concrete, validated capabilities.
The Role
- Orchestration: It runs the full flow:
Selection→Validation→Engineering. - Preventing "Phantom Tools": It verifies that every tool listed in a skill's
allowed-toolsmetadata exists in the host'sToolRegistry. If a tool is missing, it can reject the match (Strict) or omit the tool (Permissive), preventing the LLM from hallucinating an action it cannot perform. - Atomic Delivery: It emits a
SkillContext, which encapsulates the final system instruction and the collection of executableArc<dyn Tool>instances as a single unit.
Resolution Strategies
The coordinator supports a cascading Resolution Strategy pattern, allowing for flexible skill loading:
- ByName: Load a specific skill for dedicated flows.
- ByQuery: Find the best skill for a user's intent.
- ByTag: Find a skill categorised with specific labels (e.g., "emergency", "fallback").
API: ContextCoordinator::new(index, registry, config)
API: coordinator.resolve(&[Strategy::ByName("..."), Strategy::ByQuery("...")])
5. Selection (Scoring & Consumption)
The select_skills function implements a deterministic, token-based relevance engine. It calculates a score for each skill based on the query and then consumes that score to rank and filter the results.
The Scoring Algorithm
Relevance is determined by weighted lexical overlap. For every token in the query that is found in a skill field, the score increases by a specific weight:
| Field | Weight | Rationale |
|---|---|---|
| Name | +4.0 |
Exact name matches are highly intentional. |
| Description | +2.5 |
Concise summaries are the primary driver for relevance. |
| Tags | +2.0 |
Explicit labels provide strong categorization. |
| Body | +1.0 |
Mentions in instructions are relevant but can be noisy. |
Normalization: To prevent long-form instruction sets from unfairly drowning out concise skills, the raw score is divided by the square root of the unique tokens in the body: FinalScore = RawScore / sqrt(unique_tokens).
Score Significance & Expression
Because the score is expressed as a weighted lexical distance, its significance should be interpreted as Strength of Intent:
0.0 - 0.9(Trace): Incidental token overlap. Likely irrelevant to the user's current goal.1.0 - 2.4(Broad Match): Weak relevance. Found in the body or tags, but missing from primary identifiers (Name/Description).2.5 - 4.9(Specific Match): Strong relevance. Usually indicates a match in thedescriptionor multiple tags.5.0+(High Confidence): Direct hit. Indicates a match in thenameor high-density overlap across all fields.
The Runtime Lifecycle of a Score
In a running application, the score follows this logical flow:
- Query Arrival: The user sends a message (e.g., "I need a plumber for a leak").
- Lexical Calculation: The
select_skillsengine tokenizes the query and performs a weighted lookup across theSkillIndex. Initial scores are generated for all candidate skills. - Policy Enforcement: The
SelectionPolicyis applied.- Filtering: If "Plumbing" scores
4.5but the policy requiresmin_score: 5.0, it is discarded. - Clipping: The sorted list is trimmed to
top_k.
- Filtering: If "Plumbing" scores
- Action/Injection:
- Automated: If a match survives, its
bodyis injected into the prompt context usingapply_skill_injection. - Manual/UI: The application inspects the
SkillMatch.scoreto decide whether to trigger a tool, log a warning, or ask for user confirmation.
- Automated: If a match survives, its
API: select_skills(index, query, policy)
The skill engine uses the calculated score in two primary ways:
- Filtering: Any skill with a score below the
SelectionPolicy.min_score(default:1.0) is immediately discarded. This ensures agents only receive highly relevant instructions. - Ranking & Top-K: Resulting matches are sorted by descending score. If scores are tied, the engine applies deterministic Tie-Breaking:
- Lexicographical sort by Name.
- Lexicographical sort by File Path.
The engine then takes the top top_k results (default: 1) for injection or inspection.
API: select_skills(index, query, policy)
SelectionPolicy defaults:
top_k = 1min_score = 1.0include_tags = []exclude_tags = []
5. The Reliability Contract
What "makes the score well used" in a production app is its predictability and reproducibility.
- Deterministic Selection: The lexical matching algorithm contains no random seeds or opaque model weights. The same query against the same skill index will always yield the same score.
- Stable Identification: Every
SkillMatchincludes a uniqueidderived from the content hash. If the score changes, it's because the instructions (the data) changed. - Policy-Enforced Boundaries: By using
SelectionPolicy, the application defines a "Safety Contract" where no agent ever receives instructions below a verified relevance threshold.
This transparency allows you to build unit tests for your agent's persona:
6. Injection
Injection helpers prepend the selected skill body to user content using:
[skill:<name>]
<skill body>
[/skill]
Then original user text follows.
Behavior:
- Injection runs only when
Content.role == "user". - Query text is extracted from text parts and joined with newlines.
- Only the top match is injected.
- Injected body is truncated to
max_injected_chars.
APIs:
select_skill_prompt_block(...)apply_skill_injection(...)SkillInjector/SkillInjectorConfigSkillInjector::build_plugin(...)SkillInjector::build_plugin_manager(...)
Quick Start
Load and Match Skills
use ;
let index = load_skill_index?;
let policy = SelectionPolicy ;
let matches = select_skills;
for m in matches
# Ok::
Inject Into User Content
use Content;
use ;
let index = load_skill_index?;
let policy = SelectionPolicy ;
let mut content = new.with_text;
let matched = apply_skill_injection;
if let Some = matched
# Ok::
Build A Plugin Manager
use ;
let injector = from_root?;
let plugin_manager = injector.build_plugin_manager;
# let _ = plugin_manager;
# Ok::
Error Model
Main error type: SkillError
IoYamlInvalidFrontmatter { path, message }MissingField { path, field }InvalidSkillsRoot(path)
Type alias: SkillResult<T> = Result<T, SkillError>
Current Limits
- No embedding/vector retrieval (lexical matching only).
- No incremental file reload API yet.
- No remote catalog (
skills-ref/MCP) in this crate yet. - No script/file reference execution layer in this crate (selection + injection only).
- No standard CLI for skill management (use
adk-cliwrapper if available).
Application Integration Patterns
To ensure scores are well-consumed and the system is "designed for an app," consider these production-grade patterns:
1. The "Confidence Gatekeeper"
Don't always accept the top-1 match. In high-stakes applications (e.g., medical or financial), use a higher min_score (e.g., 5.0) to ensure the agent only acts when it is highly confident in the match.
2. Ambiguity Handling (The "Menu" Pattern)
If the top 3 skills have very close scores (e.g., within 0.5 of each other), instead of injecting one and potentially being wrong, use the scores to present a "Menu" to the user:
"I found a few skills that could help. Would you like to use the Account Recovery skill or the Security Update skill?"
3. The "Generalist Fallback"
Always maintain a "Generalist" skill with broad tags and no strict allowed-tools. If select_skills returns an empty list, fall back to this base instruction set to ensure the agent doesn't simply fail or hallucinate a persona.
4. Observability & Evaluation
Log the SkillMatch.score along with the SkillDocument.hash for every interaction. This allows you to perform "Offline Evaluation":
- Re-run queries against newer instructor sets.
- Detect "Score Drift" where changes to a popular skill's description cause it to lose relevance for core queries.
Best Practices for Skill Authors
1. Specification vs. Runtime Split
Maintain two versions of high-stakes skills:
SKILL.md(Design-Time): The comprehensive specification, including rationale, edge cases, and compliance data. Store this in your source-controlledskills/directory..skills/<name>.md(Runtime): An optimized, "distilled" version of the instructions. Strip out noise and focus on "Loud" instructions that the LLM can follow with minimal latency.
2. Explicit Tooling (Logic-as-Data)
Always define allowed-tools if your runtime supports dynamic loading. It acts as a safety barrier and reduces "tool hallucination" where the model tries to use a tool it isn't authorized for.
3. Descriptive Metadata
The description field is the primary driver for selection. If multiple skills are being matched, ensure descriptions are mutually exclusive to avoid "prompt pollution" where two conflicting personas are injected simultaneously.
5. Optimizing Selection Score
Because the selection engine uses weighted token matches, you can "steer" discovery by optimizing your metadata:
- Title Power: Use specific, unique terms in the
namefield (+4.0 weight). - Keyword Density: Ensure the
description(+2.5) contains the primary keywords you expect in users' queries. - Tag Categorization: Use
tags(+2.0) for synonyms or broad-brush categories (e.g.,[plumbing, drainage]) that might not be in the name. - Normalization Strategy: Keep your instructional
bodyconcise. A very long body increases the normalization factor (sqrt(tokens)), which can slightly penalize the overall score compared to a punchy, focused skill.
Related Examples
From this repository:
examples/skills_llm_minimalexamples/skills_auto_discoveryexamples/skills_policy_filtersexamples/skills_runner_injectorexamples/skills_workflow_minimal
Development
License
Apache-2.0