Usage
agent-reviewallowed-skills#agent-creator
Configuration
read_fileglobgrepshellask_user |
Instructions
Overview
schema/Agent.yaml
Steps
Identify the agent to review from the user's request — accept an agent name, a directory path, or an AGENT.mdfile path Resolve the agent file: if given a name, look for .stencila/agents/<name>/AGENT.mdwalking up from the current directory; also check ~/.config/stencila/agents/<name>/AGENT.mdfor user-level agents. If given a path, use it directly Read the full AGENT.mdfile and any supporting files in the agent directory (check for scripts/, references/, and assets/subdirectories) Read schema/Agent.yamlto verify the checklist covers current agent schema fields and to identify any unknown frontmatter properties Evaluate the agent against each criterion in the Review Checklist below Produce a structured review report with a summary, per-criterion findings, and a prioritized list of suggestions If the user asks you to apply improvements, make the changes and validate the result with stencila agents validate <agent-name>
Review Checklist
Frontmatter
name : present, matches directory name, valid kebab-case ( ^[a-z0-9]([a-z0-9-]{0,62}[a-z0-9])?$), follows the thing-rolenaming convention (e.g., code-reviewer, data-analyst) description : present, not empty, not a placeholder ( TODO, <placeholder>), recommended to be concise (under ~1,024 characters), specific enough to convey the agent's purpose
Optional Fields
model : do not hard-code a model unless the user explicitly requires one; flag hard-coded models as a warning since they reduce portability provider : same as model — omit unless explicitly needed; flag hard-coded providers as a warning model-size : if present, check that it is being used coherently as a broad model-tier preference (for example small, medium, large), and that it matches the agent's stated role and task complexity. Treat it as a Stencila cross-provider classification rather than an exact provider guarantee reasoning-effort : typically low, medium, or highif present; custom provider-specific values are also valid trust-level : must be low, medium, or highif present; check that it matches the agent's intended use (e.g., a read-only reviewer should not have hightrust) allowed-tools : check that listed tools are valid Stencila tool names ( read_file, write_file, edit_file, grep, glob, shell, web_fetch, use_skill, spawn_agent, send_input, wait, close_agent, ask_user, mcp_codemode); flag unknown tool names allowed-skills : if present, check that listed skill names are valid kebab-case. Skills that do not yet have a corresponding SKILL.mdare valid forward references (top-down design) — note them as outstanding dependencies rather than flagging them as errors allowed-domains / disallowed-domains : if present, check format (exact hosts or *.example.comwildcards) max-turns : non-negative integer if present max-tool-rounds : positive integer if present tool-timeout : positive integer (seconds) if present max-subagent-depth : non-negative integer if present enable-mcp / enable-mcp-codemode : boolean if present allowed-mcp-servers : list of server ID strings if present history-thinking-replay : must be noneor fullif present truncation-preset : must be strict, balanced, or verboseif present compaction-trigger-percent : unsigned integer (0–100) if present compatibility : under 500 characters if present unknown fields : flag any frontmatter fields not defined in schema/Agent.yaml(or inherited from CreativeWork) as warnings — they may be typos or unsupported properties that will be silently ignored
AgentCreativeWorklicense
allowed-skillsallowed-toolsallowed-toolsuse_skill
Discovery and Delegation Metadata
keywords : if present, check that keywords are relevant, not redundant with the description, and include likely user intent words, artifact types, and domain terms. Flag generic or overly broad keywords. If absent, recommend adding keywords to improve discoverability when-to-use / when-not-to-use : if present, check that entries are specific, actionable, and complementary to the description rather than duplicating it. Flag vague signals like "when appropriate" or "when needed". If absent, recommend adding them to improve manager delegation accuracy Coherence check : verify that description, keywords, and when-to-use/ when-not-to-usework together — they should be complementary, not redundant. Flag cases where the same text appears verbatim in multiple fields
System Instructions and Body
A missing body is valid — a frontmatter-only AGENT.mdis a legitimate configuration-only agent. However, a body that exists but contains only empty sections or placeholder content should be flagged If allowed-skillscontains exactly one skill, recognize that Stencila automatically preloads that skill's full instructions into the system prompt. In this case, a short body that acts as a preamble or identity-setting introduction is sufficient and should not be criticized as too sparse merely for brevity If present, instructions are clear, imperative, and unambiguous Instructions align with the agent's described purpose — a code reviewer's instructions should not describe code generation No contradictions between frontmatter configuration and body instructions (e.g., body says "modify files" but allowed-toolsexcludes write_fileand edit_file) No placeholder content ( TODO, <placeholder>, or empty sections)
Security
Tool scope : agent only has access to tools it needs; flag overly broad tool access for specialized agents (e.g., a documentation agent with shellaccess) Trust level : appropriate for the agent's role; flag hightrust on agents that do not need it Domain restrictions : if the agent uses web_fetch, consider whether domain restrictions are appropriate MCP access : if MCP is enabled, check whether allowed-mcp-serversrestricts access to only needed servers
Consistency
Frontmatter property names use kebab-case (not camelCase or snake_case) Formatting is consistent (heading levels, list styles, code block languages) Naming follows thing-roleconvention Configuration choices are internally consistent (e.g., max-turns: 5with reasoning-effort: highsuggests the agent expects complex tasks but has limited turns) model-sizeand reasoning-effortare used coherently: model-sizeshould reflect the desired cost/latency/capability tier, while reasoning-effortshould reflect how much the selected model should deliberate. Flag cases where a very simple agent uses an unnecessarily large model tier without justification, or where a demanding analysis/review agent likely needs a larger tier than configured
Report Format
Summary
Findings
✅ Pass — criterion fully met ⚠️ Warning — minor issue or room for improvement ❌ Fail — significant problem that should be fixed
Suggestions
###
Examples
Resolve to .stencila/agents/code-reviewer/AGENT.mdRead the file and check for supporting files in subdirectories ( scripts/, references/, assets/) Read schema/Agent.yamlto verify field validity Evaluate frontmatter: nameis code-reviewer, matches directory, valid kebab-case, follows thing-roleconvention; descriptionis specific Check optional fields: model-size: mediumand reasoning-effort: highare a sensible combination for a read-only reviewer; allowed-toolslists read_file, grep, glob, shell— appropriate for that role Evaluate body: instructions say "do not modify files" — consistent with read-only tools Check security: no write_fileor edit_file— good least privilege Run stencila agents validate code-reviewerProduce the report
###
Summary
The code-reviewer agent is well-configured with appropriate read-only tool restrictions and clear instructions. One minor improvement is possible.
Findings
Area Status Notes Required fields ✅ Pass Name and description are valid and specific Optional fields ✅ Pass All present fields have valid values System instructions ✅ Pass Clear, imperative, consistent with tool restrictions Security ✅ Pass Read-only tools, appropriate trust level Consistency ⚠️ Warning reasoning-effortis not set; consider highfor code review tasks, and consider model-size: mediumif the agent should consistently avoid the smallest tier
Suggestions
Add reasoning-effort: high— code review benefits from deeper analysis, and this matches the defaults in the agent-creation skill's configuration guide
Consider model-size: medium— this keeps the agent portable across providers while signaling that review work typically needs more than the smallest, fastest model tier
Resolve to .stencila/agents/code-reviewer/AGENT.mdRead the file — frontmatter includes allowed-skills:with exactly one entry, code-reviewNote from the agent configuration pattern that a single allowed skill is preloaded into the system prompt automatically Evaluate the body as a short preamble that frames the agent's identity, rather than expecting the full workflow to be repeated there Check that the body is not placeholder text and does not contradict the preloaded skill or tool restrictions Produce the report without flagging the short preamble as too sparse
###
Summary
The code-reviewer agent is appropriately concise for a single-skill setup. Its short body works as a preamble, and the substantive workflow is supplied by the preloaded code-reviewskill.
Findings
Area Status Notes Required fields ✅ Pass Name and description are valid and specific Optional fields ✅ Pass allowed-skillshas one valid skill name, which indicates a preloaded single-skill configuration System instructions ✅ Pass The body is brief but appropriate because it frames the agent and the detailed instructions come from the preloaded skill Security ✅ Pass Tool access matches the review-only role Consistency ✅ Pass The configuration and preamble are internally consistent
Suggestions
Optionally expand the preamble by one sentence if you want a stronger agent identity, but do not duplicate the preloaded skill instructions
Resolve to .stencila/agents/quick-helper/AGENT.mdRead the file — frontmatter has name: quick-helper, description: Helps with stuff, trust-level: high, allowed-tools: read_file write_file edit_file shell web_fetch spawn_agent; body says "You are a helpful assistant" with a ## TODOsection Read schema/Agent.yamlto check field validity Evaluate against the checklist Produce the report
###
Summary
The quick-helper agent has significant configuration issues: an overly broad tool set for an unspecified purpose, unnecessarily high trust, a vague description, and placeholder content in the body.
Findings
Area Status Notes Required fields ⚠️ Warning Description "Helps with stuff" is too vague to convey the agent's purpose Optional fields ⚠️ Warning trust-level: highis set without clear justification System instructions ❌ Fail Body contains a ## TODOplaceholder section Security ❌ Fail Agent has shell, web_fetch, and spawn_agentwith trust-level: highbut no clear need for these capabilities Consistency ⚠️ Warning Name quick-helperdoes not follow thing-roleconvention — unclear what domain it covers
Suggestions
Remove placeholder ## TODOsection or replace it with actual instructions
Reduce trust-levelto medium(or low) unless elevated trust is justified by the agent's purpose
Restrict allowed-toolsto only the tools the agent needs — remove shell, web_fetch, and spawn_agentunless required
Rewrite descriptionto specifically convey what the agent does and when to use it
Rename to follow thing-roleconvention (e.g., general-assistantor a more specific name)
Edge Cases
Agent not found : Report the error clearly and suggest checking the name or path. List available agents if possible using stencila agents listor by listing .stencila/agents/directories. Multiple agents requested : Review each agent separately with its own report section. Ask the user to confirm if reviewing all agents is intended. Frontmatter-only agent (no body) : This is valid — do not flag it as a failure. A frontmatter-only AGENT.mdis a legitimate configuration-only agent. Single allowed skill with brief body : Do not flag a short one- or two-sentence body as too sparse when allowed-skillscontains exactly one skill. Treat it as a preamble, because the skill content is preloaded automatically. Unresolved skill references : If allowed-skillslists skill names that have no corresponding SKILL.md, do not flag them as errors. These are valid forward references from top-down design — note them as outstanding dependencies and evaluate the rest of the agent definition on its own merits. The runtime produces a warning for unresolved skill names, not an error. User-level agent : Check ~/.config/stencila/agents/if the agent is not found in the workspace. Hard-coded model or provider : Flag as a warning, not a failure. Hard-coding reduces portability but may be intentional. Missing model-size: Do not flag absence as a failure. Recommend it only when the agent would benefit from an explicit cross-provider size preference, such as a smalltier for quick, low-stakes tasks or a medium/ largetier for heavier review and analysis work. Unknown frontmatter fields : Flag any fields not in the Agent schema as warnings — they may be typos or unsupported properties that will be silently ignored. User asks to fix issues : If the user asks you to apply suggestions, make the changes, then validate with stencila agents validate <agent-name>before reporting completion.
Validation
# By agent name
stencila agents validate <agent-name>
# By directory path
stencila agents validate .stencila/agents/<agent-name>
# By AGENT.md path
stencila agents validate .stencila/agents/<agent-name>/AGENT.mdLimitations
This skill reviews the structure, quality, and configuration of an agent definition. It does not test the agent's runtime behavior or execute it against real inputs. The review checks tool names against known Stencila tools but cannot verify that third-party MCP server IDs are valid. Security assessment is based on configuration analysis, not runtime behavior.
.stencila/skills/agent-review/SKILL.md