Skip to content

Agent persona format

An agent persona is a markdown file describing how one agent — a coordinator, an incident investigator, a specialist — behaves: which tools it can call, which model it runs on, how many ReAct turns it gets, and what system prompt it carries.

Source of truth: Agent in internal/manager/biz/aiops/chatruntime/types.go.

On-disk shape

markdown
---
name: incident_investigator
description: Walk an incident from symptom to root cause, calling host + observability tools.
when_to_use: When the user asks why an alert is firing, or shares an incident link.
tools:
  - expand_topology
  - find_topology_node
  - query_promql
  - search_logs
  - query_traceql
  - host_probe
  - bash
disallowed_tools: []
permission_mode: read-only
max_turns: 24
model: anthropic/claude-sonnet-4-6
critical_reminder: |
  Always show your evidence. Cite the PromQL / LogQL / file path. Never speculate
  beyond what the tools returned.
initial_prompt: |
  You are investigating incident {{ '{{' }}.incident_id{{ '}}' }} on device {{ '{{' }}.device_id{{ '}}' }}.
  Start by reading the incident summary.
background: false
omit_claude_md: false
metadata:
  os: [linux, darwin]
  requires:
    bins: []
    config: []
  ongrid:
    scope: manager
---

# Incident investigator

You are an SRE-grade incident investigator. Given an incident, your job is to:

1. Pull the alert detail and any attached evidence (alert summary, snapshot).
2. Expand the device's topology to understand the blast radius.
3. Query the relevant signal (metric / log / trace) to confirm the symptom.
4. Walk upstream services / underlying resources until you find the root cause.
5. Return an evidence-backed answer in plain language.

When the user asks a follow-up, stay grounded in tool output. If you cannot
verify a claim with a tool call, say so explicitly and stop.

The frontmatter is YAML. The body (after ---) is the system prompt the worker LLM sees. Whitespace and markdown formatting in the body are preserved verbatim.

Frontmatter fields

FieldTypeRequiredDescription
namestringyesAgent identifier used at spawn time (/v1/agents/{name}).
descriptionstringyesHuman listing string shown in the agent picker.
when_to_usestringyesCoordinator's spawn-decision hint. The coordinator reads this when deciding which specialist to invoke.
toolsstring[]noExplicit tool whitelist. Empty = inherit from policy (every tool the user's role can see).
disallowed_toolsstring[]noBlacklist applied after the whitelist. Black wins over white.
permission_modeenumno (default read-only)read-only, mutating-with-confirm, dual-sign-required. Gates which tool classes can run without confirmation.
max_turnsintnoCaps the worker's internal ReAct loop. Coordinator default applies if zero.
modelstringnoLLM identifier (anthropic/claude-sonnet-4-6, openai/gpt-5.4, zhipu/glm-4.7, etc.). Empty = inherit coordinator default.
critical_reminderstringnoSystem-reminder block injected on every turn. Anti-drift mechanism.
initial_promptstringnoPrepended to the first user message at spawn. Supports Go template syntax over the spawn context ({{.incident_id}}, {{.device_id}}).
backgroundboolnotrue forces async execution (long-running workers).
omit_claude_mdboolnoSkip inheriting the global system context. Used for tightly-scoped reviewer agents.
metadataobjectnoOS gate, required binaries / config keys, ongrid extensions (scope, edge_runtime, edge_capabilities).

Unknown frontmatter keys are preserved (the parser stores them under UnknownFields) so future fields from openclaw / claude-code do not break the loader.

Source field

When the SPA reads back an agent, the API also includes a source field that is not part of the on-disk frontmatter:

ValueMeaning
builtinshipped in the binary (programmatic Add). Read-only in the UI.
diskloaded from agents/*.md next to the binary or under an external dir. Read-only in the UI.
usercreated by the user via POST /v1/agents/custom. Editable and deletable from the UI.

Permission modes

The permission_mode field gates which tool classes the persona can run.

ModeAllowed classesConfirmation required
read-onlyread (alias safe)never
mutating-with-confirmread + writeonce per write call
dual-sign-requiredread + write + destructivetwo-step SOP for destructive; once for write

A persona can further constrain via tools (whitelist) and disallowed_tools (blacklist). The runtime applies them in this order:

  1. Take the global tool set the user's role can see.
  2. Intersect with tools if non-empty.
  3. Remove anything in disallowed_tools.
  4. For each remaining tool, check permission_mode against its class.

Registration flow

text
Built-in personas
  ↳ programmatic Add() in cmd/ongrid/main.go at startup. Cannot be deleted.

On-disk personas
  ↳ ./agents/*.md (relative to manager working dir) scanned at boot.
  ↳ ONGRID_AGENTS_EXTERNAL_DIRS adds more.
  ↳ The loader walks every .md, parses frontmatter via skill_parser.go / agent_parser.go.
  ↳ Each agent is registered with Source="disk".
  ↳ Cannot be edited or deleted via the UI; remove the file and restart.

User personas
  ↳ POST /v1/agents/custom with the frontmatter as a JSON body.
  ↳ Stored in agents table (DB), not on disk.
  ↳ Source="user"; fully editable via PATCH /v1/agents/custom/{name}.

The merge order at startup is builtin → disk → user. A user persona with the same name as a built-in or disk persona shadows it.

Spawning an agent

The coordinator picks a specialist by matching the user's query against every persona's when_to_use. To spawn programmatically (chat API):

http
POST /api/v1/chat/sessions/{id}/messages
Content-Type: application/json
Authorization: Bearer ...

{
  "content": "Investigate incident 4217.",
  "agent": "incident_investigator",
  "context": { "incident_id": 4217, "device_id": 102 }
}

If agent is omitted, the coordinator chooses. context is templated into the persona's initial_prompt.

Critical reminders

The critical_reminder block is injected as a system-reminder message at the top of every turn, not just the first. This is the standard claude-code anti-drift mechanism — when the model wanders mid-conversation (e.g. stops citing evidence after turn 8), the reminder pulls it back.

Use it sparingly. One short paragraph per persona is plenty. The agent kernel already injects framework-level reminders (locale, model name, available tools); your critical_reminder should add only persona-specific behavior.

Examples

Minimal specialist

markdown
---
name: disk_specialist
description: Diagnose disk pressure issues — usage, IO, mount points.
when_to_use: When the user asks about disk-full, slow IO, or mount errors.
tools: [host_probe, bash, query_promql]
permission_mode: read-only
model: zhipu/glm-4.7
---

# Disk specialist

Focus exclusively on disk-related questions. ...

Reviewer (omits global context)

markdown
---
name: change_reviewer
description: Review a proposed config change for blast radius.
when_to_use: When the user wants a second opinion on a destructive action.
tools: [expand_topology, read_repo, search_knowledge]
permission_mode: read-only
omit_claude_md: true
max_turns: 8
model: anthropic/claude-opus-4-7
critical_reminder: |
  Be a skeptic. Default to "do not proceed" unless evidence is overwhelming.
---

# Change reviewer

You are reviewing a proposed change. Your job is to find reasons the change
should NOT proceed. Approve only when you cannot find a reason to block.

See also