Agent persona format

An agent persona is a markdown file describing how one agent — a coordinator, an incident investigator, a specialist — behaves: which tools it can call, which model it runs on, how many ReAct turns it gets, and what system prompt it carries.

Source of truth: Agent in internal/manager/biz/aiops/chatruntime/types.go.

On-disk shape

markdown

---
name: incident_investigator
description: Walk an incident from symptom to root cause, calling host + observability tools.
when_to_use: When the user asks why an alert is firing, or shares an incident link.
tools:
  - expand_topology
  - find_topology_node
  - query_promql
  - search_logs
  - query_traceql
  - host_probe
  - bash
disallowed_tools: []
permission_mode: read-only
max_turns: 24
model: anthropic/claude-sonnet-4-6
critical_reminder: |
  Always show your evidence. Cite the PromQL / LogQL / file path. Never speculate
  beyond what the tools returned.
initial_prompt: |
  You are investigating incident {{ '{{' }}.incident_id{{ '}}' }} on device {{ '{{' }}.device_id{{ '}}' }}.
  Start by reading the incident summary.
background: false
omit_claude_md: false
metadata:
  os: [linux, darwin]
  requires:
    bins: []
    config: []
  ongrid:
    scope: manager
---

# Incident investigator

You are an SRE-grade incident investigator. Given an incident, your job is to:

1. Pull the alert detail and any attached evidence (alert summary, snapshot).
2. Expand the device's topology to understand the blast radius.
3. Query the relevant signal (metric / log / trace) to confirm the symptom.
4. Walk upstream services / underlying resources until you find the root cause.
5. Return an evidence-backed answer in plain language.

When the user asks a follow-up, stay grounded in tool output. If you cannot
verify a claim with a tool call, say so explicitly and stop.

The frontmatter is YAML. The body (after ---) is the system prompt the worker LLM sees. Whitespace and markdown formatting in the body are preserved verbatim.

Frontmatter fields

Field	Type	Required	Description
`name`	string	yes	Agent identifier used at spawn time (`/v1/agents/{name}`).
`description`	string	yes	Human listing string shown in the agent picker.
`when_to_use`	string	yes	Coordinator's spawn-decision hint. The coordinator reads this when deciding which specialist to invoke.
`tools`	string[]	no	Explicit tool whitelist. Empty = inherit from policy (every tool the user's role can see).
`disallowed_tools`	string[]	no	Blacklist applied after the whitelist. Black wins over white.
`permission_mode`	enum	no (default `read-only`)	`read-only`, `mutating-with-confirm`, `dual-sign-required`. Gates which tool classes can run without confirmation.
`max_turns`	int	no	Caps the worker's internal ReAct loop. Coordinator default applies if zero.
`model`	string	no	LLM identifier (`anthropic/claude-sonnet-4-6`, `openai/gpt-5.4`, `zhipu/glm-4.7`, etc.). Empty = inherit coordinator default.
`critical_reminder`	string	no	System-reminder block injected on every turn. Anti-drift mechanism.
`initial_prompt`	string	no	Prepended to the first user message at spawn. Supports Go template syntax over the spawn context (`{{.incident_id}}`, `{{.device_id}}`).
`background`	bool	no	`true` forces async execution (long-running workers).
`omit_claude_md`	bool	no	Skip inheriting the global system context. Used for tightly-scoped reviewer agents.
`metadata`	object	no	OS gate, required binaries / config keys, ongrid extensions (scope, edge_runtime, edge_capabilities).

Unknown frontmatter keys are preserved (the parser stores them under UnknownFields) so future fields from openclaw / claude-code do not break the loader.

Source field

When the SPA reads back an agent, the API also includes a source field that is not part of the on-disk frontmatter:

Value	Meaning
`builtin`	shipped in the binary (programmatic Add). Read-only in the UI.
`disk`	loaded from `agents/*.md` next to the binary or under an external dir. Read-only in the UI.
`user`	created by the user via `POST /v1/agents/custom`. Editable and deletable from the UI.

Permission modes

The permission_mode field gates which tool classes the persona can run.

Mode	Allowed classes	Confirmation required
`read-only`	`read` (alias `safe`)	never
`mutating-with-confirm`	`read` + `write`	once per `write` call
`dual-sign-required`	`read` + `write` + `destructive`	two-step SOP for `destructive`; once for `write`

A persona can further constrain via tools (whitelist) and disallowed_tools (blacklist). The runtime applies them in this order:

Take the global tool set the user's role can see.
Intersect with tools if non-empty.
Remove anything in disallowed_tools.
For each remaining tool, check permission_mode against its class.

Registration flow

text

Built-in personas
  ↳ programmatic Add() in cmd/ongrid/main.go at startup. Cannot be deleted.

On-disk personas
  ↳ ./agents/*.md (relative to manager working dir) scanned at boot.
  ↳ ONGRID_AGENTS_EXTERNAL_DIRS adds more.
  ↳ The loader walks every .md, parses frontmatter via skill_parser.go / agent_parser.go.
  ↳ Each agent is registered with Source="disk".
  ↳ Cannot be edited or deleted via the UI; remove the file and restart.

User personas
  ↳ POST /v1/agents/custom with the frontmatter as a JSON body.
  ↳ Stored in agents table (DB), not on disk.
  ↳ Source="user"; fully editable via PATCH /v1/agents/custom/{name}.

The merge order at startup is builtin → disk → user. A user persona with the same name as a built-in or disk persona shadows it.

Spawning an agent

The coordinator picks a specialist by matching the user's query against every persona's when_to_use. To spawn programmatically (chat API):

http

POST /api/v1/chat/sessions/{id}/messages
Content-Type: application/json
Authorization: Bearer ...

{
  "content": "Investigate incident 4217.",
  "agent": "incident_investigator",
  "context": { "incident_id": 4217, "device_id": 102 }
}

If agent is omitted, the coordinator chooses. context is templated into the persona's initial_prompt.

Critical reminders

The critical_reminder block is injected as a system-reminder message at the top of every turn, not just the first. This is the standard claude-code anti-drift mechanism — when the model wanders mid-conversation (e.g. stops citing evidence after turn 8), the reminder pulls it back.

Use it sparingly. One short paragraph per persona is plenty. The agent kernel already injects framework-level reminders (locale, model name, available tools); your critical_reminder should add only persona-specific behavior.

Examples

Minimal specialist

markdown

---
name: disk_specialist
description: Diagnose disk pressure issues — usage, IO, mount points.
when_to_use: When the user asks about disk-full, slow IO, or mount errors.
tools: [host_probe, bash, query_promql]
permission_mode: read-only
model: zhipu/glm-4.7
---

# Disk specialist

Focus exclusively on disk-related questions. ...

Reviewer (omits global context)

markdown

---
name: change_reviewer
description: Review a proposed config change for blast radius.
when_to_use: When the user wants a second opinion on a destructive action.
tools: [expand_topology, read_repo, search_knowledge]
permission_mode: read-only
omit_claude_md: true
max_turns: 8
model: anthropic/claude-opus-4-7
critical_reminder: |
  Be a skeptic. Default to "do not proceed" unless evidence is overwhelming.
---

# Change reviewer

You are reviewing a proposed change. Your job is to find reasons the change
should NOT proceed. Approve only when you cannot find a reason to block.

Agent persona format ​

On-disk shape ​

Frontmatter fields ​

Source field ​

Permission modes ​

Registration flow ​

Spawning an agent ​

Critical reminders ​

Examples ​

Minimal specialist ​

Reviewer (omits global context) ​

See also ​

Agent persona format

On-disk shape

Frontmatter fields

Source field

Permission modes

Registration flow

Spawning an agent

Critical reminders

Examples

Minimal specialist

Reviewer (omits global context)

See also