Skip to content

Knowledge base

The knowledge base is what makes the agent's answers feel grounded — the LLM doesn't have to guess from training data because it can search your runbooks, your past postmortems, and your team's docs at every turn.

Storage shape

One vector store, three source types.

Source typeWhere it comes fromRead-only?
vaultThe built-in ongridio/vault playbook collection (96 markdown docs as of 2026-05-29)Yes
repoGit URLs you register — public HTTPS or SSH via per-repo identityNo (edit settings + re-sync)
uploadDirect uploads (md / txt / pdf / docx) via the SPA's /knowledge pageNo (edit / delete row)
manualPaste markdown via the inline editorNo

All four types share one Qdrant collection (ongrid_knowledge) and one embedding model. Search hits return the source_type payload field so the UI can render the right "from vault" / "from your repos" badge.

ADR-028 — knowledge sources, tiered

Why one collection: the alternative (one collection per source type) forced the LLM to call N searches per question to get coverage, blowing the tool-call budget. One collection + source_type filter gets the same isolation with one query.

The Qdrant client

Narrow surface, in internal/manager/biz/knowledge/usecase.go:62:

go
type QdrantClient interface {
    EnsureCollection(ctx context.Context, name string, dim int) error
    EnsurePayloadIndex(ctx context.Context, collection, field, schema string) error
    Upsert(ctx context.Context, collection string, points []qdrantx.Point) error
    DeleteByFilter(ctx context.Context, collection string, mustMatch map[string]any) error
    DeleteByID(ctx context.Context, collection string, id uint64) error
    GetPoints(ctx context.Context, collection string, ids []uint64) ([]qdrantx.SearchHit, error)
    Search(ctx context.Context, collection string, vector []float32, opts qdrantx.SearchOpts) ([]qdrantx.SearchHit, error)
    Scroll(ctx context.Context, collection string, opts qdrantx.ScrollOpts) (*qdrantx.ScrollResult, error)
}

Backed by internal/pkg/qdrantx in production, fake-able in tests. Qdrant ships in the docker-compose; external Qdrant is supported via ONGRID_QDRANT_URL.

The vault

The vault is a public git repository (no auth) that ships pre- curated diagnostics playbooks. Topics covered (as of 2026-05-29):

  • Network: DNS / TCP / TLS / HTTP debugging recipes.
  • Kubernetes: pod / node / kubelet symptom-to-cause maps.
  • Linux: process / memory / disk / journald investigation paths.
  • Database: MySQL slow-log triage, PostgreSQL replication lag.
  • Diagnostics: 70+ symptom-keyed playbooks.

ADR-029 added cloud sync: the manager pulls the latest vault on a schedule (default 24h, configurable via the SPA settings) and re-embeds on changes. The on-disk checkout lives under /var/lib/ongrid/repos/<vault_id>/; the qdrant points are tagged source=vault so a Sync only deletes vault points, never your own.

Don't treat vault as your repo

The vault is a built-in source — it never appears in the /knowledge/repos CRUD page. The code uses isBuiltinVault() / is_builtin to gate this; do not substring-match ongridio/vault (see the recurring-pitfalls feedback).

Your own repos

POST /v1/knowledge/repos with a { url, name } body registers a git repository. On the next sync tick the manager:

  1. git clone --depth=1 (or git pull if the dir already exists) into /var/lib/ongrid/repos/<id>/.
  2. Walks the tree for .md / .txt / .rst / .yaml / .yml / .toml / .json files.
  3. Embeds each file with the configured embedding model.
  4. Replaces the point-set for that repo (source=repo, repo_id=<id>) atomically.

SSH auth

Per-repo identity, managed via the ssh_identities table and the SPA's /knowledge/identities page. The Repository row does NOT carry a provider field — the identity table is the single source of git creds (ADR-023).

GIT_SSH_COMMAND is set per-sync to point at the identity's key file, so each repo's clone uses its own key without polluting $HOME/.ssh/.

HTTPS auth

Public HTTPS repos clone unauthenticated. Private HTTPS auth is parked: the original GitHub-PAT path was removed when SSH covered the realistic use cases. Until ADR-018's RepoFetcher lands, use SSH for private.

Uploads

The /knowledge/upload page accepts:

FormatExtractor
.md, .txtRaw read + chunk
.pdfdocextract (internal/pkg/docextract)
.docxdocextract

Each upload becomes a row in knowledge_docs and a point in qdrant with source=upload. Inline edit and delete work per-row from the "Organization knowledge" tree.

The LLM gets a query_knowledge BaseTool — schema in query_knowledge_basetool.go. Args: query, optional source_type filter, optional top_k. Returns: a list of {score, source, title, snippet, url?} hits.

The coordinator persona is told to call this early for any "how do I" / "why does X happen" / "what's the runbook for Y" question — the cost of one knowledge search is much lower than a 5-tool investigation that lands on the same answer the playbook already had.

The chat UI also exposes a Quick Action button "Search knowledge" so operators can hit the same index without going through chat.

See also

  • Skillsquery_knowledge is one of the 18 bridged BaseTools surfaced in the skill registry.
  • RCA — the investigator persona that calls knowledge search as a first-resort tool.
  • Architecture — where Qdrant sits in the L1 stack.