Kubernetes

Ongrid runs on Kubernetes but it is not a first-class deployment target. The default install path is docker compose on a single Linux host, because that is the fastest path from git clone to a working https://your-host/. K8s is for operators who already have a Kubernetes platform and want to fit Ongrid into it.

Helm chart status

There is no official Helm chart yet. The release tarball ships deploy/docker-compose.yml and the install scripts that wrap it — no deploy/charts/, no kustomize/. If you need to deploy on K8s today, you translate the compose file by hand. A community chart is on the roadmap; track it on the GitHub issue tracker.

What the compose file gives you, the equivalents you will need:

Compose service	K8s equivalent
`mysql`	StatefulSet (1 replica) + headless Service + PVC
`ongrid`	Deployment (1 replica) + ClusterIP Service for `:8080` (`/api`) and `:9100` (`/metrics`)
`nginx` (front door)	Deployment + LoadBalancer/Ingress for `:443` and `:80`
`frontier`	Deployment + LoadBalancer/NodePort for the tunnel port `:40012`
`prometheus`	StatefulSet + PVC, or your existing kube-prom-stack
`loki`	StatefulSet + PVC, or your existing loki-distributed
`tempo`	StatefulSet + PVC
`grafana`	Deployment + PVC, or your existing grafana
`searxng`	Deployment + ClusterIP

Use ConfigMap + Secret for the ONGRID_* env. Mount the same TLS cert into the nginx pod that you mount in the compose stack.

Should the manager run on K8s?

Honest answer: it works, but you do not get much out of it. The manager is a single process with persistent state (MySQL) and is not horizontally scalable — running two pods behind a Service breaks the in-process alert evaluator, the chat session router, and the agent kernel's tool registry caches. So you are running a 1-replica Deployment with a PVC, which is fundamentally the same shape as a VM with Docker, just with more YAML.

Where K8s buys you something:

You already have ingress / cert-manager / external-DNS wired up — reuse them for the manager front door.
You already have a managed MySQL (RDS / CloudSQL / Vitess) — point ONGRID_DB_DSN at it and drop the StatefulSet.
You already have a Prometheus / Loki / Tempo stack — point ONGRID_PROM_URL, ONGRID_LOG_QUERY_URL, ONGRID_TRACE_QUERY_URL at them and drop those StatefulSets.

After the substitutions, "manager on K8s" reduces to two Deployments (manager + nginx) plus a Service per upstream you reused.

Edge on K8s — daemonset pattern

If your workload runs on K8s and you want per-node observability, run the edge agent as a DaemonSet. One agent pod per node, hostPath into /proc, /sys, the node's journal, and the node's syslog.

Sketch:

yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ongrid-edge
  namespace: ongrid
spec:
  selector: { matchLabels: { app: ongrid-edge } }
  template:
    metadata: { labels: { app: ongrid-edge } }
    spec:
      hostPID: true
      hostNetwork: false      # tunnel is outbound; no need for hostNetwork
      containers:
      - name: edge
        image: your-registry/ongrid-edge:vX.Y.Z
        env:
        - name: ONGRID_EDGE_CLOUD_ADDR
          value: ongrid.example.com:40012
        - name: ONGRID_EDGE_ACCESS_KEY
          valueFrom: { secretKeyRef: { name: ongrid-edge, key: access_key } }
        - name: ONGRID_EDGE_SECRET_KEY
          valueFrom: { secretKeyRef: { name: ongrid-edge, key: secret_key } }
        # rare cases (read-only network introspection skills)
        securityContext:
          capabilities: { add: ["NET_ADMIN"] }
        volumeMounts:
        - { name: proc,   mountPath: /host/proc, readOnly: true }
        - { name: sys,    mountPath: /host/sys,  readOnly: true }
        - { name: jrnl,   mountPath: /var/log/journal, readOnly: true }
      volumes:
      - { name: proc,   hostPath: { path: /proc } }
      - { name: sys,    hostPath: { path: /sys } }
      - { name: jrnl,   hostPath: { path: /var/log/journal } }

You will need a Docker image of ongrid-edge — make docker-ongrid-edge produces one (ongrid-edge:<version>). Push it to your private registry.

The same access/secret pair gets reused by every node. The manager's Edges page will show one row per node, deduplicated by hostname.

Edge on K8s — sidecar pattern

If you only want to observe one app — not the whole node — run the edge agent as a sidecar in that app's pod. Same image, same env vars, no hostPath. You lose host metrics (the gopsutil collector will return the pod's view, not the node's) but gain narrower scope.

This is the right shape when:

Multiple teams share a cluster and each team wants its own Ongrid manager / tenant.
You do not have permission to run privileged DaemonSets.
You only care about one workload's logs and traces.

What does NOT work

ongrid running across multiple replicas. The manager is stateful in-process: alert evaluator state, chat session router, agent kernel caches are all single-instance. A 2-replica Deployment will produce duplicate notifications, race the alert evaluator, and split chat sessions across instances. Run one replica.
The curl-pipe installer inside a pod. It hard-fails without systemd. Use the image instead.
The ADR-024 staged-bundle upgrade. The hook is a Linux shell script wired into systemd. On K8s you upgrade by changing the image tag and rolling the DaemonSet — the in-place swap path is bypassed.
NodePort for port 40012. Works, but every edge connection re-hashes when the node behind the NodePort changes. Prefer a LoadBalancer for the tunnel listener, or expose the frontier broker on a dedicated node.

Roadmap

A Helm chart with sensible defaults (1-replica manager, optional bundled MySQL / Prom / Loki / Tempo).
A CRD-driven operator that takes a Manager and EdgePool resource, generates the access/secret pair automatically, and rolls upgrades.
A k8s-native edge plugin that reads pod logs via the kubelet API instead of journald, so cluster-local logs do not require hostPath.

None of these are committed to a release; see the GitHub issues if you want to help drive them.

Kubernetes ​

Helm chart status ​

Should the manager run on K8s? ​

Edge on K8s — daemonset pattern ​

Edge on K8s — sidecar pattern ​

What does NOT work ​

Roadmap ​