Linux (server)

The manager is shipped as a Docker image (ongrid:<version>) and is operated through docker compose. Both come from the release tarball produced by make package — no apt install ongrid, no host-side daemon.

Supported distributions

We run the manager day-to-day on the following hosts. Anything that runs Docker Engine 24+ should work; the list below is what gets explicit testing per release.

Distribution	Tested on
Ubuntu	22.04 LTS, 24.04 LTS
Debian	12 (bookworm)
RHEL	9
Rocky Linux	9
AlmaLinux	9

CentOS 7 is not supported — its glibc and systemd are too old for the bundled Compose stack and have hit upstream EOL.

Required software

Docker Engine 24.0 or later. Older releases lack --platform plumbing that make docker-build relies on and break extra_hosts: host-gateway.
Docker Compose v2 (the docker compose CLI plugin, not the legacy docker-compose binary). The compose file at deploy/docker-compose.yml deliberately omits the top-level version: key, which v1 rejects.
systemd for the host. The manager itself does not register a systemd unit — the install script deploy/install/server-install.sh only relies on systemd to start Docker — but auto-restart on reboot needs docker.service enabled.
iptables / nftables is fine; both backends work.

No required kernel modules beyond what Docker itself needs (overlayfs, namespaces, cgroups v1 or v2).

Resource floors

Resource	Floor	Comfortable	Why
CPU	2 vCPU	4 vCPU	The manager Go process, Prometheus, Loki, Tempo, Grafana, MySQL all live on the same host.
Memory	4 GB	8 GB	Prometheus + Loki together can hit 1.5 GB at idle. Headroom keeps the OOM killer off MySQL.
Disk	20 GB	100 GB	Prometheus retention is 90 days / 20 GB by default (see `--storage.tsdb.retention.*` in the compose file). Loki + Tempo share the rest.
Network	10 Mbps	100 Mbps	Telemetry is the data plane — log shipping from many edges can saturate a thin pipe long before the manager itself does.

The 20 GB floor assumes you only test with a handful of edges. Once you cross ~25 hosts shipping metrics + logs + traces, plan on 200 GB just to keep 30 days of Loki around.

Ports

Port	Direction	Protocol	What it is
443	inbound	HTTPS	nginx — SPA, REST API, log/trace ingest
80	inbound	HTTP	nginx redirect to 443
40012	inbound	TCP (geminio)	Frontier broker — every edge dials this
3306	exposed locally	TCP	MySQL — bind to `127.0.0.1` in production
9090	exposed locally	HTTP	Prometheus — bind to `127.0.0.1` in production
9100	inbound (optional)	HTTP	Manager `/metrics` — for an external Prometheus to scrape

Only 443, 80, and 40012 need to be reachable from the public internet (or your edge fleet's network). Everything else stays bound to localhost in production — nginx fronts the API, and Grafana / Prometheus / Loki / Tempo are reached over the docker network.

What the compose stack runs

From deploy/docker-compose.yml:

Service	Image	Role
`mysql`	mysql:8.0	Manager's primary store (users, edges, devices, alert rules, sessions).
`ongrid`	`ongrid:<version>`	Manager — Go binary built from `cmd/ongrid`.
`nginx`	`ongrid-web:<version>`	TLS termination, SPA, `/api/` reverse proxy, `/edge/` static for the install bundle.
`frontier`	singchia/frontier:1.2.5	Geminio broker; edge tunnel endpoint (ADR-007).
`prometheus`	prom/prometheus:v2.54.0	Metric store + remote_write receiver (ADR-009).
`loki`	grafana/loki:3.4.0	Log store (ADR-012).
`tempo`	grafana/tempo:2.5.0	Trace store (ADR-013).
`searxng`	searxng/searxng:latest	Self-hosted search backend for the `web_search` skill.
`grafana`	grafana/grafana-oss:11.1.4	Dashboard embed for the Monitor page.

The compose file is the production shape — there is no second "prod" file. Override knobs via the env block (ONGRID_* — see Environment variables) instead of forking the YAML.

TLS

nginx terminates TLS using certificates bind-mounted from deploy/certs/ (cert.pem, key.pem). For first boot the install script generates a self-signed cert; replace it with a real cert at any time and docker compose restart nginx. Let's Encrypt is supported via a sidecar pattern documented in deploy/README.md.

Selinux + AppArmor

Both work out of the box. The compose file does not mount sensitive host paths read-write; the bind mounts (./edge, ./nginx, ./certs, ./grafana/provisioning) are all :ro. If your SELinux policy enforces strict labels on bind mounts, append :Z to the mount strings inside docker-compose.override.yml.

Upgrading

make package           # produces dist/out/ongrid-vX.Y.Z-linux-amd64.tar.xz
scp dist/out/ongrid-vX.Y.Z-linux-amd64.tar.xz host:/tmp/
ssh host /tmp/upgrade.sh

upgrade.sh lives next to the tarball and does the idempotent dance: load the new image, write a temporary docker-compose.override.yml pinning the new tag, docker compose up -d, prune the previous image after one healthy minute. See Upgrade.

What does NOT work

Podman + podman-compose: the compose file uses Docker-specific shorthand (extra_hosts: host-gateway, :Z SELinux semantics) that podman either parses incompatibly or ignores. We do not test it.
K3s host-mode networking: works for the manager, but the edge installer is built around bare systemd — see Kubernetes for the daemonset pattern.
OpenRC / runit / s6: the manager container does not care, but the host's Docker daemon needs to come up on boot. If your init system can do that, you are fine.

Linux (server) ​

Supported distributions ​

Required software ​

Resource floors ​

Ports ​

What the compose stack runs ​

TLS ​

Selinux + AppArmor ​

Upgrading ​

What does NOT work ​