Environment variables
Both ongrid and ongrid-edge are configured exclusively through environment variables. No YAML. The canonical wiring lives in internal/pkg/config/config.go and the compose env block at deploy/docker-compose.yml. Every variable below is read at startup; the manager does not hot-reload.
Tables group variables by subsystem. Defaults shown are what the binary picks when the variable is unset or empty. "Required" means "the feature in column 1 will not work without it".
HTTP & metrics listeners
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_HTTP_ADDR | string | :8080 | TCP listen for the API + SPA. nginx in the compose stack proxies /api/* here. |
ONGRID_METRICS_ADDR | string | :9100 | TCP listen for /metrics. Scraped by Prometheus. |
ONGRID_TUNNEL_ADDR | string | :40012 | Geminio broker listen. Bound by the frontier service, not the manager itself. |
ONGRID_PUBLIC_URL | string | empty | Canonical https://... URL the manager hands out to edges as the data-plane endpoint (logs/traces ingest, edge bundle download). Empty disables data plane plugin endpoints. Set this in production. |
Database (MySQL default, SQLite opt-in)
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_DB_DIALECT | string | mysql | mysql (default) or sqlite. Empty is treated as mysql. |
ONGRID_DB_DSN | string | ongrid:ongrid@tcp(127.0.0.1:3306)/ongrid?parseTime=true&charset=utf8mb4&loc=Local | MySQL DSN. Required in production. |
ONGRID_DB_PATH | string | ./data/ongrid.db | SQLite database file path. :memory: is accepted in tests. |
JWT (iam)
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_JWT_SECRET | string | dev-insecure-secret-change-me | HS256 signing key for access + refresh tokens. Required; the default refuses to issue tokens in production builds. |
ONGRID_JWT_ACCESS_TTL | duration | 15m | Access-token TTL. |
ONGRID_JWT_REFRESH_TTL | duration | 168h (7d) | Refresh-token TTL. |
Durations accept Go time.ParseDuration syntax (15m, 2h, 30s). A bare integer is interpreted as seconds.
LLM providers
The chat agent supports six first-class providers plus a Custom (OpenAI-compatible) slot. Each provider is gated by its API key — empty key = provider not surfaced to the chat picker.
OpenAI
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_OPENAI_API_KEY | string | empty | OpenAI API key. Empty = OpenAI hidden from the picker. |
ONGRID_OPENAI_MODEL | string | gpt-5.4 | Default model when OpenAI is the selected provider. |
ONGRID_OPENAI_BASE_URL | string | empty | Override base URL for OpenAI-compatible relays (Azure / vLLM / Ollama / one-api). |
Anthropic, Zhipu, Gemini, DeepSeek, Kimi
Each provider has the same three keys (API key / default model / base URL) and a fourth (MODELS, comma-separated list of selectable model slugs):
| Variable | Default |
|---|---|
ONGRID_ANTHROPIC_API_KEY | empty |
ONGRID_ANTHROPIC_MODEL | claude-sonnet-4-6 |
ONGRID_ANTHROPIC_BASE_URL | empty |
ONGRID_ANTHROPIC_MODELS | claude-opus-4-7,claude-sonnet-4-6,claude-haiku-4-5 |
ONGRID_ZHIPU_API_KEY | empty |
ONGRID_ZHIPU_MODEL | glm-4.7 |
ONGRID_ZHIPU_BASE_URL | empty |
ONGRID_ZHIPU_MODELS | glm-5.1,glm-5,glm-4.7,glm-4.7-flash |
ONGRID_GEMINI_API_KEY | empty |
ONGRID_GEMINI_MODEL | gemini-2.5-pro |
ONGRID_GEMINI_BASE_URL | empty |
ONGRID_GEMINI_MODELS | gemini-3.5-flash,gemini-2.5-pro,gemini-2.5-flash |
ONGRID_DEEPSEEK_API_KEY | empty |
ONGRID_DEEPSEEK_MODEL | deepseek-v4-flash |
ONGRID_DEEPSEEK_BASE_URL | empty |
ONGRID_DEEPSEEK_MODELS | deepseek-v4-pro,deepseek-v4-flash,deepseek-reasoner |
ONGRID_KIMI_API_KEY | empty |
ONGRID_KIMI_MODEL | kimi-k2.6 |
ONGRID_KIMI_BASE_URL | empty |
ONGRID_KIMI_MODELS | kimi-k2.6,kimi-k2.5,moonshot-v1-128k |
Routing & budget
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_LLM_DEFAULT_PROVIDER | string | empty | Provider used when a request does not specify one. Empty = first configured provider (alphabetical). Set this when you want a specific provider to be the site default. |
ONGRID_LLM_DAILY_TOKEN_LIMIT | int | 0 | Global per-UTC-day token ceiling. 0 = unlimited. |
Agent kernel & tools
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_AGENT_KERNEL | string | graph | graph enables the eino graph kernel + SkillRegistry.Resolve activation-keyword filter + ToolBag deferral pipeline. legacy is the older for-loop runner with all tools always full-schema. Flip to legacy only to bisect. |
ONGRID_TOOLBAG_DEFERRAL_THRESHOLD | int | 30 | Tool-count threshold above which specialty-tier tools get redacted schemas (LLM must call ToolSearch to expand). |
ONGRID_SKILLS_EXTERNAL_DIRS | csv | empty | Comma/colon-separated absolute paths the skill loader scans for skill.json manifests. Each must be absolute; relative or missing entries are skipped with a log line. |
Frontier broker client
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_FRONTIER_ADDR | string | frontier:40011 | Service-bound listen of the upstream frontier broker the manager dials. |
ONGRID_FRONTIER_SERVICE_NAME | string | ongrid-manager | Identifier reported on connect. |
ONGRID_FRONTIER_DISABLED | bool | false | Skip the long-lived service-end dial entirely. Used by e2e harness — features requiring the broker (webssh, edge reverse calls) error at call site. |
Cloud-side Prometheus
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_PROM_ENABLED | bool | false | Gates Prometheus wiring. When false the manager runs without metric storage; push_prom_samples silently drops, and query_promql is not registered as a tool. |
ONGRID_PROM_URL | string | http://prometheus:9090 | Prom server root URL. |
ONGRID_PROM_REMOTE_WRITE_URL | string | empty | Exact remote_write endpoint when the upstream is not rooted at /api/v1/write (Mimir / Cortex / VictoriaMetrics gateway). |
ONGRID_PROM_QUERY_URL | string | empty | Query API root for query_promql. Empty falls back to ONGRID_PROM_URL. |
ONGRID_PROM_TLS_INSECURE | bool | false | Skip TLS cert verification. |
ONGRID_PROM_TLS_CA_FILE | string | empty | PEM file with the root CA used to verify the TSDB's cert. Empty = system roots. |
Grafana
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_GRAFANA_INTERNAL_URL | string | http://grafana:3000/grafana | URL the manager uses to reach Grafana over the docker network. |
ONGRID_GRAFANA_BOOTSTRAP_USER | string | admin | One-time admin user used to auto-create the ongrid Service Account + token. |
ONGRID_GRAFANA_BOOTSTRAP_PASSWORD | string | empty | Bootstrap password; empty disables bootstrap (paste a manually-created SA token in the UI). |
ONGRID_GRAFANA_TLS_INSECURE | bool | false | Skip cert verification for the bootstrap call. |
ONGRID_GRAFANA_ROOT_URL | string | %(protocol)s://%(domain)s/grafana/ | Forwarded to GF_SERVER_ROOT_URL. |
Logs & traces (data plane)
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_LOG_QUERY_URL | string | http://loki:3100 | Loki API root the manager talks to for query_range / labels / values. Empty = Logs page returns 503. |
ONGRID_TRACE_QUERY_URL | string | http://tempo:3200 | Tempo HTTP listener root for /api/search, /api/traces/<id>, /api/search/tag/<tag>/values. Empty = Traces page returns 503. |
The edge data plane endpoints (where the logs / traces plugins POST) are derived from ONGRID_PUBLIC_URL; see Telemetry data plane.
Built-in alert thresholds
These drive the four canonical built-in rules over the host metric closed set. Set to 0 to disable.
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_ALERT_ENABLED | bool | true | Master switch for built-in host alert evaluation. |
ONGRID_ALERT_COOLDOWN | duration | 10m | Per-(edge, rule) cooldown. Notifications inside this window are suppressed. |
ONGRID_ALERT_CPU_PERCENT | float | 90 | Fires when cpu_pct >= threshold. |
ONGRID_ALERT_MEM_PERCENT | float | 90 | Fires when mem_pct >= threshold. |
ONGRID_ALERT_DISK_USED_PERCENT | float | 90 | Fires when disk_used_pct >= threshold. |
ONGRID_ALERT_LOAD1 | float | 0 | Fires when load1 >= threshold. 0 disables (load varies too widely across host shapes for a useful default). |
ONGRID_ALERT_EVAL_INTERVAL | duration | 5m | How often the pipeline evaluator scans edges and queries Prom. |
ONGRID_ALERT_EDGE_OFFLINE_THRESHOLD | duration | 90s | Heartbeat staleness above which an edge counts as offline. |
ONGRID_ALERT_PROM_INGEST_FAIL_LIMIT | int | 5 | Consecutive remote_write failure count at which prom_ingest_fail fires. |
Notifications
Master switch + the four built-in channel types. UI-created channels carry their own enabled flag and are unaffected by these.
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_NOTIFY_ENABLED | bool | true | Master switch for outbound notifications. |
ONGRID_NOTIFY_DEFAULT_CHANNELS | csv | empty | Ordered channel-name list used when a caller does not specify destinations. |
ONGRID_NOTIFY_TIMEOUT | duration | 10s | Per-channel send timeout. |
ONGRID_NOTIFY_WEBHOOK_ENABLED | bool | false | Enable the env-configured webhook channel. |
ONGRID_NOTIFY_WEBHOOK_NAME | string | webhook | Display name. |
ONGRID_NOTIFY_WEBHOOK_URL | string | empty | POST endpoint. |
ONGRID_NOTIFY_WEBHOOK_SECRET | string | empty | Optional HMAC secret. |
ONGRID_NOTIFY_SLACK_ENABLED | bool | false | Enable the env-configured Slack channel. |
ONGRID_NOTIFY_SLACK_NAME | string | slack | Display name. |
ONGRID_NOTIFY_SLACK_WEBHOOK_URL | string | empty | Incoming Webhook URL. |
ONGRID_NOTIFY_FEISHU_ENABLED | bool | false | Enable the env-configured Larksuite / Feishu channel. |
ONGRID_NOTIFY_FEISHU_NAME | string | feishu | Display name. |
ONGRID_NOTIFY_FEISHU_WEBHOOK_URL | string | empty | Custom-Bot URL. |
ONGRID_NOTIFY_FEISHU_SECRET | string | empty | Signing secret. |
ONGRID_NOTIFY_DINGTALK_ENABLED | bool | false | Enable the env-configured DingTalk channel. |
ONGRID_NOTIFY_DINGTALK_NAME | string | dingtalk | Display name. |
ONGRID_NOTIFY_DINGTALK_WEBHOOK_URL | string | empty | Custom-Bot URL. |
ONGRID_NOTIFY_DINGTALK_SECRET | string | empty | Signing secret. |
For WeCom and Telegram channels, create them via the Settings → Channels UI — they are first-class but have no env-configuration shortcut.
Bootstrap admin
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_ADMIN_EMAIL | string | empty | Email of the bootstrap admin. If empty, no admin is seeded; you must register from the UI on first boot. |
ONGRID_ADMIN_PASSWORD | string | empty | Initial password. Operator is expected to change it on first login. |
Edge agent
These are consumed by ongrid-edge, written into /etc/ongrid-edge/ongrid-edge.env by the installer.
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_EDGE_CLOUD_ADDR | string | 127.0.0.1:40012 | Frontier broker host:port. The agent dials this with TLS. |
ONGRID_EDGE_ACCESS_KEY | string | empty | Per-edge access key. Issued by the manager when you create an edge in the UI. |
ONGRID_EDGE_SECRET_KEY | string | empty | Matching secret. Shown once at edge creation; rotate via the UI. |
ONGRID_EDGE_COLLECTOR_MODE | string | off | off (default; the hostmetrics + procmetrics plugins handle metrics), auto (legacy embedded + scraper), embedded (embedded push only), scrape (multi-target HTTP scraper). |
ONGRID_EDGE_SCRAPE_CONFIG_FILE | string | /etc/ongrid-edge/scrape.yaml | Path to the scrape config YAML. Only consulted when COLLECTOR_MODE=scrape. |
ONGRID_EDGE_COLLECTOR_INTERVAL | duration | 10s | How often the embedded collector snapshots. Scrape mode ignores this. |
ONGRID_EDGE_PLUGIN_BIN_DIR | string | /usr/local/lib/ongrid-edge | Directory holding plugin binaries (promtail, otelcol-contrib, node_exporter, process_exporter). |
ONGRID_EDGE_PLUGIN_WORK_DIR | string | /var/lib/ongrid-edge/plugins | Per-plugin runtime dirs (configs, PID files, queue spool). |
ONGRID_EDGE_UPGRADE_STAGE_DIR | string | /var/lib/ongrid-edge/.upgrade | ADR-024 staged-bundle directory. Empty disables remote whole-bundle upgrades. |
ONGRID_INSTALL_WAIT | int | 20 | Seconds the curl-pipe installer polls the journal waiting for "registered with cloud". |
Embedding & RAG
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_EMBEDDING_PROVIDER | string | zhipu | zhipu (default; calls GLM embedding API), local (uses an on-disk bge model), openai (uses OpenAI embeddings). |
ONGRID_EMBEDDING_LOCAL_MODEL_PATH | string | empty | Absolute path to the local model when PROVIDER=local. The release tarball stages bge-base-en-v1.5 under .cache/. |
ONGRID_VAULT_REPO_URL | string | https://github.com/ongridio/vault | Upstream vault repository the manager pulls baseline knowledge from. Override for air-gapped mirrors. |
Locale & misc
| Variable | Type | Default | Description |
|---|---|---|---|
ONGRID_DEFAULT_LOCALE | string | en | Default locale used by automatic LLM outputs (alert investigations, scheduled summaries). UI-triggered chat uses the user's UI locale instead. |
See also
- REST API — endpoints these env vars wire up.
- CLI — the two binaries' command-line flags.
- Telemetry data plane — why log/trace endpoints differ from the tunnel.
- Architecture — where each env var lives in the stack diagram.