Kimi

TL;DR

bash

ONGRID_KIMI_API_KEY=...
ONGRID_KIMI_MODEL=kimi-k2.6                 # default
ONGRID_KIMI_BASE_URL=                       # defaults to api.moonshot.cn/v1

Provider id: kimi. SDK adapter: OpenAI-compatible.

Env vars

Var	Default	Notes
`ONGRID_KIMI_API_KEY`	—	Empty = provider dropped
`ONGRID_KIMI_MODEL`	`kimi-k2.6`	Default model
`ONGRID_KIMI_BASE_URL`	`https://api.moonshot.cn/v1`	Moonshot's endpoint
`ONGRID_KIMI_MODELS`	`kimi-k2.6,kimi-k2.5,moonshot-v1-128k`	Catalog list

Default catalog

kimi-k2.6 — the catalog default; Moonshot's current frontier.
kimi-k2.5 — previous generation; still competitive on cost.
moonshot-v1-128k — long-context variant. 128k tokens.

China-based

Moonshot's api.moonshot.cn endpoint is in mainland CN. Non-CN networks need either a VPC peering or a relay; the Settings UI tags the BaseURL field as "China-based" alongside Zhipu.

Long-context tip

moonshot-v1-128k is the only model in the default catalog with serious context length. Use it for:

The correlate_incident composite — long Prom + Loki + Tempo result blob.
Knowledge-base searches over long playbooks.

The Ongrid investigator persona's 10-tool-call cap means the prompt rarely gets large enough to matter for the routine path; long-context is for the deep-dive case where you've manually pulled lots of data.

Making Kimi the default

bash

ONGRID_LLM_DEFAULT_PROVIDER=kimi

Quirks

OpenAI-compatible wire — same as Zhipu / DeepSeek. Function calling, streaming, system messages all standard.
Output language — Kimi is bilingual but defaults to Chinese responses unless the prompt directive says otherwise. Same LANGUAGE: ... directive that handles GLM works here.
Rate limits — Moonshot's per-account rate limits are tight. Use the Config.MaxConcurrent=5 default on the RCA worker to avoid starving manual chat when an alert storm hits.

Kimi ​

Env vars ​

Default catalog ​

China-based ​

Long-context tip ​

Making Kimi the default ​

Quirks ​

See also ​