Skip to content

Kimi

TL;DR

bash
ONGRID_KIMI_API_KEY=...
ONGRID_KIMI_MODEL=kimi-k2.6                 # default
ONGRID_KIMI_BASE_URL=                       # defaults to api.moonshot.cn/v1

Provider id: kimi. SDK adapter: OpenAI-compatible.

Env vars

VarDefaultNotes
ONGRID_KIMI_API_KEYEmpty = provider dropped
ONGRID_KIMI_MODELkimi-k2.6Default model
ONGRID_KIMI_BASE_URLhttps://api.moonshot.cn/v1Moonshot's endpoint
ONGRID_KIMI_MODELSkimi-k2.6,kimi-k2.5,moonshot-v1-128kCatalog list

Default catalog

  • kimi-k2.6 — the catalog default; Moonshot's current frontier.
  • kimi-k2.5 — previous generation; still competitive on cost.
  • moonshot-v1-128k — long-context variant. 128k tokens.

China-based

Moonshot's api.moonshot.cn endpoint is in mainland CN. Non-CN networks need either a VPC peering or a relay; the Settings UI tags the BaseURL field as "China-based" alongside Zhipu.

Long-context tip

moonshot-v1-128k is the only model in the default catalog with serious context length. Use it for:

  • The correlate_incident composite — long Prom + Loki + Tempo result blob.
  • Knowledge-base searches over long playbooks.

The Ongrid investigator persona's 10-tool-call cap means the prompt rarely gets large enough to matter for the routine path; long-context is for the deep-dive case where you've manually pulled lots of data.

Making Kimi the default

bash
ONGRID_LLM_DEFAULT_PROVIDER=kimi

Quirks

  • OpenAI-compatible wire — same as Zhipu / DeepSeek. Function calling, streaming, system messages all standard.
  • Output language — Kimi is bilingual but defaults to Chinese responses unless the prompt directive says otherwise. Same LANGUAGE: ... directive that handles GLM works here.
  • Rate limits — Moonshot's per-account rate limits are tight. Use the Config.MaxConcurrent=5 default on the RCA worker to avoid starving manual chat when an alert storm hits.

See also