Kimi
TL;DR
bash
ONGRID_KIMI_API_KEY=...
ONGRID_KIMI_MODEL=kimi-k2.6 # default
ONGRID_KIMI_BASE_URL= # defaults to api.moonshot.cn/v1Provider id: kimi. SDK adapter: OpenAI-compatible.
Env vars
| Var | Default | Notes |
|---|---|---|
ONGRID_KIMI_API_KEY | — | Empty = provider dropped |
ONGRID_KIMI_MODEL | kimi-k2.6 | Default model |
ONGRID_KIMI_BASE_URL | https://api.moonshot.cn/v1 | Moonshot's endpoint |
ONGRID_KIMI_MODELS | kimi-k2.6,kimi-k2.5,moonshot-v1-128k | Catalog list |
Default catalog
kimi-k2.6— the catalog default; Moonshot's current frontier.kimi-k2.5— previous generation; still competitive on cost.moonshot-v1-128k— long-context variant. 128k tokens.
China-based
Moonshot's api.moonshot.cn endpoint is in mainland CN. Non-CN networks need either a VPC peering or a relay; the Settings UI tags the BaseURL field as "China-based" alongside Zhipu.
Long-context tip
moonshot-v1-128k is the only model in the default catalog with serious context length. Use it for:
- The
correlate_incidentcomposite — long Prom + Loki + Tempo result blob. - Knowledge-base searches over long playbooks.
The Ongrid investigator persona's 10-tool-call cap means the prompt rarely gets large enough to matter for the routine path; long-context is for the deep-dive case where you've manually pulled lots of data.
Making Kimi the default
bash
ONGRID_LLM_DEFAULT_PROVIDER=kimiQuirks
- OpenAI-compatible wire — same as Zhipu / DeepSeek. Function calling, streaming, system messages all standard.
- Output language — Kimi is bilingual but defaults to Chinese responses unless the prompt directive says otherwise. Same
LANGUAGE: ...directive that handles GLM works here. - Rate limits — Moonshot's per-account rate limits are tight. Use the
Config.MaxConcurrent=5default on the RCA worker to avoid starving manual chat when an alert storm hits.
See also
- Zhipu (GLM) — the other China-based provider.
- Models overview.
- Routing.