DeepSeek

TL;DR

bash

ONGRID_DEEPSEEK_API_KEY=sk-...
ONGRID_DEEPSEEK_MODEL=deepseek-v4-flash     # default
ONGRID_DEEPSEEK_BASE_URL=                   # optional; defaults to api.deepseek.com/v1

Provider id: deepseek. SDK adapter: OpenAI-compatible.

DeepSeek's V4 family is the cheap-and-fast option. The endpoint is OpenAI-compatible at the wire level.

Env vars

Var	Default	Notes
`ONGRID_DEEPSEEK_API_KEY`	—	Empty = provider dropped
`ONGRID_DEEPSEEK_MODEL`	`deepseek-v4-flash`	Default model
`ONGRID_DEEPSEEK_BASE_URL`	`https://api.deepseek.com/v1`	Override for VPC endpoints
`ONGRID_DEEPSEEK_MODELS`	`deepseek-v4-pro,deepseek-v4-flash,deepseek-reasoner`	Catalog list

Default catalog

deepseek-v4-pro — top of the V4 family; closest to frontier quality at a fraction of the cost.
deepseek-v4-flash — the catalog default; recommended for chat.
deepseek-reasoner — chain-of-thought variant. See quirks below.

`deepseek-reasoner` caveats

deepseek-reasoner emits a <thinking>...</thinking> block before its final answer. The Ongrid LLM adapter does NOT strip these — they show up in the chat transcript and in the RCA report's findings_md.

If you don't want the thinking blocks rendered:

Use a different model for chat (deepseek-v4-pro).
Or post-process the transcript with a CSS rule that hides details[open] > summary:contains("thinking") — the SPA wraps them in collapsible <details> by default.

The reasoner's response is slower than v4-flash (the chain-of- thought is real compute). Don't use it for the Pass-2 structured extractor — the timeout will hit.

Making DeepSeek the default

bash

ONGRID_LLM_DEFAULT_PROVIDER=deepseek

The agent runtime auto-picks the default-resolver-provided model for the investigator persona's calls; this means flipping default to DeepSeek immediately routes all auto-RCAs there — at much lower cost than Claude / GPT for similar quality on the structured-extraction half of the pipeline.

BaseURL

The api.deepseek.com/v1 endpoint is globally reachable. No China-based tag in the SPA. Use BaseURL override only for relays.

Quirks

OpenAI-compatible wire — flat tool_calls, OpenAI streaming format. The adapter is the same as for Custom / Zhipu / Kimi / Gemini-OAI-mode.
Long context — V4 supports 64k tokens; the Ongrid budget estimator uses a conservative len(text)/4 so you'll see the budget reject before you actually hit the model limit.

DeepSeek ​

Env vars ​

Default catalog ​

deepseek-reasoner caveats ​

Making DeepSeek the default ​

BaseURL ​

Quirks ​

See also ​