Skip to content

Routing & default

Ongrid runs N providers in parallel and dispatches each LLM call to exactly one. This page covers how that dispatch works at three layers:

  1. MultiClient — the wire-level router used by the legacy llm.Chat path (translate, knowledge search).
  2. RoutingChatModel — the eino model.ChatModel wrapper used by the graph-kernel ReAct agent.
  3. DefaultResolver — the dynamic-default hook that lets the default_provider setting take effect mid-process.

MultiClient

The bottom of the router stack lives in internal/pkg/llm/router.go.

go
// router.go:67
type MultiClient struct {
    // map: provider id -> sub-Client built from a ProviderConfig
    // ...
}

func (m *MultiClient) Chat(ctx context.Context, req ChatReq) (*ChatResp, error)

Each provider config produces one sub-client at construction:

go
// cmd/ongrid/main.go:492
providerCfgs := []llm.ProviderConfig{}
if cfg.OpenAI.APIKey != "" {
    providerCfgs = append(providerCfgs, llm.ProviderConfig{
        ID: "openai", Label: "OpenAI",
        APIKey: cfg.OpenAI.APIKey,
        Model: firstNonEmpty(cfg.OpenAI.Model, "gpt-5.4"),
        BaseURL: cfg.OpenAI.BaseURL,
        Models: dedupeModels(...),
    })
}
// ...same for each provider...
llmRouter := llm.NewMultiClient(providerCfgs, cfg.LLM.Default, openaiClient)

Chat dispatches on req.Provider:

  • Non-empty → look up the sub-client; 404 → ErrUnknownProvider.
  • Empty → fall back to defaultProvider.

The MultiClient.SetProvidersResolver(r) wiring layers the DB-backed resolver on top — on each call the active sub-client set is re-resolved (cached 60s by SetResolveTTL).

RoutingChatModel

The graph kernel uses eino's model.ChatModel interface, not llm.Chat. RoutingChatModel (eino_routing.go:89) wraps N inner ChatModels and dispatches via an impl-specific option:

go
type RoutingChatModel struct {
    inner           map[string]model.ChatModel
    defaultProvider string
    defaultResolver func(context.Context) (provider, mdl string)
}

func WithProvider(provider string) model.Option {
    return model.WrapImplSpecificOptFn(func(o *providerOpts) {
        o.provider = provider
    })
}

Usage from a call site:

go
resp, err := chatModel.Generate(ctx, msgs,
    model.WithModel("glm-4.7-flash"),
    llm.WithProvider("zhipu"),
)

pick() resolves the inner:

go
// eino_routing.go:173
func (r *RoutingChatModel) pick(opts ...model.Option) (model.ChatModel, string, error) {
    po := model.GetImplSpecificOptions(&providerOpts{}, opts...)
    prov := po.provider
    if prov == "" {
        prov = r.defaultProvider
    }
    inner, ok := r.inner[prov]
    if !ok {
        return nil, prov, fmt.Errorf("%w: %s", ErrUnknownProvider, prov)
    }
    return inner, prov, nil
}

DefaultResolver — the dynamic default

The bug this fixes: an admin flips default_provider from Anthropic to GLM in /settings/llm. The chat picker UI immediately respects the new default (it re-fetches /v1/aiops/models on each load). But the RCA investigator worker — which is process-internal and binds its provider at boot — keeps routing to Anthropic until restart.

Fix: RoutingChatModelConfig.DefaultResolver.

go
var defaultResolver func(context.Context) (string, string)
if resolver != nil {
    defaultResolver = func(rctx context.Context) (string, string) {
        provCfgs, resolvedDefault, rerr := resolver.ResolveProviders(rctx)
        if rerr != nil || resolvedDefault == "" {
            return "", ""
        }
        for _, pc := range provCfgs {
            if pc.ID == resolvedDefault {
                return resolvedDefault, pc.Model
            }
        }
        return resolvedDefault, ""
    }
}
chatModel, err := llm.NewRoutingChatModel(llm.RoutingChatModelConfig{
    Inner:           innerModels,
    DefaultProvider: defProv,
    DefaultResolver: defaultResolver,
})

withDynamicDefault injects the resolver's output as WithProvider + WithModel for calls that pinned neither — the chat picker pins provider per-message and bypasses the resolver entirely; default-routed calls (investigator, translate, knowledge fan-out) now follow live configuration.

Per-call provider in Chat

For the MultiClient.Chat path (non-graph-kernel callers), set ChatReq.Provider explicitly:

go
resp, err := llmClient.Chat(ctx, llm.ChatReq{
    Provider: "openai",
    Model:    "gpt-5.5",
    Messages: msgs,
})

Empty Provider → MultiClient resolves the default at call time. Empty Model → the resolved sub-client uses its configured default.

Per-call provider in the graph kernel

The agent runtime injects llm.WithProvider whenever the chat send envelope contains a non-empty provider. The persona registry can also pin a provider for an entire persona (e.g. the cheap extractor persona pins to Anthropic Haiku). See the agent persona format reference.

Pitfalls

  • Forgetting default_provider — the resolver picks the first sorted provider id; you'll send your model id to the wrong endpoint. Always set ONGRID_LLM_DEFAULT_PROVIDER (or write the DB row).
  • Pinning a provider that has no inner ChatModel — happens when the resolver returns a provider id that wasn't registered at boot AND wasn't pre-registered. The custom slot is pre-registered for exactly this reason; everything else gates on cfg.LLM.*.APIKey != "".
  • Hot-swap timing — the cache TTL is 60s on the resolver and 60s on the MultiClient. Worst case 120s before an admin edit takes effect. The Invalidate path is exposed but not wired to the SPA's save action yet.

See also

  • Models overview[]llm.ProviderConfig assembly.
  • Budget — orthogonal to routing; same cap applies to every provider.
  • RCA — investigator worker routing.