Middleware for runtime model selection via LangGraph runtime context.
Allows switching the model per invocation by passing a CLIContext via
context= on agent.astream() / agent.invoke() without recompiling
the graph.
Declared context_schema for the agent graph.
Registered via context_schema= when the graph is built, so LangGraph
coerces each run's context= payload into this dataclass — in-process,
runtime.context is a CLIContextSchema instance.
It exists alongside CLIContext (below) because the payload is shaped
differently on each side of the API boundary: in-process it is coerced to
this dataclass, but over the LangGraph API server (RemoteGraph) it is
serialized to JSON and arrives as a plain dict. Consumers
(configurable_model._get_context, _should_interrupt_tool_call)
therefore accept both shapes. CLIContext is the client-facing builder for
constructing that payload.
Fields mirror CLIContext; see its per-field docstrings for semantics.
Result of creating a chat model, bundling the model with its metadata.
This separates model creation from settings mutation so callers can decide when to commit the metadata to global settings.
Swap the model or per-call settings from runtime.context.
Reads two optional keys from the runtime context dict:
'model' — a provider:model spec (e.g. "openai:gpt-5").
When present and different from the current model, the request is
re-routed to the new model.'model_params' — a dict of extra model settings (e.g.
{"temperature": 0}) that are shallow-merged into the
request's model_settings.This middleware is typically the outermost layer so it intercepts every
model call before provider-specific middleware (like
AnthropicPromptCachingMiddleware) runs.