Module traits

Structs§

ChatMessage: A single message in a conversation.
ChatRequest: Request payload for model_provider chat calls.
ChatResponse: An LLM response that may contain text, tool calls, or both.
ModelInfo: Model info with optional pricing — returned by list_models_with_pricing.
ModelPricing: Per-token pricing for a model. All values are per-token rates as strings expressed in USD per token — e.g. "0.000005" = $5.00 per 1M tokens.
NativeThinkingParams: Parameters for native extended thinking support.
ProviderCapabilities: ModelProvider capabilities declaration.
ProviderCapabilityError: Structured error returned when a requested capability is not supported.
StreamChunk: A chunk of content from a streaming response.
StreamOptions: Options for streaming chat requests.
TokenUsage: Raw token counts from a single LLM API response.
ToolCall: A tool call requested by the LLM.
ToolResultMessage: A tool result to feed back to the LLM.

ConversationMessage: A message in a multi-turn conversation, including tool interactions.
StreamError: Errors that can occur during streaming.
StreamEvent: Structured events emitted by model_provider streaming APIs.
ToolsPayload: ModelProvider-specific tool payload formats.

BASELINE_MAX_TOKENS: Output-token budget roomy enough for typical agent turns. Providers override per family where the model’s own context window is the binding constraint.
BASELINE_TEMPERATURE: Industry-neutral sampling temperature. OpenAI, Gemini, OpenRouter, and most OpenAI-compatible endpoints document 0.7 as their typical default; Anthropic and Ollama override (1.0 and 0.0 respectively).
BASELINE_TIMEOUT_SECS: HTTP timeout for cloud inference. Local model_providers (Ollama) override upward since CPU/GPU-bound inference runs slower than round-tripping to a hyperscaler.
BASELINE_WIRE_API: Wire protocol used when the model_provider doesn’t declare one. Only OpenAI’s Codex stack uses the “responses” protocol; everything else speaks the classic chat completions shape.
MAX_BUDGET_TOKENS
MIN_BUDGET_TOKENS: Anthropic’s documented minimum for extended-thinking budget_tokens. Requests below this are rejected with 400 by the provider; clamping at resolution time gives a clearer error site than the first API call.

build_tool_instructions_text: Build tool instructions text for prompt-guided tool calling.