Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Custom Providers

Three ways to add a provider ZeroClaw doesn’t ship with:

  1. Use the custom slot. For any OpenAI-compatible endpoint not covered by an existing canonical slot.
  2. Use the first-class local-server slots (lmstudio, llamacpp, sglang, vllm, osaurus, litellm). Thin wrappers with sensible defaults.
  3. Implement the ModelProvider trait in Rust. For anything that’s not OpenAI-compatible.

OpenAI-compatible endpoint: use the custom slot

If the service speaks OpenAI chat-completions, this is a config-only change. The custom slot requires uri (the family’s endpoint enum has no default); reference it from an agent’s model_provider.

This is the same OpenAiCompatibleModelProvider runtime impl used by groq, mistral, xai, and every other vendor with its own canonical slot in the catalog. The difference is which family slot you use: custom is the catch-all for endpoints not represented by a vendor slot.

First-class local-inference servers

ZeroClaw ships canonical slots for popular local-inference stacks. They’re all OpenAI-compatible under the hood but with default uri values pre-applied so you can usually omit uri entirely.

llama.cpp: slot llamacpp

sh

llama-server -hf ggml-org/gpt-oss-20b-GGUF --jinja -c 133000 --host 127.0.0.1 --port 8033

Optional fields apply to any compat-slot family (including llamacpp). The full set, derived from the schema:

api_key 🔑 secret · default

Secret API token for this model_provider. Grab it from the model_provider’s dashboard (OpenAI platform, Anthropic console, OpenRouter keys page, etc.). Stored via the OS keyring when possible; never commit it to config.toml directly.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.api_key field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.api_key field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.api_key    # masked input, stored encrypted

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__api_key=
chat_template_kwargs table · default

Arbitrary key/value pairs forwarded verbatim as chat_template_kwargs in the request body (llama.cpp-specific). Use this to pass model-family template variables that control behaviour not exposed by other fields. Example (Qwen3 thinking suppression): chat_template_kwargs = { enable_thinking = false }

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.chat_template_kwargs field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.chat_template_kwargs field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.chat_template_kwargs <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__chat_template_kwargs=
extra_headers 🔑 secret · default

Extra HTTP headers sent with every request. Niche: used for auth bridges, corporate proxies, or custom gateways that demand a tracing header. Most users never touch this; edit config.toml directly if you need it.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.extra_headers field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.extra_headers field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.extra_headers    # masked input, stored encrypted

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__extra_headers=
fallback ModelProviderRef[] · default

Ordered list of other provider aliases to try when every model on this alias has failed. Each entry is a dotted <type>.<alias> reference into providers.models and resolves with its own credentials, endpoint, and model. A fallback never inherits this alias’s key. The walk is depth-first: this alias’s models are exhausted first, then each fallback alias is descended in turn (applying its own fallback_models and fallback). Empty means no provider-level fallback.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.fallback field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.fallback field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.fallback <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__fallback=
fallback_models string[] · default

Ordered alternate models to try on THIS provider before falling over to the fallback aliases. Same endpoint, key, and headers as the primary model. Only the model identifier changes. Use this when a provider serves a backup model (e.g. a smaller or older variant) that should be tried before leaving the provider entirely. Empty means only model is tried.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.fallback_models field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.fallback_models field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.fallback_models <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__fallback_models=
kind string? · default

Provider implementation to instantiate for this profile. Use this when a canonical typed slot should run through a compatible implementation, e.g. [providers.models.openai.proxy] kind = "openai-compatible".

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.kind field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.kind field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.kind <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__kind=
max_tokens integer? · default

Hard cap on response length in tokens. Most models enforce sensible built-in limits already; leave unset unless you specifically need to clip long outputs for cost or latency reasons.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.max_tokens field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.max_tokens field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.max_tokens <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__max_tokens=
merge_system_into_user bool · default

ModelProvider-specific quirk: fold the system prompt into the first user message instead of sending a separate system role. Only needed for models that reject (or mishandle) a standalone system role, e.g. certain older Mistral variants.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.merge_system_into_user field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.merge_system_into_user field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.merge_system_into_user <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__merge_system_into_user=
model string? · default

Model identifier to send with each request: the ID string from the model_provider’s catalog (e.g. gpt-4o, claude-sonnet-4-5, llama-3.3-70b). Must match a model the model_provider actually serves on this account.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.model field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.model field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.model <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__model=
native_tools bool? · default

Override the provider’s default for native tool calling. None (default) honors the provider’s built-in choice. Some(true) forces native tool calls on, Some(false) forces text-fallback. Currently consulted only by the Groq factory, which defaults to text-fallback because llama-family Groq models reject native tool calls with HTTP 400. Setting native_tools = true re-enables native tool calling for Groq models that support it.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.native_tools field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.native_tools field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.native_tools <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__native_tools=
pricing map · default

Per-model pricing for cost tracking, USD per 1M tokens. Free-form key/value map. Keys are user-defined model identifiers; an optional .input / .output suffix encodes pricing dimension when the operator wants to split rates. A bare key without a suffix is used as a flat per-token rate when neither dimension is specified. Default is empty: cost tracking falls back to “unknown” rates and only token usage is recorded. Example: pricing = { opus = 15.0, sonnet = 3.0 } Or split: pricing = { "opus.input" = 15.0, "opus.output" = 75.0 }

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.pricing field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.pricing field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.pricing <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__pricing=
provider_extra table · default

Extra JSON parameters to include in API requests. Merged at the top level of the request body, allowing provider-specific features (routing, transforms, etc.) without code changes. Example: provider_extra = { model_provider = { only = ["Anthropic"] } }

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.provider_extra field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.provider_extra field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.provider_extra <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__provider_extra=
requires_openai_auth bool · default

When true, the client pulls credentials from OPENAI_API_KEY or ~/.codex/auth.json instead of the api_key field above. Turn on only for the OpenAI Codex model_provider; leave off for standard API-key model_providers.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.requires_openai_auth field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.requires_openai_auth field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.requires_openai_auth <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__requires_openai_auth=
temperature number? · default

Sampling temperature passed to the model. Lower values (0.0–0.3) give deterministic, near-verbatim output, which fits code, routing, summarization. Higher values (0.7–1.2) give more varied output, which fits open-ended chat.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.temperature field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.temperature field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.temperature <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__temperature=
think bool? · default

Enable or disable chain-of-thought thinking for models that support it (e.g. Qwen3, GLM-4). true turns thinking on, false turns it off. None (default) lets the model decide. Forwarded as enable_thinking in the request body; mirrors the Ollama provider’s think field.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.think field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.think field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.think <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__think=
timeout_secs integer? · default

HTTP request timeout in seconds. Bump this for slow local model_providers (Ollama on CPU, big local models) or high-latency networks; leave unset otherwise.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.timeout_secs field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.timeout_secs field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.timeout_secs <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__timeout_secs=
uri string? · default

Endpoint URI the client hits. Override the family’s default endpoint when pointing at a self-hosted gateway (LiteLLM, vLLM, Ollama), a custom proxy, or any non-standard URL. Leave unset to use the family’s default URI from its ModelEndpoint impl. Set this to the FULL endpoint URL; there is no separate path-suffix field.

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.uri field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.uri field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.uri <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__uri=
wire_api WireApi · default

Set it on any surface:

Gateway dashboard

Open /config/providers.models/custom and set the providers.models.custom.<alias>.wire_api field.

zerocode

In the Config pane, set the providers.models.custom.<alias>.wire_api field.

zeroclaw config

zeroclaw config set providers.models.custom.<alias>.wire_api <value>

Environment variable

Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:

export ZEROCLAW_providers__models__custom__<alias>__wire_api=

Controlling thinking mode varies by model family. think = false sets the top-level enable_thinking field in the request. Some models (e.g. Qwen3) read this flag from the Jinja template via chat_template_kwargs instead:

Other model families use different template variable names, check your model’s chat template and set the appropriate key under chat_template_kwargs.

SGLang: slot sglang

sh

python -m sglang.launch_server --model meta-llama/Llama-3.1-8B-Instruct --port 30000

vLLM: slot vllm

sh

vllm serve meta-llama/Llama-3.1-8B-Instruct

LM Studio, Osaurus, LiteLLM

Slots lmstudio, osaurus, litellm follow the same pattern, see the catalog.

Wire protocol: wire_api = "responses"

Bring-your-own-endpoint slots default to the OpenAI chat-completions wire. An endpoint that only speaks the OpenAI responses wire (some self-hosted vLLM / TGI deployments) needs an explicit wire_api = "responses" opt-in on the alias entry.

When set to "responses", the provider is built as an OpenAiResponsesModelProvider (full streaming tool calls over the responses protocol) instead of a chat-completions provider. Omit the field, or set "chat_completions", for the default wire.

wire_api is honored by the bring-your-own-endpoint families where the wire is operator-configurable: openai, llamacpp, and custom (plus the generic openai-compatible path). Branded vendor slots (groq, mistral, deepseek, …) have a fixed wire protocol and ignore the field, with one exception: opencode honors wire_api = "responses" because OpenCode Zen serves both wires. With no uri override, the OpenCode responses route targets https://opencode.ai/zen/v1/responses:

[providers.models.opencode.default]
model    = "big-pickle"
wire_api = "responses"

The setting governs both the primary agent path and delegate targets, so a delegate whose target alias declares wire_api = "responses" reaches the endpoint over the responses wire.

Validation

Regardless of approach:

sh

zeroclaw config list                          # loads config; any validation failures print to stderr
zeroclaw models refresh --provider <type>.<alias>   # list models the endpoint advertises
zeroclaw agent -a <alias> -m "hello"          # smoke-test against the agent at `[agents.<alias>]`

Implementing a new ModelProvider trait

If the endpoint isn’t OpenAI-compatible and isn’t one of the local-server slots, you need code.

The trait lives in crates/zeroclaw-api/src/model_provider.rs:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait ModelProvider: Send + Sync {
    fn name(&self) -> &str;
    fn supports_streaming(&self) -> bool { true }
    fn supports_streaming_tool_events(&self) -> bool { false }

    async fn chat(
        &self,
        messages: Vec<Message>,
        tools: Vec<ToolSchema>,
        options: ChatOptions,
    ) -> Pin<Box<dyn Stream<Item = Result<StreamEvent>> + Send>>;
}
}

Implementation pattern:

  1. Define the typed config in crates/zeroclaw-config/src/schema.rs:
    #![allow(unused)]
    fn main() {
    pub struct MyProviderModelProviderConfig {
        #[serde(flatten)]
        pub base: ModelProviderConfig,
        pub endpoint: MyProviderEndpoint,
        // family-specific fields
    }
    
    pub enum MyProviderEndpoint { Default }
    impl ModelEndpoint for MyProviderEndpoint {
        fn uri(&self) -> &'static str {
            match self { Self::Default => "https://my-provider.example.com/v1" }
        }
    }
    }
  2. Add the slot to for_each_model_provider_slot! in crates/zeroclaw-config/src/providers.rs. Every helper picks up the new slot automatically.
  3. Add the runtime impl in crates/zeroclaw-providers/src/myprovider.rs. Translate Vec<Message> to the wire format, stream the response, emit StreamEvent values.
  4. Wire the factory branch in crates/zeroclaw-providers/src/lib.rs::create_provider_with_url_and_options.
  5. Add a feature flag in Cargo.toml if the provider pulls heavy deps.

See anthropic.rs as a reference for a provider with a fully custom wire format. See compatible.rs for the SSE-streaming OpenAI-compat pattern.

Troubleshooting

Authentication errors

  • Verify the API key matches the endpoint (many vendors use key prefixes: sk-, gsk_, sk-ant-).
  • Check that uri includes the scheme (http:// / https://) and the /v1 path if the endpoint expects it.
  • Endpoints behind a VPN or proxy? Confirm routing from the ZeroClaw host.

Model not found

  • List what the endpoint advertises:

sh

  curl -sS "$URI/models" -H "Authorization: Bearer $API_KEY" | jq
- If the endpoint doesn't implement `/models`, send a direct chat request and read the error, most endpoints return the expected model family in the error body. - Gateway services often expose only a subset of upstream models.

Connection issues

  • curl -I $URI, does it respond?
  • Firewall, proxy, egress rules? VPS providers sometimes block outbound high ports.
  • Vendor status page if it’s a hosted service.

Gateway rejects temperature

Some gateways (e.g. a LiteLLM proxy fronting claude-opus-4-7) return an error when a temperature field is present at all. ZeroClaw honors the Option contract: if you leave temperature unset in config, the field is omitted from the request body entirely and the backend picks its own default. Only set temperature explicitly when the endpoint accepts it.

See also