Custom Providers
Three ways to add a provider ZeroClaw doesn’t ship with:
- Use the
customslot. For any OpenAI-compatible endpoint not covered by an existing canonical slot. - Use the first-class local-server slots (
lmstudio,llamacpp,sglang,vllm,osaurus,litellm). Thin wrappers with sensible defaults. - Implement the
ModelProvidertrait in Rust. For anything that’s not OpenAI-compatible.
OpenAI-compatible endpoint: use the custom slot
If the service speaks OpenAI chat-completions, this is a config-only change. The custom slot requires uri (the family’s endpoint enum has no default); reference it from an agent’s model_provider.
This is the same OpenAiCompatibleModelProvider runtime impl used by groq, mistral, xai, and every other vendor with its own canonical slot in the catalog. The difference is which family slot you use: custom is the catch-all for endpoints not represented by a vendor slot.
First-class local-inference servers
ZeroClaw ships canonical slots for popular local-inference stacks. They’re all OpenAI-compatible under the hood but with default uri values pre-applied so you can usually omit uri entirely.
llama.cpp: slot llamacpp
sh
llama-server -hf ggml-org/gpt-oss-20b-GGUF --jinja -c 133000 --host 127.0.0.1 --port 8033
Optional fields apply to any compat-slot family (including llamacpp). The
full set, derived from the schema:
api_key 🔑
Secret API token for this model_provider. Grab it from the model_provider’s dashboard (OpenAI platform, Anthropic console, OpenRouter keys page, etc.). Stored via the OS keyring when possible; never commit it to config.toml directly.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.api_key field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.api_key field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.api_key # masked input, stored encrypted
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__api_key=
chat_template_kwargs
Arbitrary key/value pairs forwarded verbatim as chat_template_kwargs in the request body (llama.cpp-specific). Use this to pass model-family template variables that control behaviour not exposed by other fields. Example (Qwen3 thinking suppression): chat_template_kwargs = { enable_thinking = false }
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.chat_template_kwargs field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.chat_template_kwargs field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.chat_template_kwargs <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__chat_template_kwargs=
extra_headers 🔑
Extra HTTP headers sent with every request. Niche: used for auth bridges, corporate proxies, or custom gateways that demand a tracing header. Most users never touch this; edit config.toml directly if you need it.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.extra_headers field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.extra_headers field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.extra_headers # masked input, stored encrypted
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__extra_headers=
fallback
Ordered list of other provider aliases to try when every model on this alias has failed. Each entry is a dotted <type>.<alias> reference into providers.models and resolves with its own credentials, endpoint, and model. A fallback never inherits this alias’s key. The walk is depth-first: this alias’s models are exhausted first, then each fallback alias is descended in turn (applying its own fallback_models and fallback). Empty means no provider-level fallback.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.fallback field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.fallback field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.fallback <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__fallback=
fallback_models
Ordered alternate models to try on THIS provider before falling over to the fallback aliases. Same endpoint, key, and headers as the primary model. Only the model identifier changes. Use this when a provider serves a backup model (e.g. a smaller or older variant) that should be tried before leaving the provider entirely. Empty means only model is tried.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.fallback_models field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.fallback_models field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.fallback_models <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__fallback_models=
kind
Provider implementation to instantiate for this profile. Use this when a canonical typed slot should run through a compatible implementation, e.g. [providers.models.openai.proxy] kind = "openai-compatible".
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.kind field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.kind field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.kind <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__kind=
max_tokens
Hard cap on response length in tokens. Most models enforce sensible built-in limits already; leave unset unless you specifically need to clip long outputs for cost or latency reasons.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.max_tokens field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.max_tokens field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.max_tokens <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__max_tokens=
merge_system_into_user
ModelProvider-specific quirk: fold the system prompt into the first user message instead of sending a separate system role. Only needed for models that reject (or mishandle) a standalone system role, e.g. certain older Mistral variants.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.merge_system_into_user field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.merge_system_into_user field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.merge_system_into_user <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__merge_system_into_user=
model
Model identifier to send with each request: the ID string from the model_provider’s catalog (e.g. gpt-4o, claude-sonnet-4-5, llama-3.3-70b). Must match a model the model_provider actually serves on this account.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.model field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.model field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.model <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__model=
native_tools
Override the provider’s default for native tool calling. None (default) honors the provider’s built-in choice. Some(true) forces native tool calls on, Some(false) forces text-fallback. Currently consulted only by the Groq factory, which defaults to text-fallback because llama-family Groq models reject native tool calls with HTTP 400. Setting native_tools = true re-enables native tool calling for Groq models that support it.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.native_tools field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.native_tools field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.native_tools <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__native_tools=
pricing
Per-model pricing for cost tracking, USD per 1M tokens. Free-form key/value map. Keys are user-defined model identifiers; an optional .input / .output suffix encodes pricing dimension when the operator wants to split rates. A bare key without a suffix is used as a flat per-token rate when neither dimension is specified. Default is empty: cost tracking falls back to “unknown” rates and only token usage is recorded. Example: pricing = { opus = 15.0, sonnet = 3.0 } Or split: pricing = { "opus.input" = 15.0, "opus.output" = 75.0 }
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.pricing field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.pricing field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.pricing <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__pricing=
provider_extra
Extra JSON parameters to include in API requests. Merged at the top level of the request body, allowing provider-specific features (routing, transforms, etc.) without code changes. Example: provider_extra = { model_provider = { only = ["Anthropic"] } }
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.provider_extra field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.provider_extra field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.provider_extra <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__provider_extra=
requires_openai_auth
When true, the client pulls credentials from OPENAI_API_KEY or ~/.codex/auth.json instead of the api_key field above. Turn on only for the OpenAI Codex model_provider; leave off for standard API-key model_providers.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.requires_openai_auth field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.requires_openai_auth field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.requires_openai_auth <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__requires_openai_auth=
temperature
Sampling temperature passed to the model. Lower values (0.0–0.3) give deterministic, near-verbatim output, which fits code, routing, summarization. Higher values (0.7–1.2) give more varied output, which fits open-ended chat.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.temperature field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.temperature field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.temperature <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__temperature=
think
Enable or disable chain-of-thought thinking for models that support it (e.g. Qwen3, GLM-4). true turns thinking on, false turns it off. None (default) lets the model decide. Forwarded as enable_thinking in the request body; mirrors the Ollama provider’s think field.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.think field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.think field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.think <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__think=
timeout_secs
HTTP request timeout in seconds. Bump this for slow local model_providers (Ollama on CPU, big local models) or high-latency networks; leave unset otherwise.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.timeout_secs field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.timeout_secs field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.timeout_secs <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__timeout_secs=
uri
Endpoint URI the client hits. Override the family’s default endpoint when pointing at a self-hosted gateway (LiteLLM, vLLM, Ollama), a custom proxy, or any non-standard URL. Leave unset to use the family’s default URI from its ModelEndpoint impl. Set this to the FULL endpoint URL; there is no separate path-suffix field.
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.uri field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.uri field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.uri <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__uri=
wire_api
Set it on any surface:
Gateway dashboard
Open /config/providers.models/custom and set the providers.models.custom.<alias>.wire_api field.
zerocode
In the Config pane, set the providers.models.custom.<alias>.wire_api field.
zeroclaw config
zeroclaw config set providers.models.custom.<alias>.wire_api <value>
Environment variable
Export the override (POSIX shells; drop into ~/.bashrc, ~/.zshrc, .env, or a Dockerfile). Replace <alias> with the literal alias:
export ZEROCLAW_providers__models__custom__<alias>__wire_api=
Controlling thinking mode varies by model family. think = false sets the top-level enable_thinking field in the request. Some models (e.g. Qwen3) read this flag from the Jinja template via chat_template_kwargs instead:
Other model families use different template variable names, check your model’s chat template and set the appropriate key under chat_template_kwargs.
SGLang: slot sglang
sh
python -m sglang.launch_server --model meta-llama/Llama-3.1-8B-Instruct --port 30000
vLLM: slot vllm
sh
vllm serve meta-llama/Llama-3.1-8B-Instruct
LM Studio, Osaurus, LiteLLM
Slots lmstudio, osaurus, litellm follow the same pattern, see the catalog.
Wire protocol: wire_api = "responses"
Bring-your-own-endpoint slots default to the OpenAI chat-completions wire. An endpoint that only speaks the OpenAI responses wire (some self-hosted vLLM / TGI deployments) needs an explicit wire_api = "responses" opt-in on the alias entry.
When set to "responses", the provider is built as an OpenAiResponsesModelProvider (full streaming tool calls over the responses protocol) instead of a chat-completions provider. Omit the field, or set "chat_completions", for the default wire.
wire_api is honored by the bring-your-own-endpoint families where the wire is operator-configurable: openai, llamacpp, and custom (plus the generic openai-compatible path). Branded vendor slots (groq, mistral, deepseek, …) have a fixed wire protocol and ignore the field, with one exception: opencode honors wire_api = "responses" because OpenCode Zen serves both wires. With no uri override, the OpenCode responses route targets https://opencode.ai/zen/v1/responses:
[providers.models.opencode.default]
model = "big-pickle"
wire_api = "responses"
The setting governs both the primary agent path and delegate targets, so a delegate whose target alias declares wire_api = "responses" reaches the endpoint over the responses wire.
Validation
Regardless of approach:
sh
zeroclaw config list # loads config; any validation failures print to stderr
zeroclaw models refresh --provider <type>.<alias> # list models the endpoint advertises
zeroclaw agent -a <alias> -m "hello" # smoke-test against the agent at `[agents.<alias>]`
Implementing a new ModelProvider trait
If the endpoint isn’t OpenAI-compatible and isn’t one of the local-server slots, you need code.
The trait lives in crates/zeroclaw-api/src/model_provider.rs:
#![allow(unused)]
fn main() {
#[async_trait]
pub trait ModelProvider: Send + Sync {
fn name(&self) -> &str;
fn supports_streaming(&self) -> bool { true }
fn supports_streaming_tool_events(&self) -> bool { false }
async fn chat(
&self,
messages: Vec<Message>,
tools: Vec<ToolSchema>,
options: ChatOptions,
) -> Pin<Box<dyn Stream<Item = Result<StreamEvent>> + Send>>;
}
}
Implementation pattern:
- Define the typed config in
crates/zeroclaw-config/src/schema.rs:#![allow(unused)] fn main() { pub struct MyProviderModelProviderConfig { #[serde(flatten)] pub base: ModelProviderConfig, pub endpoint: MyProviderEndpoint, // family-specific fields } pub enum MyProviderEndpoint { Default } impl ModelEndpoint for MyProviderEndpoint { fn uri(&self) -> &'static str { match self { Self::Default => "https://my-provider.example.com/v1" } } } } - Add the slot to
for_each_model_provider_slot!incrates/zeroclaw-config/src/providers.rs. Every helper picks up the new slot automatically. - Add the runtime impl in
crates/zeroclaw-providers/src/myprovider.rs. TranslateVec<Message>to the wire format, stream the response, emitStreamEventvalues. - Wire the factory branch in
crates/zeroclaw-providers/src/lib.rs::create_provider_with_url_and_options. - Add a feature flag in
Cargo.tomlif the provider pulls heavy deps.
See anthropic.rs as a reference for a provider with a fully custom wire format. See compatible.rs for the SSE-streaming OpenAI-compat pattern.
Troubleshooting
Authentication errors
- Verify the API key matches the endpoint (many vendors use key prefixes:
sk-,gsk_,sk-ant-). - Check that
uriincludes the scheme (http:///https://) and the/v1path if the endpoint expects it. - Endpoints behind a VPN or proxy? Confirm routing from the ZeroClaw host.
Model not found
- List what the endpoint advertises:
sh
curl -sS "$URI/models" -H "Authorization: Bearer $API_KEY" | jq
Connection issues
curl -I $URI, does it respond?- Firewall, proxy, egress rules? VPS providers sometimes block outbound high ports.
- Vendor status page if it’s a hosted service.
Gateway rejects temperature
Some gateways (e.g. a LiteLLM proxy fronting claude-opus-4-7) return an error
when a temperature field is present at all. ZeroClaw honors the Option
contract: if you leave temperature unset in config, the field is omitted
from the request body entirely and the backend picks its own default. Only set
temperature explicitly when the endpoint accepts it.
See also
- Overview: provider model and how per-agent dispatch works
- Configuration: full
[providers.*]schema, Azure typed config, regional and OAuth variants - Catalog: every canonical slot with a worked TOML example
- Developing → Plugin protocol: if a plugin works better than a first-class crate