Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

日志与可观测性

Every event ZeroClaw emits flows through one crate: zeroclaw-log. The crate owns the on-disk JSONL schema, the in-process broadcast stream the dashboard reads, the bridge to the typed Observer (Prometheus / OTel), and the macros (record!, scope!, spawn!) that subsystems call.

This page covers what an operator needs: configuration, where the log lives, the shape of the events, and how to query them.

Config ([observability])

[observability]
# Storage policy for the JSONL log.
# "none"    — in-process broadcast only (no disk writes).
# "rolling" — append + trim once `log_persistence_max_entries` is exceeded.
# "full"    — append forever, operator manages rotation.
log_persistence = "rolling"

# Workspace-relative path (or absolute).
log_persistence_path = "state/runtime-trace.jsonl"

# Cap for "rolling".
log_persistence_max_entries = 200

# Tool input/output capture policy.
# "off"      — only tool name + outcome + duration; no I/O bodies.
# "redacted" — bodies are leak-scanned and truncated at `log_tool_io_truncate_bytes`.
# "full"     — bodies are leak-scanned; no truncation.
log_tool_io = "redacted"
log_tool_io_truncate_bytes = 8192

# Tool names whose I/O is never persisted beyond name + outcome + duration,
# regardless of `log_tool_io`. For tools whose I/O is intrinsically sensitive.
log_tool_io_denylist = []

# OTel / Prometheus backend (independent of the JSONL log).
backend = "none"            # "none" | "log" | "verbose" | "prometheus" | "otel"
otel_endpoint = "http://localhost:4318"
otel_service_name = "zeroclaw"
# otel_headers = { Authorization = "Bearer …" }

Defaults: log_persistence = "rolling", log_persistence_max_entries = 200, log_tool_io = "redacted", log_tool_io_truncate_bytes = 8192. A fresh install produces a 200-event rolling JSONL at ~/.zeroclaw/state/runtime-trace.jsonl, and the dashboard’s Logs page works without further configuration.

log_persistence = "none" disables persistence entirely. The broadcast stream (dashboard SSE) and the typed Observer bridge still receive events; only the JSONL writer is gated.

On-disk format

JSONL: one event per line, UTF-8, 0o600 permissions on Unix. Every line is sync_data’d after write — the line is durable before the emitting code returns.

Line shape mirrors zeroclaw_log::event::LogEvent. Top-level keys:

类型备注
idUUID v4 stringPersistent event id.
@timestampRFC 3339 + ms, UTCLexicographic-sortable; the reader sorts on this.
severity_numberu8OTel: 1 TRACE, 5 DEBUG, 9 INFO, 13 WARN, 17 ERROR.
severity_text字符串Bucket label for severity_number.
event.category字符串agent, channel, cron, memory, tool, provider, session, system, or internal.
event.action字符串Stable identifier (llm_request, channel_message_inbound, …).
event.outcomestring | omittedsuccess, failure, unknown (omitted when unknown).
service.name字符串Constant "zeroclaw".
service.version字符串Crate version of the running daemon.
trace_idhex string | omittedPer-turn correlation. One agent turn = one trace_id.
span_idhex string | omittedSub-span within a turn.
zeroclaw.*flat string mapAlias-bound attribution (see below).
消息string | omittedHuman-readable line body.
attributesobject | omittedFree-form per-action payload.
schema_versionu8Currently 2. v1 rows migrate in-place on startup.

zeroclaw.* attribution

The Rust source of truth is ATTRIBUTION_FIELDS + COMPOSITE_PREFIXES in crates/zeroclaw-log/src/event.rs. The /api/logs response carries the canonical list as attribution_keys; fetch it instead of hard-coding.

Plain fields (ATTRIBUTION_FIELDS) carry a single string each. Composite prefixes get three keys: <prefix>, <prefix>_type, <prefix>_alias (e.g. channel = "discord.glados", channel_type = "discord", channel_alias = "glados"). Filters can match either coarse or precise.

When a tracing call sets a composite-prefix field to a bare type (no .), only the _type slot is populated — that way a tracing::*!(model_provider = name, …) call inside a span that already carries the full <type>.<alias> composite doesn’t clobber it on the leaf→root merge.

Querying

The dashboard’s Logs page is the primary surface. Underneath:

GET /api/logs

Top-level filters (query params): since_ts, until_ts, until_id, action, category, outcome, severity_min, trace_id, q (substring across message + attributes), hide_internal (drops event.category = "internal"), limit.

Every other ?<key>=<value> is treated as a per-attribution equality filter — the gateway validates the key against is_attribution_field and rejects unknowns with 400. The response includes attribution_keys: string[], so callers don’t have to guess.

Examples:

# All WARN+ events since the daemon started.
curl "$ZEROCLAW_GATEWAY/api/logs?severity_min=13"

# A specific agent's events:
curl "$ZEROCLAW_GATEWAY/api/logs?agent_alias=glados"

# Discord traffic for one bot:
curl "$ZEROCLAW_GATEWAY/api/logs?channel=discord.glados"

# A single agent turn:
curl "$ZEROCLAW_GATEWAY/api/logs?trace_id=<value-from-a-prior-event>"

Pagination is reverse-cursor. The response includes next_cursor: [timestamp, id] | null; pass these back as until_ts + until_id to load older. at_end: true means the reader scanned the whole file for the current filter.

The /api/status response includes daemon_started_at: string (RFC 3339), so a dashboard can default to “since daemon start” without an extra round-trip.

External log viewers

The JSONL schema is an OTel-logs + ECS hybrid: @timestamp, severity_number + severity_text, event.{category,action,outcome}, service.{name,version}, attributes, plus the zeroclaw.* vendor namespace. Most log viewers ingest it with little or no transform. Replace <install> with the absolute path to your install dir in the examples below (typically ~/.zeroclaw expanded).

Grafana Loki

Promtail labels lift agent_alias, channel, and severity_text so they’re filterable in Grafana:

scrape_configs:
  - 任务名称: zeroclaw
    静态配置:
      - 目标: [localhost]
        标签:
          job: zeroclaw
          __path__: <install>/data/state/runtime-trace.jsonl
    pipeline_stages:
      - json:
          expressions:
            agent: zeroclaw.agent_alias
            channel: zeroclaw.channel
            level: severity_text
      - 标签:
          agent:
          channel:
          level:
      - timestamp:
          source: '@timestamp'
          format: RFC3339

OpenTelemetry Collector

The filelog receiver maps the schema directly. Export to any OTel sink afterward (Tempo, Honeycomb, Datadog, etc.):

receivers:
  filelog/zeroclaw:
    include: [<install>/data/state/runtime-trace.jsonl]
    operators:
      - 类型: json_parser
        timestamp:
          parse_from: attributes["@timestamp"]
          layout: '%Y-%m-%dT%H:%M:%S.%LZ'
        severity:
          parse_from: attributes.severity_number

Kibana / Elastic

Ingest works as-is. Strict ECS pipelines expect log.level in place of severity_text. A Filebeat ingest pipeline that renames severity_text to log.level (and severity_number to log.syslog.severity.code) covers the gap. @timestamp and event.{category,action,outcome} are already in canonical positions.

Vector / Fluent Bit

Both tail JSONL with a JSON parser stage; no schema transforms needed before shipping to any backend.

Terminal format

The daemon’s stderr formatter prefixes every line with the closest enclosing alias-bound identity:

  • agent context → [<agent_alias>]
  • channel-only context (channel listener, no agent yet) → [<channel_composite>] (e.g. [discord.glados])
  • otherwise → [system]

The span chain follows: channel_listener{channel=discord.glados}: …. Span fields are visible inline.

Schema migration

On startup, if log_persistence is enabled and the file exists, the writer streams any schema-1 rows through an in-place migration to schema-2 before the first append. Pure streaming — bounded by a single line’s allocation regardless of file size. The migrated file is atomically renamed into place. Files already at v2 are left untouched.

If migration fails, the daemon logs a warn and continues writing v2 appends; the old v1 rows remain readable by tools that still understand v1 but won’t pass the v2 reader’s deserializer.

What is internal?

event.category = "internal" is the bucket for ops noise an operator doesn’t need on the dashboard by default: heartbeat ticks, idle broadcasts, lossy sync retries, and the like. The dashboard’s “Hide internal” toggle (on by default) filters these.

Use it when you have a high-frequency event whose presence matters for forensics but whose absence is the normal state. Don’t use it as a volume governor for genuine errors.

Files of interest

  • crates/zeroclaw-log/src/event.rs — the canonical LogEvent shape.
  • crates/zeroclaw-log/src/layer.rs — the tracing-subscriber Layer that captures every tracing::* call and feeds the pipeline.
  • crates/zeroclaw-log/src/macro.rsrecord!, scope!, spawn!.
  • crates/zeroclaw-log/src/writer.rs — append + rolling trim.
  • crates/zeroclaw-log/src/reader.rs/api/logs reader.
  • crates/zeroclaw-log/src/config.rsStoragePolicy, ToolIoPolicy, ResolvedPolicy.
  • crates/zeroclaw-log/src/migrate.rs — schema-1 → schema-2 streaming migration.
  • crates/zeroclaw-log/src/observer_bridge.rs — typed Observer projection for Prometheus / OTel consumers.
  • crates/zeroclaw-gateway/src/api_logs.rs — the HTTP adapter.

Touch the source before you trust the prose on this page.