Observability

When an agent does the wrong thing, you need to know why. Minara Agent emits three distinct streams of information: structured logs, the audit log, and tier events. Each has a purpose. This page explains what goes where and how to investigate common failure modes.

Why three streams instead of one big log? Agent misbehavior shows up in three shapes: the infrastructure broke (LLM call failed, DB locked), the agent took the wrong action (called a tool with bad args), or permission logic blocked something unexpectedly. One stream that tries to cover all three becomes unreadable. Splitting them lets you walk up to the right table immediately: behavior → audit, permissions → tier_events, infra → structured logs. All three share a trace_id, so you can stitch a story back together when you need the full picture.

See this in use: the 6-stage safety stack writes a row to tier_events for every fund-moving call — this is how you prove, after the fact, that a trade passed every gate before executing.

The three streams

Stream	Lives in	Purpose
Structured logs	`$dataDir/logs/*.ndjson` plus stdout	Operational visibility, startup state, LLM calls
Audit log	SQLite `audit` table	"What tools did the agent call and what did they do"
Tier events	SQLite `tier_events` table	"Why was this call allowed or blocked"

The rule of thumb: if you're debugging behavior, start with the audit log. If you're debugging infrastructure, start with structured logs. If you're debugging permissions, start with tier events.

Structured logs

apps/agent/src/core/logger.ts exposes a minimal JSON logger:

logger.info("skills/registry", "skill_registered", { id });
logger.warn("agent/loop", "max_iterations_reached", { session_id });
logger.error("llm/anthropic-wire", "cache_miss", { reason });

Each line is a JSON object with:

{
  "ts": "2026-04-14T12:34:56.789Z",
  "level": "info",
  "category": "skills/registry",
  "event": "skill_registered",
  "correlation_id": "turn_abc123",
  "trace_id": "t_xyz",
  "data": { "id": "minara.core" }
}

Correlation IDs come from withCorrelation(id, fn) wrappers around turn execution. Every log line produced inside a turn inherits the same id via AsyncLocalStorage, so grep correlation_id=turn_abc123 $dataDir/logs/*.ndjson gets the full story of one turn.

Log rotation

Logs are written with lettercase-based daily rotation under $dataDir/logs/. Configure the threshold with:

LOG_LEVEL (debug / info / warn / error, default info)

Setting LOG_LEVEL=debug is safe: the structured format means you can jq out the noise. It does roughly 4× the log volume.

Audit log: the source of truth

Every tool call goes through auditLogHook in apps/agent/src/core/audit-log-hook.ts. The table schema:

CREATE TABLE audit (
  id              TEXT PRIMARY KEY,
  session_id      TEXT,
  trace_id        TEXT,
  tool_name       TEXT,
  tool_set        TEXT,
  args_json       TEXT,       -- redacted
  result_json     TEXT,
  blocked         INTEGER,
  block_reason    TEXT,
  permission_tier INTEGER,
  source          TEXT,       -- user | cron | autopilot | delegation
  duration_ms     INTEGER,
  created_at      INTEGER
);
CREATE VIRTUAL TABLE audit_fts USING fts5(
  tool_name, block_reason, args_json, content='audit'
);

FTS5 means you can grep the whole history:

SELECT created_at, tool_name, blocked, block_reason
  FROM audit
 WHERE audit MATCH 'withdraw'
 ORDER BY created_at DESC
 LIMIT 50;

Every investigation starts here. When a user reports "the agent did something weird," the first move is to pull their session's audit rows in order. The LLM's reasoning text, the tool arguments, the raw tool output, and the timestamps are all there.

What's redacted

The redactor in tools/_shared/result.ts masks known sensitive keys (api_key, secret, password, token, private_key, mnemonic, seed) with *** before anything reaches the audit log. Combined with the rule that secrets are never tool arguments to begin with (they come from process.env at factory time), the audit log is safe to share with support or drop into a bug report with minimal screening.

Tier events: why things were blocked

core/permission-tier-hook.ts emits a row to tier_events for every decision, whether allowed or blocked:

CREATE TABLE tier_events (
  id            TEXT PRIMARY KEY,
  tool_name     TEXT,
  source        TEXT,
  tier          INTEGER,
  allow_ceiling INTEGER,
  decision      TEXT,          -- allow | block | pending_confirmation
  reason        TEXT,
  trace_id      TEXT,
  created_at    INTEGER
);

The audit log tells you what happened. Tier events tell you why the permission system made that call. The two join on trace_id and are often queried together:

SELECT a.tool_name, a.blocked, te.decision, te.reason
  FROM audit a
  LEFT JOIN tier_events te ON te.trace_id = a.trace_id
                           AND te.tool_name = a.tool_name
 WHERE a.session_id = ?
 ORDER BY a.created_at;

Correlation and trace IDs

There are two distinct ids in play:

correlation_id is per-turn. Generated at the start of each turn by the agent loop. Appears in structured logs and audit rows.
trace_id is per-workflow-run or per-signal. Propagated from a SignalContext or a WorkflowInstance into every tool call downstream, including sub-agent delegations.

A user chat turn usually has a fresh correlation_id and no trace_id. A cron fire has both. trace_id lets you query "what did the 14:03 BTC alert do across its whole lifetime, including any sub-agents it spawned."

Common investigations

"Why did the agent refuse to call X?"

SELECT tool_name, decision, reason, created_at
  FROM tier_events
 WHERE tool_name = 'swap'
   AND decision = 'block'
 ORDER BY created_at DESC
 LIMIT 10;

Check reason. The most common values:

tier_exceeds_ceiling. Skill wasn't activated, or the turn's allowRiskTier is lower than the tool's tier.
analysis_to_trade_boundary. The turn already made analysis calls; trade calls must be a separate user message.
daily_cap_exceeded. daily_spend plus this call's notional would exceed MINARA_DAILY_CAP_USD.
kill_switch_active. Someone (or the agent) called kill.
tool_set_not_allowed. The turn's allowedToolSets excluded this tool.

"Why is this turn slow?"

SELECT tool_name, AVG(duration_ms) AS avg_ms, COUNT(*) AS n
  FROM audit
 WHERE session_id = ?
 GROUP BY tool_name
 ORDER BY avg_ms DESC;

Combine with structured logs grepped by correlation_id to see LLM call durations and cache hit ratios.

"Did the prompt cache work?"

Grep the logs for llm.cache_read_input_tokens. If the value is 0 across a session, the cacheable prompt blocks aren't stable (you're probably regenerating them per turn, which defeats caching). See LLM Integration.

"What did autopilot do overnight?"

SELECT created_at, tool_name, blocked, substr(result_json, 1, 200)
  FROM audit
 WHERE source = 'autopilot'
   AND created_at > ?
 ORDER BY created_at;

Health endpoints

The HTTP gateway exposes:

GET /healthz is a liveness probe that returns 200 if the process is up and the DB is openable.
GET /status returns readiness detail: DB stats, skill count, active triggers, last-successful LLM call timestamp.

See the API reference for the exact schema.

Exporting for external tooling

If you want logs in Loki / Datadog / Grafana Cloud, pipe stdout:

docker run minara 2>&1 | vector --config vector.toml

All stdout lines are valid NDJSON. There is no separate "structured log" export path: stdout is canonical.

Observability is boring on purpose. Three tables, one log format, one correlation field, one trace field. When something goes wrong, you read rows. When nothing goes wrong, you ignore it. That is the entire design.

On this page