MINARA

Safety & Sandboxing

Sandbox, permission tiers, hook pipeline, and the 6-stage finance safety stack

Minara applies safety at two levels: system-wide isolation and permission gates that apply to every tool call, and a finance- specific safety stack that applies to trade paths.

Why 6 stages for finance? Each stage catches a different class of mistake. The LLM can hallucinate a ticker, forget a preference, get prompt-injected, or simply have stale prices. Stacking 6 cheap-but-orthogonal checks means an error has to slip through all of them to move real funds — which empirically doesn't happen in practice. One big "LLM judge" check would be higher cost and miss the categorical errors that a 1-line typed guard catches.

This page covers both, starting with the foundations.

📘 For the operator-facing companion — how the four independent safety layers (command-guard, OS jail, fund-moving confirm, script-risk gate) show up in daily use, and what each protects against — see the Security chapter. This page is the implementation reference; that chapter is the user walk-through.

See this in use: every fund-moving feature page (Trading, Portfolio, Predictions) links back here. The Your First Trade walkthrough shows the user-facing preview step (stage 3 in the 6-stage stack).

Foundations: sandbox, tiers, and the hook pipeline

Every tool call passes through two orthogonal layers: sandbox (filesystem isolation — what a tool can touch) and permission tiers plus the hook pipeline (behavioral gating — what a tool is allowed to do right now).

Sandbox

All filesystem tools root at $dataDir/sandbox/files/ (default: ~/.minara/sandbox/files/). Every path argument goes through resolveInSandbox() from apps/agent/src/tools/_shared/sandbox.ts, which:

  • Resolves the requested path against the sandbox root.
  • Rejects paths containing .. that escape the root.
  • Resolves symlinks and rejects any that point outside the sandbox.
  • Returns the absolute resolved path the tool can safely use.

Tools are physically incapable of reading or writing files under apps/agent/src/, the user's home dir, or anywhere else outside the sandbox. This applies to read_file, write_file, patch, search_files, and every file-adjacent tool.

Shell egress is also locked down. The terminal tool scrubs 38 credential-like env prefixes from every subprocess, and shell-wrapper egress (bash -c curl and friends) is denied by default.

Permission tiers

Every ToolEntry carries a permissionTier:

TierNameExamples
1READ_ONLYget_price, get_balance, read_file, web_search
2CONFIRM_ONCEanalyze_market, deep_research_run, small swaps
3ALWAYS_CONFIRMwrite_file, patch, buy_token, docx_create
4MANUAL_ONLYtransfer_token to external address, emergency stop flip

The hook pipeline

A global BeforeToolCallHook runs before every tool invocation. It enforces, in order:

  1. Emergency stop. If safetyConfig.killSwitch is active, all trading tools (tier ≥ 2) are blocked.
  2. Daily spend cap. Accumulated fund-moving volume is tracked against safetyConfig.dailySpendCap.
  3. Per-tx max. Individual trade amount checked against safetyConfig.perTxMax.
  4. Tier gating. On autonomous turns (no user in the loop), tier 3+ tools are blocked unless safetyConfig.autopilotEnabled is true.
  5. Manual-only enforcement. Tier 4 tools always require a user-confirmation round-trip.

State lives in SQLite's audit_log table. Every call is logged with its reasoning, arguments, outcome, and hook decisions. Nothing is ever silently deleted.

Prompt injection defense

The agent defends against prompt injection from market data, tool outputs, and memory recalls via:

  • Delimiter isolation. Untrusted content is wrapped in unique random delimiters the LLM is told never to follow instructions from.
  • Zero-width char stripping. Removes ZWSP, ZWJ, RLO, and friends.
  • Content scanning. Regex match against a curated injection payload database before the content enters the prompt.

See apps/agent/src/core/prompt-builder.ts and tests/unit/prompt-injection.test.ts.

The finance safety stack

The permission tier hook chain stops obviously bad calls. The finance safety stack stops subtly bad trades: a swap that executes at 50% slippage, a 5× position on one token, a 15× leverage perp that blows up at the first dip. This module lives under apps/agent/src/finance/ and composes six independently testable pieces into a full-trade safety path.

finance-safety diagram

The stack

          trade intent


       ┌───────────────┐
       │ token-safety  │  scam detection, canonical address, chain resolution
       └───────┬───────┘

       ┌───────────────┐
       │ position-sizing│  fixed_usd / fixed_fraction / half_kelly
       └───────┬───────┘

       ┌───────────────┐
       │ exposure-limits│  per-token / per-chain / per-asset-class caps
       └───────┬───────┘

       ┌───────────────┐
       │ slippage      │  simulate → check price impact → reject if too high
       └───────┬───────┘

       ┌───────────────┐
       │ risk-manager  │  per-tx max, daily cap, emergency stop (atomic debit)
       └───────┬───────┘

       SafeTradingClient → Minara backend


       ┌───────────────┐
       │ stop-loss     │  periodic workflow closes positions on exit trigger
       └───────────────┘

Each stage is a separate module. The top-level full-risk-manager.ts composes them rather than growing into a monolith. That composition is the point: any piece can be replaced or extended without touching the others.

risk-manager.ts: the minimum floor

apps/agent/src/finance/risk-manager.ts is the Phase 1 floor that always runs. Three hard limits:

  1. Per-transaction max (default $500). Rejects any trade whose estimated_value_usd exceeds the cap.
  2. Daily spend cap (default $2,000). Atomic check-and-debit against the daily_spend SQLite table using BEGIN IMMEDIATE so concurrent trades cannot exceed the cap by racing.
  3. Emergency stop. When active, every tier-2-plus trade is rejected until /unkill is called.

Per-tx and daily caps are controlled by safety.json fields:

{
  "maxTransactionAmount": 500,
  "dailySpendCap": 2000,
  "killSwitchActive": false,
  "allowedTokens": [],
  "blockedTokens": ["SQUID", "SAFEMARS"]
}

The allowlist is empty by default (meaning all tokens pass). The blocklist is always enforced. Edit via minara config set safety.maxTransactionAmount 1000.

token-safety.ts: the entry gate

Runs as a trade hook at priority 50, before the permission tier hook. Three responsibilities:

  1. Canonical address resolution. When the user says "buy USDT on Arbitrum," the hook looks up the canonical address from CANONICAL_ADDRESSES so the trade goes to the real USDT contract rather than a scam imitator with the same ticker.
  2. Known scam detection. A small curated set of token tickers and addresses that are hard-blocked regardless of allowlist/blocklist configuration. The set lives in source code so adding entries requires a PR.
  3. Chain resolution. Maps ambiguous chain identifiers ("eth" vs "ethereum" vs "mainnet") to the canonical chain id the Minara backend expects.

If the token doesn't resolve, the trade is rejected with a structured error the LLM can explain ("I don't recognize 'SAFEMARS' on 'ethereum' as a canonical token, and it matches our scam list").

position-sizing.ts: how big

Three strategies, picked by safety.jsonsizing.strategy:

fixed_usd

size_usd = min(fixedAmountUsd, max_transaction_usd)

Deterministic. "Always trade $100." Good for DCA workflows.

fixed_fraction

size_usd = min(portfolio_value_usd * fraction, max_transaction_usd)

Percent of portfolio. "Always 2% of equity." Good for risk parity setups where each trade sizes itself down as the portfolio shrinks.

half_kelly

f_star = (p * b - q) / b        // p = win prob, q = 1 - p, b = payoff ratio
size_usd = portfolio_value_usd * f_star * kellyMultiplier   // default 0.5

Optimal bet sizing given edge and variance. Half-Kelly by default because full Kelly produces brutal drawdowns even when the edge is real. Requires the LLM to supply win_probability and payoff_ratio as arguments; if either is missing the sizer falls back to fixed_fraction.

Every strategy is always clipped by the per-transaction cap. The SizingDecision.clipped flag records whether the computed size hit the ceiling so audit queries can surface when caps are binding.

exposure-limits.ts: concentration control

Three layers of exposure limits computed against the current portfolio (not historical):

interface ExposureLimitsConfig {
  maxPerTokenUsd: number;           // default $5,000
  maxPerChainUsd: number;           // default $15,000
  maxPerAssetClassFraction: number; // default 0.40 (40% in any one class)
  maxTotalExposureUsd: number;      // default $50,000
}

The exposure check runs before the trade executes. If the new trade would push exposure past any of the four limits, the hook rejects it with the specific ceiling that was violated.

Asset class is computed by learning/methodology-store.ts → classifyAsset and overlaps with the skill system's asset class taxonomy. So a trade labeled crypto_meme in the router is also labeled crypto_meme in the exposure check. The single vocabulary means "no more than 40% in memecoins" is enforceable with no glue code.

slippage-protection.ts: price impact gates

Large trades get tighter slippage budgets:

Trade sizeMax price impact
Under $1,0002.0%
$1,000 – $10,0001.0%
Over $10,0000.5%

The tiered structure prevents a small trade from eating 5% impact (annoying but recoverable) and also prevents a large trade from eating 1% impact (potentially catastrophic on whale- size orders). The thresholds are defaults; every field is configurable via safety.json.

Before executing a swap, the hook calls the Minara backend's /v1/tx/cross-chain/swaps-simulate endpoint to get the estimated output and price impact. If the impact exceeds the tier ceiling, the trade is rejected. No actual swap has happened yet; the simulation is cheap and the rejection is clean.

Leverage cap for perps

maxLeverage: 10

Hard cap. A perp order with leverage: 15x is rejected with leverage_exceeds_cap. The cap is per-trade; there is no portfolio-wide leverage aggregation yet (if you need that, the exposure limiter is the place to add it).

Min output ratio

minOutputRatio: 0.95  // expect at least 95% of input USD out

A last-line check against swap quotes that show pathologically low output. If the simulation says "your $100 trade will yield $40," the check catches it even when price impact looks okay.

stop-loss.ts: exit management

Position-level stop-loss rules. Three types:

fixed_percent

Exit when price drops threshold_pct from entry.

{ type: "fixed_percent", threshold_pct: 0.10 }  // 10% drop triggers exit

trailing

Exit when price drops threshold_pct from the peak since entry rather than from entry itself. Locks in gains.

{ type: "trailing", threshold_pct: 0.08 }  // 8% drawdown from high

time_based

Exit after a fixed duration regardless of P&L.

{ type: "time_based", max_age_ms: 86400000 }  // 24 hours

All three support an optional take_profit_pct to close on a gain as well as a loss.

How it runs

Stop-loss rules are checked by a periodic workflow (workflow/templates/stop-loss-monitor.ts). The workflow polls open positions on a schedule, computes whether each trigger fires, and issues close orders through the normal trade path.

This matters: the stop-loss module itself never calls the trading backend directly. It only decides. The close order still goes through the full safety stack (permission tier, daily cap, slippage check, everything). A stop-loss trying to exit during an exchange outage still respects every gate; it does not get a special-purpose bypass.

Composition: FullRiskManager

full-risk-manager.ts wires everything together into one trade hook:

new FullRiskManager(db, safetyConfig, {
  sizing: DEFAULT_SIZING_CONFIG,
  exposure: DEFAULT_EXPOSURE_LIMITS,
  slippage: DEFAULT_SLIPPAGE_CONFIG,
});

The composition is intentionally explicit. You can swap any piece (custom sizing strategy, tighter exposure limits for a conservative user, disabled slippage checks in tests) without touching the others. The risk manager underneath always runs; that's the minimum floor.

Config surface

Safety lives in safety.json under $dataDir/. Edit via minara config:

minara config list                                    # show every field
minara config get sizing.strategy
minara config set sizing.strategy half_kelly
minara config set exposure.maxPerTokenUsd 2500
minara config set slippage.largeTradeMaxImpact 0.003

Validation happens at read time: invalid values (negative caps, leverage > 100, threshold_pct > 1.0) cause the safety config loader to fall back to defaults and emit a warning to the structured log. There is no way to boot with an invalid config.

Observability

Every trade hook decision lands in two tables:

  • audit: the tool call, the args, the outcome.
  • tier_events: which hook fired and why.

Common investigation queries:

-- Why was my swap rejected yesterday?
SELECT a.tool_name, te.decision, te.reason, a.created_at
  FROM audit a
  LEFT JOIN tier_events te ON te.trace_id = a.trace_id
 WHERE a.tool_name IN ('swap', 'buy', 'sell')
   AND a.blocked = 1
   AND date(a.created_at) = '2026-04-14';

-- How often are sizing caps binding?
SELECT COUNT(*) FROM audit
 WHERE tool_set = 'trade'
   AND json_extract(result_json, '$.sizing.clipped') = 1;

-- Which tokens hit exposure limits most often?
SELECT json_extract(args_json, '$.token'), COUNT(*)
  FROM audit
 WHERE block_reason LIKE '%exposure%'
 GROUP BY 1 ORDER BY 2 DESC LIMIT 10;

Extending the stack

  • Custom sizing strategy. Add a new variant to SizingStrategy, implement computeSize for it in PositionSizer, and document the new sizing.* config fields in env-vars plus this page.
  • New exposure dimension. Add a field to ExposureLimitsConfig, implement the check in ExposureLimiter.check, and surface it in the trade hook's block reason. Write a unit test that covers both the pass and the fail case.
  • Different slippage tiering. Change the thresholds in DEFAULT_SLIPPAGE_CONFIG or add a new tier. The tiered table is small on purpose; do not turn it into a continuous function unless you have a good reason.
  • New stop-loss type. Add a variant to StopLossType, implement the trigger in StopLossEvaluator, and make sure the periodic workflow template recognizes it.

Do not add a trade path that bypasses FullRiskManager. The single-gate property is what makes the safety model defensible. A fast-path for "trusted" callers is the kind of change that looks harmless in review and breaks the audit log the day you need it most.

On this page