Safety & Sandboxing
Sandbox, permission tiers, hook pipeline, and the 6-stage finance safety stack
Minara applies safety at two levels: system-wide isolation and permission gates that apply to every tool call, and a finance- specific safety stack that applies to trade paths.
Why 6 stages for finance? Each stage catches a different class of mistake. The LLM can hallucinate a ticker, forget a preference, get prompt-injected, or simply have stale prices. Stacking 6 cheap-but-orthogonal checks means an error has to slip through all of them to move real funds — which empirically doesn't happen in practice. One big "LLM judge" check would be higher cost and miss the categorical errors that a 1-line typed guard catches.
This page covers both, starting with the foundations.
📘 For the operator-facing companion — how the four independent safety layers (command-guard, OS jail, fund-moving confirm, script-risk gate) show up in daily use, and what each protects against — see the Security chapter. This page is the implementation reference; that chapter is the user walk-through.
See this in use: every fund-moving feature page (Trading, Portfolio, Predictions) links back here. The Your First Trade walkthrough shows the user-facing preview step (stage 3 in the 6-stage stack).
Foundations: sandbox, tiers, and the hook pipeline
Every tool call passes through two orthogonal layers: sandbox (filesystem isolation — what a tool can touch) and permission tiers plus the hook pipeline (behavioral gating — what a tool is allowed to do right now).
Sandbox
All filesystem tools root at $dataDir/sandbox/files/ (default:
~/.minara/sandbox/files/). Every path argument goes through
resolveInSandbox() from apps/agent/src/tools/_shared/sandbox.ts, which:
- Resolves the requested path against the sandbox root.
- Rejects paths containing
..that escape the root. - Resolves symlinks and rejects any that point outside the sandbox.
- Returns the absolute resolved path the tool can safely use.
Tools are physically incapable of reading or writing files under
apps/agent/src/, the user's home dir, or anywhere else outside the sandbox.
This applies to read_file, write_file, patch, search_files,
and every file-adjacent tool.
Shell egress is also locked down. The terminal tool scrubs 38
credential-like env prefixes from every subprocess, and shell-wrapper
egress (bash -c curl and friends) is denied by default.
Permission tiers
Every ToolEntry carries a permissionTier:
| Tier | Name | Examples |
|---|---|---|
| 1 | READ_ONLY | get_price, get_balance, read_file, web_search |
| 2 | CONFIRM_ONCE | analyze_market, deep_research_run, small swaps |
| 3 | ALWAYS_CONFIRM | write_file, patch, buy_token, docx_create |
| 4 | MANUAL_ONLY | transfer_token to external address, emergency stop flip |
The hook pipeline
A global BeforeToolCallHook runs before every tool invocation. It
enforces, in order:
- Emergency stop. If
safetyConfig.killSwitchis active, all trading tools (tier ≥ 2) are blocked. - Daily spend cap. Accumulated fund-moving volume is tracked
against
safetyConfig.dailySpendCap. - Per-tx max. Individual trade amount checked against
safetyConfig.perTxMax. - Tier gating. On autonomous turns (no user in the loop),
tier 3+ tools are blocked unless
safetyConfig.autopilotEnabledis true. - Manual-only enforcement. Tier 4 tools always require a user-confirmation round-trip.
State lives in SQLite's audit_log table. Every call is logged
with its reasoning, arguments, outcome, and hook decisions.
Nothing is ever silently deleted.
Prompt injection defense
The agent defends against prompt injection from market data, tool outputs, and memory recalls via:
- Delimiter isolation. Untrusted content is wrapped in unique random delimiters the LLM is told never to follow instructions from.
- Zero-width char stripping. Removes ZWSP, ZWJ, RLO, and friends.
- Content scanning. Regex match against a curated injection payload database before the content enters the prompt.
See apps/agent/src/core/prompt-builder.ts and tests/unit/prompt-injection.test.ts.
The finance safety stack
The permission tier hook chain stops obviously bad calls. The
finance safety stack stops subtly bad trades: a swap that
executes at 50% slippage, a 5× position on one token, a 15×
leverage perp that blows up at the first dip. This module lives
under apps/agent/src/finance/ and composes
six independently testable pieces into a full-trade safety path.
The stack
trade intent
│
▼
┌───────────────┐
│ token-safety │ scam detection, canonical address, chain resolution
└───────┬───────┘
▼
┌───────────────┐
│ position-sizing│ fixed_usd / fixed_fraction / half_kelly
└───────┬───────┘
▼
┌───────────────┐
│ exposure-limits│ per-token / per-chain / per-asset-class caps
└───────┬───────┘
▼
┌───────────────┐
│ slippage │ simulate → check price impact → reject if too high
└───────┬───────┘
▼
┌───────────────┐
│ risk-manager │ per-tx max, daily cap, emergency stop (atomic debit)
└───────┬───────┘
▼
SafeTradingClient → Minara backend
│
▼
┌───────────────┐
│ stop-loss │ periodic workflow closes positions on exit trigger
└───────────────┘Each stage is a separate module. The top-level
full-risk-manager.ts
composes them rather than growing into a monolith. That
composition is the point: any piece can be replaced or extended
without touching the others.
risk-manager.ts: the minimum floor
apps/agent/src/finance/risk-manager.ts
is the Phase 1 floor that always runs. Three hard limits:
- Per-transaction max (default
$500). Rejects any trade whoseestimated_value_usdexceeds the cap. - Daily spend cap (default
$2,000). Atomic check-and-debit against thedaily_spendSQLite table usingBEGIN IMMEDIATEso concurrent trades cannot exceed the cap by racing. - Emergency stop. When active, every tier-2-plus trade is
rejected until
/unkillis called.
Per-tx and daily caps are controlled by safety.json fields:
{
"maxTransactionAmount": 500,
"dailySpendCap": 2000,
"killSwitchActive": false,
"allowedTokens": [],
"blockedTokens": ["SQUID", "SAFEMARS"]
}The allowlist is empty by default (meaning all tokens pass). The
blocklist is always enforced. Edit via minara config set safety.maxTransactionAmount 1000.
token-safety.ts: the entry gate
Runs as a trade hook at priority 50, before the permission tier hook. Three responsibilities:
- Canonical address resolution. When the user says "buy
USDT on Arbitrum," the hook looks up the canonical address
from
CANONICAL_ADDRESSESso the trade goes to the real USDT contract rather than a scam imitator with the same ticker. - Known scam detection. A small curated set of token tickers and addresses that are hard-blocked regardless of allowlist/blocklist configuration. The set lives in source code so adding entries requires a PR.
- Chain resolution. Maps ambiguous chain identifiers ("eth" vs "ethereum" vs "mainnet") to the canonical chain id the Minara backend expects.
If the token doesn't resolve, the trade is rejected with a structured error the LLM can explain ("I don't recognize 'SAFEMARS' on 'ethereum' as a canonical token, and it matches our scam list").
position-sizing.ts: how big
Three strategies, picked by safety.json → sizing.strategy:
fixed_usd
size_usd = min(fixedAmountUsd, max_transaction_usd)Deterministic. "Always trade $100." Good for DCA workflows.
fixed_fraction
size_usd = min(portfolio_value_usd * fraction, max_transaction_usd)Percent of portfolio. "Always 2% of equity." Good for risk parity setups where each trade sizes itself down as the portfolio shrinks.
half_kelly
f_star = (p * b - q) / b // p = win prob, q = 1 - p, b = payoff ratio
size_usd = portfolio_value_usd * f_star * kellyMultiplier // default 0.5Optimal bet sizing given edge and variance. Half-Kelly by
default because full Kelly produces brutal drawdowns even
when the edge is real. Requires the LLM to supply
win_probability and payoff_ratio as arguments; if either is
missing the sizer falls back to fixed_fraction.
Every strategy is always clipped by the per-transaction cap.
The SizingDecision.clipped flag records whether the computed
size hit the ceiling so audit queries can surface when caps are
binding.
exposure-limits.ts: concentration control
Three layers of exposure limits computed against the current portfolio (not historical):
interface ExposureLimitsConfig {
maxPerTokenUsd: number; // default $5,000
maxPerChainUsd: number; // default $15,000
maxPerAssetClassFraction: number; // default 0.40 (40% in any one class)
maxTotalExposureUsd: number; // default $50,000
}The exposure check runs before the trade executes. If the new trade would push exposure past any of the four limits, the hook rejects it with the specific ceiling that was violated.
Asset class is computed by
learning/methodology-store.ts → classifyAsset
and overlaps with the skill system's asset class taxonomy. So a
trade labeled crypto_meme in the router is also labeled
crypto_meme in the exposure check. The single vocabulary means
"no more than 40% in memecoins" is enforceable with no glue
code.
slippage-protection.ts: price impact gates
Large trades get tighter slippage budgets:
| Trade size | Max price impact |
|---|---|
| Under $1,000 | 2.0% |
| $1,000 – $10,000 | 1.0% |
| Over $10,000 | 0.5% |
The tiered structure prevents a small trade from eating 5%
impact (annoying but recoverable) and also prevents a large
trade from eating 1% impact (potentially catastrophic on whale-
size orders). The thresholds are defaults; every field is
configurable via safety.json.
Before executing a swap, the hook calls the Minara backend's
/v1/tx/cross-chain/swaps-simulate endpoint to get the
estimated output and price impact. If the impact exceeds the
tier ceiling, the trade is rejected. No actual swap has happened
yet; the simulation is cheap and the rejection is clean.
Leverage cap for perps
maxLeverage: 10Hard cap. A perp order with leverage: 15x is rejected with
leverage_exceeds_cap. The cap is per-trade; there is no
portfolio-wide leverage aggregation yet (if you need that, the
exposure limiter is the place to add it).
Min output ratio
minOutputRatio: 0.95 // expect at least 95% of input USD outA last-line check against swap quotes that show pathologically low output. If the simulation says "your $100 trade will yield $40," the check catches it even when price impact looks okay.
stop-loss.ts: exit management
Position-level stop-loss rules. Three types:
fixed_percent
Exit when price drops threshold_pct from entry.
{ type: "fixed_percent", threshold_pct: 0.10 } // 10% drop triggers exittrailing
Exit when price drops threshold_pct from the peak since
entry rather than from entry itself. Locks in gains.
{ type: "trailing", threshold_pct: 0.08 } // 8% drawdown from hightime_based
Exit after a fixed duration regardless of P&L.
{ type: "time_based", max_age_ms: 86400000 } // 24 hoursAll three support an optional take_profit_pct to close on a
gain as well as a loss.
How it runs
Stop-loss rules are checked by a periodic workflow
(workflow/templates/stop-loss-monitor.ts). The workflow polls
open positions on a schedule, computes whether each trigger
fires, and issues close orders through the normal trade path.
This matters: the stop-loss module itself never calls the trading backend directly. It only decides. The close order still goes through the full safety stack (permission tier, daily cap, slippage check, everything). A stop-loss trying to exit during an exchange outage still respects every gate; it does not get a special-purpose bypass.
Composition: FullRiskManager
full-risk-manager.ts
wires everything together into one trade hook:
new FullRiskManager(db, safetyConfig, {
sizing: DEFAULT_SIZING_CONFIG,
exposure: DEFAULT_EXPOSURE_LIMITS,
slippage: DEFAULT_SLIPPAGE_CONFIG,
});The composition is intentionally explicit. You can swap any piece (custom sizing strategy, tighter exposure limits for a conservative user, disabled slippage checks in tests) without touching the others. The risk manager underneath always runs; that's the minimum floor.
Config surface
Safety lives in safety.json under $dataDir/. Edit via
minara config:
minara config list # show every field
minara config get sizing.strategy
minara config set sizing.strategy half_kelly
minara config set exposure.maxPerTokenUsd 2500
minara config set slippage.largeTradeMaxImpact 0.003Validation happens at read time: invalid values (negative caps, leverage > 100, threshold_pct > 1.0) cause the safety config loader to fall back to defaults and emit a warning to the structured log. There is no way to boot with an invalid config.
Observability
Every trade hook decision lands in two tables:
audit: the tool call, the args, the outcome.tier_events: which hook fired and why.
Common investigation queries:
-- Why was my swap rejected yesterday?
SELECT a.tool_name, te.decision, te.reason, a.created_at
FROM audit a
LEFT JOIN tier_events te ON te.trace_id = a.trace_id
WHERE a.tool_name IN ('swap', 'buy', 'sell')
AND a.blocked = 1
AND date(a.created_at) = '2026-04-14';
-- How often are sizing caps binding?
SELECT COUNT(*) FROM audit
WHERE tool_set = 'trade'
AND json_extract(result_json, '$.sizing.clipped') = 1;
-- Which tokens hit exposure limits most often?
SELECT json_extract(args_json, '$.token'), COUNT(*)
FROM audit
WHERE block_reason LIKE '%exposure%'
GROUP BY 1 ORDER BY 2 DESC LIMIT 10;Extending the stack
- Custom sizing strategy. Add a new variant to
SizingStrategy, implementcomputeSizefor it inPositionSizer, and document the newsizing.*config fields in env-vars plus this page. - New exposure dimension. Add a field to
ExposureLimitsConfig, implement the check inExposureLimiter.check, and surface it in the trade hook's block reason. Write a unit test that covers both the pass and the fail case. - Different slippage tiering. Change the thresholds in
DEFAULT_SLIPPAGE_CONFIGor add a new tier. The tiered table is small on purpose; do not turn it into a continuous function unless you have a good reason. - New stop-loss type. Add a variant to
StopLossType, implement the trigger inStopLossEvaluator, and make sure the periodic workflow template recognizes it.
Do not add a trade path that bypasses FullRiskManager.
The single-gate property is what makes the safety model
defensible. A fast-path for "trusted" callers is the kind of
change that looks harmless in review and breaks the audit log
the day you need it most.