Security Model
Trust Boundaries
Section titled “Trust Boundaries”Content is classified by trust source:
| Source | Trust Level |
|---|---|
| User instructions | Trusted |
| Project policy | Trusted |
| AgentZero core code | Trusted |
| Skill instructions | Untrusted |
| Document content | Untrusted |
| Tool output | Untrusted |
| Network content | Untrusted |
Untrusted content never becomes trusted instruction.
Data Classification
Section titled “Data Classification”| Level | Remote Models | Handling |
|---|---|---|
public | Allowed | No restrictions |
internal | Policy check | Requires explicit policy |
private | Redact or deny | Must redact before remote |
pii | Redact | Always redacted for remote |
secret | Denied | Never sent remotely |
credential | Denied | Never sent remotely |
unknown | Denied | Fails closed as private |
Fail Closed
Section titled “Fail Closed”When uncertain, AgentZero denies:
- Unknown data classification → treat as private
- Unknown permission → deny
- Unknown runtime safety → deny
- Failed redaction → deny remote call
- Failed policy load → deny privileged actions
- Failed audit init → deny session start
Encryption
Section titled “Encryption”- AES-256-GCM for all encryption at rest
- Argon2id key derivation (memory-hard)
- Random salt + nonce per encryption
- Per-line encryption for audit logs (appendable)
- Per-secret files in vault
Runtime Isolation
Section titled “Runtime Isolation”Skills execute at different isolation tiers per ADR 0006:
| Tier | Use Case | Isolation | Status |
|---|---|---|---|
| None | Instruction-only skills | No code executed | ✓ shipped |
| Host Readonly | Safe read-only tools | Filesystem access, path-restricted | ✓ shipped |
| Host Supervised | Tools with write/shell | Policy-gated ShellCommand capability, full audit trail | ✓ shipped |
| WASM Sandbox | Portable tools | No filesystem, no network, fuel-limited | ✓ shipped |
| MVM MicroVM | High-risk execution | Full VM isolation | planned |
| Deny | Blocked | Nothing executes | ✓ shipped |
WASM skills run inside a wasmtime sandbox with:
- No ambient filesystem — all WASI access denied
- Network access controlled —
SandboxNetworkPolicywith three tiers:Deny(default),AllowEgress(unrestricted),AllowEgressFiltered(host allowlist) - Memory cap — 64 MB default
- Time limit — fuel-based, 30 seconds default
- Policy-gated — requires
wasm_execution = "allow"in policy
Security Hardening
Section titled “Security Hardening”.agentzero/ Directory Protection
Section titled “.agentzero/ Directory Protection”The .agentzero/ directory is blocked from tool access. Tools cannot read, write, or list files inside this directory. This prevents policy files, vault secrets, and audit logs from being disclosed to the model or exfiltrated through tool calls.
TOCTOU Prevention
Section titled “TOCTOU Prevention”Paths are canonicalized before policy evaluation. This prevents time-of-check-to-time-of-use attacks where a symlink could point to a different file between the policy check and the actual file operation.
Redaction Pipeline
Section titled “Redaction Pipeline”Tool arguments are redacted in ToolCallRecord and the approval flow. Secrets are replaced with [REDACTED_SECRET] and PII with [REDACTED_PII]. Redaction placeholders include a random hex suffix (e.g., [SECRET_a1b2]) to prevent placeholder collision.
Tool output is scanned for secrets before audit event logging. The scanning functions (scan_for_secrets(), redact_json_value()) are extracted to agentzero-core as shared utilities.
WASM Import Verification
Section titled “WASM Import Verification”WASM modules with undeclared imports are rejected before execution. Only imports declared in the WIT interface (az:host) are linked. This prevents modules from requesting capabilities they were not designed for.
Approval Scope Tracking
Section titled “Approval Scope Tracking”The approval prompt supports three responses:
| Response | Scope |
|---|---|
y | Approve this one tool call |
yes-all / a | Approve all calls to this tool for the current session |
n | Deny |
Session-scoped approvals are cached per tool name and do not persist across sessions.
Gateway Safety
Section titled “Gateway Safety”Messaging gateways (Slack, Telegram) route external messages through the agent loop with no human in the loop. To prevent abuse:
- Dangerous tools denied —
GatewayApprovalHandlerblocks write, edit, shell, and generate_tool in gateway mode. Only read-only tools (read, list, search) are available. - PII redaction on outbound — all responses are scanned and redacted before sending to the messaging platform
- Policy still applies — the deny-by-default policy engine runs before every tool call, even in gateway mode
HTTP Request Controls (WASM)
Section titled “HTTP Request Controls (WASM)”WASM plugins can make outbound HTTP requests via the az::http_request host import. Three barriers protect against data exfiltration:
- Capability check — policy engine must allow
NetworkRequest - URL allowlist —
SandboxNetworkPolicymust allow the specific URL (plugins declare allowed hosts inPLUGIN.toml) - PII scan — request body is scanned for secrets; blocked if any are found
Plugins declare their network requirements in PLUGIN.toml:
[plugin.network]policy = "allow_egress_filtered"allowed_hosts = ["slack.com", "api.telegram.org"]Plugins without a [plugin.network] section default to Deny — no outbound requests.
What AgentZero Prevents
Section titled “What AgentZero Prevents”- Raw secrets reaching model context
- PII in remote model calls without redaction
- Prompt injection from tool output
- Ambient host access from skills/tools
- WASM modules escaping sandbox boundaries
- WASM modules with undeclared imports executing
- Secret leakage in audit logs (tool output scanned before logging)
- Unaudited tool or skill execution
- Tool access to
.agentzero/configuration directory - TOCTOU path manipulation attacks
- Arbitrary code execution via messaging gateways (dangerous tools denied when no human in the loop)
- WASM plugin data exfiltration to unauthorized hosts (URL allowlist enforcement)
- Secrets in outbound HTTP request bodies from WASM plugins (PII scan)