Security Model

Trust Boundaries

Content is classified by trust source:

Source	Trust Level
User instructions	Trusted
Project policy	Trusted
AgentZero core code	Trusted
Skill instructions	Untrusted
Document content	Untrusted
Tool output	Untrusted
Network content	Untrusted

Untrusted content never becomes trusted instruction.

Data Classification

Level	Remote Models	Handling
`public`	Allowed	No restrictions
`internal`	Policy check	Requires explicit policy
`private`	Redact or deny	Must redact before remote
`pii`	Redact	Always redacted for remote
`secret`	Denied	Never sent remotely
`credential`	Denied	Never sent remotely
`unknown`	Denied	Fails closed as private

Fail Closed

When uncertain, AgentZero denies:

Unknown data classification → treat as private
Unknown permission → deny
Unknown runtime safety → deny
Failed redaction → deny remote call
Failed policy load → deny privileged actions
Failed audit init → deny session start

Encryption

AES-256-GCM for all encryption at rest
Argon2id key derivation (memory-hard)
Random salt + nonce per encryption
Per-line encryption for audit logs (appendable)
Per-secret files in vault

Runtime Isolation

Skills execute at different isolation tiers per ADR 0006:

Tier	Use Case	Isolation	Status
None	Instruction-only skills	No code executed	✓ shipped
Host Readonly	Safe read-only tools	Filesystem access, path-restricted	✓ shipped
Host Supervised	Tools with write/shell	Policy-gated `ShellCommand` capability, full audit trail	✓ shipped
WASM Sandbox	Portable tools	No filesystem, no network, fuel-limited	✓ shipped
MVM MicroVM	High-risk execution	Full VM isolation	planned
Deny	Blocked	Nothing executes	✓ shipped

WASM skills run inside a wasmtime sandbox with:

No ambient filesystem — all WASI access denied
Network access controlled — SandboxNetworkPolicy with three tiers: Deny (default), AllowEgress (unrestricted), AllowEgressFiltered (host allowlist)
Memory cap — 64 MB default
Time limit — fuel-based, 30 seconds default
Policy-gated — requires wasm_execution = "allow" in policy

Security Hardening

.agentzero/ Directory Protection

The .agentzero/ directory is blocked from tool access. Tools cannot read, write, or list files inside this directory. This prevents policy files, vault secrets, and audit logs from being disclosed to the model or exfiltrated through tool calls.

TOCTOU Prevention

Paths are canonicalized before policy evaluation. This prevents time-of-check-to-time-of-use attacks where a symlink could point to a different file between the policy check and the actual file operation.

Redaction Pipeline

Tool arguments are redacted in ToolCallRecord and the approval flow. Secrets are replaced with [REDACTED_SECRET] and PII with [REDACTED_PII]. Redaction placeholders include a random hex suffix (e.g., [SECRET_a1b2]) to prevent placeholder collision.

Tool output is scanned for secrets before audit event logging. The scanning functions (scan_for_secrets(), redact_json_value()) are extracted to agentzero-core as shared utilities.

WASM Import Verification

WASM modules with undeclared imports are rejected before execution. Only imports declared in the WIT interface (az:host) are linked. This prevents modules from requesting capabilities they were not designed for.

Approval Scope Tracking

The approval prompt supports three responses:

Response	Scope
`y`	Approve this one tool call
`yes-all` / `a`	Approve all calls to this tool for the current session
`n`	Deny

Session-scoped approvals are cached per tool name and do not persist across sessions.

Gateway Safety

Messaging gateways (Slack, Telegram) route external messages through the agent loop with no human in the loop. To prevent abuse:

Dangerous tools denied — GatewayApprovalHandler blocks write, edit, shell, and generate_tool in gateway mode. Only read-only tools (read, list, search) are available.
PII redaction on outbound — all responses are scanned and redacted before sending to the messaging platform
Policy still applies — the deny-by-default policy engine runs before every tool call, even in gateway mode

HTTP Request Controls (WASM)

WASM plugins can make outbound HTTP requests via the az::http_request host import. Three barriers protect against data exfiltration:

Capability check — policy engine must allow NetworkRequest
URL allowlist — SandboxNetworkPolicy must allow the specific URL (plugins declare allowed hosts in PLUGIN.toml)
PII scan — request body is scanned for secrets; blocked if any are found

Plugins declare their network requirements in PLUGIN.toml:

[plugin.network]
policy = "allow_egress_filtered"
allowed_hosts = ["slack.com", "api.telegram.org"]

Plugins without a [plugin.network] section default to Deny — no outbound requests.

What AgentZero Prevents

Raw secrets reaching model context
PII in remote model calls without redaction
Prompt injection from tool output
Ambient host access from skills/tools
WASM modules escaping sandbox boundaries
WASM modules with undeclared imports executing
Secret leakage in audit logs (tool output scanned before logging)
Unaudited tool or skill execution
Tool access to .agentzero/ configuration directory
TOCTOU path manipulation attacks
Arbitrary code execution via messaging gateways (dangerous tools denied when no human in the loop)
WASM plugin data exfiltration to unauthorized hosts (URL allowlist enforcement)
Secrets in outbound HTTP request bodies from WASM plugins (PII scan)