Skip to content

Security Overview

AgentZero implements defense-in-depth security with 12 independent layers. Every capability is denied by default and must be explicitly enabled. All boundaries are fail-closed — if a security check can’t complete, the operation is rejected.

LayerWhat It ProtectsKey Mechanism
File I/OPath traversal, symlink attacksCanonicalization + hard-link guard + 40+ sensitive patterns
Shell ExecutionCommand injectionQuote-aware parser + explicit allowlist
Encryption at RestData confidentialityXChaCha20-Poly1305 AEAD, random nonce per write
Network SecuritySSRF, DNS rebindingPrivate IP blocking + DNS resolution check
Credential Leak GuardSecret exfiltrationPattern matching + Shannon entropy + channel boundary isolation
PII ProtectionData leakage to remote LLM providersMandatory redaction of 9 PII categories + opaque request IDs (always on, cannot be disabled)
LLM GuardrailsPrompt injection, additional PII patternsRegex-based detection, configurable enforcement mode
WASM Plugin SandboxMalicious pluginsFuel metering + memory cap + Ed25519 signatures
Gateway SecurityUnauthorized access, replay attacksmTLS + HMAC-SHA256 signing + TLS enforcement
MCP AttestationSupply chain attacksSHA-256 binary hash verification
Autonomy & DelegationPrivilege escalation3-tier autonomy + policy intersection + per-tool rate limits
Declarative PoliciesMisconfigurationYAML per-tool rules for egress, filesystem, commands
Audit & RedactionInformation leakageSecret redaction in errors/logs/panics, path sanitization

Every file operation passes through a multi-stage validation pipeline:

Input Path
-> Component::ParentDir rejection (blocks ../)
-> Canonicalization (resolves symlinks to real paths)
-> Allowed Root check (must be within workspace)
-> Hard-link guard (rejects files with multiple hard links)
-> Sensitive file check (40+ patterns)
-> Size limit check
-> Execute

The following files are blocked by default (configurable via allow_sensitive_file_reads/allow_sensitive_file_writes):

Environment files: .env, .env.* (any suffix)

Cloud credentials: .aws/credentials, .aws/config, .azure/accessTokens.json, .config/gcloud/credentials.db, .config/gcloud/application_default_credentials.json

SSH keys: .ssh/id_rsa, .ssh/id_ed25519, .ssh/id_ecdsa, .ssh/id_dsa

Kubernetes/Docker: .kube/config, .docker/config.json

Package registries: .npmrc, .pypirc, .gem/credentials

Database credentials: .pgpass, .my.cnf, .netrc

Certificates: *.pem, *.key, *.p12, *.pfx

Service accounts: credentials.json, service-account.json, client_secret.json


AgentZero uses a context-aware quote parser that understands shell quoting rules:

  • Backtick (`) is always forbidden — even inside quotes
  • $ is blocked outside quotes — prevents variable expansion and $(command) substitution
  • ;, &, |, >, < are blocked outside quotes — prevents command chaining
  • Quoted metacharacters are safe: echo 'hello;world' is allowed

Commands must be on an explicit allowlist — there is no path to execute arbitrary commands.

InputResultReason
echo 'hello;world'AllowedSemicolon inside single quotes
echo hello;worldBlockedUnquoted semicolon
echo `whoami`BlockedBacktick always forbidden
echo $HOMEBlockedUnquoted dollar sign
grep "pattern" file.txtAllowedQuoted argument

All sensitive data is encrypted before writing to disk:

  • Algorithm: XChaCha20-Poly1305 (authenticated encryption with associated data)
  • Key size: 256-bit
  • Nonce: 24-byte, randomly generated per encryption operation
  • Key storage: Auto-generated, stored with Unix permissions 0o600
  • Key sources: File-based (.agentzero-data.key) or environment variable (AGENTZERO_DATA_KEY)
  • Atomic writes: Data is written to a temporary file, then atomically renamed
  • API key hashing: Raw API keys are SHA-256 hashed before storage — the raw key is never persisted

All network tools share a unified URL Access Policy:

URL Parse -> Scheme Check (http/https only) -> Domain Blocklist ->
IP Resolution -> Private IP Check -> DNS Rebinding Check ->
Domain Allowlist -> Execute

DNS rebinding protection: Domain names are resolved to IP addresses and verified against the blocklist. An attacker cannot register a domain that initially resolves to a public IP, then change DNS to point to 127.0.0.1.

  • Loopback-only binding by default — public binding requires gateway.allow_public_bind = true
  • Constant-time token comparison via subtle::ConstantTimeEq to prevent timing side-channel attacks
  • Rate limiting: Sliding window with per-identity isolation (600 req/min default)
  • HSTS headers automatically applied when TLS is active
  • Request size limit: 1 MB default maximum body size

Configure gateway.tls.client_ca_path to require client certificates:

[gateway.tls]
cert_path = "/etc/ssl/server.pem"
key_path = "/etc/ssl/server-key.pem"
client_ca_path = "/etc/ssl/client-ca.pem" # Enables mTLS

When set, clients must present a certificate signed by the specified CA. Unsigned or mis-signed connections are rejected at the TLS layer.

API keys can be created with HMAC signing enabled. When active, requests must include:

  • X-AZ-Timestamp: Unix epoch timestamp (must be within 5 minutes of server time)
  • X-AZ-Signature: hmac-sha256=<hex> covering {timestamp}.{METHOD}.{path}.{body}

This prevents replay attacks and ensures request integrity.

When AGENTZERO_ENV=production, the gateway refuses to start without TLS unless gateway.allow_insecure = true is explicitly set.


Outbound messages to external channels (Telegram, Discord, Slack) are scanned for credentials before sending:

PatternExample
API key prefixessk-*, api_key=*
Bearer tokensBearer <token>
JWT tokenseyJ*.eyJ*.*
AWS access keysAKIA*
Private key headers-----BEGIN PRIVATE KEY-----
GitHub tokensghp_*, ghs_*
Anthropic keyssk-ant-*
Slack tokensxox[baprs]-*
X25519/Noise keys64-char hex strings with key-like prefixes

High-entropy strings (32+ characters) are flagged using Shannon entropy analysis. The sensitivity threshold is configurable (default 0.7).

Content marked local_only is blocked from being sent to remote channels. This prevents accidental leakage of internal data to external platforms.

Add your own detection patterns in agentzero.toml:

[security.outbound_leak_guard]
enabled = true
action = "redact"
sensitivity = 0.7
[[security.outbound_leak_guard.extra_patterns]]
name = "stripe_key"
regex = "sk_live_[a-zA-Z0-9]{24,}"
[[security.outbound_leak_guard.extra_patterns]]
name = "internal_token"
regex = "int_[a-f0-9]{32}"

The LLM pipeline includes composable guardrails that are enabled in audit mode by default:

Detects and optionally redacts:

  • Email addresses -> [EMAIL_REDACTED]
  • US phone numbers -> [PHONE_REDACTED]
  • Social Security Numbers -> [SSN_REDACTED]
  • API key patterns (sk-*, AKIA*, ghp_*) -> [API_KEY_REDACTED]

Detects 9+ common injection patterns:

  • “ignore all previous instructions”
  • “you are now [DAN/jailbreak/unrestricted]”
  • “new system prompt:”, “override system prompt”
  • “forget all rules/instructions”
  • “pretend you have no restrictions”

Detects invisible Unicode characters commonly used for steganographic prompt injection:

  • Zero-width characters: ZWSP, ZWNJ, ZWJ, BOM, word joiner
  • Bidirectional overrides: LTR/RTL marks, embedding, isolation (can reorder visible text)
  • Tag characters: U+E0001..U+E007F (Unicode steganography)
  • Invisible separators: soft hyphen, combining grapheme joiner, Mongolian vowel separator, function application, invisible times/separator/plus
  • Annotation anchors: interlinear annotation markers

In sanitize mode, all suspicious characters are stripped. This guard is applied to both user input and context file content before inclusion in system prompts.

Before any file is included in the system prompt (.agentzero.md, project context files, loaded references), its content is scanned through both the prompt injection guard and the Unicode injection guard. This catches:

  • Prompt injection hidden inside project documentation
  • Invisible Unicode steganography in source files
  • Bidirectional text overrides that reorder visible content

Use scan_for_injection(content) programmatically, or rely on the automatic scanning in the guardrails pipeline.

Configure via [guardrails] in agentzero.toml:

[guardrails]
pii_mode = "audit" # "off", "audit", "sanitize", "block"
injection_mode = "audit" # "off", "audit", "sanitize", "block"
unicode_mode = "sanitize" # "off", "audit", "sanitize", "block"
ModeBehavior
offDisabled
auditLog violations, pass through (default)
sanitizeRedact content, continue
blockReject the request entirely

Plugins execute in a strict sandbox with configurable isolation:

ControlDefault
Max execution time30 seconds
Max module size5 MB
Memory cap256 MB
Network accessDenied
Filesystem writeDenied
Filesystem readDenied
Signature requiredYes (release builds)

In release builds, require_signed defaults to true. Unsigned WASM modules are rejected. Plugin manifests are signed with Ed25519 and verified before loading.

Use --allow-unsigned-plugins for local development.


MCP servers are spawned as subprocesses. To prevent supply-chain attacks, configure a SHA-256 hash for each server binary:

{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem"],
"sha256": "a1b2c3d4e5f6..."
}
}
}

Before spawning, AgentZero:

  1. Resolves the binary path (absolute or via PATH)
  2. Reads the binary and computes SHA-256
  3. Compares against the configured hash
  4. Refuses to spawn on mismatch

LevelRead ToolsWrite ToolsNetwork Tools
ReadOnlyAuto-approveBlockedBlocked
SupervisedAuto-approveRequires approvalRequires approval
FullAuto-approveAuto-approveAuto-approve

When delegating to sub-agents, policies are intersected (most restrictive wins):

  • Autonomy level: most restrictive of parent and child
  • Forbidden paths: union (more paths forbidden)
  • Allowed roots: intersection (only roots allowed by both)
  • Sensitive file access: only if both parent and child allow
  • Per-tool rate limits: the lower limit wins

A sub-agent can never escalate beyond its parent’s privileges.

Configure rate limits per tool in the autonomy policy to prevent resource-intensive tools from being called excessively:

[autonomy.tool_rate_limits.shell]
max_calls = 10
window_secs = 60
[autonomy.tool_rate_limits.http_request]
max_calls = 30
window_secs = 60

Create .agentzero/security-policy.yaml for fine-grained per-tool rules:

default: deny
rules:
- tool: http_request
egress:
- api.openai.com
- "*.github.com"
action: allow
- tool: shell
commands: [git, cargo, rustc]
action: allow
- tool: read_file
filesystem: [/workspace, /tmp]
action: allow
- tool: "mcp:*"
egress: [prompt]
action: prompt

Paths in filesystem rules are canonicalized before comparison, preventing traversal attacks at the policy layer.


Secrets are automatically stripped from errors, logs, and panic output:

PatternReplacement
OPENAI_API_KEY=sk-*OPENAI_API_KEY=[REDACTED]
"api_key": "...""api_key":"[REDACTED]"
Authorization: Bearer ...Authorization: Bearer [REDACTED]
sk-[A-Za-z0-9_-]{10,}sk-[REDACTED]

The panic hook ensures secrets are redacted even during crashes. The error chain walker redacts every level of an anyhow error chain.

Error messages returned to the LLM replace the user’s home directory with ~, preventing leakage of filesystem layout (e.g., /home/alice/secret-project/ becomes ~/secret-project/).

Security events are logged as structured tracing with target: "audit":

  • auth_failure — failed authentication attempt
  • scope_denied — authenticated but insufficient scope
  • pair_success / pair_failure — pairing flow events
  • api_key_created / api_key_revoked — key lifecycle
  • rate_limited — request rejected by rate limiter
  • estop — emergency stop triggered

These integrate with any log aggregation system (ELK, Datadog, Grafana Loki) via JSON/JSONL subscriber configuration.


For sensitive deployments, AgentZero supports end-to-end encrypted communication:

  • Noise protocol key exchange with X25519
  • Session management with configurable TTL and max concurrent sessions
  • Key rotation with overlap periods for graceful rollover
  • Sealed envelope relay with timing jitter to obscure message patterns

Configure via [privacy] in agentzero.toml:

[privacy]
mode = "encrypted" # "off", "private", "encrypted", "full"

# Autonomy
[autonomy]
level = "supervised" # read_only, supervised, full
# Guardrails (default: audit)
[guardrails]
pii_mode = "sanitize"
injection_mode = "block"
# Gateway TLS + mTLS
[gateway.tls]
cert_path = "/etc/ssl/cert.pem"
key_path = "/etc/ssl/key.pem"
client_ca_path = "/etc/ssl/client-ca.pem"
# WebSocket tuning
[gateway.websocket]
heartbeat_interval_secs = 30
pong_timeout_secs = 60
idle_timeout_secs = 300
max_message_bytes = 2097152
# Leak guard with custom patterns
[security.outbound_leak_guard]
enabled = true
action = "redact"
sensitivity = 0.7
extra_patterns = [
{ name = "stripe_key", regex = "sk_live_[a-zA-Z0-9]{24,}" },
]