Gateway API

The AgentZero gateway exposes a localhost HTTP API for programmatic access to the agent runtime.

Starting the Gateway

agentzero gateway
agentzero gateway --host 127.0.0.1 --port 42617
agentzero gateway --new-pairing

Endpoints

Method	Path	Auth	Description
`GET`	`/`	None	Dashboard HTML page
`GET`	`/health`	None	Service health probe
`GET`	`/metrics`	None	Prometheus-compatible metrics
`POST`	`/pair`	Pairing code	Exchange pairing code for bearer token
`POST`	`/v1/ping`	Bearer	Echo test endpoint
`POST`	`/v1/webhook/:channel`	Bearer	Channel message dispatch
`POST`	`/api/chat`	Bearer	Chat with agent (JSON response)
`POST`	`/v1/chat/completions`	Bearer	OpenAI-compatible chat completions (supports SSE streaming)
`GET`	`/v1/models`	Bearer	List available models (OpenAI-compatible)
`GET`	`/ws/chat`	Bearer	WebSocket chat with streaming agent responses
`POST`	`/webhook`	Bearer	Legacy webhook endpoint
`GET`	`/v1/openapi.json`	None	OpenAPI 3.1 specification
`GET`	`/v1/privacy/info`	None	Privacy capabilities discovery (feature-gated)
`POST`	`/v1/noise/handshake/step1`	None	Noise XX handshake step 1 (feature-gated)
`POST`	`/v1/noise/handshake/step2`	None	Noise XX handshake step 2 (feature-gated)
`POST`	`/v1/noise/handshake/ik`	None	Noise IK handshake (feature-gated)
`POST`	`/v1/relay/submit`	None	Submit sealed envelope (relay mode, feature-gated)
`GET`	`/v1/relay/poll/:routing_id`	None	Poll sealed envelopes (relay mode, feature-gated)
`GET`	`/v1/agents/:agent_id/stats`	Bearer	Per-agent aggregated metrics (runs, cost, tokens, tool usage)
`GET`	`/v1/topology`	Bearer	Live agent topology snapshot (nodes + delegation edges)
`GET`	`/v1/autopilot/proposals`	Bearer	List autopilot proposals (feature-gated)
`POST`	`/v1/autopilot/proposals/:id/approve`	Bearer	Approve an autopilot proposal (feature-gated)
`POST`	`/v1/autopilot/proposals/:id/reject`	Bearer	Reject an autopilot proposal (feature-gated)
`GET`	`/v1/autopilot/missions`	Bearer	List autopilot missions (feature-gated)
`GET`	`/v1/autopilot/missions/:id`	Bearer	Get mission detail with steps (feature-gated)
`GET`	`/v1/autopilot/triggers`	Bearer	List autopilot triggers (feature-gated)
`POST`	`/v1/autopilot/triggers/:id/toggle`	Bearer	Enable/disable a trigger (feature-gated)
`GET`	`/v1/autopilot/stats`	Bearer	Daily spend, mission counts, agent activity (feature-gated)

Authentication

The gateway supports two authentication methods:

Pairing Flow (Default)

On first start, the gateway prints a one-time pairing code to the terminal
POST the pairing code to /pair to get a bearer token
Use the bearer token in subsequent requests

# Health check (no auth required)
curl http://127.0.0.1:42617/health

{ "status": "ok" }

# Exchange pairing code for token
curl -X POST http://127.0.0.1:42617/pair \
  -H "X-Pairing-Code: <code-from-terminal>"

# Authenticated request
curl -X POST http://127.0.0.1:42617/v1/ping \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json"

API Keys (Scope-Based Authorization)

API keys provide fine-grained RBAC for multi-tenant deployments. Each key carries a set of scopes:

Scope	Grants access to
`runs:read`	Read runs, results, models, agents, subscribe to events
`runs:write`	Submit runs, chat, webhooks, ping
`runs:manage`	Cancel runs
`admin`	Emergency stop, key management

Bearer tokens from pairing have full access (all scopes). API keys are scoped and persisted with encrypted-at-rest storage.

Session TTL

Paired tokens can optionally expire after a configurable TTL. Legacy tokens without timestamps remain valid for backward compatibility.

Chat Endpoints

Agent Chat (`POST /api/chat`)

Send a message to the agent and receive a complete JSON response.

curl -X POST http://127.0.0.1:42617/api/chat \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"message": "What is the weather?", "context": ""}'

{ "message": "I can help with that...", "tokens_used_estimate": 42 }

Returns 503 Service Unavailable if the gateway was started without agent configuration.

OpenAI-Compatible Completions (`POST /v1/chat/completions`)

Accepts the standard OpenAI chat completions format. Set stream: true for SSE streaming.

# Non-streaming
curl -X POST http://127.0.0.1:42617/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "hello"}]}'

# Streaming (SSE)
curl -X POST http://127.0.0.1:42617/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "hello"}], "stream": true}'

SSE events follow the OpenAI format:

data: {"id":"chatcmpl-...","choices":[{"index":0,"delta":{"content":"token"},"finish_reason":null}]}

data: [DONE]

The model field is passed through to the agent, allowing model override per request.

WebSocket Chat (`GET /ws/chat`)

Upgrade to a WebSocket connection for bidirectional streaming chat. Send a text message and receive streaming delta frames:

// Incoming delta
{"type": "delta", "delta": "partial response text"}

// Stream complete
{"type": "done"}

// Error
{"type": "error", "message": "description"}

Models (`GET /v1/models`)

List available models in OpenAI-compatible format.

curl http://127.0.0.1:42617/v1/models \
  -H "Authorization: Bearer <token>"

Privacy Endpoints

These endpoints are available when the gateway is built with the privacy Cargo feature and privacy is configured. See the Privacy Guide for details.

Privacy Info (`GET /v1/privacy/info`)

Discover gateway privacy capabilities before initiating a handshake.

{
  "noise_enabled": true,
  "handshake_pattern": "XX",
  "public_key": "<base64-encoded X25519 public key>",
  "key_fingerprint": "a1b2c3d4e5f6a1b2",
  "sealed_envelopes_enabled": false,
  "relay_mode": false,
  "supported_patterns": ["XX", "IK"]
}

Noise Handshake (XX Pattern)

Two-step mutual authentication handshake:

POST /v1/noise/handshake/step1 — Client sends {"client_message": "<base64>"}. Server returns {"server_message": "<base64>"}.
POST /v1/noise/handshake/step2 — Client sends {"client_message": "<base64>"}. Server returns {"session_id": "<64-char hex>"}.

Noise Handshake (IK Pattern)

Single round-trip handshake when the client knows the server’s public key:

POST /v1/noise/handshake/ik — Client sends {"client_message": "<base64>", "server_public_key": "<base64>"}. Server returns {"server_message": "<base64>", "session_id": "<64-char hex>"}.

Noise Transport

After handshake, all requests include X-Noise-Session: <session_id> header with encrypted body. The gateway middleware transparently decrypts request bodies and encrypts response bodies.

Relay (Sealed Envelopes)

Available when relay_mode = true in gateway config:

POST /v1/relay/submit — Submit a sealed envelope. Body: {"routing_id": "<64-char hex>", "payload": "<base64>", "nonce": "<base64 24-byte>", "ttl_secs": 300}. Returns HTTP 409 on replay (duplicate nonce).
GET /v1/relay/poll/:routing_id — Poll for envelopes addressed to a routing ID.

The relay strips identifying headers (X-Forwarded-For, X-Real-IP, Via) from all requests.

Observability Endpoints

Agent Stats (`GET /v1/agents/:agent_id/stats`)

Returns aggregated metrics for a specific agent: total runs, status breakdown, cost, token usage, and tool call frequency.

curl http://127.0.0.1:42617/v1/agents/coder/stats \
  -H "Authorization: Bearer <token>"

{
  "agent_id": "coder",
  "total_runs": 42,
  "running_count": 1,
  "completed_count": 38,
  "failed_count": 3,
  "total_cost_microdollars": 1250000,
  "total_tokens_used": 845000,
  "tool_usage": {
    "read_file": 120,
    "write_file": 45,
    "shell": 30,
    "web_search": 8
  }
}

Topology (`GET /v1/topology`)

Returns a live snapshot of the agent topology — agents as nodes with status and active run counts, and delegation links as edges between agents.

curl http://127.0.0.1:42617/v1/topology \
  -H "Authorization: Bearer <token>"

{
  "nodes": [
    {
      "agent_id": "coordinator",
      "name": "Coordinator",
      "status": "running",
      "active_run_count": 2,
      "total_cost_microdollars": 500000
    },
    {
      "agent_id": "coder",
      "name": "Coder",
      "status": "running",
      "active_run_count": 1,
      "total_cost_microdollars": 250000
    }
  ],
  "edges": [
    {
      "from_agent_id": "coordinator",
      "to_agent_id": "coder",
      "run_id": "run-abc123",
      "edge_type": "delegation"
    }
  ]
}

Edges are derived from running jobs with parent_run_id — when a child run’s parent belongs to a different agent, a delegation edge is created.

Metrics

The /metrics endpoint exposes Prometheus-compatible metrics for monitoring:

gateway_requests_total{method, path, status} — Request counter by method, path, and status code
gateway_request_duration_seconds{method, path} — Request latency histogram
gateway_errors_total{error_type} — Error counter by structured error type
gateway_ws_connections_total — WebSocket connection counter
gateway_active_connections — Current active connection gauge

Provider metrics (emitted per LLM provider request):

agentzero_provider_requests_total{provider, model, status} — Request counter by provider, model, and status (success/error)
agentzero_provider_request_duration_seconds{provider, model} — Request latency histogram
agentzero_provider_errors_total{provider, model, error_type} — Error counter by type (e.g., http_429, http_500, transport)
agentzero_provider_tokens_total{provider, model, type} — Token usage counter (input/output)

Fallback metrics (when provider fallback chains are configured):

provider_fallback_total{from, to} — Fallback events by source and target provider

Privacy metrics (when privacy feature is enabled):

agentzero_noise_sessions_active — Active Noise sessions (gauge)
agentzero_noise_handshakes_total{result} — Handshake attempts by result (counter)
agentzero_relay_mailbox_envelopes — Envelopes in relay mailboxes (gauge)
agentzero_relay_submit_total — Total envelope submissions (counter)
agentzero_key_rotation_total{epoch} — Key rotation events (counter)
agentzero_privacy_encrypt_duration_seconds — Encrypt/decrypt latency (histogram)

Prometheus scrape config:

scrape_configs:
  - job_name: agentzero-gateway
    static_configs:
      - targets: ['127.0.0.1:42617']
    metrics_path: /metrics
    scrape_interval: 15s

Models

The /v1/models endpoint dynamically returns all models from the provider catalog. The response follows the OpenAI format:

{
  "object": "list",
  "data": [
    { "id": "claude-sonnet-4-6", "object": "model", "owned_by": "anthropic" },
    { "id": "gpt-4o", "object": "model", "owned_by": "openai" }
  ]
}

Error Responses

All error responses use a structured JSON format:

{
  "error": {
    "type": "auth_required",
    "message": "authentication required"
  }
}

Error Type	HTTP Status	Description
`auth_required`	401	No bearer token provided
`auth_failed`	403	Invalid token or pairing code
`insufficient_scope`	403	API key lacks required scope
`not_found`	404	Unknown endpoint or resource
`agent_unavailable`	503	Gateway started without agent config
`agent_execution_failed`	500	Agent runtime error
`rate_limited`	429	Rate limit exceeded
`payload_too_large`	413	Request body too large
`bad_request`	400	Malformed request

WebSocket Behavior

The WebSocket endpoint (/ws/chat) includes production hardening:

Heartbeat — Server sends a ping every 30 seconds. If no pong is received within 60 seconds, the connection is closed.
Idle timeout — Connections with no messages for 5 minutes are automatically closed.
Message size limit — Messages larger than 2 MB are rejected.
Binary rejection — Binary WebSocket frames are rejected with an error JSON frame.

Middleware

The gateway includes built-in middleware for production hardening:

Rate Limiting — Sliding window counter that rejects excess requests with 429 Too Many Requests. Default: 600 requests per 60-second window (10 req/s). Set rate_limit_max = 0 in config to disable.

Per-Identity Rate Limiting — When rate_limit_per_identity is set (default: 0 = disabled), each API key gets its own rate limit bucket. Identity is extracted from the Authorization header: API keys use key:<prefix>, bearer tokens use "bearer", unauthenticated requests use "_anonymous". Expired identity buckets are garbage-collected automatically.

All rate-limited responses include these headers:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the window
`X-RateLimit-Remaining`	Requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets

These headers are present on both successful (200) and rate-limited (429) responses.

Request Size Limits — Rejects requests with Content-Length exceeding the configured maximum (default: 1 MB) with 413 Payload Too Large.

CORS — Configurable origin allowlist for browser clients. Supports exact origin matching and wildcard (*). Handles preflight OPTIONS requests automatically.

HSTS — When TLS is enabled, Strict-Transport-Security: max-age=31536000; includeSubDomains is added to all responses.

Channel Validation — Webhook channel names are validated to contain only alphanumeric characters, hyphens, and underscores (1–64 chars).

Request Metrics — All requests are automatically instrumented with Prometheus counters and histograms.

Graceful Shutdown — On SIGTERM or SIGINT, the gateway drains active connections before exiting.

OpenAPI Specification

The full OpenAPI 3.1 spec is available at /v1/openapi.json. It covers all 20+ endpoints with request/response schemas and Bearer auth security scheme.

curl http://127.0.0.1:42617/v1/openapi.json | jq .info

Configuration

[gateway]
host = "127.0.0.1"          # bind interface
port = 42617                 # bind port
require_pairing = true       # require OTP pairing
allow_public_bind = false    # allow non-loopback bind
allow_insecure = false       # must be true to run production mode without TLS

# TLS configuration (requires --features tls)
# [gateway.tls]
# cert_path = "/path/to/cert.pem"
# key_path = "/path/to/key.pem"

[gateway.node_control]
enabled = false
# auth_token = "your-token"
allowed_node_ids = []

Daemon Mode

Run the gateway as a background process with automatic local AI service discovery, PID file management, and log rotation:

# Start in background
agentzero daemon start --host 127.0.0.1 --port 42617

# Check status (includes PID, uptime, address)
agentzero daemon status
agentzero daemon status --json

# Stop the daemon
agentzero daemon stop

# Run in foreground (for debugging or systemd)
agentzero daemon start --foreground

Daemon logs are written to {data_dir}/daemon.log with automatic rotation (10 MB max, 5 rotated files).

Service Installation

Install as a system service for automatic startup:

# Auto-detect init system (systemd or openrc)
agentzero service install

# Explicit init system
agentzero service --service-init systemd install
agentzero service --service-init openrc install

# Lifecycle
agentzero service start
agentzero service status
agentzero service restart
agentzero service stop
agentzero service uninstall

systemd installs as a user-level unit (no root required). OpenRC installs system-wide (requires sudo).