Skip to content

Gateway API

The AgentZero gateway exposes a localhost HTTP API for programmatic access to the agent runtime.

Terminal window
agentzero gateway
agentzero gateway --host 127.0.0.1 --port 42617
agentzero gateway --new-pairing
MethodPathAuthDescription
GET/NoneDashboard HTML page
GET/healthNoneService health probe
GET/metricsNonePrometheus-compatible metrics
POST/pairPairing codeExchange pairing code for bearer token
POST/v1/pingBearerEcho test endpoint
POST/v1/webhook/:channelBearerChannel message dispatch
POST/api/chatBearerChat with agent (JSON response)
POST/v1/chat/completionsBearerOpenAI-compatible chat completions (supports SSE streaming)
GET/v1/modelsBearerList available models (OpenAI-compatible)
GET/ws/chatBearerWebSocket chat with streaming agent responses
POST/webhookBearerLegacy webhook endpoint
GET/v1/openapi.jsonNoneOpenAPI 3.1 specification
GET/v1/privacy/infoNonePrivacy capabilities discovery (feature-gated)
POST/v1/noise/handshake/step1NoneNoise XX handshake step 1 (feature-gated)
POST/v1/noise/handshake/step2NoneNoise XX handshake step 2 (feature-gated)
POST/v1/noise/handshake/ikNoneNoise IK handshake (feature-gated)
POST/v1/relay/submitNoneSubmit sealed envelope (relay mode, feature-gated)
GET/v1/relay/poll/:routing_idNonePoll sealed envelopes (relay mode, feature-gated)
GET/v1/agents/:agent_id/statsBearerPer-agent aggregated metrics (runs, cost, tokens, tool usage)
GET/v1/topologyBearerLive agent topology snapshot (nodes + delegation edges)
GET/v1/autopilot/proposalsBearerList autopilot proposals (feature-gated)
POST/v1/autopilot/proposals/:id/approveBearerApprove an autopilot proposal (feature-gated)
POST/v1/autopilot/proposals/:id/rejectBearerReject an autopilot proposal (feature-gated)
GET/v1/autopilot/missionsBearerList autopilot missions (feature-gated)
GET/v1/autopilot/missions/:idBearerGet mission detail with steps (feature-gated)
GET/v1/autopilot/triggersBearerList autopilot triggers (feature-gated)
POST/v1/autopilot/triggers/:id/toggleBearerEnable/disable a trigger (feature-gated)
GET/v1/autopilot/statsBearerDaily spend, mission counts, agent activity (feature-gated)

The gateway supports two authentication methods:

  1. On first start, the gateway prints a one-time pairing code to the terminal
  2. POST the pairing code to /pair to get a bearer token
  3. Use the bearer token in subsequent requests
Terminal window
# Health check (no auth required)
curl http://127.0.0.1:42617/health
{ "status": "ok" }
Terminal window
# Exchange pairing code for token
curl -X POST http://127.0.0.1:42617/pair \
-H "X-Pairing-Code: <code-from-terminal>"
Terminal window
# Authenticated request
curl -X POST http://127.0.0.1:42617/v1/ping \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json"

API keys provide fine-grained RBAC for multi-tenant deployments. Each key carries a set of scopes:

ScopeGrants access to
runs:readRead runs, results, models, agents, subscribe to events
runs:writeSubmit runs, chat, webhooks, ping
runs:manageCancel runs
adminEmergency stop, key management

Bearer tokens from pairing have full access (all scopes). API keys are scoped and persisted with encrypted-at-rest storage.

Paired tokens can optionally expire after a configurable TTL. Legacy tokens without timestamps remain valid for backward compatibility.

Send a message to the agent and receive a complete JSON response.

Terminal window
curl -X POST http://127.0.0.1:42617/api/chat \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"message": "What is the weather?", "context": ""}'
{ "message": "I can help with that...", "tokens_used_estimate": 42 }

Returns 503 Service Unavailable if the gateway was started without agent configuration.

OpenAI-Compatible Completions (POST /v1/chat/completions)

Section titled “OpenAI-Compatible Completions (POST /v1/chat/completions)”

Accepts the standard OpenAI chat completions format. Set stream: true for SSE streaming.

Terminal window
# Non-streaming
curl -X POST http://127.0.0.1:42617/v1/chat/completions \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "hello"}]}'
Terminal window
# Streaming (SSE)
curl -X POST http://127.0.0.1:42617/v1/chat/completions \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "hello"}], "stream": true}'

SSE events follow the OpenAI format:

data: {"id":"chatcmpl-...","choices":[{"index":0,"delta":{"content":"token"},"finish_reason":null}]}
data: [DONE]

The model field is passed through to the agent, allowing model override per request.

Upgrade to a WebSocket connection for bidirectional streaming chat. Send a text message and receive streaming delta frames:

// Incoming delta
{"type": "delta", "delta": "partial response text"}
// Stream complete
{"type": "done"}
// Error
{"type": "error", "message": "description"}

List available models in OpenAI-compatible format.

Terminal window
curl http://127.0.0.1:42617/v1/models \
-H "Authorization: Bearer <token>"

These endpoints are available when the gateway is built with the privacy Cargo feature and privacy is configured. See the Privacy Guide for details.

Discover gateway privacy capabilities before initiating a handshake.

{
"noise_enabled": true,
"handshake_pattern": "XX",
"public_key": "<base64-encoded X25519 public key>",
"key_fingerprint": "a1b2c3d4e5f6a1b2",
"sealed_envelopes_enabled": false,
"relay_mode": false,
"supported_patterns": ["XX", "IK"]
}

Two-step mutual authentication handshake:

  1. POST /v1/noise/handshake/step1 — Client sends {"client_message": "<base64>"}. Server returns {"server_message": "<base64>"}.
  2. POST /v1/noise/handshake/step2 — Client sends {"client_message": "<base64>"}. Server returns {"session_id": "<64-char hex>"}.

Single round-trip handshake when the client knows the server’s public key:

POST /v1/noise/handshake/ik — Client sends {"client_message": "<base64>", "server_public_key": "<base64>"}. Server returns {"server_message": "<base64>", "session_id": "<64-char hex>"}.

After handshake, all requests include X-Noise-Session: <session_id> header with encrypted body. The gateway middleware transparently decrypts request bodies and encrypts response bodies.

Available when relay_mode = true in gateway config:

  • POST /v1/relay/submit — Submit a sealed envelope. Body: {"routing_id": "<64-char hex>", "payload": "<base64>", "nonce": "<base64 24-byte>", "ttl_secs": 300}. Returns HTTP 409 on replay (duplicate nonce).
  • GET /v1/relay/poll/:routing_id — Poll for envelopes addressed to a routing ID.

The relay strips identifying headers (X-Forwarded-For, X-Real-IP, Via) from all requests.

Agent Stats (GET /v1/agents/:agent_id/stats)

Section titled “Agent Stats (GET /v1/agents/:agent_id/stats)”

Returns aggregated metrics for a specific agent: total runs, status breakdown, cost, token usage, and tool call frequency.

Terminal window
curl http://127.0.0.1:42617/v1/agents/coder/stats \
-H "Authorization: Bearer <token>"
{
"agent_id": "coder",
"total_runs": 42,
"running_count": 1,
"completed_count": 38,
"failed_count": 3,
"total_cost_microdollars": 1250000,
"total_tokens_used": 845000,
"tool_usage": {
"read_file": 120,
"write_file": 45,
"shell": 30,
"web_search": 8
}
}

Returns a live snapshot of the agent topology — agents as nodes with status and active run counts, and delegation links as edges between agents.

Terminal window
curl http://127.0.0.1:42617/v1/topology \
-H "Authorization: Bearer <token>"
{
"nodes": [
{
"agent_id": "coordinator",
"name": "Coordinator",
"status": "running",
"active_run_count": 2,
"total_cost_microdollars": 500000
},
{
"agent_id": "coder",
"name": "Coder",
"status": "running",
"active_run_count": 1,
"total_cost_microdollars": 250000
}
],
"edges": [
{
"from_agent_id": "coordinator",
"to_agent_id": "coder",
"run_id": "run-abc123",
"edge_type": "delegation"
}
]
}

Edges are derived from running jobs with parent_run_id — when a child run’s parent belongs to a different agent, a delegation edge is created.

The /metrics endpoint exposes Prometheus-compatible metrics for monitoring:

  • gateway_requests_total{method, path, status} — Request counter by method, path, and status code
  • gateway_request_duration_seconds{method, path} — Request latency histogram
  • gateway_errors_total{error_type} — Error counter by structured error type
  • gateway_ws_connections_total — WebSocket connection counter
  • gateway_active_connections — Current active connection gauge

Provider metrics (emitted per LLM provider request):

  • agentzero_provider_requests_total{provider, model, status} — Request counter by provider, model, and status (success/error)
  • agentzero_provider_request_duration_seconds{provider, model} — Request latency histogram
  • agentzero_provider_errors_total{provider, model, error_type} — Error counter by type (e.g., http_429, http_500, transport)
  • agentzero_provider_tokens_total{provider, model, type} — Token usage counter (input/output)

Fallback metrics (when provider fallback chains are configured):

  • provider_fallback_total{from, to} — Fallback events by source and target provider

Privacy metrics (when privacy feature is enabled):

  • agentzero_noise_sessions_active — Active Noise sessions (gauge)
  • agentzero_noise_handshakes_total{result} — Handshake attempts by result (counter)
  • agentzero_relay_mailbox_envelopes — Envelopes in relay mailboxes (gauge)
  • agentzero_relay_submit_total — Total envelope submissions (counter)
  • agentzero_key_rotation_total{epoch} — Key rotation events (counter)
  • agentzero_privacy_encrypt_duration_seconds — Encrypt/decrypt latency (histogram)

Prometheus scrape config:

scrape_configs:
- job_name: agentzero-gateway
static_configs:
- targets: ['127.0.0.1:42617']
metrics_path: /metrics
scrape_interval: 15s

The /v1/models endpoint dynamically returns all models from the provider catalog. The response follows the OpenAI format:

{
"object": "list",
"data": [
{ "id": "claude-sonnet-4-6", "object": "model", "owned_by": "anthropic" },
{ "id": "gpt-4o", "object": "model", "owned_by": "openai" }
]
}

All error responses use a structured JSON format:

{
"error": {
"type": "auth_required",
"message": "authentication required"
}
}
Error TypeHTTP StatusDescription
auth_required401No bearer token provided
auth_failed403Invalid token or pairing code
insufficient_scope403API key lacks required scope
not_found404Unknown endpoint or resource
agent_unavailable503Gateway started without agent config
agent_execution_failed500Agent runtime error
rate_limited429Rate limit exceeded
payload_too_large413Request body too large
bad_request400Malformed request

The WebSocket endpoint (/ws/chat) includes production hardening:

  • Heartbeat — Server sends a ping every 30 seconds. If no pong is received within 60 seconds, the connection is closed.
  • Idle timeout — Connections with no messages for 5 minutes are automatically closed.
  • Message size limit — Messages larger than 2 MB are rejected.
  • Binary rejection — Binary WebSocket frames are rejected with an error JSON frame.

The gateway includes built-in middleware for production hardening:

Rate Limiting — Sliding window counter that rejects excess requests with 429 Too Many Requests. Default: 600 requests per 60-second window (10 req/s). Set rate_limit_max = 0 in config to disable.

Per-Identity Rate Limiting — When rate_limit_per_identity is set (default: 0 = disabled), each API key gets its own rate limit bucket. Identity is extracted from the Authorization header: API keys use key:<prefix>, bearer tokens use "bearer", unauthenticated requests use "_anonymous". Expired identity buckets are garbage-collected automatically.

All rate-limited responses include these headers:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the window
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp when the window resets

These headers are present on both successful (200) and rate-limited (429) responses.

Request Size Limits — Rejects requests with Content-Length exceeding the configured maximum (default: 1 MB) with 413 Payload Too Large.

CORS — Configurable origin allowlist for browser clients. Supports exact origin matching and wildcard (*). Handles preflight OPTIONS requests automatically.

HSTS — When TLS is enabled, Strict-Transport-Security: max-age=31536000; includeSubDomains is added to all responses.

Channel Validation — Webhook channel names are validated to contain only alphanumeric characters, hyphens, and underscores (1–64 chars).

Request Metrics — All requests are automatically instrumented with Prometheus counters and histograms.

Graceful Shutdown — On SIGTERM or SIGINT, the gateway drains active connections before exiting.

The full OpenAPI 3.1 spec is available at /v1/openapi.json. It covers all 20+ endpoints with request/response schemas and Bearer auth security scheme.

Terminal window
curl http://127.0.0.1:42617/v1/openapi.json | jq .info
[gateway]
host = "127.0.0.1" # bind interface
port = 42617 # bind port
require_pairing = true # require OTP pairing
allow_public_bind = false # allow non-loopback bind
allow_insecure = false # must be true to run production mode without TLS
# TLS configuration (requires --features tls)
# [gateway.tls]
# cert_path = "/path/to/cert.pem"
# key_path = "/path/to/key.pem"
[gateway.node_control]
enabled = false
# auth_token = "your-token"
allowed_node_ids = []

Run the gateway as a background process with automatic local AI service discovery, PID file management, and log rotation:

Terminal window
# Start in background
agentzero daemon start --host 127.0.0.1 --port 42617
# Check status (includes PID, uptime, address)
agentzero daemon status
agentzero daemon status --json
# Stop the daemon
agentzero daemon stop
# Run in foreground (for debugging or systemd)
agentzero daemon start --foreground

Daemon logs are written to {data_dir}/daemon.log with automatic rotation (10 MB max, 5 rotated files).

Install as a system service for automatic startup:

Terminal window
# Auto-detect init system (systemd or openrc)
agentzero service install
# Explicit init system
agentzero service --service-init systemd install
agentzero service --service-init openrc install
# Lifecycle
agentzero service start
agentzero service status
agentzero service restart
agentzero service stop
agentzero service uninstall

systemd installs as a user-level unit (no root required). OpenRC installs system-wide (requires sudo).