Model Providers

Supported Providers

Provider	API Type	Default Port	Flag
Ollama	Native `/api/chat`	11434	`--provider ollama`
llama.cpp	OpenAI `/v1/chat/completions`	8080	`--provider llama-cpp`
vLLM	OpenAI `/v1/chat/completions`	8000	`--provider vllm`
LM Studio	OpenAI `/v1/chat/completions`	1234	`--provider lm-studio`
Anthropic Claude	Messages `/v1/messages`	N/A (remote)	`--provider anthropic`
Custom	Any (configurable via `driver`)	Any	`"type": "custom"` in models.json

The OpenAI-compatible provider has been verified against Groq, Together, and DeepSeek in addition to the servers listed above. Any service implementing /v1/chat/completions should work.

Provider Routing

AgentZero tries providers in priority order:

Local providers always attempted first
Remote providers only used if classification allows
Automatic fallback when primary is unavailable
Retry with backoff on transient failures

Classification-Based Routing

Classification	Local	Remote
Public	Allowed	Allowed
Internal	Allowed	Policy check
Private	Allowed	Requires redaction
PII	Allowed	Redacted then allowed
Secret	Allowed	Denied
Credential	Allowed	Denied
Unknown	Allowed	Denied

Custom Server URL

az chat --provider llama-cpp --url http://gpu-box:8080

models.json Configuration

Providers are configured in .agentzero/models.json. The ProviderRouter loads this file at startup and instantiates providers dynamically.

{
  "providers": [
    {
      "name": "ollama-local",
      "provider_type": "ollama",
      "base_url": "http://localhost:11434",
      "default_model": "llama3.2",
      "is_local": true
    },
    {
      "name": "lm-studio",
      "provider_type": "openai-compatible",
      "base_url": "http://localhost:1234",
      "default_model": "gemma-4-12b",
      "is_local": true
    },
    {
      "name": "remote-vllm",
      "provider_type": "openai-compatible",
      "base_url": "https://gpu-box.internal:8000",
      "default_model": "meta-llama/Llama-3.2-70B",
      "is_local": false,
      "api_key": "vault://vllm/api-key"
    },
    {
      "name": "claude",
      "provider_type": "anthropic",
      "base_url": "https://api.anthropic.com",
      "default_model": "claude-sonnet-4-20250514",
      "is_local": false,
      "api_key": "vault://anthropic/api-key"
    }
  ],
  "default_provider": "ollama-local"
}

See the Model Configuration guide for the full schema and examples.

Custom Providers

The custom provider type lets you add any compatible endpoint via models.json without code changes. Set the driver field to select which client implementation to use:

Driver	Protocol	Use For
`openai-compatible` (default)	`/v1/chat/completions`	Together AI, Groq, Fireworks, DeepSeek, any OpenAI-compatible API
`ollama`	`/api/chat`	Self-hosted Ollama instances on custom ports
`anthropic`	`/v1/messages`	Anthropic-compatible endpoints

Example — adding Together AI with no code changes:

{
  "providers": [
    {
      "name": "together-ai",
      "type": "custom",
      "driver": "openai-compatible",
      "url": "https://api.together.xyz/v1",
      "default_model": "meta-llama/Llama-3-70b",
      "is_local": false,
      "api_key": "vault:together/key"
    }
  ]
}

If driver is omitted, it defaults to openai-compatible.

ModelProvider Trait

All providers implement the ModelProvider trait:

#[async_trait]
pub trait ModelProvider: Send + Sync {
    async fn chat_with_tools(
        &self,
        messages: &[ChatMessage],
        tools: &[ToolDef],
    ) -> Result<ChatResponse>;

    async fn chat_streaming(
        &self,
        messages: &[ChatMessage],
        tools: &[ToolDef],
        tx: mpsc::Sender<StreamEvent>,
    ) -> Result<()>;

    async fn health_check(&self) -> Result<bool>;
    fn model_name(&self) -> &str;
}

OllamaProvider, OpenAICompatProvider, and AnthropicProvider all implement this trait. For standard OpenAI-compatible endpoints, use "type": "custom" in models.json — no code changes needed. For providers with non-standard APIs, implement the ModelProvider trait in Rust.

Dynamic Loading

ProviderRouter::from_config() reads models.json and instantiates the correct provider for each entry:

let config = ModelsConfig::load(".agentzero/models.json")?;
let router = ProviderRouter::from_config(&config)?;

// Router tries providers in priority order with fallback
let response = router.chat_with_tools(&messages, &tools).await?;

Settings Defaults

Configure in .agentzero/settings.toml:

[general]
default_provider = "ollama"
default_model = "llama3.2"

These are used when CLI flags are at their defaults.