Document Querying
Overview
Section titled “Overview”AgentZero can index a directory of text files — code, markdown, config, prose — and answer questions about them using semantic search. During chat, the LLM can call a query tool to retrieve the most relevant chunks from the index.
This works by:
- Chunking files into overlapping pieces using sentence/paragraph boundaries
- Embedding each chunk via Ollama’s
/api/embedendpoint - Storing the vectors to disk (
.agentzero/index/) - Querying with cosine similarity at chat time
Prerequisites
Section titled “Prerequisites”Pull an embedding model in Ollama:
ollama pull nomic-embed-textOther supported models: mxbai-embed-large, snowflake-arctic-embed, all-minilm.
Build the Index
Section titled “Build the Index”az index buildThis walks the current directory, skipping .agentzero/, .git/, target/, node_modules/, and other build artifacts. It indexes all text-based files: source code, markdown, TOML, YAML, JSON, and more.
Options
Section titled “Options”# Index a specific directoryaz index build --path /path/to/documents
# Use a different embedding modelaz index build --model mxbai-embed-large
# Custom Ollama serveraz index build --url http://gpu-box:11434
# Adjust chunk size (default: 1000 characters)az index build --chunk-size 500Check Index Status
Section titled “Check Index Status”az index statusIndex status: Model: nomic-embed-text Files: 47 Chunks: 312 Created at: 1715184000Use in Chat
Section titled “Use in Chat”Once the index is built, the query tool is automatically available during chat:
az chatyou> what error handling patterns does this project use? [tool: query] ok (2048 bytes)
agentzero> Based on the indexed documents, the project uses thiserrorfor error types with a consistent pattern of...The LLM decides when to use query vs read or search based on the question. query is best for semantic/conceptual questions, while search is better for exact string matches.
Clear the Index
Section titled “Clear the Index”az index clearWhat Gets Indexed
Section titled “What Gets Indexed”| File Type | Extensions |
|---|---|
| Prose | .txt, .md, .rst, .org, .adoc |
| Code | .rs, .py, .js, .ts, .go, .java, .c, .cpp, .rb, .php, .swift, .kt, .scala, .lua, .sh, .zig, .hs, .ex, .clj, .nim, .v |
| Config | .toml, .yaml, .yml, .json, .xml, .csv, .ini, .cfg |
| Data/Schema | .sql, .graphql, .proto |
| Build | Dockerfile, Makefile, Justfile |
| Web | .html, .css, .scss, .less |
Binary files, images, and PDFs are skipped in Phase 1. PDF/HTML/DOCX parsing is planned for a future release.
Storage
Section titled “Storage”The index lives at .agentzero/index/:
| File | Format | Purpose |
|---|---|---|
default.idx | bincode | Serialized chunk embeddings |
metadata.json | JSON | Human-readable stats |
The index is local to your project and not committed to git (add .agentzero/ to .gitignore).
How It Works
Section titled “How It Works”- Chunking —
text-splittersplits files at semantic boundaries (sentences, paragraphs) with a configurable max size - Embedding — Chunks are sent in batches of 32 to Ollama’s
/api/embedendpoint - Storage — Embedded chunks are serialized to disk with bincode
- Query — The question is embedded with the same model, then ranked against all chunks by cosine similarity. Top 5 results are returned.
All of this runs locally through your existing Ollama instance — no external API calls, no data leaves your machine.