Retrieval & Memory
AgentZero’s retrieval layer is built around three independent indexes that share a single source of truth. Each index can be enabled independently and gracefully falls back to a brute-force scan when missing or rebuilding.
What’s stored where
Section titled “What’s stored where”| Data | Source of truth | Index | Used for |
|---|---|---|---|
| Conversation memory | SQLite (memory table) — encrypted via SQLCipher | HNSW (optional) | Semantic recall in agent loop |
| RAG documents | Encrypted JSON store ({data_dir}/rag/index.jsonl) | Tantivy (when rag feature enabled) | BM25 keyword search via agentzero rag query |
The source-of-truth stores never go away. Indexes are sidecar artifacts that can be deleted and rebuilt without losing data.
Conversation memory
Section titled “Conversation memory”Every agent interaction is appended to the memory table in the encrypted SQLite database. Rows carry the standard fields you’d expect — role, content, created_at, conversation_id, org_id, agent_id, expires_at — plus an optional embedding BLOB.
When you call MemoryStore::append_with_embedding(entry, vec), the embedding is stored alongside the row.
Semantic recall — brute force vs HNSW
Section titled “Semantic recall — brute force vs HNSW”The default MemoryStore::semantic_recall() implementation does a full table scan: load every row that has an embedding, compute cosine similarity in process, sort, take the top limit. That’s O(n) and fine for hundreds or low thousands of rows.
For larger memory stores, opt into the HNSW approximate nearest neighbor index:
use agentzero_storage::SqliteMemoryStore;
let mut store = SqliteMemoryStore::open("memory.db", Some(&key))?;store.enable_hnsw_index("/var/lib/agentzero/hnsw", 384)?;After this call:
append_with_embedding()writes to both SQLite (durable) and HNSW (fast lookup)- The HNSW index checkpoints to disk every 100 inserts
semantic_recall(query, k)queries HNSW for(k * 3)candidate IDs, then resolves the fullMemoryEntryrows from SQLite preserving HNSW ranking- Expired rows (
expires_at < now) are filtered after the HNSW lookup
If the HNSW directory is missing on startup (cold start, fresh deployment, accidental delete), enable_hnsw_index() rebuilds it by scanning every embedded row from SQLite. The index is never authoritative — it’s always derivable from the source of truth.
The hnsw_rs dep is always linked into agentzero-storage regardless of feature flags, so the API surface is uniform across all builds.
Hybrid retrieval
Section titled “Hybrid retrieval”For combined keyword + semantic queries, use MemoryStore::hybrid_recall(query_text, query_embedding, limit). The default implementation:
- Runs
semantic_recall()over(limit * 4)candidates to get the semantic ranking - Runs a substring match over the same
recent()window to get a keyword ranking - Fuses both rankings via reciprocal rank fusion with
k = 60 - Deduplicates on a content fingerprint and returns the top
limit
The SemanticRecallTool supports this directly:
{ "query": "what did we decide about the database?", "limit": 5, "mode": "hybrid" }mode defaults to "semantic" for backward compatibility; pass "hybrid" explicitly to opt into the fused ranking.
RAG document index
Section titled “RAG document index”The RAG index is a separate store for explicitly-ingested documents (notes, knowledge base articles, code snippets) that you want the agent to retrieve at query time. It is independent from conversation memory.
Documents are persisted twice:
- Encrypted JSON store at
{data_dir}/rag/index.jsonl— durable source of truth, AES-256-GCM encrypted - Tantivy inverted index in a sibling
{data_dir}/rag/index.jsonl.tantivy/directory — fast BM25 query path
When you call agentzero rag ingest --id <id> --text "...", the document is written to both. When you call agentzero rag query "search terms", the Tantivy index is consulted; results are returned with a score: f32 BM25 relevance value.
If the Tantivy directory is missing or corrupt (newer install, accidental delete, schema drift across versions), the next query rebuilds it from the encrypted store. Migration from the legacy plaintext JSONL format also happens transparently.
The Tantivy dep is gated behind the rag feature flag; builds without rag skip both Tantivy and the multimodal/document chunking surface.
Embedding providers
Section titled “Embedding providers”Both semantic recall and hybrid retrieval need an embedding model. AgentZero ships two implementations:
| Provider | When to use |
|---|---|
CandleEmbeddingProvider | Local, in-process. Uses sentence-transformers/all-MiniLM-L6-v2 (384 dims, ~23 MB). Downloads from HuggingFace on first use, then runs entirely offline. |
ApiEmbeddingProvider | Calls an OpenAI-compatible /v1/embeddings endpoint. Use when you want a hosted model or already have credentials configured. |
Both implement the same EmbeddingProvider trait. The SemanticRecallTool accepts whichever you wire it with.
Why this design
Section titled “Why this design”We deliberately keep the source-of-truth stores separate from the indexes, even though it means writing the same data twice on ingest:
- Recovery is simple. If an index is corrupt, delete the directory and the next call rebuilds it from SQLite or the encrypted JSON store. There is no “split brain” recovery path.
- Indexes are optional. A small deployment can ignore HNSW and Tantivy entirely; the brute-force fallbacks still work and the API surface is identical.
- Encryption stays at the storage layer. The Tantivy index lives on disk in plaintext (Tantivy doesn’t encrypt natively), but the content it indexes is always reachable from the encrypted store. Compromising the Tantivy directory leaks the keyword index, not the conversation history.
See also
Section titled “See also”- Trait System —
MemoryStore,EmbeddingProvider,Toolinterface definitions - Config Reference — Storage backend configuration
- Provider Setup — Embedding provider configuration