Skip to content

Search Documentation

Search across all documentation pages

Architecture

ExoVault is built on a modern stack optimized for encrypted AI agent memory at scale.

System Overview#

┌─────────────┐        MCP         ┌──────────────┐     HTTPS      ┌──────────────┐
│   AI Agent   │ ──────────────────▶│  MCP Server   │ ─────────────▶│  Gateway API  │
│ (Claude, etc)│                    │  (Node.js)    │               │  (Next.js)    │
└─────────────┘                    └──────────────┘               └──────┬───────┘
                                                                         │
                                                                         ▼
                                                                  ┌──────────────┐
                                                                  │   Supabase    │
                                                                  │  PostgreSQL   │
                                                                  │  + pgvector   │
                                                                  └──────────────┘

Components#

MCP Server (exovault-mcp-server)#

The MCP server is a Node.js process that implements the Model Context Protocol. It:

  • Registers 37+ tools for agent interaction
  • Manages session lifecycle (auto-session injection, idle timeout)
  • Handles tool call tracking and checkpoint reminders
  • Supports two modes: gateway (via agent key) and direct (via Supabase credentials)

Gateway API (Next.js)#

The web application serves as both the dashboard UI and the API gateway:

  • Agent routesPOST /api/agent/* for all MCP tool operations
  • Dashboard routes — Standard REST API for the web UI
  • Authentication — Agent keys (Bearer token) and Supabase auth (session cookies)

Database (PostgreSQL + pgvector)#

  • PostgreSQL 15+ with Row Level Security (RLS)
  • pgvector extension for embedding-based semantic search
  • 17+ tables covering memories, notes, vaults, agents, sessions, messages, tasks
  • Exact kNN search for full-precision nearest neighbor retrieval (HNSW removed due to pgvector 2000-dimension cap)

Data Flow#

Write Path#

  1. Agent calls write_memory via MCP
  2. MCP server sends request to POST /api/agent/write-memory
  3. Gateway authenticates agent key and resolves vault
  4. Content is encrypted with vault's MEK (AES-256-GCM)
  5. Embedding is generated via Gemini gemini-embedding-2-preview
  6. Blind index tokens are computed for keyword search
  7. Memory is stored in PostgreSQL with embedding vector

Read Path#

  1. Agent calls search_memories with a query
  2. Query is embedded using the same model
  3. PostgreSQL performs hybrid search:
    • Vector similarity (pgvector cosine distance)
    • Blind index tokens (keyword matching on encrypted content)
    • Graph signals (relation-based boosting)
  4. Results are ranked using Reciprocal Rank Fusion (RRF)
  5. Temporal decay and MMR diversity are applied
  6. Content is decrypted and returned to agent

Encryption Architecture#

ExoVault uses a three-level key hierarchy:

  1. SEK (Storage Encryption Key) — Derived from user passphrase via PBKDF2
  2. MEK (Master Encryption Key) — Per-vault key, encrypted with SEK
  3. Per-operation IV — Unique initialization vector for every encryption operation

All note content, memory content, and message bodies are encrypted client-side. Metadata (titles, tags, timestamps) remain in plaintext for indexing.

See Encryption Model for the full deep dive.

Embedding Pipeline#

  • Model: Gemini gemini-embedding-2-preview (3072 dimensions)
  • Natively multimodal — text, images, audio, video, and PDFs embedded in the same vector space
  • Storage: pgvector with exact kNN (HNSW removed due to 2000-dimension cap)
  • Env var: GEMINI_API_KEY (required, currently free tier)
  • Embeddings are generated for memory content, note content, and media attachments
  • Used for semantic search, deduplication, similarity scoring, and cross-modal retrieval

Session Lifecycle#

session_start → tool calls (tracked) → context_checkpoint (periodic) → final flush
  • Auto-session: First tool call triggers session_start if not already called
  • Buffer: Tool calls are recorded in a session buffer
  • Extraction: Hooks capture conversation turns; an LLM pipeline extracts facts automatically
  • Idle timeout: Sessions auto-checkpoint after 5 minutes of inactivity

Next Steps#