Skip to content

Search Documentation

Search across all documentation pages

Memory Protocol

This page defines the protocol agents should follow when writing memories to ExoVault. Following these guidelines ensures high-quality, non-redundant memory that is useful across sessions.

When to Write Memories#

Strong Triggers (Always Write)#

TriggerMemory TypeExample
User states a preferencepreference"I prefer PostgreSQL over MySQL"
User sets a rule or constraintconstraint"Never use any in TypeScript"
User corrects the agentcorrection"Actually, the API uses POST not GET"
Agent learns a new skill/patternskill"This codebase uses Drizzle ORM with PostgreSQL"
Important factual discoveryfact"The production database is on Supabase"
Session endsepisodicSummary of what was accomplished
User assigns a tasktask"Add input validation to the API"

Moderate Triggers (Write if Important)#

  • A decision was made after discussion (importance >= 3)
  • A workaround was found for a known issue
  • Configuration details were discovered
  • A project convention was established

Weak Triggers (Usually Skip)#

  • Trivial, transient information
  • Information already available in the codebase
  • Highly specific debugging details unlikely to recur
  • Content that would be stale within hours

Memory Type Selection#

Choose the correct type based on the nature of the information:

TypeWhen to UseTypical Importance
factObjective information about the user, project, or environment3-4
skillLearned patterns, techniques, or project-specific knowledge3-4
preferenceUser preferences and opinions4-5
constraintRules, restrictions, and mandatory practices4-5
taskAction items, to-dos, and work tracking3-5
episodicSession summaries and event records2-3
correctionCorrections to previous agent behavior or beliefs4-5

See Memory Types for detailed descriptions and examples.

Importance Guidelines#

Rate importance from 1 (trivial) to 5 (critical):

ScoreMeaningExample
1Trivial, ephemeral"User said hello"
2Low importance, background context"Session focused on refactoring"
3Moderate, useful in context"Project uses Next.js 14"
4Important, affects decisions"User prefers TypeScript over JavaScript"
5Critical, must always remember"Never commit .env files"

Guidelines:

  • Preferences and constraints should generally be 4-5
  • Facts about the project/stack are typically 3-4
  • Episodic summaries are typically 2-3
  • Corrections are typically 4-5 (you need to remember what went wrong)

Confidence Guidelines#

Rate confidence from 1 (uncertain) to 5 (certain):

ScoreMeaningWhen to Use
1SpeculativeAgent inferred something without confirmation
2Likely but unverifiedBased on patterns, not explicit statements
3Moderately confidentReasonable inference from context
4High confidenceUser stated it clearly
5CertainExplicitly confirmed or verified

Guidelines:

  • Direct user statements: confidence 4-5
  • Inferences from code/context: confidence 2-3
  • Corrections from the user: confidence 5

Deduplication#

The dedup: true Flag#

Always set dedup: true when writing memories to enable ExoVault's multi-layer deduplication:

json
{
  "content": "User prefers PostgreSQL",
  "memoryType": "preference",
  "dedup": true
}

Dedup Layers#

ExoVault runs four deduplication layers in sequence:

  1. In-batch hash -- Catches exact duplicates within a context_checkpoint batch (zero cost)
  2. Content hash -- SHA-256 hash match against all stored memories (zero API cost)
  3. Blind token overlap -- HMAC-based token matching catches near-duplicates without embeddings (zero API cost)
  4. Semantic embedding -- Cosine similarity using gemini-embedding-2-preview catches paraphrased duplicates

Dedup Outcomes#

OutcomeMeaning
skipA sufficiently similar memory already exists; no new memory created
supersedeNew memory created, old similar memory archived
(no dedup field)Memory is novel, created normally

Search Before Write#

Even with dedup: true, agents should search before writing when practical:

1. search_memories("PostgreSQL preference")
2. If found: skip writing or update existing
3. If not found: write_memory with dedup:true

This reduces unnecessary embedding API calls and provides the agent with existing context.

Anti-Patterns#

Do Not#

  • Write every message -- Not everything said is worth remembering
  • Duplicate search results -- If you just found a memory via search, do not write it again
  • Write without entities -- Always extract entities for better searchability
  • Use generic content -- "User said something about databases" is not useful
  • Forget dedup: true -- Always enable deduplication
  • Write stale corrections -- If you correct yourself, write a correction memory, not a duplicate fact
  • Set all importance to 5 -- Reserve 5 for genuinely critical information
  • Set all confidence to 5 -- Be honest about uncertainty

Do#

  • Include entities -- Extract key names, technologies, and concepts
  • Write clear summaries -- The summary field is used for token-efficient retrieval
  • Use writeReason -- Explain why the memory was written for audit purposes
  • Link related memories -- Use relatedMemoryIds to connect related knowledge
  • Use correct types -- A preference is not a fact, a constraint is not a preference

Entity Extraction#

Always extract meaningful entities from the content:

json
{
  "content": "The production PostgreSQL database is hosted on Supabase with pgvector enabled",
  "entities": ["PostgreSQL", "Supabase", "pgvector", "production"],
  "memoryType": "fact"
}

Good entities are:

  • Technology names (PostgreSQL, React, Docker)
  • Project names and components
  • People and team names
  • Key concepts and patterns
  • Environment names (production, staging)