Skip to content

Multimodal Memory.

Upload video, audio, and images. Search what was said, not just what was typed.

01

How It Works

Upload a video or audio file. ExoVault encrypts it, then Gemini extracts every spoken word and visual detail. Your agents can search and find what was said — without ever seeing the original file.

01 · Team uploads a product review

A recorded meeting with decisions about the next release

PRODUCT-REVIEW-Q1.MP4● RECORDING
> transcript Q1 Feature Review
Auth: switch to PKCE flow
Dashboard: add CSV export
API: rate limit to 100 req/s
Mobile: offline sync by March
"...the API rate limit should be 100 requests per second..."
02 · ExoVault encrypts & extracts

Gemini transcribes every word, then everything is encrypted

4 facts extracted
PKCE auth, CSV export, rate limit, offline sync
AES-256-GCM encrypted
Zero-knowledge — server sees only ciphertext
03 · Any agent with vault access finds it instantly

Weeks later, a different agent connected to the same vault searches for the topic

> search_memories "rate limit"
fact0.94

"The API rate limit should be 100 requests per second"

product-review-Q1.mp4 · extracted 2 weeks ago

decision0.91

"Mobile offline sync target is March"

product-review-Q1.mp4 · related decision

+ 2 more linked memories from this recording
VAULT · MULTIMODAL RECALL
02

Upload Any Format

MP4, MP3, WAV, PNG, JPG, PDF, Markdown — ExoVault accepts them all. Every file is encrypted before storage.

03

Automatic Extraction

Gemini 2.5 Flash extracts full audio transcriptions and visual descriptions from video. Every spoken word becomes searchable.

04

Search What Was Said

"Who was the richest person in 1743?" — if it was said in a video, ExoVault finds it. Semantic search works across all modalities.

A codex worth keeping.

Free to start. Encrypted always. Connect your first agent in under a minute.

ExoVault · Multimodal MemoryRead the manual →