Skip to content

Search Documentation

Search across all documentation pages

attach_media

Attach a media file (image, PDF, audio, video) to a memory for multimodal embedding. The file is encrypted at rest and embedded via Gemini for cross-modal search. Accepts base64-encoded file content.

After upload, the file is automatically processed:

  1. Encrypted and stored in Supabase Storage
  2. Embedded via Gemini gemini-embedding-2-preview (3072d multimodal vector)
  3. Text extracted via Gemini 2.5 Flash (speech transcription, visual description, OCR)

The extracted text and embedding make the media searchable alongside text memories.

Supported Types#

TypeFormatsMax Size
ImagesPNG, JPEG, WebP, GIF20 MB
DocumentsPDF50 MB
AudioMP3, WAV, OGG, WebM50 MB
VideoMP4, WebM, QuickTime20 MB

SVG is not supported. Per-user storage quota applies.

Parameters#

ParameterTypeRequiredDescription
fileBase64stringrequiredBase64-encoded file content
fileNamestringrequiredOriginal file name with extension
mimeTypestringrequiredMIME type (e.g. image/png, video/mp4, audio/mpeg)
memoryIdstring (UUID)optionalMemory to attach this media to
vaultIdstring (UUID)optionalVault scope (defaults to defaultVaultId)

Returns#

Confirmation with attachment ID, storage path, and processing status.

Example#

json
{
  "tool": "attach_media",
  "arguments": {
    "fileBase64": "iVBORw0KGgo...",
    "fileName": "architecture-diagram.png",
    "mimeType": "image/png",
    "memoryId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  }
}

Notes#

  • Requires gateway mode (agent key). Not available in direct Supabase mode.
  • Embedding and text extraction happen asynchronously via Inngest after upload.
  • Video embedding is expensive ($0.00079/frame) — files are capped at 20 MB.
  • Search results include an attachments array with extractedText when media matches.