ProductMarch 11, 2026

Introducing CLAIV Echo: Multi-Model AI Chat with Persistent Memory

CLAIV Echo is a multi-model AI chat that connects GPT-5, Claude, Gemini, Llama, and DeepSeek into a single workspace — and remembers everything across every model, every conversation, indefinitely.

Today we are launching CLAIV Echo — a multi-model AI chat that brings GPT-5, Claude, Gemini, Llama, and DeepSeek together in one workspace, powered by CLAIV Memory V6. Echo is available at echo.claiv.io starting today.

The central idea is simple: every AI frontier model has different strengths, and switching between them should not mean starting from scratch. Echo gives every model access to the same persistent memory — so whether you are drafting with Claude Opus, running analysis with GPT-5, or exploring a math problem with DeepSeek R1, your context, preferences, and history follow you everywhere.

Persistent Memory Across Every Model

Most AI chat applications either use ephemeral context windows or bolt on a simplistic memory layer that stores raw text blobs. Echo uses CLAIV Memory V6 — a structured fact extraction pipeline that distils every conversation into subject-predicate-object triples, each with a verbatim source quote and importance tier. This means:

Facts are structured, not raw chunks — they are queryable, auditable, and deduplicated.
Memory is tiered — hot facts always surface, warm facts appear when relevant, cold facts are archived but never lost.
Facts evolve correctly — when you change your mind, the old fact is preserved as history and the new fact becomes current. Nothing is silently overwritten.
Memory is model-agnostic — Claude and GPT-5 both see the same synthesised context narrative from the same V6 recall response.

When you start a conversation in Echo, a POST /v6/recall call runs in the background, retrieving relevant facts and producing the llm_context.text narrative that is injected into the system prompt. The model receives a pre-synthesised paragraph — not a raw list of triples — which means memory integrates naturally into the conversation without artefacts.

Every Major Frontier Model, One Workspace

Echo launches with 13 models across four providers:

OpenAIGPT-5 Nano (free tier), GPT-5 Mini, GPT-5, GPT-5.2, and GPT-5.4 with 1M context and computer use capability.
AnthropicClaude Haiku 4.5 for fast structured tasks, Claude Sonnet 4.6 for long-form writing and analysis, and Claude Opus 4.6 for demanding expert-level work.
GoogleGemini 2.5 Flash-Lite and Flash for high-speed document work, Gemini 2.5 Pro for multimodal reasoning, and Gemini 3.1 Pro with a 1M context window.
Open SourceLlama 3.3 70B for auditable AI workflows, DeepSeek R1 for chain-of-thought reasoning and math, and DeepSeek V3 for general-purpose tasks with a 256k context window.

You can switch models mid-conversation. Memory and context carry across every switch — there is no context reset when you move from Claude to GPT-5.

Echo Tokens: One Currency for Every Model

Rather than exposing raw per-token pricing from four different providers, Echo uses Echo Tokens (ET) — a single unified billing unit. Every message costs a flat rate of ET based on the model used:

Economy models (Nano, Flash-Lite, DeepSeek V3)   → 4 ET / message
Standard models (Mini, Flash, Llama 3.3, R1)      → 5 ET / message
Claude Haiku 4.5                                   → 6 ET / message
GPT-5 / Gemini 2.5 Pro                             → 9 ET / message
Gemini 3.1 Pro                                     → 10 ET / message
GPT-5.2                                            → 11 ET / message
Claude Sonnet 4.6                                  → 12 ET / message
GPT-5.4                                            → 13 ET / message
Claude Opus 4.6                                    → 18 ET / message
Online search mode (any model)                     → +5 ET / message

Flat per-message pricing means you always know what a message costs before you send it. No surprise bills from unexpectedly long responses, no hidden infrastructure fees. Memory is included in every plan at no extra ET cost.

Plans and Token Bundles

Echo launches with four subscription tiers:

Free500 ET / month — Economy models and GPT-5 Nano. Approximately 50–100 messages per month. No credit card required.
Pro — $195,000 ET / month — GPT-5 Mini, Claude Haiku 4.5, Gemini 2.5 Flash. Approximately 600–1,200 messages per month.
Power — $4920,000 ET / month — GPT-5, GPT-5.2, Gemini 2.5 Pro, Claude Sonnet 4.6. Approximately 2,500–5,000 messages per month.
Builder — $12960,000 ET / month — Every model including Claude Opus 4.6, GPT-5.4, Gemini 3.1 Pro, Llama 3.3 70B, DeepSeek R1 and V3. Approximately 8,000–15,000 messages per month.

For usage above your monthly allocation, four one-time token bundles are available: Spark (500 ET, $5), Boost (2,000 ET, $15), Surge (6,000 ET, $35), and Blaze (20,000 ET, $90). Bundles never expire and stack on top of your monthly allowance indefinitely.

Image Generation

Echo includes native image generation via OpenRouter. Generated images are stored in your gallery backed by Cloudflare R2 — you can browse, download, and reference them in future conversations. Image generation is available on Pro and above.

Documents and Project Workspaces

Echo ships with two organisational layers designed for serious AI-assisted work:

Documents — Upload text files, PDFs, or raw content directly into a conversation. Documents are chunked and indexed via the CLAIV Memory V6 documents endpoint and surfaced automatically in relevant recall responses. No manual retrieval step required.
Project Workspaces — Group related conversations, documents, and sessions into named projects. Memory is scoped to the project, so facts from one project do not bleed into unrelated ones. Switch between projects without losing context.

Online Search Mode

Enable online mode on any model to give it live web search access via OpenRouter. Results are grounded in current information and integrated into the response naturally. Online mode adds a flat 5 ET surcharge per message, regardless of which model you have selected.

Full Memory Control

Echo exposes the full CLAIV Memory control surface directly in the UI. You can view what Echo remembers about you at any time, delete individual facts, or trigger a full forget to remove all memory associated with your account. The forget operation returns a structured V6 deletion receipt documenting exactly what was removed — usable as a GDPR compliance record.

How Echo Differs from Other Multi-Model Chat Apps

There are other multi-model frontends. The meaningful difference in Echo is the memory layer. Most alternatives either have no persistent memory at all, or implement it as a raw text log injected into the context window. That approach has hard limits: you can only inject as much as the context window allows, older context gets dropped, and there is no deduplication, conflict resolution, or temporal tracking.

V6 memory is different in kind. The llm_context.text narrative injected into each Echo conversation is a dense synthesis of the facts most relevant to the current query — not a raw log of past messages. It scales indefinitely as you use Echo more, costs a fixed number of tokens per recall regardless of how much history you have accumulated, and works identically whether you are using GPT-5 or Claude Opus.

Get Started

CLAIV Echo is available now. The Free plan requires no credit card. Sign up at echo.claiv.io and start your first conversation. If you are already building with CLAIV Memory, Echo is the fastest way to see V6 memory working end-to-end in a real product.

All Posts