Concepts
Memory lifecycle, fact structure, tiered memory, predicate routing, scoping, conflict detection, temporal resolution, and the document system.
How CLAIV Memory works
CLAIV Memory is a context engine, not a chatbot. You ingest events; when your LLM needs context about a user, you call recall. CLAIV returns structured context (llm_context.text + answer_facts) that you inject into your LLM's prompt. Your LLM generates the final reply.
Data flow
Ingest
Your app sends messages to /v6/ingest. Returns event_id immediately. Fact extraction runs asynchronously (1–5 s).
Enrich (async)
A background worker runs the 5-step pipeline: Extract → Map → Gate → Embed → Tier. It produces structured Facts, detects conflicts, and builds Episode summaries.
Recall
Before your LLM call, ask for context with a query. Pass conversation_id on every recall call — it is required and drives conversation history and working memory. Optionally pass document_id or collection_id to route through the document memory system. CLAIV returns llm_context.text, answer_facts, and routing metadata.
Generate
Inject the returned context into your system prompt. Your LLM generates the final response.
Async enrichment pipeline
After /v6/ingest returns, a background worker runs five sequential steps:
- Step 1Extract: LLM reads the message and produces proposition cards — subject-predicate-object triples with verbatim evidence spans.
- Step 2Map: Each proposition is mapped to a catalog predicate (e.g. prefers_framework, has_deadline, works_as).
- Step 3Gate: A utility score filters low-value propositions before storage.
- Step 4Embed: High-utility facts are embedded for vector similarity search.
- Step 5Tier: Facts are assigned hot, warm, or cold tier based on importance score and access frequency.
/v6/recall immediately after ingest, the newly ingested event may not appear yet. In most production flows this is not a problem — recall happens on the next user interaction.Memory block types
The enrichment pipeline produces several block types. Each block has a type, content, source_ids, score, and evidence.
| Type | Description |
|---|---|
fact | A structured subject + relation + object extracted from a message, backed by a verbatim evidence quote. e.g. "user uses_technology: React" |
conflict | An unresolved contradiction between two facts. Surfaced during recall so your LLM can ask the user for clarification. |
episode | A conversation segment summary capturing the gist of a multi-turn exchange. Scoped to the thread where it was created. |
Fact structure
Every piece of information CLAIV stores is a Fact — a structured object extracted from an ingested event. Facts are the atomic unit of memory.
| Field | Type | Description |
|---|---|---|
subject | string | Who the fact is about — typically "user" or a named entity. |
predicate | string | Catalog predicate (e.g. prefers_framework, has_deadline). |
object_text | string | The fact value (e.g. "React", "2026-04-01"). |
source_text | string | Verbatim quote from the original message — character-exact evidence span. |
confidence | 0.0–1.0 | Extraction model confidence score. |
importance | 0.0–1.0 | Utility score — determines tier placement and recall priority. |
tier | hot | warm | cold | Memory tier assigned during enrichment. |
superseded_at | timestamp | null | Set when a newer fact replaces this one — preserves temporal history. |
Tiered memory
Every fact is assigned to one of three tiers based on its importance score and access patterns. The tier controls how the fact surfaces during recall.
Hot tier — always recalled
Facts with high importance scores (identity, persistent preferences, active goals) are placed in hot tier. They appear in every recall response regardless of the query, populating background_context.
Example predicates: is_named, works_as, primary_language
Warm tier — recalled by relevance
Facts with moderate importance are embedded and retrieved via semantic vector search when the recall query is relevant. Warm results populate answer_facts and supporting_facts.
Example predicates: prefers_framework, has_deadline, reported_issue
Cold tier — long-term archive
Superseded facts and low-utility extractions move to cold storage. They do not surface in standard recall but remain queryable for temporal history. They preserve the change history without polluting active context.
Scoping with conversation_id
In V6.4, conversation_id is a required field on both /v6/ingest and /v6/recall. It is the single unified scoping key for both read and write operations.
At ingest time — write scoping
Groups events into a named conversation. All events ingested with the same conversation_id form a conversation that can be referenced for history retrieval and episode summarisation.
Episode summaries are conversation-scoped. Facts, however, persist globally for the user — a preference extracted in one conversation is available when recalling in any other conversation for the same user.
At recall time — read scoping
Enables working memory and conversation history retrieval. The conversation_id tells CLAIV which conversation's recent context to surface, along with user-wide facts.
The recall response includes a conversation_history block containing recent turns and a working_memory block of active in-session context alongside the standard fact memory.
conversation_id on /v6/forget
Optionally pass conversation_id on /v6/forget to scope deletion to a single conversation's events and episodes only. Combine with project_id or a time range for finer-grained deletion.
conversation_id — a UUID or your app's internal session identifier both work well. The value must match exactly across all ingest and recall calls that belong to the same conversation.Event types
The type field on /v6/ingest must be one of:
message
A user or assistant message in a conversation. The most common event type. Examples: chat turns, Q&A exchanges, instructions from the user.
tool_call
A tool or function call made by the AI assistant. Captures what tools were invoked and their results. Examples: API calls, database queries, function invocations.
app_event
An application-level event: navigation, purchases, feature usage, or any custom event your app emits. Examples: page views, button clicks, purchases, settings changes.
Role-aware extraction
The role field determines how the enrichment pipeline processes the event content:
| Role | Extracts facts? | Resolves conflicts? |
|---|---|---|
user | Yes | Yes — user statements are the source of truth |
assistant | Yes | No — assistant assertions cannot overrule user assertions |
system | Skipped | N/A — system messages are not processed for memory |
tool | Yes | No — treated like assistant |
Conflict detection
When the enrichment pipeline detects a new assertion that contradicts an existing one, CLAIV handles it automatically:
1. Both assertions marked contested
Neither assertion is deleted. Both are marked contested so the system knows there is an unresolved conflict.
2. Conflict record created
A conflict record is created linking the two contradicting assertions, preserving both sides of the disagreement.
3. Recall surfaces a conflict block
When you call recall, CLAIV surfaces a conflict memory block. Your LLM can use this to ask the user for clarification rather than silently choosing one version.
4. Auto-resolves on clarification
When the user clarifies (e.g. "Actually I switched to light mode"), a new user-role event is ingested. The pipeline detects the resolution, promotes the winning assertion, and retires the loser to cold tier.
Predicate routing
Before executing search, CLAIV analyses the recall query to identify what kinds of facts are needed. Multiple search channels run in parallel:
Search channels (parallel)
Predicate match: Direct lookup for facts whose predicate matches the query intent (e.g. a query about "framework preferences" targets prefers_framework).
Vector similarity: Embedding-based search across warm-tier facts for semantic relevance.
Temporal range: Date-aware search when the query contains temporal expressions ("last week", "next month") resolved to absolute ranges.
Keyword: Cross-entity keyword matching for proper nouns and project names not well-captured by embeddings.
routing object in the recall response describes how the query was classified and which channels ran. This is useful for debugging unexpected results — check routing.mode and routing.predicates first.Temporal resolution
CLAIV resolves relative date expressions in both ingested content and recall queries into absolute timestamps. Every resolved range includes a temporal_range with start and end dates.
Anchor time
The reference time is the anchor for resolving expressions like "next week" or "last month". On recall, it defaults to server time but can be overridden with the reference_time parameter.
Fact supersession
When a new fact contradicts an older one (e.g. user updates their deadline), the old fact's superseded_at field is set. The old fact moves to cold tier, preserving history without polluting active recall.
created_at or temporal_matches fields.Document system
The document system provides RAG over structured reference material — product manuals, knowledge base articles, policy documents, and more. Documents are stored separately from event memory and are retrieved using one of four routing strategies.
Document structure: document → sections → spans
When you upload a document, CLAIV parses its content into a three-level tree:
Document — top-level node with metadata, name, and project scope.
Sections — structural segments derived from headings and paragraph groupings. Each section receives an LLM-generated distillation asynchronously.
Spans — fine-grained text chunks embedded with text-embedding-3-small and indexed synchronously. Available for recall immediately after upload returns.
Async distillations
After the upload API returns, CLAIV generates LLM summaries for each section and a holistic document distillation asynchronously. The upload response includes status: "processing" while distillations complete.
Distillations are used by the SECTION and DOCUMENT routing strategies. Span-level recall (LOCAL strategy) works immediately without waiting for distillations.
Re-upload semantics
Re-uploading a document with the same document_id is fully idempotent — all previous structure nodes, spans, and distillations are deleted before re-processing. This is the correct pattern for updating a document that has changed.
| Strategy | Triggered by | Description |
|---|---|---|
LOCAL | Default (no document_id / collection_id) | Cosine similarity search across all spans in the project. Standard semantic chunk retrieval. |
SECTION | Named section in query | Identifies a target section by name and returns that section's distillation plus its spans. Useful for "show me the Introduction" queries. |
DOCUMENT | document_id on recall | Full document context: document distillation, all section distillations, and top-k spans. Falls back with [DOCUMENT UNDERSTANDING STATUS: DEGRADED] if distillations are still processing. |
COLLECTION | collection_id on recall | Tiered multi-document context across all documents in a collection. Returns per-document distillations and relevance-weighted spans. |
Collections
Collections group documents for coordinated multi-document retrieval. Create a collection with POST /v6/collections, then add documents to it at upload time (via collection_id on POST /v6/documents) or later (via POST /v6/collections/:id/documents).
Deleting a collection does not delete the documents inside it — only the membership relationships are removed.
project_id required). Recall automatically filters to the correct project. Delete a document with DELETE /v6/documents/:document_id — this removes all its spans, sections, distillations, and collection memberships.Compliance deletion
/v6/forget deletes all memory for a user — or a scoped subset (by conversation_id, project_id, document_id, or time range). Every forget call returns a receipt_id and a deleted_counts breakdown of every entity removed. Store the receipt ID for your compliance records. The deleted_counts.chunks field is non-zero when document spans are also deleted.