CLAIV Memory Architecture
LLM-curated fact extraction, tiered storage, evidence traceability, temporal change tracking, and multi-channel recall — explained in full.
Memory Lifecycle
From raw text to ranked recall
Every message passes through a five-stage enrichment pipeline. Ingestion is synchronous — enrichment is fully async with no latency impact on your API calls.
Ingest
Raw conversation events enter via POST /v6/ingest. Messages, tool calls, and app events are accepted. The event is stored immediately — enrichment runs async in the background with no latency impact.
Extract
An LLM reads each message and produces proposition cards — subject-predicate-object triples, each anchored to a verbatim source_text span from the original content for full traceability.
Map
Each proposition is mapped to a catalog predicate using deterministic rules plus an LLM fallback. Produces consistent, queryable predicates like prefers_framework and has_deadline across all users.
Gate + Embed
A utility score determines storage worthiness. High-utility facts are embedded with text-embedding-3-small for semantic search. Generic noise is discarded — keeping the memory graph dense with signal.
Tier + Store
Facts are assigned hot, warm, or cold tiers based on importance score and access frequency. Tiering determines recall priority — hot facts always surface, warm facts require semantic relevance.
Ingest
Raw conversation events enter via POST /v6/ingest. Messages, tool calls, and app events are accepted. The event is stored immediately — enrichment runs async in the background with no latency impact.
Extract
An LLM reads each message and produces proposition cards — subject-predicate-object triples, each anchored to a verbatim source_text span from the original content for full traceability.
Map
Each proposition is mapped to a catalog predicate using deterministic rules plus an LLM fallback. Produces consistent, queryable predicates like prefers_framework and has_deadline across all users.
Gate + Embed
A utility score determines storage worthiness. High-utility facts are embedded with text-embedding-3-small for semantic search. Generic noise is discarded — keeping the memory graph dense with signal.
Tier + Store
Facts are assigned hot, warm, or cold tiers based on importance score and access frequency. Tiering determines recall priority — hot facts always surface, warm facts require semantic relevance.
Fact Structure
Subject-predicate-object triples
Every extracted fact is a structured triple with full evidence provenance — deterministic, inspectable, each linked back to the exact verbatim source text it was derived from. No black-box embeddings, no opaque blobs.
subjectThe entity the fact is about
"user-123"
predicateThe relationship or property type
"prefers_framework"
object_textThe value or target entity
"React with TypeScript"
source_textCharacter-exact verbatim quote from message
"I use React and TypeScript"
kindSemantic category of the fact
"preference"
tierRecall priority: hot / warm / cold
"hot"
{
"fact_id": "fact_8x7k2m",
"subject": "user-123",
"kind": "preference",
"predicate": "prefers_framework",
"object_text": "React with TypeScript",
"relation_phrase":"prefers working with",
"source_text": "I use React and TypeScript",
"source_event_id":"evt_abc123",
"confidence": 0.95,
"importance": 0.85,
"tier": "hot",
"created_at": "2026-03-04T10:00:00Z",
"temporal_edges": []
}Evidence traceability
The source_text field is a character-exact verbatim span from the original message — not a paraphrase, not a summary. Every fact can be audited back to the exact words that produced it.
Memory Tiers
Hot, warm, and cold storage
Every fact is assigned a tier based on importance score and access frequency. Tiers determine recall priority — ensuring the most relevant memories always surface first.
0.8 – 1.0importance score
Always included in every recall response. No vector search needed — retrieved deterministically.
Core identity facts, critical preferences, active goals
0.4 – 0.79importance score
Included when semantically relevant via vector similarity to the current query.
Project details, recent decisions, situational context
0.0 – 0.39importance score
Only surfaced on highly specific queries. Archived but never deleted — superseded facts live here.
Historical preferences, superseded facts, old conversations
Importance scores are computed by the LLM during extraction and can rise or fall over time — a fact promoted by repeated recall moves to a warmer tier. Superseded facts are demoted to cold but never deleted.
Role-Aware Processing
Different roles, different extraction
CLAIV applies different extraction logic depending on who said what — treating user preferences, assistant commitments, system prompts, and tool outputs as distinct categories of memory.
userFull extraction
Preferences, goals, biographical details, stated requirements, and opinions are all extracted as first-class facts attributed directly to the user.
assistantSelective extraction
Commitments, recommendations, and action items made by the assistant are extracted. Filler, hedging, and acknowledgements are ignored.
systemContext metadata
System instructions are stored as context metadata and not extracted as user-attributed facts — preventing instruction contamination of user memory.
toolResult extraction
Tool outputs (API responses, search results) are extracted as evidence-backed facts attributed to the source, not the user.
Temporal Change Tracking
Facts evolve. CLAIV tracks it all.
When a fact changes, CLAIV creates temporal edges linking old and new versions — preserving the full evolution history. Superseded facts move to cold storage and are never lost, still queryable with a date-range filter.
UPDATEFact's object value changes. Old version preserved with a temporal supersession edge pointing to the new version.
MERGETwo related facts are combined into a richer single fact with higher confidence. Both source facts retained as history.
DELETEFact explicitly contradicted or retracted. Marked superseded and moved to cold tier — never purged.
NOOPRe-ingested fact matches existing memory within confidence threshold. Existing fact's confidence is reinforced, no duplicate created.
Jan 15, 2026 · original
user prefers_framework Vue.js
→ SUPERSEDED · cold tierUPDATE — Mar 2, 2026
user prefers_framework React with TypeScript
source_text: "I switched to React and TypeScript last month"
CURRENT — Active
user prefers_framework React with TypeScript
importance: 0.85 · tier: hot · confidence: 0.95
Multi-Channel Recall
Four search channels, one ranked result
POST /v6/recall runs all four search channels in parallel, merges results by relevance score, and returns both structured answer_facts and a pre-synthesized llm_context.text narrative ready to drop into any system prompt.
Predicate match
Exact match on structured predicate — prefers_framework, has_deadline, etc. Highest precision, zero hallucination.
Vector similarity
Semantic search over embedded facts using cosine similarity. Catches paraphrases and conceptually related memories.
Keyword search
Full-text BM25 search over object_text and source_text fields. Catches specific terms not covered by semantic similarity.
Temporal awareness
Superseded facts are excluded by default. Date-range filters let you query state at any point in history.
{
"answer_facts": [
{
"predicate": "prefers_framework",
"object_text": "React with TypeScript",
"tier": "hot",
"confidence": 0.95,
"source_text": "I use React and TypeScript"
},
{
"predicate": "has_deadline",
"object_text": "Q2 2026",
"tier": "warm",
"confidence": 0.87
}
],
"llm_context": {
"text": "The user prefers React with TypeScript
and is working toward a Q2 2026 deadline.
They have been using this stack since
switching from Vue.js in March 2026.",
"token_count": 42
}
}llm_context.text
A pre-synthesized narrative built from the top-ranked facts. Drop it verbatim into your system prompt — no extra LLM call needed. Always passed through verbatim; never truncated or modified.
Ready to build with V6?
Start ingesting memories in minutes. The full API reference, quickstart guides, and playground are all available now.
5
Pipeline stages
Ingest → Recall
3
Memory tiers
Hot / Warm / Cold
4
Recall channels
Parallel search
4
Temporal ops
Full change history