V6 Architecture

CLAIV Memory Architecture

LLM-curated fact extraction, tiered storage, evidence traceability, temporal change tracking, and multi-channel recall — explained in full.

IngestExtractMapGateTierRecall

Memory Lifecycle

From raw text to ranked recall

Every message passes through a five-stage enrichment pipeline. Ingestion is synchronous — enrichment is fully async with no latency impact on your API calls.

01

Ingest

Raw conversation events enter via POST /v6/ingest. Messages, tool calls, and app events are accepted. The event is stored immediately — enrichment runs async in the background with no latency impact.

02

Extract

An LLM reads each message and produces proposition cards — subject-predicate-object triples, each anchored to a verbatim source_text span from the original content for full traceability.

03

Map

Each proposition is mapped to a catalog predicate using deterministic rules plus an LLM fallback. Produces consistent, queryable predicates like prefers_framework and has_deadline across all users.

04

Gate + Embed

A utility score determines storage worthiness. High-utility facts are embedded with text-embedding-3-small for semantic search. Generic noise is discarded — keeping the memory graph dense with signal.

05

Tier + Store

Facts are assigned hot, warm, or cold tiers based on importance score and access frequency. Tiering determines recall priority — hot facts always surface, warm facts require semantic relevance.

Fact Structure

Subject-predicate-object triples

Every extracted fact is a structured triple with full evidence provenance — deterministic, inspectable, each linked back to the exact verbatim source text it was derived from. No black-box embeddings, no opaque blobs.

subject

The entity the fact is about

"user-123"

predicate

The relationship or property type

"prefers_framework"

object_text

The value or target entity

"React with TypeScript"

source_text

Character-exact verbatim quote from message

"I use React and TypeScript"

kind

Semantic category of the fact

"preference"

tier

Recall priority: hot / warm / cold

"hot"

V6 fact object
{
  "fact_id":        "fact_8x7k2m",
  "subject":        "user-123",
  "kind":           "preference",
  "predicate":      "prefers_framework",
  "object_text":    "React with TypeScript",
  "relation_phrase":"prefers working with",
  "source_text":    "I use React and TypeScript",
  "source_event_id":"evt_abc123",
  "confidence":     0.95,
  "importance":     0.85,
  "tier":           "hot",
  "created_at":     "2026-03-04T10:00:00Z",
  "temporal_edges": []
}

Evidence traceability

The source_text field is a character-exact verbatim span from the original message — not a paraphrase, not a summary. Every fact can be audited back to the exact words that produced it.

Memory Tiers

Hot, warm, and cold storage

Every fact is assigned a tier based on importance score and access frequency. Tiers determine recall priority — ensuring the most relevant memories always surface first.

Hot0.8 – 1.0

importance score

Always included in every recall response. No vector search needed — retrieved deterministically.

Core identity facts, critical preferences, active goals

Warm0.4 – 0.79

importance score

Included when semantically relevant via vector similarity to the current query.

Project details, recent decisions, situational context

Cold0.0 – 0.39

importance score

Only surfaced on highly specific queries. Archived but never deleted — superseded facts live here.

Historical preferences, superseded facts, old conversations

Importance scores are computed by the LLM during extraction and can rise or fall over time — a fact promoted by repeated recall moves to a warmer tier. Superseded facts are demoted to cold but never deleted.

Role-Aware Processing

Different roles, different extraction

CLAIV applies different extraction logic depending on who said what — treating user preferences, assistant commitments, system prompts, and tool outputs as distinct categories of memory.

user

Full extraction

Preferences, goals, biographical details, stated requirements, and opinions are all extracted as first-class facts attributed directly to the user.

assistant

Selective extraction

Commitments, recommendations, and action items made by the assistant are extracted. Filler, hedging, and acknowledgements are ignored.

system

Context metadata

System instructions are stored as context metadata and not extracted as user-attributed facts — preventing instruction contamination of user memory.

tool

Result extraction

Tool outputs (API responses, search results) are extracted as evidence-backed facts attributed to the source, not the user.

Temporal Change Tracking

Facts evolve. CLAIV tracks it all.

When a fact changes, CLAIV creates temporal edges linking old and new versions — preserving the full evolution history. Superseded facts move to cold storage and are never lost, still queryable with a date-range filter.

UPDATE

Fact's object value changes. Old version preserved with a temporal supersession edge pointing to the new version.

MERGE

Two related facts are combined into a richer single fact with higher confidence. Both source facts retained as history.

DELETE

Fact explicitly contradicted or retracted. Marked superseded and moved to cold tier — never purged.

NOOP

Re-ingested fact matches existing memory within confidence threshold. Existing fact's confidence is reinforced, no duplicate created.

temporal evolution example

Jan 15, 2026 · original

user prefers_framework Vue.js

→ SUPERSEDED · cold tier

UPDATE — Mar 2, 2026

user prefers_framework React with TypeScript

source_text: "I switched to React and TypeScript last month"

CURRENT — Active

user prefers_framework React with TypeScript

importance: 0.85 · tier: hot · confidence: 0.95

Multi-Channel Recall

Four search channels, one ranked result

POST /v6/recall runs all four search channels in parallel, merges results by relevance score, and returns both structured answer_facts and a pre-synthesized llm_context.text narrative ready to drop into any system prompt.

Predicate match

Exact match on structured predicate — prefers_framework, has_deadline, etc. Highest precision, zero hallucination.

Vector similarity

Semantic search over embedded facts using cosine similarity. Catches paraphrases and conceptually related memories.

Keyword search

Full-text BM25 search over object_text and source_text fields. Catches specific terms not covered by semantic similarity.

Temporal awareness

Superseded facts are excluded by default. Date-range filters let you query state at any point in history.

POST /v6/recall → response
{
  "answer_facts": [
    {
      "predicate":   "prefers_framework",
      "object_text": "React with TypeScript",
      "tier":        "hot",
      "confidence":  0.95,
      "source_text": "I use React and TypeScript"
    },
    {
      "predicate":   "has_deadline",
      "object_text": "Q2 2026",
      "tier":        "warm",
      "confidence":  0.87
    }
  ],
  "llm_context": {
    "text": "The user prefers React with TypeScript
and is working toward a Q2 2026 deadline.
They have been using this stack since
switching from Vue.js in March 2026.",
    "token_count": 42
  }
}

llm_context.text

A pre-synthesized narrative built from the top-ranked facts. Drop it verbatim into your system prompt — no extra LLM call needed. Always passed through verbatim; never truncated or modified.

Ready to build with V6?

Start ingesting memories in minutes. The full API reference, quickstart guides, and playground are all available now.

5

Pipeline stages

Ingest → Recall

3

Memory tiers

Hot / Warm / Cold

4

Recall channels

Parallel search

4

Temporal ops

Full change history