V6.4

API Reference

Complete reference for the CLAIV Memory V6.4 API. Core endpoints (ingest, recall, forget), an overhauled document system with collections, usage monitoring, and health checks.

Base URL: https://api.claiv.io

Your tenant is inferred from your API key — you never send a tenant_id in request payloads. CLAIV Memory returns context; your LLM generates the final reply.

Authentication

All API requests require a Bearer token in the Authorization header. Generate keys from your project dashboard — each key is scoped to one project (tenant). Keys are SHA-256 hashed before storage; copy yours immediately since it is shown only once.

Authorization: Bearer YOUR_API_KEY
POST/v6/ingest

Store a conversation event. Returns immediately with an event_id. Fact extraction, conflict detection, and embedding run asynchronously in the background (1–5 s via the Extract → Map → Gate → Embed → Tier pipeline).

Request body

ParameterTypeDescription
user_idreqstringYour application's user identifier. Scopes all memory for this user within your tenant.
typereqstringEvent type: "message" (conversation turn), "tool_call" (function call result), or "app_event" (custom application event).
contentreqstringThe event text to ingest and extract facts from.
rolestringConversation role: "user", "assistant", "system", or "tool". Controls extraction behaviour — user-role events are the source of truth for conflict resolution; system-role events are skipped.
conversation_idreqstringGroups events into a named conversation. Required — drives conversation history, working memory, pending plans, and stable conversation-scoped recall sessions.
event_timeISO datetimeOverride the event timestamp. Defaults to server time. Useful when ingesting backfilled history.
idempotency_keystringOptional client-supplied deduplication key. If a request with the same key was stored within the last 24 hours, the original event_id is returned and deduped is set to true.
metadataobjectArbitrary key-value pairs stored alongside the event for your own use. Not used during fact extraction.

Response

FieldTypeDescription
event_idUUIDStable identifier for this event. Use for debugging or building evidence trails.
dedupedbooleantrue if this message was already stored (byte-identical within the dedup window, or matching idempotency_key).
cURLRequest
curl -X POST https://api.claiv.io/v6/ingest \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-123",
    "type": "message",
    "role": "user",
    "conversation_id": "session-abc",
    "content": "I use React and TypeScript. My deadline is March 15th."
  }'
Response200 OK
{
  "event_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "deduped": false
}
Enrichment is asynchronous. Fact extraction, mapping, and embedding run in the background after ingest returns. If you call /v6/recall immediately after ingest, the newly ingested event may not yet appear in results. In most production flows this is fine — recall happens on the next user interaction.
POST/v6/recall

Retrieve relevant memory for a user before an LLM call. Runs predicate match, vector similarity, keyword, and temporal search channels in parallel. Returns tiered facts plus a pre-synthesized llm_context.text narrative ready to inject directly into your system prompt.

Request body

ParameterTypeDescription
user_idreqstringThe user whose memory to recall.
queryreqstringNatural language description of the context you need. Drives predicate routing and vector search.
conversation_idreqstringRequired — drives conversation history in llm_context, working memory, pending plans, and conversation-scoped recall sessions.
reference_timeISO datetimeOverride the temporal anchor used to resolve relative date expressions (e.g. 'next week'). Defaults to server time.
mode_hintstringHint the routing mode: "single", "multi", "temporal", or "broad". By default the router classifies the query automatically.
project_idstringWhen provided, enables project-scoped retrieval — recalls facts and documents associated with this project.
document_idstringWhen provided, restricts document retrieval to this specific document and triggers the DOCUMENT routing strategy (full document context with distillations).
collection_idstringWhen provided, triggers the COLLECTION routing strategy — tiered multi-document context across all documents in the collection within a 100K-char budget.
limits.answer_factsintegerMax number of answer facts to return. Default 10.
limits.supporting_factsintegerMax number of supporting facts to return. Default 5.
limits.background_contextintegerMax number of background context facts to return. Default 5.
limits.document_chunksintegerMax document spans injected in LOCAL semantic mode. Range 1–50, default 5.
include.pending_planbooleanWhen true, includes the pending_plan object — conversation-scoped pending plan from the plan ledger. Default false.
include.debugbooleanWhen true, includes a debug object with routing internals. Default false.

Response

FieldTypeDescription
answer_factsFact[]Facts that directly answer the query, ranked by relevance and confidence.
supporting_factsFact[]Corroborating facts that provide additional context beyond the direct answer.
background_contextFact[]Broad, always-relevant user context (identity facts, stable preferences). Hot-tier facts surface here regardless of query.
working_memoryobject | nullConversation-scoped working state: turn-level focus, recent constraints, confirmation state, and last plan state.
pending_planobject | nullPresent when include.pending_plan is true and data exists for that conversation. Conversation-scoped pending plan from the plan ledger.
llm_context.textstringPre-synthesized narrative built from selected facts, conversation context, and any matching document spans as a [DOCUMENT CONTEXT] section. Inject directly into your system prompt.
llm_context.fact_idsUUID[]IDs of the facts used to build llm_context.text.
llm_context.reference_timeISO datetimeThe temporal anchor used to resolve date expressions in facts.
llm_context.anchor_sourcestringHow the reference_time was determined: "client_provided" or "server_now".
llm_context.conversation_historyobject[]Recent conversation turns for the conversation_id, available when conversation_id is provided.
routingobjectDescribes how the query was routed — mode, matched fact kinds, predicates, temporal intent, and document routing strategy used.

Document routing strategies

When documents exist for the user, the query classifier selects one of four strategies. The result is injected into llm_context.text as a [DOCUMENT CONTEXT] section.

StrategyWhen triggeredBehaviour
LOCALDefault — no document_id or collection_idCosine similarity search across spans; top matches labelled by document_name and section title.
SECTIONQuery names a specific section (e.g. "chapter 3", "the introduction")Fetches all spans in the named section in reading order; falls back to span vector search if no title match.
DOCUMENTdocument_id passed explicitly, or query targets a whole documentFull document context: distillation + section distillations + evidence spans within 100K-char budget. Falls back to raw spans with a [DOCUMENT UNDERSTANDING STATUS: DEGRADED] flag if distillations are not yet ready.
COLLECTIONcollection_id passed explicitlyTiered multi-document context: per-document distillations + section importance scores + representative spans within 100K-char budget.

Fact object

FieldTypeDescription
fact_idUUIDStable fact identifier.
subjectstringWho the fact is about — typically "user" or a named entity.
kindstringFact category: preference, constraint, task, identity, relationship, event.
predicatestringCatalog predicate label (e.g. prefers_framework, has_deadline, works_as).
object_textstringThe fact value (e.g. React, 2026-04-01).
relation_phrasestringHuman-readable relation (e.g. prefers working with).
source_textstringVerbatim quote from the original message — character-exact evidence span.
confidencefloat 0–1Extraction model confidence score.
importancefloat 0–1Utility score — determines tier placement. ≥0.8 = hot, 0.4–0.79 = warm, <0.4 = cold.
tierhot | warm | coldMemory tier assigned during enrichment.
created_atISO datetimeWhen the fact was first extracted.
superseded_atISO datetime | nullSet when a newer fact replaces this one — preserves temporal history.
temporal_matchesobject[]Resolved date ranges if the fact contained temporal expressions.

Routing object

FieldTypeDescription
modesingle | multi | temporal | broadHow the query was classified. single = one predicate cluster; multi = several clusters; temporal = time-scoped; broad = general context.
kindsstring[]Fact kinds targeted in this query.
predicatesstring[]Catalog predicates matched for exact-match retrieval.
temporal_intentobject | nullResolved date range if the query contained temporal expressions (e.g. 'what changed last week').
cURLRequest
curl -X POST https://api.claiv.io/v6/recall \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-123",
    "conversation_id": "session-abc",
    "query": "What tech stack does this user prefer?",
    "limits": { "answer_facts": 5 }
  }'
Response200 OK
{
  "answer_facts": [
    {
      "fact_id": "uuid-1",
      "subject": "user",
      "kind": "preference",
      "predicate": "prefers_framework",
      "object_text": "React",
      "relation_phrase": "prefers working with",
      "source_text": "I use React and TypeScript",
      "confidence": 0.95,
      "importance": 0.82,
      "tier": "warm",
      "created_at": "2026-03-04T10:00:00Z",
      "superseded_at": null,
      "temporal_matches": []
    }
  ],
  "supporting_facts": [],
  "background_context": [],
  "llm_context": {
    "text": "The user prefers React and TypeScript. Their project deadline is March 15th.",
    "fact_ids": ["uuid-1", "uuid-2"],
    "reference_time": "2026-03-06T12:00:00Z",
    "anchor_source": "server_now"
  },
  "routing": {
    "mode": "single",
    "kinds": ["preference"],
    "predicates": ["prefers_framework"],
    "temporal_intent": null
  }
}
Always inject llm_context.text into your system prompt. It is pre-synthesized and ready to use without post-processing. The structured answer_facts array is available when you need individual facts for custom rendering, citation display, or filtering.
POST/v6/forget

Delete all memory for a user. Optionally scope by time range using from_time / to_time. Returns a structured receipt documenting exact counts of everything removed — use this for GDPR right-to-erasure requests.

Request body

ParameterTypeDescription
user_idreqstringThe user whose memory to delete.
conversation_idstringWhen provided, scopes deletion to events and facts from this conversation only.
project_idstringWhen provided, scopes deletion to events and facts from this project only.
document_idstringWhen provided, removes all spans, structure nodes, distillations, and document metadata for this document. Reflected in deleted_counts.chunks.
from_timeISO datetimeLimit deletion to events from this time onwards. Combine with to_time for a time-range deletion.
to_timeISO datetimeLimit deletion to events up to and including this time. Combine with from_time for a time-range deletion.

Response

FieldTypeDescription
receipt_idUUIDStable audit-trail identifier for this deletion. Store for compliance records.
deleted_counts.eventsintegerNumber of raw ingested events deleted.
deleted_counts.chunksintegerNumber of document spans deleted (for document-scoped forget with document_id).
deleted_counts.episodesintegerNumber of episode summaries deleted.
deleted_counts.factsintegerNumber of extracted facts deleted.
deleted_counts.claimsintegerNumber of contested assertion records deleted.
deleted_counts.open_loopsintegerNumber of open-loop task records deleted.
cURLFull user deletion
curl -X POST https://api.claiv.io/v6/forget \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "user_id": "user-123" }'
cURLScoped deletion — time range
curl -X POST https://api.claiv.io/v6/forget \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-123",
    "from_time": "2026-01-01T00:00:00Z",
    "to_time": "2026-01-31T23:59:59Z"
  }'
Response200 OK
{
  "receipt_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "deleted_counts": {
    "events":     12,
    "chunks":      0,
    "episodes":    3,
    "facts":      28,
    "claims":      1,
    "open_loops":  2
  }
}
POST/v6/documents

Parse, embed, and index a document for RAG. The API parses content into a document → sections → spans tree, embeds each span synchronously with text-embedding-3-small, and returns immediately — spans are available for recall right away. LLM-generated section and document distillations complete asynchronously, enabling richer structured recall.

project_id is required. All documents are project-scoped. Re-uploading with the same document_id deletes all previous structure nodes, spans, and distillations before re-processing — making re-upload idempotent.

Request body

ParameterTypeDescription
user_idreqstringThe user this document belongs to.
document_namereqstringDisplay name for the document (e.g. 'Product Manual').
contentreqstring (up to 5 MB)Full text content of the document. For binary formats (PDF, DOCX) extract the text before sending.
project_idreqstringThe project this document belongs to. All recall filters spans to the correct project automatically.
document_idstringOptional client-supplied identifier. Generated if omitted. Re-uploading the same ID replaces the document.
collection_idstringAdd the document to a collection on upload.
positionintegerPosition within an ordered collection.

Response

FieldTypeDescription
document_idUUIDStable document identifier — store this to scope recall or delete the document later.
document_namestringThe document name as stored.
project_idstringThe project this document was indexed under.
collection_idstring | nullThe collection this document was added to, if any.
sectionsobject[]Array of section nodes created from the document structure. Each has node_id and title.
spans_createdintegerNumber of spans embedded and indexed synchronously. Available for recall immediately.
status"processing" | "ready""processing" — spans are available but distillations are still generating asynchronously. "ready" — all distillations are complete.
cURLUpload a document
curl -X POST https://api.claiv.io/v6/documents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-123",
    "project_id": "proj-abc",
    "document_name": "Product Manual",
    "content": "... full document text ..."
  }'
Response200 OK
{
  "document_id": "doc_uuid",
  "document_name": "Product Manual",
  "project_id": "proj-abc",
  "collection_id": null,
  "sections": [
    { "node_id": "node_uuid1", "title": "Introduction" },
    { "node_id": "node_uuid2", "title": "Chapter 1" }
  ],
  "spans_created": 42,
  "status": "processing"
}
cURLRecall against a specific document (DOCUMENT strategy)
curl -X POST https://api.claiv.io/v6/recall \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "user-123",
    "conversation_id": "session-abc",
    "query": "What are the installation requirements?",
    "document_id": "doc_uuid",
    "limits": { "document_chunks": 10 }
  }'
GET/v6/documents

List documents for a user, optionally filtered by project or collection.

Query parameters

ParameterTypeDescription
user_idreqstringThe user whose documents to list.
project_idstringFilter to a specific project.
collection_idstringFilter to a specific collection.
limitintegerMax results to return. Default 20, max 100.
offsetintegerPagination offset. Default 0.
DELETE/v6/documents/:document_id

Permanently removes a document and all its spans, structure nodes, distillations, and collection memberships. Alternatively, use POST /v6/forget with document_id to delete a document as part of a broader user forget operation.

Response200 OK
{ "deleted": true, "document_id": "doc_uuid" }

Collections

Collections group documents for multi-document recall. When a collection_id is passed on /v6/recall, the COLLECTION routing strategy assembles tiered context across all documents in the collection. Deleting a collection does not delete the documents inside it.

POST/v6/collections

Request body

ParameterTypeDescription
user_idreqstringThe user who owns the collection.
project_idreqstringThe project this collection belongs to.
namereqstringDisplay name for the collection.
orderedbooleanWhether documents in the collection have an explicit order. Default false.
collection_idstringOptional client-supplied identifier. Generated if omitted.
GET/v6/collections

List collections. Query params: user_id (required), project_id, limit, offset.

GET/v6/collections/:collection_id

Returns the collection with its document list. Query param: user_id (required).

DELETE/v6/collections/:collection_id

Deletes the collection. Documents inside the collection are not deleted. Query param: user_id (required).

POST/v6/collections/:collection_id/documents

Adds an existing document to a collection.

ParameterTypeDescription
document_idreqstringThe document to add.
user_idreqstringThe user who owns both the collection and the document.
positionintegerPosition within an ordered collection.
DELETE/v6/collections/:collection_id/documents

Removes a document from a collection. Pass document_id in the request body. The document itself is not deleted — only its collection membership is removed.

Usage endpoints

Monitor ingest and recall consumption against your plan quotas. All usage endpoints are prefixed /v1/usage and require the same Bearer auth.

GET/v1/usage/summary

Returns total ingests and recalls for the current billing period alongside your plan limits.

Response200 OK
{
  "period": {
    "start": "2026-03-01T00:00:00Z",
    "end":   "2026-03-31T23:59:59Z"
  },
  "ingests": {
    "used":  4821,
    "limit": 50000
  },
  "recalls": {
    "used":  1204,
    "limit": 25000
  },
  "plan": "growth"
}
GET/v1/usage/breakdown

Returns per-day usage for the current period — useful for building usage dashboards.

Response200 OK
{
  "breakdown": [
    { "date": "2026-03-04", "ingests": 312, "recalls": 89 },
    { "date": "2026-03-05", "ingests": 478, "recalls": 120 }
  ]
}
GET/v1/usage/limits

Returns current rate limits and overage policy for your plan. Check this to understand throttling behaviour.

Response200 OK
{
  "rate_limits": {
    "ingest_per_minute":  500,
    "recall_per_minute":  200,
    "forget_per_minute":   60
  },
  "monthly_quotas": {
    "ingests": 50000,
    "recalls": 25000
  },
  "overage_policy": "hard_stop"
}
Overage behaviour depends on your plan. On Starter and Growth, quota exhaustion returns 429 Quota Exceeded. On Scale, overages are billed at $4 per 1,000 additional ingests.

Health endpoints

Use these endpoints in load balancer health checks and uptime monitors. No authentication required.

GET/healthz

Liveness check. Returns 200 OK when the API process is running, regardless of downstream dependency health.

Response200 OK
{ "status": "ok" }
GET/readyz

Readiness check. Returns 200 OK only when Postgres and Redis are reachable. Returns 503 Service Unavailable otherwise — use this to gate traffic.

Response200 OK
{ "status": "ready", "checks": { "postgres": "ok", "redis": "ok" } }
Response503 — dependency down
{ "status": "not_ready", "checks": { "postgres": "ok", "redis": "error" } }

Error codes

All errors follow the same shape: { "error": { "code": "...", "message": "..." } }

StatusCodeMeaning
400invalid_requestMissing or malformed required field (e.g. missing user_id).
401unauthorizedMissing or invalid API key.
403forbiddenThe API key does not have access to this resource.
404not_foundThe requested resource does not exist.
409conflictIdempotency key collision with different payload.
422unprocessableRequest body is valid JSON but semantically invalid.
429rate_limitedPer-minute rate limit exceeded — back off and retry.
429quota_exceededMonthly quota exhausted. Upgrade plan or wait for reset.
500internal_errorUnexpected server error. Retry with exponential back-off.
503service_unavailableDependency (Postgres / Redis) temporarily unreachable.
For 429 rate_limited, inspect the Retry-After header for the seconds to wait before retrying. All 5xx errors are safe to retry with exponential back-off (initial 1 s, max 30 s, jitter ±500 ms).

Continue reading