Deployment

Architecture overview and deployment guide for the CLAIV Memory backend infrastructure.

Architecture overview

CLAIV Memory consists of four primary components:

API Server

Handles /v6/ingest, /v6/recall, and /v6/forget requests. Stateless and horizontally scalable. Validates API keys and enforces tenant isolation. Exposes /healthz and /readyz.

Worker

Processes enrichment jobs asynchronously. Runs the Extract → Map → Gate → Embed → Tier pipeline to produce structured facts, episode summaries, and embeddings from ingested events. Pulls jobs from a Redis queue.

PostgreSQL + pgvector

Primary data store for events, facts, embeddings, API keys, and tenant records. Shared between the API Server and Worker. pgvector extension required for embedding similarity search.

Redis

Job queue for async enrichment. Also used for per-minute rate limiting and recall result caching.

Environment variables

Required environment variables for the CLAIV Memory backend:

.env
# ── Database ──────────────────────────────────────────
DATABASE_URL=postgresql://user:pass@host:5432/claiv_memory

# ── Redis ─────────────────────────────────────────────
REDIS_URL=redis://host:6379

# ── API Server ────────────────────────────────────────
PORT=3000
API_KEY_SALT_ROUNDS=12

# ── Worker (LLM + Embedding) ──────────────────────────
OPENAI_API_KEY=sk-...
EMBEDDING_MODEL=text-embedding-3-small

# ── Console (optional, if co-deploying) ──────────────
CLAIV_MEMORY_API_BASE_URL=https://api.claiv.io
CLAIV_ENCRYPTION_KEY=<32-byte-hex-key>

Database setup

PostgreSQL 14+ with the pgvector extension is required. Run migrations before starting the API or Worker:

Terminal
# Enable pgvector
psql -d claiv_memory -c "CREATE EXTENSION IF NOT EXISTS vector;"

# Run migrations
npm run db:migrate
# or
pnpm db:migrate
For embedding search performance on large datasets, create an HNSW index on the facts embedding column after your initial data load. IVFFlat is an alternative for very large deployments (>1M facts).

Health checks

The API Server exposes two health endpoints. Configure your load balancer to use /readyz.

GET /healthz — liveness

Returns 200 OK when the API process is running. Does not check database or Redis connectivity. Use in container liveness probes.

GET /readyz — readiness

Returns 200 OK only when Postgres and Redis are both reachable. Returns 503 Service Unavailable otherwise. Use this in load balancer health checks — it prevents traffic from reaching instances that cannot serve requests.
You can test connectivity to your CLAIV Memory backend from the console's Settings page using the “Test Connection” feature, which calls both /healthz and /readyz.

Scaling considerations

API Server

Stateless — scale horizontally behind a load balancer. Each instance handles ~1,000 concurrent connections. Health-check with /readyz before routing traffic.

Worker

Scale workers based on ingestion throughput. Each worker processes enrichment jobs sequentially. Add more worker instances for higher-concurrency enrichment. Monitor the Redis queue depth to size your worker fleet.

PostgreSQL

Use read replicas for recall-heavy workloads. Enable pgvector indexing (HNSW recommended) for embedding search. Partition the events table by tenant_id at very high data volumes.

Redis

A single Redis instance with persistence (AOF) is sufficient for most deployments. Use Redis Cluster for high-availability production setups with >10K ingests/hour.

Docker compose (development)

docker-compose.yml
services:
  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_DB: claiv_memory
      POSTGRES_USER: claiv
      POSTGRES_PASSWORD: localpass
    ports: ["5432:5432"]

  redis:
    image: redis:7-alpine
    ports: ["6379:6379"]

  api:
    build: .
    command: node dist/api.js
    env_file: .env
    ports: ["3000:3000"]
    depends_on: [postgres, redis]

  worker:
    build: .
    command: node dist/worker.js
    env_file: .env
    depends_on: [postgres, redis]

Continue reading