Deployment
Architecture overview and deployment guide for the CLAIV Memory backend infrastructure.
Architecture overview
CLAIV Memory consists of four primary components:
API Server
Handles ingest, recall, and forget requests. Stateless and horizontally scalable. Validates API keys and enforces tenant isolation.
Worker
Processes enrichment jobs asynchronously. Extracts facts, episodes, and generates embeddings from ingested events. Pulls jobs from a Redis queue.
PostgreSQL
Primary data store for events, memories, embeddings (pgvector), API keys, and tenants. Shared between the API and Worker.
Redis
Job queue for async enrichment. Also used for rate limiting and caching.
Environment variables
Required environment variables for the CLAIV Memory backend:
# Database
DATABASE_URL=postgresql://user:pass@host:5432/claiv_memory
# Redis
REDIS_URL=redis://host:6379
# API Server
PORT=3000
API_KEY_SALT_ROUNDS=12
# Worker
OPENAI_API_KEY=sk-...
EMBEDDING_MODEL=text-embedding-3-small
# Console (if deploying the console alongside)
CLAIV_MEMORY_API_BASE_URL=https://api.claiv.io
CLAIV_SETTINGS_ENCRYPTION_KEY=<32-byte-hex-key>Health checks
The API server exposes two health check endpoints:
GET /healthz
Liveness check. Returns 200 if the API server process is running. Does not check database connectivity.
GET /readyz
Readiness check. Returns 200 if the API server can connect to PostgreSQL and Redis. Use this for load balancer health checks.
Scaling considerations
API server
Stateless. Scale horizontally behind a load balancer. Each instance handles ~1000 concurrent connections.
Worker
Scale workers based on ingestion throughput. Each worker processes events sequentially. Add more workers for higher throughput.
Database
Use read replicas for recall-heavy workloads. Enable pgvector indexing (IVFFlat or HNSW) for embedding search performance.