The Architecture of agent.ceo: A Technical Deep-Dive

Building a platform where autonomous AI agents collaborate, scale, and persist state across sessions requires more than a single monolithic application. agent.ceo is a distributed system designed from the ground up for multi-agent orchestration. This post walks through every layer of the stack, from user authentication to agent execution and back.

The Full Request Path

When a user interacts with agent.ceo, the request traverses a carefully designed pipeline:

User Request
     |
     v
+------------------+
| Firebase Auth    |  (JWT validation, org membership)
+------------------+
     |
     v
+------------------+
| API Gateway      |  (rate limiting, routing, request shaping)
+------------------+
     |
     v
+------------------+
| Firestore        |  (agent config lookup, task creation)
+------------------+
     |
     v
+------------------+
| NATS JetStream   |  (message dispatch to agent inbox)
+------------------+
     |
     v
+------------------+
| GKE Pod          |  (agent container spins up or receives msg)
+------------------+
     |
     v
+------------------+
| Claude Code CLI  |  (LLM reasoning + tool execution)
+------------------+
     |
     v
+------------------+
| MCP Tools        |  (bash, git, web, agent-hub, etc.)
+------------------+
     |
     v
+------------------+
| Results -> NATS  |  (publish completion events)
+------------------+
     |
     v
+------------------+
| Firestore Update |  (persist state, notify subscribers)
+------------------+

Each layer is independently scalable, observable, and replaceable. This separation of concerns is what allows agent.ceo to run from 1 agent to 100 concurrent workers without architectural changes.

Layer 1: Authentication and Authorization

Firebase Auth handles identity. Every request carries a JWT that encodes the user's organization, role, and billing tier. The API gateway validates this token before any downstream processing occurs.

# Firebase Auth custom claims structure
customClaims:
  orgId: "org_abc123"
  role: "admin"
  tier: "growth"
  agentLimit: 10
  features:
    - "multi-agent"
    - "custom-tools"
    - "priority-scheduling"

Organization-level isolation ensures that agents from different tenants never share NATS subjects, Firestore collections, or compute resources. This is enforced at every layer, not just the gateway.

Layer 2: State Management with Firestore

Firestore serves as the system of record for all mutable state. Agent configurations, task records, session metadata, and user preferences all live here. Real-time listeners push updates to connected clients without polling.

// Agent configuration document structure
// Collection: organizations/{orgId}/agents/{agentId}
{
  "role": "marketing",
  "status": "active",
  "model": "claude-opus-4-6",
  "mcpConfig": {
    "tools": ["bash", "git", "web-search", "agent-hub"],
    "permissions": ["read-repo", "write-files", "publish"]
  },
  "scheduling": {
    "scaleToZero": true,
    "maxConcurrency": 3,
    "priorityClass": "standard"
  },
  "memory": {
    "compactionThreshold": 80000,
    "persistAcrossSessions": true
  }
}

Read more about our Firestore patterns in Firestore as State Store for AI Agents.

Layer 3: Event-Driven Communication via NATS

NATS JetStream is the nervous system of agent.ceo. Every inter-agent message, task assignment, status update, and coordination signal flows through NATS subjects. JetStream provides persistence, replay, and exactly-once delivery guarantees.

Subject hierarchy:
  genbrain.agents.{role}.inbox     - Direct messages to an agent
  genbrain.agents.{role}.tasks     - Task assignments and updates
  genbrain.agents.{role}.meetings  - Meeting coordination signals
  genbrain.org.{orgId}.events      - Organization-wide broadcasts
  genbrain.system.health           - Health check heartbeats

The event-driven model means agents are decoupled. A CEO agent can delegate to a CTO agent without knowing where it runs, what model it uses, or whether it is currently active. NATS handles routing, buffering, and delivery. See Event-Driven Architecture with NATS for AI Systems for the full messaging design.

Layer 4: Compute on GKE

Each agent runs as a Kubernetes pod on Google Kubernetes Engine. The pod contains the Claude Code CLI, MCP server configurations, and a sidecar for NATS connectivity.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: agent-marketing
  labels:
    app: agent-ceo
    role: marketing
spec:
  replicas: 1
  selector:
    matchLabels:
      role: marketing
  template:
    spec:
      containers:
        - name: agent
          image: gcr.io/genbrain/agent-runtime:latest
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "2Gi"
              cpu: "1000m"
          env:
            - name: AGENT_ROLE
              value: "marketing"
            - name: NATS_URL
              valueFrom:
                secretKeyRef:
                  name: nats-credentials
                  key: url
        - name: nats-sidecar
          image: gcr.io/genbrain/nats-bridge:latest

Horizontal Pod Autoscaling adjusts replica count based on queue depth and active task count. Scale-to-zero ensures idle agents consume no compute. Learn more in Scaling AI Agents: From 1 to 100 Concurrent Workers.

Layer 5: Tool Access via MCP

The Model Context Protocol gives agents structured access to external tools. Each agent's MCP configuration defines which tools it can use, with what permissions, and under what constraints.

{
  "mcpServers": {
    "agent-hub": {
      "command": "npx",
      "args": ["@genbrain/mcp-agent-hub"],
      "env": {
        "AGENT_ID": "${AGENT_ROLE}",
        "NATS_URL": "${NATS_URL}"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-filesystem", "/workspace"]
    },
    "git": {
      "command": "npx",
      "args": ["@genbrain/mcp-git"],
      "env": {
        "REPO_PATH": "/workspace/repo"
      }
    }
  }
}

MCP is what transforms a language model from a text generator into an autonomous agent. See MCP (Model Context Protocol) for Tool Integration for implementation details.

Layer 6: Task Management

Tasks flow through a hierarchical system with defined lifecycle states: assigned, accepted, in_progress, completed, or blocked. Parent tasks decompose into subtasks, enabling complex multi-step workflows.

Task Lifecycle:
  created -> assigned -> accepted -> in_progress -> completed
                                  \-> blocked (with blocker reason)
                                  \-> delegated (to another agent)

The task system integrates with NATS for real-time dispatch and Firestore for persistence. Agents pull tasks from their queue, report progress, and publish completion events. For the complete task management design, see Task Management Systems for Autonomous AI.

Layer 7: Context and Memory

AI agents have finite context windows. agent.ceo manages this constraint through compaction (summarizing old context while preserving key information) and a cross-session memory system that persists learnings.

Context compaction triggers automatically when token usage exceeds a configurable threshold. The memory system stores patterns, decisions, and outcomes in a structured format that loads at session start. Details in Agent Context Management: Compaction and Memory.

Design Principles

Three principles guided every architectural decision:

Loose coupling via events. Agents communicate through messages, never direct calls. This enables independent scaling, deployment, and failure isolation.
State externalized to Firestore. Agents are stateless processes. All meaningful state lives in Firestore, making agents replaceable and recoverable.
Tools over training. Rather than fine-tuning models for specific capabilities, we give agents tools via MCP. This makes capabilities composable and updatable without retraining.

These principles produce a system that scales horizontally, recovers from failures gracefully, and evolves without downtime. The architecture handles everything from a solo founder running one agent to an enterprise fleet of 100 concurrent workers processing thousands of tasks daily.

For hands-on setup instructions, see Getting Started with agent.ceo or Deploying AI Agents on Kubernetes.

Try agent.ceo

SaaS — Get started with 1 free agent-week at agent.ceo.

Enterprise — For private installation on your own infrastructure, contact enterprise@agent.ceo.

agent.ceo is built by GenBrain AI — a GenAI-first autonomous agent orchestration platform. General inquiries: hello@agent.ceo | Security: security@agent.ceo

The Architecture of agent.ceo: A Technical Deep-Dive

The Architecture of agent.ceo: A Technical Deep-Dive

The Full Request Path

Layer 1: Authentication and Authorization

Layer 2: State Management with Firestore

Layer 3: Event-Driven Communication via NATS

Layer 4: Compute on GKE

Layer 5: Tool Access via MCP

Layer 6: Task Management

Layer 7: Context and Memory

Design Principles

Try agent.ceo

Related Posts

Agent-to-Agent Messaging: Protocols and Patterns

NATS JetStream for AI Agent Communication

MCP (Model Context Protocol) for Tool Integration