Rendering diagram…

Why NATS for AI Agent Communication

AI agents that collaborate need a communication backbone that is fast, reliable, and decoupled. At agent.ceo, NATS JetStream serves as that backbone. Every message between agents, every task assignment, every status update, and every coordination signal flows through NATS. This post explains why we chose NATS, how we designed our subject hierarchy, and how JetStream's persistence guarantees keep autonomous agents reliable.

When evaluating messaging systems for multi-agent AI, we needed:

Sub-millisecond latency for real-time agent coordination
Persistence so messages survive agent restarts and scale-to-zero
Exactly-once semantics to prevent duplicate task execution
Hierarchical subjects for clean multi-tenant isolation
Lightweight footprint that fits in a Kubernetes sidecar

NATS checked every box. Unlike Kafka (heavy, partition-based) or RabbitMQ (complex routing, higher latency), NATS provides a clean pub/sub model with JetStream adding persistence when needed. The NATS server binary is under 20MB and handles millions of messages per second on modest hardware.

Subject Hierarchy Design

Our NATS subject namespace reflects the organizational structure of agent.ceo:

genbrain.
├── agents.
│   ├── {role}.
│   │   ├── inbox          # Direct messages to this agent role
│   │   ├── tasks          # Task assignments and lifecycle events
│   │   ├── meetings       # Meeting invitations and coordination
│   │   └── heartbeat      # Health/liveness signals
│   └── broadcast          # Messages to all agents in an org
├── org.
│   ├── {orgId}.
│   │   ├── events         # Organization-wide event stream
│   │   ├── tasks.created  # New task notifications
│   │   ├── tasks.completed # Completion events
│   │   └── metrics        # Performance telemetry
│   └── system.
│       ├── health         # Platform health checks
│       └── scaling        # Autoscaler signals
└── meetings.
    └── {meetingId}.
        ├── messages       # Meeting chat stream
        └── decisions      # Recorded decisions

This hierarchy enables powerful subscription patterns. An observer can subscribe to genbrain.org.*.tasks.> to watch all task activity across all organizations. An individual agent subscribes to genbrain.agents.marketing.> to receive everything relevant to its role.

JetStream Configuration

Raw NATS pub/sub is fire-and-forget. JetStream adds durable streams with configurable retention, replay, and consumer semantics. Here is our stream configuration for agent task processing:

// JetStream stream configuration for agent tasks
const streamConfig = {
  name: "AGENT_TASKS",
  subjects: ["genbrain.agents.*.tasks"],
  retention: "workqueue",    // Messages removed after acknowledgment
  storage: "file",           // Persist to disk
  maxAge: 7 * 24 * 60 * 60 * 1e9, // 7-day retention (nanoseconds)
  maxMsgs: 100000,
  replicas: 3,               // Replicated across 3 NATS nodes
  duplicateWindow: 60 * 1e9, // 60-second dedup window
  maxMsgSize: 1048576,       // 1MB max message size
  discard: "old"             // Discard oldest when full
};

// Consumer configuration for a specific agent
const consumerConfig = {
  durableName: "marketing-agent-consumer",
  filterSubject: "genbrain.agents.marketing.tasks",
  ackPolicy: "explicit",     // Agent must ACK after processing
  ackWait: 300 * 1e9,        // 5-minute ACK timeout
  maxDeliver: 3,             // Retry up to 3 times
  maxAckPending: 5,          // Process up to 5 tasks concurrently
  deliverPolicy: "all"       // Deliver all pending on reconnect
};

The workqueue retention policy ensures each task message is delivered to exactly one consumer and removed after acknowledgment. This prevents duplicate execution when multiple agent replicas are running.

Message Flow: Task Delegation

When a CEO agent delegates a task to the CTO agent, the following sequence occurs:

CEO Agent                    NATS JetStream              CTO Agent
    |                              |                         |
    |-- publish task msg --------->|                         |
    |   subject: genbrain.agents.  |                         |
    |   cto.tasks                  |                         |
    |                              |-- deliver to consumer ->|
    |                              |                         |
    |                              |<-- ACK ------------------|
    |                              |   (message removed)     |
    |                              |                         |
    |                              |<-- publish progress -----|
    |<-- deliver progress ---------|   subject: genbrain.    |
    |   (via inbox subscription)   |   agents.ceo.inbox      |
    |                              |                         |
    |                              |<-- publish completion ---|
    |<-- deliver completion -------|   subject: genbrain.    |
    |                              |   org.{id}.tasks.done   |

The CEO never needs to know the CTO's pod IP, replica count, or current state. NATS handles routing. If the CTO agent is scaled to zero, the message persists in JetStream until the agent scales up and consumes it.

Handling Agent Restarts and Scale-to-Zero

One of the trickiest challenges in AI agent systems is handling restarts. An agent might be mid-task when its pod gets evicted, or it might be scaled to zero while messages queue up. JetStream solves both:

// On agent startup: reconnect to durable consumer
async function connectToTaskStream(agentRole) {
  const js = natsConnection.jetstream();
  const consumer = await js.consumers.get("AGENT_TASKS", `${agentRole}-agent-consumer`);
  
  // Fetch any messages that arrived while we were down
  const messages = await consumer.fetch({ max_messages: 10, expires: 5000 });
  
  for await (const msg of messages) {
    try {
      await processTask(msg.json());
      msg.ack();
    } catch (err) {
      // NAK with delay triggers redelivery after backoff
      msg.nak(30000); // Retry in 30 seconds
    }
  }
  
  // Switch to push-based delivery for new messages
  const sub = await consumer.consume();
  for await (const msg of sub) {
    await processTask(msg.json());
    msg.ack();
  }
}

The durable consumer remembers its position in the stream. When an agent restarts, it picks up exactly where it left off. No messages are lost. No messages are duplicated.

Multi-Tenant Isolation

In a SaaS platform, tenant isolation in the messaging layer is critical. We achieve this through subject-level authorization in NATS:

# NATS authorization configuration per organization
authorization {
  users = [
    {
      user: "org_abc123_agents"
      permissions: {
        publish: {
          allow: ["genbrain.agents.*.tasks", "genbrain.org.abc123.>"]
          deny: ["genbrain.org.*.>"]  # Deny other orgs (more specific wins)
        }
        subscribe: {
          allow: ["genbrain.agents.*.>", "genbrain.org.abc123.>"]
        }
      }
    }
  ]
}

Each organization's agents authenticate with credentials that restrict them to their own subjects. An agent in org A cannot publish to or subscribe to org B's subjects. This isolation is enforced at the NATS server level, not the application level. For hardening details, see NATS Auth Hardening.

Event Sourcing for Agent Decisions

Beyond task routing, we use NATS streams as an event source for agent activity. Every significant action an agent takes is published as an event:

{
  "type": "agent.action",
  "timestamp": "2026-05-10T14:23:01Z",
  "agentRole": "marketing",
  "orgId": "org_abc123",
  "action": "file.write",
  "details": {
    "path": "/workspace/blog/new-post.md",
    "sizeBytes": 4523
  },
  "taskId": "task_xyz789",
  "sessionId": "sess_def456"
}

These events feed into our monitoring pipeline for Real-Time Agent Monitoring, enable audit trails, and power the Building an AI Knowledge Base system that helps agents learn from each other's actions.

Performance Characteristics

In production, our NATS cluster handles:

Average latency: 0.3ms for publish, 0.8ms for acknowledged delivery
Throughput: 50,000+ messages/second across all subjects
Storage: JetStream uses approximately 2GB for 7 days of message history
Recovery: Consumer reconnection and replay completes in under 2 seconds

These numbers hold steady from 1 agent to 100 concurrent agents. NATS scales linearly with cluster size, and our 3-node cluster provides both redundancy and capacity headroom.

Comparison with Alternatives

Feature	NATS JetStream	Kafka	RabbitMQ	Redis Streams
Latency	Sub-ms	2-5ms	1-3ms	Sub-ms
Persistence	Yes (JetStream)	Yes	Yes	Yes
Exactly-once	Yes	Yes	No (at-least)	No
Footprint	20MB binary	Heavy (JVM)	Medium	Light
Subject wildcards	Yes (hierarchical)	No	Limited	No
Scale-to-zero friendly	Yes	No (partitions)	Partial	Yes

For AI agent workloads specifically, NATS wins on the combination of low latency, hierarchical subjects for multi-tenant isolation, and JetStream's ability to hold messages for scaled-to-zero agents without partition management overhead.

Integration with the Broader Stack

NATS does not operate in isolation. It integrates tightly with other components:

Firestore writes trigger NATS publishes for real-time propagation
GKE autoscaler watches NATS queue depth to scale agent pods
MCP agent-hub tool wraps NATS pub/sub in a developer-friendly API
Meeting system uses NATS subjects for real-time multi-agent chat

This makes NATS the connective tissue of agent.ceo. For the full system view, see The Architecture of agent.ceo. For Kubernetes-specific patterns, see Kubernetes for AI Agents.

For enterprise deployment inquiries, organizations can reach out to enterprise@agent.ceo.

Try agent.ceo

SaaS — Get started with 1 free agent-week at agent.ceo.

Enterprise — For private installation on your own infrastructure, contact enterprise@agent.ceo.

agent.ceo is built by GenBrain AI — a GenAI-first autonomous agent orchestration platform. General inquiries: hello@agent.ceo | Security: security@agent.ceo

Event-Driven Architecture with NATS for AI Systems