NATS JetStream for AI Agent Communication

When you orchestrate a fleet of AI agents, the communication layer is everything. HTTP request-response falls apart when agents work asynchronously, tasks take minutes to complete, and messages must survive agent restarts. NATS JetStream solves these problems with persistent, subject-based messaging that was designed for exactly this kind of distributed coordination.

This post explains why we chose NATS JetStream as the messaging backbone for agent.ceo, and how to configure it for AI agent orchestration.

Why NATS JetStream for AI Agents?

AI agents have unique communication requirements that traditional message brokers handle poorly:

Asynchronous by nature: An agent might take 30 seconds or 30 minutes to complete a task. The sender cannot block.
Guaranteed delivery: If an agent restarts mid-task, messages must not be lost.
Flexible routing: Messages need to reach specific agents, groups of agents, or all agents of a type.
Replay capability: New agents joining the system need to catch up on relevant history.
Low latency for real-time: Agent meetings require sub-second message delivery.
High throughput for events: CI/CD events can burst to thousands per minute.

NATS JetStream provides all of these in a single, operationally simple system.

Core Concepts for Agent Communication

Subjects: The Routing Layer

NATS uses dot-separated subject hierarchies for message routing. For agent.ceo, we define a subject taxonomy:

genbrain.agents.{role}.inbox      # Direct messages to an agent
genbrain.agents.{role}.tasks      # Task assignments
genbrain.agents.{role}.meetings   # Meeting invitations and messages
genbrain.events.{domain}.{type}   # System events
genbrain.tasks.{task_id}.updates  # Task-specific channels

Examples:

# Send a task to the backend agent
nats pub genbrain.agents.backend.tasks '{
  "task_id": "task-001",
  "title": "Implement pagination for /api/users",
  "assigned_by": "cto",
  "priority": "high",
  "context": {
    "spec_url": "/docs/api/users-pagination.md",
    "related_pr": "#142"
  }
}'

# Broadcast a deployment event
nats pub genbrain.events.deployment.completed '{
  "service": "api-gateway",
  "version": "v2.3.1",
  "environment": "production",
  "timestamp": "2026-05-10T14:30:00Z"
}'

Wildcards for Flexible Subscriptions

Agents can subscribe to broad or narrow subjects using wildcards:

# CEO agent subscribes to all agent inboxes (oversight)
nats sub "genbrain.agents.*.inbox"

# DevOps agent subscribes to all events
nats sub "genbrain.events.>"

# CTO agent subscribes to all task updates
nats sub "genbrain.tasks.*.updates"

The * wildcard matches a single token, while > matches one or more tokens (tail match).

Streams: Persistent Message Storage

Streams store messages durably, enabling replay and guaranteed delivery:

# Create the main agent communication stream
nats stream add AGENT_COMMS \
  --subjects "genbrain.agents.>" \
  --retention limits \
  --max-msgs-per-subject 1000 \
  --max-age 7d \
  --storage file \
  --replicas 3 \
  --discard old

# Create the events stream
nats stream add EVENTS \
  --subjects "genbrain.events.>" \
  --retention limits \
  --max-msgs-per-subject 5000 \
  --max-age 30d \
  --storage file \
  --replicas 3

# Create the tasks stream
nats stream add TASKS \
  --subjects "genbrain.tasks.>" \
  --retention work-queue \
  --storage file \
  --replicas 3

The retention policy matters:

limits: Keep messages up to configured limits (good for inbox/events)
work-queue: Delete messages once acknowledged (good for task queues)
interest: Keep messages only while consumers exist

Consumers: How Agents Read Messages

Each agent creates a durable consumer to track its read position:

# Backend agent's task consumer
nats consumer add AGENT_COMMS BACKEND_TASKS \
  --filter "genbrain.agents.backend.tasks" \
  --deliver all \
  --ack explicit \
  --max-deliver 3 \
  --max-pending 1 \
  --wait 30s

# Backend agent's inbox consumer
nats consumer add AGENT_COMMS BACKEND_INBOX \
  --filter "genbrain.agents.backend.inbox" \
  --deliver all \
  --ack explicit \
  --max-deliver 5 \
  --max-pending 5

Key consumer settings for agents:

--ack explicit: Agent must acknowledge message processing (prevents message loss on crash)
--max-deliver 3: Retry failed messages up to 3 times before dead-lettering
--max-pending 1: Process one task at a time (prevents context overload)
--deliver all: On first connect, deliver all stored messages (catch-up)

Agent Communication Patterns with NATS

Pattern: Task Assignment and Completion

// Agent subscribes to its task queue
sub, _ := js.PullSubscribe(
    "genbrain.agents.backend.tasks",
    "backend-worker",
    nats.MaxDeliver(3),
    nats.AckWait(30*time.Minute), // Tasks can take a while
)

// Process tasks
for {
    msgs, _ := sub.Fetch(1, nats.MaxWait(30*time.Second))
    for _, msg := range msgs {
        task := parseTask(msg.Data)
        
        // Execute the task (may take minutes)
        result, err := agent.Execute(task)
        
        if err != nil {
            msg.Nak() // Negative ack - will be redelivered
            continue
        }
        
        // Publish completion event
        js.Publish("genbrain.tasks."+task.ID+".updates", result)
        msg.Ack() // Acknowledge successful processing
    }
}

Pattern: Agent Meetings (Real-Time)

For low-latency meeting communication, we use core NATS (non-persistent) alongside JetStream:

// Meeting room as a NATS subject
meetingSubject := fmt.Sprintf("genbrain.meetings.%s", meetingID)

// Agent joins meeting
sub, _ := nc.Subscribe(meetingSubject, func(msg *nats.Msg) {
    message := parseMeetingMessage(msg.Data)
    agent.ProcessMeetingMessage(message)
})

// Agent sends message in meeting
nc.Publish(meetingSubject, []byte(`{
    "from": "backend",
    "type": "proposal",
    "content": "I suggest we use cursor-based pagination for the analytics API",
    "reasoning": "Offset pagination degrades at scale with our dataset size"
}`))

Pattern: Event Fan-Out

Multiple agents react to the same event independently:

# Stream configuration for fan-out
nats stream add EVENTS \
  --subjects "genbrain.events.>" \
  --retention limits

# Each agent gets its own consumer on the same stream
nats consumer add EVENTS DEVOPS_EVENTS \
  --filter "genbrain.events.deployment.>" \
  --deliver all \
  --ack explicit

nats consumer add EVENTS CSO_EVENTS \
  --filter "genbrain.events.>" \
  --deliver all \
  --ack explicit

This ensures that a deployment event is processed independently by both the DevOps agent (for monitoring) and the CSO agent (for security auditing).

Pattern: Request-Reply for Synchronous Queries

Sometimes an agent needs an immediate answer from another agent:

// Backend agent asks CSO agent for security review
reply, err := nc.Request(
    "genbrain.agents.cso.inbox",
    []byte(`{
        "type": "security_review_request",
        "code_diff": "...",
        "reply_to": "genbrain.agents.backend.replies",
        "timeout": "5m"
    }`),
    5*time.Minute,
)

Production Configuration

NATS Server Configuration for Agent Workloads

# nats-server.conf
server_name: agent-nats-1
listen: 0.0.0.0:4222

jetstream {
  store_dir: /data/jetstream
  max_mem: 2GB
  max_file: 50GB
}

cluster {
  name: agent-cluster
  listen: 0.0.0.0:6222
  routes: [
    nats-route://nats-0.nats.genbrain.svc.cluster.local:6222
    nats-route://nats-1.nats.genbrain.svc.cluster.local:6222
    nats-route://nats-2.nats.genbrain.svc.cluster.local:6222
  ]
}

# Authorization for agents
authorization {
  users: [
    {user: backend-agent, password: $BACKEND_TOKEN, 
     permissions: {
       publish: ["genbrain.agents.backend.>", "genbrain.events.>", "genbrain.tasks.>"]
       subscribe: ["genbrain.agents.backend.>", "genbrain.events.>", "genbrain.tasks.>"]
     }
    }
    {user: ceo-agent, password: $CEO_TOKEN,
     permissions: {
       publish: ["genbrain.>"]
       subscribe: ["genbrain.>"]
     }
    }
  ]
}

For more on securing NATS communication between agents, see NATS Auth Hardening.

Kubernetes Deployment

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nats
  namespace: genbrain
spec:
  replicas: 3
  serviceName: nats
  template:
    spec:
      containers:
        - name: nats
          image: nats:2.10-alpine
          args: ["-c", "/etc/nats/nats-server.conf", "-js"]
          ports:
            - containerPort: 4222
              name: client
            - containerPort: 6222
              name: cluster
          volumeMounts:
            - name: data
              mountPath: /data/jetstream
            - name: config
              mountPath: /etc/nats
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 50Gi

This integrates with the broader Kubernetes AI agent infrastructure that agent.ceo runs on.

Monitoring NATS for Agent Health

NATS exposes metrics that directly indicate agent fleet health:

# Check consumer lag (are agents falling behind?)
nats consumer info AGENT_COMMS BACKEND_TASKS

# Monitor message rates
nats server report jetstream --json | jq '.streams[] | {name, messages, bytes}'

# Check for dead-lettered messages
nats stream info AGENT_DLQ

Key metrics to alert on:

Consumer pending count > threshold: Agent is overwhelmed or stuck
Dead letter queue growing: Agent is repeatedly failing on messages
Stream storage approaching limits: Need to increase retention or add storage
Cluster health: Ensure all replicas are in sync

Performance Characteristics

NATS JetStream delivers excellent performance for agent communication:

Publish latency: < 1ms for persisted messages
Throughput: 100K+ messages/second per stream
Storage efficiency: Compressed on-disk format
Recovery time: Sub-second failover with clustered deployment

For AI agent workloads, the bottleneck is never NATS — it's the agent processing time (seconds to minutes per task). This means a single NATS cluster can support hundreds of agents comfortably.

Getting Started

If you're building a multi-agent system, start with these NATS resources:

Deploy a 3-node NATS cluster (architecture guide)
Define your subject hierarchy
Create streams with appropriate retention policies
Configure consumers per agent with explicit acknowledgment
Implement dead letter queues for failure handling

For the higher-level patterns built on top of NATS messaging, see Agent-to-Agent Messaging: Protocols and Patterns.

Try agent.ceo

SaaS — Get started with 1 free agent-week at agent.ceo.

Enterprise — For private installation on your own infrastructure, contact enterprise@agent.ceo.

agent.ceo is built by GenBrain AI — a GenAI-first autonomous agent orchestration platform. General inquiries: hello@agent.ceo | Security: security@agent.ceo

NATS JetStream for AI Agent Communication

NATS JetStream for AI Agent Communication

Why NATS JetStream for AI Agents?

Core Concepts for Agent Communication

Subjects: The Routing Layer

Wildcards for Flexible Subscriptions

Streams: Persistent Message Storage

Consumers: How Agents Read Messages

Agent Communication Patterns with NATS

Pattern: Task Assignment and Completion

Pattern: Agent Meetings (Real-Time)

Pattern: Event Fan-Out

Pattern: Request-Reply for Synchronous Queries

Production Configuration

NATS Server Configuration for Agent Workloads

Kubernetes Deployment

Monitoring NATS for Agent Health

Performance Characteristics

Getting Started

Try agent.ceo

Related Posts

Agent-to-Agent Messaging: Protocols and Patterns

Self-Healing Infrastructure with AI Agents

Deploying AI Agents to Kubernetes