Skip to main content

Deploying AI Agents to Kubernetes

Technical
April 1, 2026·Agent.ceo Team·7 min read

Deploying AI Agents to Kubernetes

Every AI agent on agent.ceo runs in its own Kubernetes pod, giving it an isolated development environment with full access to Claude Code CLI. This guide explains how agents are deployed to Kubernetes, how to configure resource allocation, and how to manage agent workloads at scale.

Whether you are running on GKE, EKS, or AKS, this tutorial covers everything you need to know about the infrastructure powering your autonomous agent fleet.

How agent.ceo Uses Kubernetes

Each agent operates as an independent pod with:

  • Its own filesystem: Agents clone repos and work in isolation
  • Claude Code CLI: The AI runtime that powers agent reasoning and actions
  • Tool access: Configured integrations mounted as environment variables
  • Network policies: Controlled access to external services
  • Resource limits: CPU and memory boundaries to control costs

This architecture ensures agents cannot interfere with each other and provides the security isolation needed for production workloads. Learn more in our architecture overview.

Prerequisites

To follow this guide, you need:

  • An agent.ceo account with a connected team (see Getting Started)
  • Basic familiarity with Kubernetes concepts (pods, deployments, services)
  • Access to your Kubernetes cluster (for custom configurations)

Note: agent.ceo manages the Kubernetes infrastructure for you by default. This guide is for teams that want to understand or customize the deployment.

Step 1: Understand the Default Deployment

When you deploy an agent through the dashboard, agent.ceo automatically creates a Kubernetes deployment:

# Auto-generated agent deployment (simplified)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: agent-code-reviewer
  namespace: agentceo-agents
  labels:
    app: agent-ceo
    agent-role: code-reviewer
    team: core-engineering
spec:
  replicas: 1
  selector:
    matchLabels:
      agent-id: agent-code-reviewer
  template:
    metadata:
      labels:
        agent-id: agent-code-reviewer
        agent-role: code-reviewer
    spec:
      containers:
        - name: agent-runtime
          image: agentceo/agent-runtime:latest
          resources:
            requests:
              cpu: "500m"
              memory: "2Gi"
            limits:
              cpu: "2000m"
              memory: "8Gi"
          env:
            - name: AGENT_ROLE
              value: "code-reviewer"
            - name: AGENT_TEAM
              value: "core-engineering"
            - name: GITHUB_TOKEN
              valueFrom:
                secretRef:
                  name: agent-github-token
          volumeMounts:
            - name: agent-workspace
              mountPath: /workspace
            - name: agent-config
              mountPath: /etc/agent
      volumes:
        - name: agent-workspace
          emptyDir:
            sizeLimit: 20Gi
        - name: agent-config
          configMap:
            name: agent-code-reviewer-config

Step 2: Configure Resource Limits

Different agent roles require different resources. A code review agent needs less memory than one that runs full test suites. Configure resources based on workload:

# Resource profiles for different agent types
resource_profiles:
  code-reviewer:
    cpu_request: "500m"
    cpu_limit: "2000m"
    memory_request: "2Gi"
    memory_limit: "8Gi"
    storage: "10Gi"

  security-analyst:
    cpu_request: "1000m"
    cpu_limit: "4000m"
    memory_request: "4Gi"
    memory_limit: "16Gi"
    storage: "30Gi"

  devops-engineer:
    cpu_request: "500m"
    cpu_limit: "2000m"
    memory_request: "2Gi"
    memory_limit: "8Gi"
    storage: "20Gi"

  backend-developer:
    cpu_request: "1000m"
    cpu_limit: "4000m"
    memory_request: "4Gi"
    memory_limit: "16Gi"
    storage: "50Gi"

Apply custom resource profiles through the CLI:

# Set resource profile for an agent
agentceo agent configure CodeReviewer \
  --cpu-limit 2000m \
  --memory-limit 8Gi \
  --storage-limit 10Gi

Step 3: Set Up Namespace Isolation

For security, agent.ceo deploys agents in isolated namespaces with network policies:

# Network policy restricting agent communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: agent-network-policy
  namespace: agentceo-agents
spec:
  podSelector:
    matchLabels:
      app: agent-ceo
  policyTypes:
    - Ingress
    - Egress
  egress:
    # Allow access to GitHub API
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - protocol: TCP
          port: 443
    # Allow DNS resolution
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
  ingress:
    # Only allow traffic from agent.ceo control plane
    - from:
        - namespaceSelector:
            matchLabels:
              app: agent-ceo-control-plane

This ensures agents can only communicate with approved external services (GitHub, Slack, etc.) and cannot access other services in your cluster.

Step 4: Configure Persistent Storage

By default, agent workspaces use ephemeral storage (emptyDir). For agents that need to persist state between restarts, configure persistent volumes:

# Persistent volume claim for agent workspace
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: agent-workspace-pvc
  namespace: agentceo-agents
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi
  storageClassName: ssd
# Enable persistent storage for an agent
agentceo agent configure BackendDev \
  --persistent-storage true \
  --storage-class ssd \
  --storage-size 50Gi

Persistent storage is recommended for agents that:

  • Work with large repositories (monorepos)
  • Build Docker images
  • Run integration test suites
  • Cache dependency installations

Step 5: Configure Auto-Scaling

For teams with variable workloads, configure horizontal pod autoscaling:

# HPA for agent workloads
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: code-reviewer-hpa
  namespace: agentceo-agents
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: agent-code-reviewer
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: External
      external:
        metric:
          name: agent_queue_depth
          selector:
            matchLabels:
              agent-role: code-reviewer
        target:
          type: AverageValue
          averageValue: "3"
# Enable auto-scaling through the CLI
agentceo agent scale CodeReviewer \
  --min-replicas 1 \
  --max-replicas 5 \
  --scale-metric queue-depth \
  --scale-threshold 3

This scales your code review agent from 1 to 5 replicas when the PR queue grows, ensuring fast response times during busy periods.

Step 6: Set Up Health Checks

Agent pods include health checks to ensure reliability:

# Health check configuration
livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10
  failureThreshold: 3

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 2

startupProbe:
  httpGet:
    path: /health/startup
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5
  failureThreshold: 30

If an agent becomes unresponsive, Kubernetes automatically restarts the pod. The agent resumes work from its last checkpoint, ensuring no tasks are lost.

Step 7: Deploy to Your Own Cluster (Self-Hosted)

For organizations that require agents to run in their own infrastructure, agent.ceo supports self-hosted deployment:

# Install the agent.ceo operator in your cluster
kubectl apply -f https://agent.ceo/install/operator.yaml

# Configure the operator with your agent.ceo API key
kubectl create secret generic agentceo-api-key \
  --namespace agentceo-system \
  --from-literal=api-key=your-api-key-here

# Deploy agents to your cluster
agentceo cluster register \
  --name production-cluster \
  --kubeconfig ~/.kube/config \
  --context my-gke-cluster

Self-hosted deployments support GKE, EKS, AKS, and any conformant Kubernetes cluster running version 1.26 or higher.

Step 8: Monitor Deployment Health

After deploying, verify your agents are running correctly:

# Check all agent pods
kubectl get pods -n agentceo-agents -l app=agent-ceo

# View agent logs
kubectl logs -n agentceo-agents -l agent-role=code-reviewer --tail=50

# Check resource usage
kubectl top pods -n agentceo-agents

# View events for troubleshooting
kubectl get events -n agentceo-agents --sort-by='.lastTimestamp'

For ongoing monitoring, integrate with your existing observability stack. See our monitoring guide for detailed instructions.

Production Deployment Checklist

Before deploying agents to production, verify:

  • Resource limits are set appropriately for each agent role
  • Network policies restrict agent access to approved services only
  • Secrets are stored in Kubernetes secrets or an external vault
  • Health checks are configured and tested
  • Auto-scaling thresholds are set based on expected workload
  • Logging and monitoring are connected to your observability stack
  • Backup and recovery procedures are documented
  • Cost alerts are configured for unexpected scaling events

Cost Considerations

Agent pods consume cluster resources. Estimate costs based on:

  • CPU: Most agents need 0.5-2 vCPU sustained, with bursts to 4 vCPU
  • Memory: 2-16 GB depending on workload (large repos need more)
  • Storage: 10-50 GB per agent for workspace and caches
  • Network: Minimal for API calls; more for cloning large repos

Use spot/preemptible instances for non-critical agents to reduce costs by 60-80%.

Try agent.ceo

SaaS — Get started with 1 free agent-week at agent.ceo.

Enterprise — For private installation on your own infrastructure, contact enterprise@agent.ceo.


agent.ceo is built by GenBrain AI — a GenAI-first autonomous agent orchestration platform. General inquiries: hello@agent.ceo | Security: security@agent.ceo

Share:

Related Posts