How Google Agent Substrate Works: 250 Agents, 8 Pods

Google quietly released a new open-source infrastructure layer for running AI agents at massive scale on Kubernetes. Here’s the technical breakdown — and where Solo fits in.

Last month, Google published three engineering blog posts at Google I/O that barely made a ripple in the press. No keynote stage time, no dedicated product launch — just a few technical posts on the Google Cloud blog. That’s a shame, because what they announced may be one of the more important infrastructure developments for production AI agents this year.

The project is called Agent Substrate. And its headline capability is striking: it can multiplex approximately 250 stateful agent sessions onto just 8 physical Kubernetes pods. That’s a 30-times oversubscription ratio, with full state preservation across hibernation cycles.

Key Stat:  Agent Substrate’s demo multiplexes ~250 stateful actor sessions across just 8 physical pods — a 30x+ oversubscription ratio with sub-second activation latency.

This post is a technical breakdown of how Agent Substrate works, how it fits into the broader Google agentic infrastructure stack, and — critically — what it doesn’t include, which is where Solo’s tooling becomes part of the story.

The Problem Agent Substrate Solves

To understand why Agent Substrate exists, you have to understand the specific failure mode of running AI agents on standard Kubernetes.

A traditional Kubernetes workload is a long-running service: a web server, a database proxy, a microservice. These services run continuously, justifying the overhead of keeping a dedicated pod alive. Kubernetes was designed and optimized for exactly this pattern.

AI agents are fundamentally different. An agent handling a user task might fire a flurry of tool calls for 200 milliseconds, then sit completely idle for 30 seconds waiting for a human response, a database query, or an external API. Scale this to millions of concurrent agent sessions and the economics collapse: you are paying for idle compute at a scale that would not be tolerated in any other part of your infrastructure.

There are secondary problems too. When agents do need to resume, cold-starting a new pod introduces seconds of latency — noticeable to end users and ruinous for multi-step workflows. And Kubernetes’ control plane, built for managing thousands of long-running services, simply was not designed for the chatter of millions of sub-second tool calls trying to schedule and route in real time.

The Core Insight:  Agent workloads are idle most of the time. Agent Substrate exploits this to achieve heavy multiplexing: a large population of agents is mapped onto a small pool of ready physical pods, with full state transferred in and out as work arrives.

The Architecture: Three Layers Working Together

Google’s May 2026 announcements describe a three-layer stack. It’s worth understanding all three before focusing on Agent Substrate specifically, because each layer solves a different problem.

Layer 1: GKE Agent Sandbox — The Isolation Layer

Agent Sandbox is the secure execution environment where agent code actually runs. It is built on gVisor, Google’s user-space kernel sandbox, and is now generally available on GKE.

Standard Kubernetes containers share the host kernel. A container escape gives an attacker root on the physical host and access to every other container on the machine. For multi-tenant agent workloads — where one user’s agent might execute untrusted code generated by a language model — this is an unacceptable risk.

gVisor addresses this by interposing a user-space kernel between the containerized application and the real Linux kernel. When your agent makes a syscall, gVisor intercepts it, re-implements it in Go (via a component called the Sentry), and only makes a much more limited set of calls to the real kernel underneath. The attack surface shrinks dramatically. Even a full Sentry exploit leaves the attacker in user-space, not kernel-space.

Agent Sandbox also ships a warm pool that can provision 300 sandboxes per second at sub-200ms latency, and Pod Snapshots to suspend idle agents and resume them in seconds. These capabilities are what enable Agent Substrate to do what it does.

Layer 2: Agent Substrate — The Scheduling Layer

Agent Substrate is the focus of this post. It sits on top of Kubernetes and manages a fundamentally different scheduling problem: mapping a large, dynamic population of mostly-idle agents onto a small, fixed pool of ready physical pods.

The two core abstractions:

  • Actors:  the agents or applications being managed (the large, sparse population)
  • Workers:  the Kubernetes Pods that physically execute them (the small, dense pool)

When an actor needs to run, Agent Substrate assigns it to an available worker pod, transfers its state (volatile RAM and filesystem), and routes traffic to it. When the actor goes idle, its state is snapshotted and the worker is freed for another actor. The entire activate/deactivate cycle is designed to complete in under a second.

Critically, Agent Substrate takes the Kubernetes control plane out of the critical path for this scheduling work. Kubernetes’ API server was not designed for millions of rapid-fire scheduling events per second. Agent Substrate introduces its own lightweight control plane (ateapi) that handles actor lifecycle and routing with lower latency, while still sitting on top of Kubernetes for infrastructure provisioning and pod management.

Under the Hood:  The codebase uses the internal codename ‘ate’ throughout: ateapi (control plane), atelet (node-level DaemonSet), atecontroller (Kubernetes reconciler), atenet (networking), ateom-gvisor (snapshot helper), and kubectl-ate (CLI). Developers exploring the repo will encounter these names.

Layer 3: Agent Executor — The Runtime Layer

Agent Executor is the durable execution engine for long-running agent workflows. It handles the problems that arise when an agent workflow spans hours or days: what happens when the client disconnects? When a human needs to approve a step and doesn’t respond for six hours? When you need to test two different decision paths from the same checkpoint?

Agent Executor provides:

  • Durable execution:  resume after outages or human-in-the-loop interruptions via event log and snapshotting
  • Session consistency:  single-writer architecture prevents state corruption in distributed workflows
  • Connection recovery:  clients reconnect and backfill missed responses from the last seen sequence
  • Trajectory branching:  checkpoint and branch agent decision paths without losing context

Agent Executor is designed to be harness-agnostic — it works with Google’s own Antigravity, with LangChain/LangGraph, with ADK, and with any agent using the A2A protocol. It is the glue layer that lets you mix Google-managed agents with your own custom agents in the same deployment.

How Agent Substrate Works: The Detail

Let’s go deeper on the mechanics of Agent Substrate, because the architecture is genuinely novel.

The Actor/Worker Model

Think of it like a hotel with more guests than rooms, where most guests are out sightseeing most of the time. The hotel (Agent Substrate) manages which guests are currently in their rooms (active on a worker pod), which are checked in but out (hibernated actors with saved state), and handles the logistics of moving luggage (state transfer) when a guest returns.

In the demo, 250 stateful sessions are maintained with full state across just 8 physical pods. The oversubscription works because at any given moment, only a small fraction of those 250 sessions are actively executing. The rest are hibernated, their memory and filesystem state preserved as snapshots, waiting to be reactivated on demand.

State Preservation: The Hard Part

Hibernating and resuming an agent session is not trivial. A running agent has in-memory state (variables, conversation context, open file handles, network connections) and filesystem state (files it has created or modified during the session). Losing any of this between activations would break the agent’s continuity.

Agent Substrate solves this with full-process snapshots: it captures the complete memory image and filesystem state of a running process, transfers it to persistent storage, and can restore it on any available worker pod. The ateom-gvisor component runs inside sandboxed worker pods specifically to execute these gVisor checkpoint and restore operations.

The result is what Google calls “Instant Session Teleport”: a hibernated actor can be reactivated on any available worker pod, regardless of which physical node it was last running on, with its full state intact. Sub-second activation.

The Networking Layer: An Important Detail

Agent Substrate includes a networking component called atenet that handles DNS resolution, Envoy-based routing, and proxy sidecars for agent-to-agent traffic.

⚠ Important distinction:  atenet uses Envoy for internal pod routing within the substrate. But for the AI-protocol layer above it — MCP tool calls, A2A agent-to-agent traffic — Envoy was not designed for these protocols. Solo’s agentgateway was built from scratch in Rust specifically because existing proxies, including Envoy, could not handle AI-native traffic patterns at the required performance. The two layers are complementary: atenet handles internal substrate routing, agentgateway handles the protocol-aware edge.

When an actor is activated on a worker pod, atenet updates routing state so that incoming requests for that actor are directed to its current worker. When the actor hibernates, routing is suspended. This routing layer is what makes the actor abstraction transparent to callers — from the outside, you address an actor by its stable identity, not by its transient pod assignment.

Framework Compatibility

Agent Substrate is explicitly designed to be framework-agnostic. Because it manages standard OCI containers at the kernel level via gVisor, it can host agents built on any stack. The README lists first-class support for:

  • ADK (Agent Development Kit):  native session identity and persistent working memory
  • LangChain:  long-running stateful agents and sandboxed tool-calling
  • Claude Code and Codex:  high-density coding environments with persistent terminal and filesystem state across sessions
  • MCP servers:  deploy secure, sandboxed MCP servers as Substrate Actors to provide durable tools for any LLM

That last point — MCP servers as Substrate Actors — is particularly significant and is discussed in a dedicated post in this series.

What Agent Substrate Does Not Include (And Why That Matters)

Google’s blog posts describe Agent Substrate handling compute scheduling and isolation. What they do not describe — at all — is the networking layer above the sandbox.

To run agents in production at enterprise scale, you need more than just isolated sandboxes that can be rapidly activated. You need:

  • Policy enforcement: which agents can call which tools, under what conditions
  • Traffic management: rate limiting, retries, circuit breaking for agent-to-tool calls
  • Observability: distributed tracing across agent-to-agent and agent-to-tool interactions
  • Security: mTLS between agents, authentication for MCP tool servers, audit logging
  • Protocol awareness: MCP and A2A traffic require a gateway that understands agent-native protocols, not just HTTP

None of this is in Agent Substrate. It is not a gap or an oversight — it is simply out of scope for a compute scheduling layer. But it means that Agent Substrate, on its own, is not a complete production infrastructure story.

Where Solo.io Fits

Solo’s products complement Agent Substrate’s compute layer by providing the network and connectivity layer that enterprise agent deployments require.

agentgateway

agentgateway is Solo’s next-generation data plane, built from scratch in Rust to handle AI-native traffic patterns that traditional Envoy-based proxies weren’t designed for. Where kgateway (Solo’s Envoy-based Kubernetes gateway) handled HTTP and gRPC, agentgateway handles HTTP, gRPC, in addition to MCP and A2A — protocols that are stateful and require session multiplexing, making them fundamentally incompatible with the stateless one-request-to-one-backend model of traditional reverse proxies. When Agent Substrate activates an actor exposing MCP tools, agentgateway provides the protocol-aware edge: tool discovery, authentication, fine-grained access control, semantic guardrails, and end-to-end observability for agent-to-tool and agent-to-agent interactions.

Istio Ambient Mesh

Istio’s ambient mesh mode — developed in large part by Solo.io engineers — provides the security and observability layer for service-to-service traffic without sidecar overhead. For agent infrastructure, this means mTLS between agent pods, distributed tracing of multi-step agent workflows, and network policy enforcement — all with the low overhead required at agent scale.

Notably, atenet (Agent Substrate’s networking component) uses Envoy for internal pod routing. The zero-trust foundation underneath all of it is Istio ambient mesh (via ztunnel), which provides mTLS-based workload identity for all agent-to-agent and agent-to-tool traffic transparently, without sidecars or special libraries. agentgateway then operates as the AI-protocol-aware layer above that foundation, adding MCP and A2A semantics, fine-grained authorization, and semantic observability that ztunnel alone cannot provide.

kagent

kagent is a CNCF Sandbox project for building and running AI agents natively in Kubernetes. It provides a Kubernetes-native runtime that can coexist with Agent Substrate’s scheduling layer, giving platform teams a consistent model for managing both the agent runtime and the agent infrastructure.

Getting Started with Agent Substrate

Agent Substrate is in very early development (v0.0.0, tagged May 19 2026). The APIs will change. It is not production-ready. That said, it is fully functional for experimentation, and the demo is genuinely impressive (github.com/agent-substrate/substrate).

The quickest path to running it locally uses kind (Kubernetes in Docker):

# Create a local cluster with a registry
hack/create-kind-cluster.sh
 
# Install Agent Substrate core system and dependencies
hack/install-ate-kind.sh --deploy-ate-system
 
# Deploy the counter demo (stateful actor, demonstrates suspend/resume)
hack/install-ate-kind.sh --deploy-demo-counter
 
# Install the kubectl plugin
go install ./cmd/kubectl-ate
 
# Create an actor and test it
kubectl ate create actor my-agent-1 --template ate-demo-counter/counter
kubectl port-forward -n ate-system svc/atenet-router 8000:80 &
curl -X POST -H "Host: my-agent-1.actors.resources.substrate.ate.dev" -i http://localhost:8000/

The counter demo creates a simple stateful HTTP server as an actor, demonstrates state preservation across suspend/resume cycles, and shows the multiplexing in action. The more sophisticated “Secret Agent” demo highlights zero-idle self-suspension: an actor that automatically hibernates when idle and reanimates on request, preserving volatile RAM state through the cycle.

For GKE deployment, the setup script automates provisioning the required GCP resources (GKE cluster with a gVisor node pool, Redis, GCS, and IAM bindings):

# Configure your environment
cp hack/ate-dev-env.sh.example .ate-dev-env.sh
# Edit .ate-dev-env.sh with your GCP project settings
source .ate-dev-env.sh
 
# Provision GCP resources
gcloud auth application-default login --project=${PROJECT_ID}
go run ./cmd/setup --all
 
# Deploy Agent Substrate
./hack/install-ate.sh --deploy-ate-system

What to Watch

Agent Substrate is explicitly positioned as the compute foundation for Agent Executor, Google’s distributed agent runtime. As Agent Executor matures (currently in preview), the tight integration between the two will become more visible. The combination — a durable execution engine on top of a hyper-dense compute scheduler — is what makes running hundreds of millions of concurrent agent sessions economically viable.

Google has also signaled that Agent Substrate’s scheduler will eventually incorporate data locality — ensuring that agent state and scheduling work together to reduce latency further by preferring to activate an actor on the same node where its state is already warm. This is a significant optimization for stateful, long-running agents.

The project is actively seeking community contributions. Given how early it is, contributions to docs, examples, and integrations have a high probability of being accepted and will carry lasting SEO and credibility value.

The Bottom Line

Agent Substrate solves a real and important problem: how to run millions of AI agent sessions efficiently on Kubernetes without either burning money on idle compute or accepting the latency of cold-starting pods on demand. The 30x oversubscription ratio in the demo is not a parlor trick — it reflects a genuine architectural innovation that exploits the bursty, idle nature of agent workloads.

The stack Google has assembled — gVisor isolation, Agent Sandbox provisioning, Agent Substrate scheduling, and Agent Executor runtime — is a credible foundation for enterprise-scale agent infrastructure. What it does not provide is the network layer: the MCP and A2A gateway, the service mesh, the observability and policy enforcement that production deployments require.

That is exactly where Solo.io’s tooling connects. In the next post in this series, we’ll go deeper on the networking layer — specifically, why Agent Substrate’s Envoy-based internal routing (atenet) is distinct from the AI-protocol layer above it, and how agentgateway’s Rust data plane fills that gap for MCP and A2A traffic.

About Solo.io

Solo.io is reimagining infrastructure for cloud and AI, uniting secure, seamless cloud connectivity with AI-ready, agentic infrastructure. Solo’s open-source projects — including agentgateway, kagent, and contributions to Istio and Envoy — are used by Fortune 2000 enterprises running AI workloads on Kubernetes.

Resources referenced in this post:

Agent Substrate repo: github.com/agent-substrate/substrate

GKE Agent Sandbox docs: cloud.google.com/kubernetes-engine/docs/concepts/machine-learning/agent-sandbox

Agent Executor (Google/ax): github.com/google/ax

Solo agentgateway: solo.io/agentgateway

kagent (CNCF Sandbox): kagent.dev