How Google Agent Substrate Works: 250 Agents on 8 Pods

Last month, Google published three engineering blog posts at Google I/O that barely made a ripple in the press. No keynote stage time, no dedicated product launch — just a few technical posts on the Google Cloud blog. That’s a shame, because what they announced may be one of the more important infrastructure developments for production AI agents this year.

The project is called Agent Substrate. And its headline capability is striking: it can multiplex approximately 250 stateful agent sessions onto just 8 physical Kubernetes pods. That’s a 30-times oversubscription ratio, with full state preservation across hibernation cycles.

Key Stat: Agent Substrate’s demo multiplexes ~250 stateful actor sessions across just 8 physical pods — a 30x+ oversubscription ratio with sub-second activation latency.

This post is a technical breakdown of how Agent Substrate works, how it fits into the broader Google agentic infrastructure stack, and — critically — what it doesn’t include, which is where Solo’s tooling becomes part of the story.

The Problem Agent Substrate Solves

To understand why Agent Substrate exists, you have to understand the specific failure mode of running AI agents on standard Kubernetes.

A traditional Kubernetes workload is a long-running service: a web server, a database proxy, a microservice. These services run continuously, justifying the overhead of keeping a dedicated pod alive. Kubernetes was designed and optimized for exactly this pattern.

AI agents are fundamentally different. An agent handling a user task might fire a flurry of tool calls for 200 milliseconds, then sit completely idle for 30 seconds waiting for a human response, a database query, or an external API. Scale this to millions of concurrent agent sessions and the economics collapse: you are paying for idle compute at a scale that would not be tolerated in any other part of your infrastructure.

There are secondary problems too. When agents do need to resume, cold-starting a new pod introduces seconds of latency — noticeable to end users and ruinous for multi-step workflows. And Kubernetes’ control plane, built for managing thousands of long-running services, simply was not designed for the chatter of millions of sub-second tool calls trying to schedule and route in real time.

The Core Insight: Agent workloads are idle most of the time. Agent Substrate exploits this to achieve heavy multiplexing: a large population of agents is mapped onto a small pool of ready physical pods, with full state transferred in and out as work arrives.

The Architecture: Three Layers Working Together

Google’s May 2026 announcements describe a three-layer stack. It’s worth understanding all three before focusing on Agent Substrate specifically, because each layer solves a different problem.

‍Layer 1: GKE Agent Sandbox — The Isolation Layer

Agent Sandbox is the secure execution environment where agent code actually runs. It is built on gVisor, Google’s user-space kernel sandbox, and is now generally available on GKE.

Standard Kubernetes containers share the host kernel. A container escape gives an attacker root on the physical host and access to every other container on the machine. For multi-tenant agent workloads — where one user’s agent might execute untrusted code generated by a language model — this is an unacceptable risk.

gVisor addresses this by interposing a user-space kernel between the containerized application and the real Linux kernel. When your agent makes a syscall, gVisor intercepts it, re-implements it in Go (via a component called the Sentry), and only makes a much more limited set of calls to the real kernel underneath. The attack surface shrinks dramatically. Even a full Sentry exploit leaves the attacker in user-space, not kernel-space.

Agent Sandbox also ships a warm pool that can provision 300 sandboxes per second at sub-200ms latency, and Pod Snapshots to suspend idle agents and resume them in seconds. These capabilities are what enable Agent Substrate to do what it does.

Layer 2: Agent Substrate — The Scheduling Layer

Agent Substrate is the focus of this post. It sits on top of Kubernetes and manages a fundamentally different scheduling problem: mapping a large, dynamic population of mostly-idle agents onto a small, fixed pool of ready physical pods.

The two core abstractions:

Actors: the agents or applications being managed (the large, sparse population)
Workers: the Kubernetes Pods that physically execute them (the small, dense pool)

When an actor needs to run, Agent Substrate assigns it to an available worker pod, transfers its state (volatile RAM and filesystem), and routes traffic to it. When the actor goes idle, its state is snapshotted and the worker is freed for another actor. The entire activate/deactivate cycle is designed to complete in under a second.

Critically, Agent Substrate takes the Kubernetes control plane out of the critical path for this scheduling work. Kubernetes’ API server was not designed for millions of rapid-fire scheduling events per second. Agent Substrate introduces its own lightweight control plane (ateapi) that handles actor lifecycle and routing with lower latency, while still sitting on top of Kubernetes for infrastructure provisioning and pod management.

Under the Hood: The codebase uses the internal codename ‘ate’ throughout: ateapi (control plane), atelet (node-level DaemonSet), atecontroller (Kubernetes reconciler), atenet (networking), ateom-gvisor (snapshot helper), and kubectl-ate (CLI). Developers exploring the repo will encounter these names.

Layer 3: Agent Executor — The Runtime Layer

Agent Executor is the durable execution engine for long-running agent workflows. It handles the problems that arise when an agent workflow spans hours or days: what happens when the client disconnects? When a human needs to approve a step and doesn’t respond for six hours? When you need to test two different decision paths from the same checkpoint?

Agent Executor provides:

Durable execution: resume after outages or human-in-the-loop interruptions via event log and snapshotting
Session consistency: single-writer architecture prevents state corruption in distributed workflows
Connection recovery: clients reconnect and backfill missed responses from the last seen sequence
Trajectory branching: checkpoint and branch agent decision paths without losing context

Agent Executor is designed to be harness-agnostic — it works with Google’s own Antigravity, with LangChain/LangGraph, with ADK, and with any agent using the A2A protocol. It is the glue layer that lets you mix Google-managed agents with your own custom agents in the same deployment.

How Agent Substrate Works: The Detail

Let’s go deeper on the mechanics of Agent Substrate, because the architecture is genuinely novel.

The Actor/Worker Model

Think of it like a hotel with more guests than rooms, where most guests are out sightseeing most of the time. The hotel (Agent Substrate) manages which guests are currently in their rooms (active on a worker pod), which are checked in but out (hibernated actors with saved state), and handles the logistics of moving luggage (state transfer) when a guest returns.

In the demo, 250 stateful sessions are maintained with full state across just 8 physical pods. The oversubscription works because at any given moment, only a small fraction of those 250 sessions are actively executing. The rest are hibernated, their memory and filesystem state preserved as snapshots, waiting to be reactivated on demand.

State Preservation: The Hard Part

Hibernating and resuming an agent session is not trivial. A running agent has in-memory state (variables, conversation context, open file handles, network connections) and filesystem state (files it has created or modified during the session). Losing any of this between activations would break the agent’s continuity.

Agent Substrate solves this with full-process snapshots: it captures the complete memory image and filesystem state of a running process, transfers it to persistent storage, and can restore it on any available worker pod. The ateom-gvisor component runs inside sandboxed worker pods specifically to execute these gVisor checkpoint and restore operations.

The result is what Google calls “Instant Session Teleport”: a hibernated actor can be reactivated on any available worker pod, regardless of which physical node it was last running on, with its full state intact. Sub-second activation.

The Networking Layer: An Important Detail

Agent Substrate includes a networking component called atenet that handles DNS resolution, Envoy-based routing, and proxy sidecars for agent-to-agent traffic.

⚠ Important distinction: atenet uses Envoy for internal pod routing within the substrate. But for the AI-protocol layer above it — MCP tool calls, A2A agent-to-agent traffic — Envoy was not designed for these protocols. Solo’s agentgateway was built from scratch in Rust specifically because existing proxies, including Envoy, could not handle AI-native traffic patterns at the required performance. The two layers are complementary: atenet handles internal substrate routing, agentgateway handles the protocol-aware edge.

When an actor is activated on a worker pod, atenet updates routing state so that incoming requests for that actor are directed to its current worker. When the actor hibernates, routing is suspended. This routing layer is what makes the actor abstraction transparent to callers — from the outside, you address an actor by its stable identity, not by its transient pod assignment.

Framework Compatibility

Agent Substrate is explicitly designed to be framework-agnostic. Because it manages standard OCI containers at the kernel level via gVisor, it can host agents built on any stack. The README lists first-class support for:

ADK (Agent Development Kit): native session identity and persistent working memory
LangChain: long-running stateful agents and sandboxed tool-calling
Claude Code and Codex: high-density coding environments with persistent terminal and filesystem state across sessions
MCP servers: deploy secure, sandboxed MCP servers as Substrate Actors to provide durable tools for any LLM

That last point — MCP servers as Substrate Actors — is particularly significant and is discussed in a dedicated post in this series.

What Agent Substrate Does Not Include (And Why That Matters)

Google’s blog posts describe Agent Substrate handling compute scheduling and isolation. What they do not describe — at all — is the networking layer above the sandbox.

To run agents in production at enterprise scale, you need more than just isolated sandboxes that can be rapidly activated. You need:

Policy enforcement: which agents can call which tools, under what conditions
Traffic management: rate limiting, retries, circuit breaking for agent-to-tool calls
Observability: distributed tracing across agent-to-agent and agent-to-tool interactions
Security: mTLS between agents, authentication for MCP tool servers, audit logging
Protocol awareness: MCP and A2A traffic require a gateway that understands agent-native protocols, not just HTTP

None of this is in Agent Substrate. It is not a gap or an oversight — it is simply out of scope for a compute scheduling layer. But it means that Agent Substrate, on its own, is not a complete production infrastructure story.

Where Solo.io Fits

Solo’s products complement Agent Substrate’s compute layer by providing the network and connectivity layer that enterprise agent deployments require.

agentgateway

agentgateway is Solo’s next-generation data plane, built from scratch in Rust to handle AI-native traffic patterns that traditional Envoy-based proxies weren’t designed for. Where kgateway (Solo’s Envoy-based Kubernetes gateway) handled HTTP and gRPC, agentgateway handles HTTP, gRPC, in addition to MCP and A2A — protocols that are stateful and require session multiplexing, making them fundamentally incompatible with the stateless one-request-to-one-backend model of traditional reverse proxies. When Agent Substrate activates an actor exposing MCP tools, agentgateway provides the protocol-aware edge: tool discovery, authentication, fine-grained access control, semantic guardrails, and end-to-end observability for agent-to-tool and agent-to-agent interactions.

‍Istio Ambient Mesh

Istio’s ambient mesh mode — developed in large part by Solo.io engineers — provides the security and observability layer for service-to-service traffic without sidecar overhead. For agent infrastructure, this means mTLS between agent pods, distributed tracing of multi-step agent workflows, and network policy enforcement — all with the low overhead required at agent scale.

Notably, atenet (Agent Substrate’s networking component) uses Envoy for internal pod routing. The zero-trust foundation underneath all of it is Istio ambient mesh (via ztunnel), which provides mTLS-based workload identity for all agent-to-agent and agent-to-tool traffic transparently, without sidecars or special libraries. agentgateway then operates as the AI-protocol-aware layer above that foundation, adding MCP and A2A semantics, fine-grained authorization, and semantic observability that ztunnel alone cannot provide.

kagent

kagent is a CNCF Sandbox project for building and running AI agents natively in Kubernetes. It provides a Kubernetes-native runtime that can coexist with Agent Substrate’s scheduling layer, giving platform teams a consistent model for managing both the agent runtime and the agent infrastructure.

Getting Started with Agent Substrate

Agent Substrate is in very early development (v0.0.0, tagged May 19 2026). The APIs will change. It is not production-ready. That said, it is fully functional for experimentation, and the demo is genuinely impressive (github.com/agent-substrate/substrate).

The quickest path to running it locally uses kind (Kubernetes in Docker):

# Create a local cluster with a registry
hack/create-kind-cluster.sh
 
# Install Agent Substrate core system and dependencies
hack/install-ate-kind.sh --deploy-ate-system
 
# Deploy the counter demo (stateful actor, demonstrates suspend/resume)
hack/install-ate-kind.sh --deploy-demo-counter
 
# Install the kubectl plugin
go install ./cmd/kubectl-ate
 
# Create an actor and test it
kubectl ate create actor my-agent-1 --template ate-demo-counter/counter
kubectl port-forward -n ate-system svc/atenet-router 8000:80 &
curl -X POST -H "Host: my-agent-1.actors.resources.substrate.ate.dev" -i http://localhost:8000/

The counter demo creates a simple stateful HTTP server as an actor, demonstrates state preservation across suspend/resume cycles, and shows the multiplexing in action. The more sophisticated “Secret Agent” demo highlights zero-idle self-suspension: an actor that automatically hibernates when idle and reanimates on request, preserving volatile RAM state through the cycle.

For GKE deployment, the setup script automates provisioning the required GCP resources (GKE cluster with a gVisor node pool, Redis, GCS, and IAM bindings):

# Configure your environment
cp hack/ate-dev-env.sh.example .ate-dev-env.sh
# Edit .ate-dev-env.sh with your GCP project settings
source .ate-dev-env.sh
 
# Provision GCP resources
gcloud auth application-default login --project=${PROJECT_ID}
go run ./cmd/setup --all
 
# Deploy Agent Substrate
./hack/install-ate.sh --deploy-ate-system

What to Watch

Agent Substrate is explicitly positioned as the compute foundation for Agent Executor, Google’s distributed agent runtime. As Agent Executor matures (currently in preview), the tight integration between the two will become more visible. The combination — a durable execution engine on top of a hyper-dense compute scheduler — is what makes running hundreds of millions of concurrent agent sessions economically viable.

Google has also signaled that Agent Substrate’s scheduler will eventually incorporate data locality — ensuring that agent state and scheduling work together to reduce latency further by preferring to activate an actor on the same node where its state is already warm. This is a significant optimization for stateful, long-running agents.

The project is actively seeking community contributions. Given how early it is, contributions to docs, examples, and integrations have a high probability of being accepted and will carry lasting SEO and credibility value.

The Bottom Line

Agent Substrate solves a real and important problem: how to run millions of AI agent sessions efficiently on Kubernetes without either burning money on idle compute or accepting the latency of cold-starting pods on demand. The 30x oversubscription ratio in the demo is not a parlor trick — it reflects a genuine architectural innovation that exploits the bursty, idle nature of agent workloads.

The stack Google has assembled — gVisor isolation, Agent Sandbox provisioning, Agent Substrate scheduling, and Agent Executor runtime — is a credible foundation for enterprise-scale agent infrastructure. What it does not provide is the network layer: the MCP and A2A gateway, the service mesh, the observability and policy enforcement that production deployments require.

That is exactly where Solo.io’s tooling connects. In the next post in this series, we’ll go deeper on the networking layer — specifically, why Agent Substrate’s Envoy-based internal routing (atenet) is distinct from the AI-protocol layer above it, and how agentgateway’s Rust data plane fills that gap for MCP and A2A traffic.

About Solo.io

Solo.io is reimagining infrastructure for cloud and AI, uniting secure, seamless cloud connectivity with AI-ready, agentic infrastructure. Solo’s open-source projects — including agentgateway, kagent, and contributions to Istio and Envoy — are used by Fortune 2000 enterprises running AI workloads on Kubernetes.

Resources referenced in this post:

Agent Substrate repo: github.com/agent-substrate/substrate‍

GKE Agent Sandbox docs: cloud.google.com/kubernetes-engine/docs/concepts/machine-learning/agent-sandbox

Agent Executor (Google/ax): github.com/google/ax

Solo agentgateway: solo.io/agentgateway

kagent (CNCF Sandbox): kagent.dev

‍

The Problem Agent Substrate Solves

The Architecture: Three Layers Working Together

‍Layer 1: GKE Agent Sandbox — The Isolation Layer

Layer 2: Agent Substrate — The Scheduling Layer

Layer 3: Agent Executor — The Runtime Layer

How Agent Substrate Works: The Detail

The Actor/Worker Model

State Preservation: The Hard Part

The Networking Layer: An Important Detail

Framework Compatibility

What Agent Substrate Does Not Include (And Why That Matters)

Where Solo.io Fits

agentgateway

‍Istio Ambient Mesh

kagent

Getting Started with Agent Substrate

What to Watch

The Bottom Line

Featured content

The Role of Virtual MCP in Managing LLM Costs

What 'is' Agent Identity? Human? Workload? A new Layer?

Interview with James Quigley on Istio Ambient at KCD NY

kagent <3 Agent Substrate: A 101 installation & Configuration Guide

Solo Enterprise for Istio 1.30: Agentic Mesh, ztunnel-Native Egress, New UI, and Fine-Grained Workload Identity

Agentgateway Code Mode for OpenAPI to MCP

From Service Mesh to Agentic Mesh

Keeping Context and Tokens Low With Progressive Disclosure In Agentgateway

MCP Progressive Disclosure: Save Tokens, Retrieve Schemas

Five Minutes to Your First MCP Server Tool: A Quickstart with agentgateway

Agentic Quality Benchmarking With Agentevals

The AppMesh Migration Playbook

Solo Enterprise for Istio 1.29: ECS Now GA, Enhanced Debuggability, and Flexible Global Service Aliasing

Your First AI Route: Connecting to OpenAI with AgentGateway

Getting started with Multi-LLM provider routing

What Comes After Ingress NGINX? A Migration Guide to Gateway API

Why Traditional Gateways Failed AI Workloads - and How Kgateway's Rust-powered Agentgateway Fixes It

Context-aware Security for Agentic AI Gateways

Kgateway: The Best Alternative for Ingress NGINX

The Linux Foundation’s new Agentic AI Foundation and Secure MCP Infrastructure

Security Holes in MCP Servers and How To Plug Them

Announcing Gloo Mesh Support for Amazon ECS

Gloo Mesh 2.11: Expands Support to Amazon ECS and Brings Multi-Tenant Flexibility to Enterprises.

Reducing the costs and complexity of your cloud native architecture with Ambient Mesh

Introducing Solo Enterprise for agentgateway

Introducing Gloo Gateway 2.0

Ambient mesh deployments made easy with Gloo Operator

Choosing between installation methods in Gloo Mesh: Helm vs. the Gloo Operator

How ambient mesh challenges the security gaps in sidecar workloads

Migrating from sidecars to ambient with zero downtime

Comparing Istio's ambient multicluster support with Gloo Mesh's multicluster peering

The future of Kubernetes is context-aware: Meet Solo Enterprise for kagent

kgateway as Ingress for Ambient Service Mesh

Tracing GenAI Applications Is Not Enough

Gloo Mesh 2.10: More Secure, Scalable Cloud Connectivity

MCP Authorization is a Non-Starter for Enterprise

Securing and Observing Your Services, Simplified

From MCP Servers to Services: Introducing kmcp for Enterprise-Grade MCP Development

The Power of a Single API to Secure, Observe, and Control Traffic in All Directions

Why Building Large Kubernetes Clusters Is (Still) a Bad Idea

Fortifying Your Cloud Native Connectivity Security Posture with Solo and Ambient Mesh

Migrating from Sidecars to Ambient Mesh - Risks, Challenges, and Benefits

Overhaul of Agent Gateway supporting A2A, MCP, and Kubernetes Gateway API

How Ambient Mesh Delivers Advanced Resource and Cost Savings

Getting Started with Ambient Mesh: From 0 to 100 mph

Agent Discovery, Naming, and Resolution - the Missing Pieces to A2A

Part Two: MCP Authorization The Hard Way

Part One: MCP Authorization The Hard Way

Agent Identity and Access Management - Can SPIFFE Work?

Deep Dive into llm-d and Distributed Inference

Motive

Confluent

Ingenico

OfferUp

ParkMobile

Vonage

Domino’s Pizza

Introducing Solo Enterprise for agentgateway

Comparing Sidecars with Sidecarless Mesh Implementation

Solo Enterprise for Istio Feature Comparison

Enterprise Support for Istio in Production