Service Mesh Architecture: 3 Key Components and Design Factors

What is a service mesh?

A service mesh enables you to manage communications between individual services within a microservices architecture. It decouples network logic from the business logic of each microservice, ensuring you can implement and manage networking and communication consistently across the entire system.

A service mesh is an infrastructure layer that includes network proxies deployed alongside each service instance, collectively known as the data plane. In addition, it has a control plane that can configure and manage these proxies at large scale.

The need for a service mesh architecture

The rise of service mesh architectures has been a solution to many of the problems associated with microservices. Many development teams moved away from monolithic application development to microservices architectures. This splits the application from a monolithic unit into a collection of autonomous services. The challenge is finding an efficient way for these microservices to communicate with each other.

In a microservices architecture, application performance depends on services working together quickly and efficiently to share data and provide functionality. For example, a web-based email application might consist of a login service that handles user registration and authentication, a UI service that displays the web-based interface, and a database that stores emails and contacts. These three services must communicate perfectly with each other, and any lapse in communication will result in a poor user experience.

Microservices communicate through APIs, so it is important to find a good solution for service discovery and routing. Developers also need to ensure that communication is secure. A firewall protects applications from external attacks, but a microservices architecture has a flat, open network, and if any one service is compromised, attackers could gain access to the entire system.

Before the advent of service mesh, traffic routing was handled by load balancers. However, load balancers are complex to deploy, costly, and can find it difficult to operate in a microservices environment. The service mesh was envisioned as a solution to all of these problems. Service mesh solutions:

Provide a centralized control plane for the network layer of a microservices application
Integrate with all services through a proxy
Are is easier to configure and scale than load balancers
Enables central control over routing rules without requiring any changes to services.
Implements networking and communication logic at the platform layer, rather than having to build it into each individual microservice.

The 4 components of a service mesh architecture

Sidecar Proxies

A service mesh architecture adds an extra hop to every call, because calls to a service must go through a proxy. To minimize additional latency, proxies run on the same machine (virtual or physical), or on the same pod (in Kubernetes clusters), as the microservice. This allows the proxy to communicate with the service quickly via localhost. This model is called a “sidecar” deployment, and therefore service mesh proxies are known as “sidecar proxies”.

Node Proxies

A service mesh architecture adds an extra hop to every call, because calls to a service must go through a proxy. To minimize additional latency, proxies run on the same machine (virtual or physical) as the microservice. This model is called “node-level proxy” deployment, which was introduced in 2022 as part of Istio Ambient Mesh.

The Data Plane

In a service mesh architecture, the data plane refers to a network of proxies deployed together with individual microservices. Sidecar proxies are deployed with each instance of a service that needs to communicate with other services. All service calls go through these proxies, which perform authentication, authorization, encryption, rate limiting, and load balancing, handle service discovery, and enable logging and tracing.

The Control Plane

In a microservices-based architecture of hundreds of services, each service must be scaled on-demand, and might have a large number of instances. Altogether there might be hundreds or thousands of service instances in the entire microservices application, each with its own sidecar proxy—this is where the control plane comes in.

The control plane of a service mesh architecture provides an interface where users can define policies to configure a proxy’s behavior in the data plane, and propagate this configuration to all proxies. This requires that each sidecar proxy connects to the control plane, registers itself, and receives configuration details.

Learn more about the primary features of a service mesh in our detailed guide to service mesh technology (coming soon)

For an example of a popular service mesh platform, read our guide to Istio.

Service mesh architecture: design considerations

A service mesh might seem like an ideal solution for various aspects of designing and implementing microservice systems, but there are some caveats.

Processing Overhead

Service meshes use a proxy to route the invocations between microservices, often via a load balancer. They also track invocations and make modifications using encryption. Encryption doesn’t generate too much processing overhead at the individual level, but the aggregate burden of encryption across services increases resource consumption and latency.

Analysis based on scalability and performance metrics can help determine if a given use case causes significant overhead.

Configuration Complexity

Setting up service mesh configurations requires complex design tasks to ensure proper implementation. The admin must know the service mesh’s general configuration options and how to compose the right configurations for each application. The configuration must match the system’s requirements when configuring a service mesh.

Validating and Testing Configurations

Once the service mesh is configured, it is important to validate configuration, and do so repeatedly throughout the CI/CD pipeline, recognizing that configurations will often change.

After validating service mesh configurations, organizations should test them to ensure that each configuration’s behavior and intent reflects the expected behavior when invoking microservices.

Reviewing Configurations

The control plane does not always ensure the service mesh system is secure and reliable. After configuring and testing the service mesh, a verification process helps prevent issues such as insecure, undetected invocations.

Any change to a microservice, such as an addition or update, could also impact how the mesh behaves. The change might not be significant enough for the configuration to register, even if it affects communication. There should be a review process for each change to the service mesh configuration to ensure it covers all updates.

Service meshes don’t address all security concerns affecting an enterprise – they only address the aspects relating to communication between services. Additional security measures, such as infrastructure provisioning (i.e., network controls, firewalls), require separate tools and processes.

Control Plane Changes

Service mesh systems usually change over time, with new updates to improve performance and scalability, add features and functionality, or apply patches and bug fixes. Regression tests are important during updates to the service mesh control plane – they help ensure the system updates do not introduce negative changes to a service mesh’s behavior.

Emergence of envoy and istio

Solo.io provides Enterprise service mesh based on Istio and Envoy, Gloo Mesh, part of the integrated Gloo Platform. Gloo Mesh Enterprise delivers connectivity, security, observability, and reliability for Kubernetes, VMs, and microservices spanning single cluster to multi-cluster, hybrid environments, plus production support for Istio.According to the 2022 GigaOm Service Mesh Radar report, “Solo.io Gloo Mesh continues to be the leading Istio-based service mesh, incorporating built-in best practices for extensibility and security and simplified, centralized Istio and Envoy lifecycle management.”

What is a service mesh?

The need for a service mesh architecture

The 4 components of a service mesh architecture

Sidecar Proxies

Node Proxies

The Data Plane

The Control Plane

Service mesh architecture: design considerations

Processing Overhead

Configuration Complexity

Validating and Testing Configurations

Reviewing Configurations

Control Plane Changes

Emergence of envoy and istio

Featured content

The Role of Virtual MCP in Managing LLM Costs

What 'is' Agent Identity? Human? Workload? A new Layer?

Interview with James Quigley on Istio Ambient at KCD NY

kagent <3 Agent Substrate: A 101 installation & Configuration Guide

Solo Enterprise for Istio 1.30: Agentic Mesh, ztunnel-Native Egress, New UI, and Fine-Grained Workload Identity

Agentgateway Code Mode for OpenAPI to MCP

From Service Mesh to Agentic Mesh

Keeping Context and Tokens Low With Progressive Disclosure In Agentgateway

MCP Progressive Disclosure: Save Tokens, Retrieve Schemas

Five Minutes to Your First MCP Server Tool: A Quickstart with agentgateway

Agentic Quality Benchmarking With Agentevals

The AppMesh Migration Playbook

Solo Enterprise for Istio 1.29: ECS Now GA, Enhanced Debuggability, and Flexible Global Service Aliasing

Your First AI Route: Connecting to OpenAI with AgentGateway

Getting started with Multi-LLM provider routing

What Comes After Ingress NGINX? A Migration Guide to Gateway API

Why Traditional Gateways Failed AI Workloads - and How Kgateway's Rust-powered Agentgateway Fixes It

Context-aware Security for Agentic AI Gateways

Kgateway: The Best Alternative for Ingress NGINX

The Linux Foundation’s new Agentic AI Foundation and Secure MCP Infrastructure

Security Holes in MCP Servers and How To Plug Them

Announcing Gloo Mesh Support for Amazon ECS

Gloo Mesh 2.11: Expands Support to Amazon ECS and Brings Multi-Tenant Flexibility to Enterprises.

Reducing the costs and complexity of your cloud native architecture with Ambient Mesh

Introducing Solo Enterprise for agentgateway

Introducing Gloo Gateway 2.0

Ambient mesh deployments made easy with Gloo Operator

Choosing between installation methods in Gloo Mesh: Helm vs. the Gloo Operator

How ambient mesh challenges the security gaps in sidecar workloads

Migrating from sidecars to ambient with zero downtime

Comparing Istio's ambient multicluster support with Gloo Mesh's multicluster peering

The future of Kubernetes is context-aware: Meet Solo Enterprise for kagent

kgateway as Ingress for Ambient Service Mesh

Tracing GenAI Applications Is Not Enough

Gloo Mesh 2.10: More Secure, Scalable Cloud Connectivity

MCP Authorization is a Non-Starter for Enterprise

Securing and Observing Your Services, Simplified

From MCP Servers to Services: Introducing kmcp for Enterprise-Grade MCP Development

The Power of a Single API to Secure, Observe, and Control Traffic in All Directions

Why Building Large Kubernetes Clusters Is (Still) a Bad Idea

Fortifying Your Cloud Native Connectivity Security Posture with Solo and Ambient Mesh

Migrating from Sidecars to Ambient Mesh - Risks, Challenges, and Benefits

Overhaul of Agent Gateway supporting A2A, MCP, and Kubernetes Gateway API

How Ambient Mesh Delivers Advanced Resource and Cost Savings

Getting Started with Ambient Mesh: From 0 to 100 mph

Agent Discovery, Naming, and Resolution - the Missing Pieces to A2A

Part Two: MCP Authorization The Hard Way

Part One: MCP Authorization The Hard Way

Agent Identity and Access Management - Can SPIFFE Work?

Deep Dive into llm-d and Distributed Inference

Motive

Confluent

Ingenico

OfferUp

ParkMobile

Vonage

Domino’s Pizza

Introducing Solo Enterprise for agentgateway

Comparing Sidecars with Sidecarless Mesh Implementation

Solo Enterprise for Istio Feature Comparison

Enterprise Support for Istio in Production

Service Mesh for Developers, Part 1: Exploring the Power of Observability and OpenTelemetry

Service Mesh at Scale

Compare Capabilities of the Top Service Mesh Platforms

Compare Capabilities of the Top API Gateways