10 things I wish I had known before using Istio

As we’ve helped a growing number of organizations and individuals adopt Istio over the years, we’ve seen a lot of recurring pain points. Using that knowledge and experience, we’ve put together a collection of 10 tips to help users like you improve your Istio experience.

In fact, we recently held a live webinar to share this knowledge. Watch the webinar on-demand here.

1: Istio vocabulary

Like any technology, you have to get used to the lexicon. When looking through the Istio documentation, common terms like destination and workloads might be new. Here are their definitions:

Destination – Kubernetes service
Workload – Kubernetes deployment
Workload selector – Labels on a pod

2: Inbound and outbound traffic

Zooming in on the architecture of a service mesh with a proxy sidecar, you can understand traffic flow. Clients connect to a proxy that is running next to your service, called inbound traffic, where you can apply policies like authentication and authorization.

The traffic going out of your service and going through the proxy again before reaching its destination is called outbound traffic, and that’s where you can implement traffic management policies like VirtualService and DestinationRule.

3: Use CRDs individually

Some tutorials tell users to use the CRD triplet:

Gateway
VirtualService
DestinationRule

However, using those together is not mandatory. Each CRD has its own particular purpose and usage. For example, you can use a gateway CRD for routing TCP traffic, or use VirtualService on its own for in-mesh routing, or use DestinationRule to enable some features like a circuit breaker, without using the others.

4: Control the scope

Controlling the scope is a concern that will arise when you start to grow your cluster and to grow your mesh. With hundreds of thousands running in the mesh, in one cluster or multiple clusters, memory usage increases for the sidecar proxy and the Istio control plane.

These issues can be tuned on the control plane and the data plane. On the control plane, use discovery selectors to tell Istio to stop watching all the services in the cluster and focus on a particular namespace or set of namespaces. From there, Istio will create its proper service registry in an optimized way, only listing the services that make sense for its usage.

You can also export to a sidecar on the data plane side and control configuration sharing across namespaces. That’s a way for service owners to take their VirtualService doing routing to a service and use it for a particular set of services, not exposing that Istio resource.

For instance, as a service owner and using the “exportTo” option, you can decide whether your Istio resource will be available to other namespaces. It’s all about configuring the scope of your policies. Keep in mind that using this “exportTo” option is not a way of enforcing service-to-service authorization policies. It’s just about Istio configuration.

The sidecar CRD is helpful if you want to trim down the size of your configuration being pushed from the control plane to the Envoy instance. By default, your Envoy instance or your sidecar proxy will be aware of all services in your cluster, but that increases the memory consumed by each sidecar. A good practice is to configure the proxies only with the services they are supposed to connect to – do that by using a Sidecar custom resource.

5: Zero trust is next door

In the security space, zero trust is a big topic, but it’s also difficult to accomplish. The idea is that all policies must be opt-in to accept – it’s hard to do across a network without blanket rules and a lot of configuration. Istio makes it easy because users have control over the entire data plane and everything is running through Envoy.
Istio:

Provides cryptographic identity to all workloads with PeerAuthentication
Controls the traffic from the Edge and does dynamic routing with RequestAuthentication
Can apply some fine-grained traffic policies in the mesh with AuthorizationPolicy

Within a service mesh, you can apply service mesh wide policies really easily and start out with a deny-all status. This, along with MTLS gets you on the path of creating a secure, repeatable, defense in-depth environment

Download our whitepaper to find out how to achieve compliance and zero trust with Istio Ambient Mesh.

6: Deploy Istio right

Documentation includes a lot of references to istioctl and Helm – they’re both valid options to deploy Istio. Since last year, the community has recommended using gitops with Helm. The current best practice is to use declarative infrastructure, or infrastructure-as-code, and Helm is a great way to do just that.

Even if you are installing Istio with Helm, do not discount Istioctl. It is a powerful debugging tool on it’s own, not just an installer.

7: Revisions are key

Using revisions is very important as it allows you to canary Istio upgrades in production. When it comes time to upgrade, which is a frequent event – Istio has a support lifecycle of 6 to 7 months – revisions help you safely keep pace with no downtime. You apply another Istio control plane with a revision tag that will only apply itself to things that have the same revision tags. If you don’t do this right out of the gate, backtracking later is more challenging.

More information can be found here.

8: Compare apples to apples

Istioctl is a really powerful tool to use for debugging. Is your mTLS broken? Or do you have certificate issues? Three helpful commands are:

istioctl proxy-config will provide information about all of the sidecars in your mesh
istioctl proxy-config rootca-compare will compare the root CA of two different workloads
Istioctl x describe will show all the policies applied to a particular sidecar

There are also commands to expose the Envoy administration UI, change the log level, and more. A complete command argument list can be found here.

9: External services as first-class citizens

What’s usually called an external service can be either a service outside of the mesh or a service outside of your cluster – like a database, a web service, an AWS Lambda function, or any kind of function as a service (FaaS). Istio policies can be used throughout the traffic to VMs or web services.

We often see people moving workloads from on-prem to the cloud using traffic policies. It’s a nice way to modernize IT. Users can still enforce encrypted connections between a service or database on a VM and your mesh. That means gaining visibility of connections going in and out of the cluster and the mesh, including traces, logs, and metrics – for free.

10: Not all namespaces are born the same

Namespaces are a nexus point for controlling a lot of Istio. Besides the common usage of tags in namespaces to control injection (remember to use revisions!), namespaces can also be used to control the scope of what services can access each other. It is important to understand where the namespace boundaries lie, especially when moving towards a zero trust architecture.

11 (bonus tip!): Ambient Mesh is here

When using Istio, you’re essentially putting an Envoy proxy next to every single pod you’re running. That comes with a cost in cpu and ram. Solo.io and Google, along with the great Istio community, have created Istio Ambient Mesh to address that concern.

Ambient Mesh removes the need for sidecars everywhere with shared L4 proxies per node, and L7 proxies per service account; it also makes upgrade cycles easier and can improve performance. See more about Ambient Mesh here.

Read our series about traffic in Ambient Mesh.

Learn more

We hope these tips help you better navigate your Istio experience.

To participate in our community and learn more about Istio, join our Slack channel.

To find out how Istio drives business value, download our ebook today.

10 things I wish I had known before using Istio

1: Istio vocabulary

2: Inbound and outbound traffic

3: Use CRDs individually

4: Control the scope

5: Zero trust is next door

6: Deploy Istio right

7: Revisions are key

8: Compare apples to apples

9: External services as first-class citizens

10: Not all namespaces are born the same

11 (bonus tip!): Ambient Mesh is here

Learn more

Featured content

Agent Identity and Access Management - Can SPIFFE Work?

Deep Dive into llm-d and Distributed Inference

Gloo Mesh 2.8 simplifies service mesh operations with new enhanced user experience across multi-cluster environments.

Gloo Gateway 1.19 accelerates context-rich, real-time AI apps with Gateway API

llm-d: Distributed Inference Serving on Kubernetes

AI Reliability Engineering For More Dependable Humans

Kubernetes Identity the Right Way with SPIRE and Ambient

Optimizing GenAI in Production: High-Value Use Cases for AI Gateways

Solo.io Recognized as a Visionary in the 2024 Gartner® Magic Quadrant™ for API Management for the SECOND year in a row.

Guardians of the Governance: GenAI Gateway Guidance with GitOps and Gloo

Istio Ambient Waypoint Proxy explained

Hands-On with the Kubernetes Gateway API and Envoy Proxy: A Tutorial with GitOps and Gloo Gateway

Istio and the State of DevOps: Enhancing Key Metrics

What is an AI Gateway and its role in AI Applications?

Best practices for secure Istio deployment with Gloo Mesh Core

Gloo Mesh 2.6: Istio's Ambient mode now ready for production

HTTP Observability Without Compromises

Advance your knowledge of service mesh tech with Solo.io Academy certifications

Service Mesh for the developer workflow, a series

Challenges of adopting service mesh in enterprise organizations

Service Mesh in the Real World #2 — Ingress Traffic Control

Service Mesh in the Real World Video Series – Episode # 1: Egress Traffic

Service Mesh the easy way with AWS App Mesh and SuperGloo

Webinar Recap: Intro to Service Mesh Hub and SMI

D-TECK Uses Solo.io Gloo Gateway and Google Cloud to Help Businesses Make Better HR Decisions

Minimize the blast radius of changes with Solo.io Gloo Gateway and Weaveworks Flagger

Announcing Service Mesh Interface (SMI) Support and Collaboration

Service Mesh Interface (SMI) and our Vision for the Community and Ecosystem

The need for a standard, service mesh API

SuperGloo to the Rescue! Making it easier to write extensions for Service Mesh

Introducing The Service Mesh Hub -everything you need for your service mesh

Kubernetes Ingress Past, Present, and Future

Solo.io Streamlines Service Mesh and Serverless Adoption for Enterprises in Google Cloud

ParkMobile

Vonage

Domino’s Pizza

Gloo Mesh Feature Comparison

Service Mesh for Developers, Part 1: Exploring the Power of Observability and OpenTelemetry

Service Mesh at Scale

Compare Capabilities of the Top Service Mesh Platforms

Compare Capabilities of the Top API Gateways

Establishing zero trust security for modern cloud architectures

Unlocking the Power of Your API Gateway

API Gateways: Productivity, Resilience, and Security for Next-Generation Cloud Applications

Driving Business Value with Istio

Service Mesh Vendor Comparison

Istio Then & Now

4 Reasons Why You Need an AI Gateway

Gloo Gateway vs. Kong

Gloo Gateway vs. Apigee

3 Reasons You Need an API Gateway for Microservices Apps

Solo Academy Course: Service Mesh Basics

Solo Academy Course: Istio Basics

Solo Academy Course: Envoy Basics

Solo Academy Course: API Gateway Basics

Solo Academy Course: Get Started with Istio Service Mesh

Solo Academy Course: Introduction to Envoy Proxy

Solo Academy Course: Deploying Istio for Production

Kgateway Lab: Integrating kgateway with Istio at Ingress

Kgateway Lab: Kgateway as a Waypoint

Kgateway AI Lab: Consumption Reporting

Kgateway AI Lab: Deploying kgateway as an AI Gateway

Kagent Lab: How to build an AI agent

Kagent Lab: Integrate tools from MCP servers with kagent

Gloo AI Gateway Hands-On Lab: Semantic Caching

Kgateway AI Lab: Credentials Management