Technical

Demystifying Istio Ambient Mesh for Total Beginners

In my recent talk, I aimed to clear up the core concepts of Istio in a way that beginners can grasp. In this post, we’ll explore the challenges posed by sidecar mode and how Istio Ambient Mesh ingeniously tackles them by minimizing proxies, simplifying scalability, and maintaining security. Let’s dig in.

Service Mesh As We Know It Today

In the contemporary landscape, mobile phones have become omnipresent, effectively serving as the epicenter of our daily lives. We use our mobile phones for communication, whether through messaging apps like WhatsApp and Twitter or traditional phone calls.

A curious habit some people have when making calls is inquiring, “Where are you?” However, in most instances, this isn’t the primary purpose of our call, is it? It’s merely a customary question. Our calls or messages often serve different purposes, and knowing the exact location of the other person isn’t essential.

Consider international calls: They involve dialing a universally recognized number format, ensuring secure, encrypted communication that can’t be intercepted. Our phones offer a range of statistics, from call records to app usage and consumption data. They even allow us to block unwanted callers.

Given our heavy reliance on mobile phones and their standardized, user-friendly approach, why haven’t we applied a similar philosophy to the numerous services operating within our systems? Shouldn’t we adopt a comparable strategy for our digital services?

And this is, in essence, what a service mesh is about. Rather than a mobile phone possessed by a person, it’s a proxy next to an application. In Kubernetes, since the proxy is deployed as a sidecar, this architecture is called sidecar mode.

Ambient Mesh, The Concept

Now that we’ve established our trust in what we can call mobile phone-based service mesh, let’s elevate the discussion.

Imagine you’re in an office, conducting a meeting with remote colleagues. You and your local team gather in a conference room.

Given that we predominantly communicate via our mobile phones, does it seem logical for each person, including your teammates, to use their individual mobile phones within the conference room? Or would it be more efficient to place a central speaker in the room, allowing everyone to utilize a single device? Security concerns are mitigated by the conference room’s physical barrier.

In essence, this is the fundamental premise of ambient mesh: a service mesh re-architecture aimed at minimizing the number of proxies, simplifying scalability, and preserving the level of security that sidecar mode offered. In Kubernetes, since this architecture does not force you to deploy a proxy for each pod, it is called sidecarless mode.

Within the conference room, the walls establish a secure perimeter, enabling open communication. In the realm of Kubernetes, the secured perimeter can be given by two elements:

  • A cluster node that groups applications under the same node, so that encrypting that communication would be pointless.
  • A ServiceAccount that groups multiple pods representing a service with numerous replicas of the same pod. This makes deploying a proxy per each replica inefficient.

By now, you’ve grasped the primary advantage of the sidecarless approach over sidecar mode. However, there’s more to the story. Throughout the years, as the community tested sidecar mode, certain challenges emerged, compelling us to rethink the architecture.

3 Major Challenges from Sidecar Mode

To demonstrate the benefits of sidecarless mode (ambient mesh) over sidecar mode (service mesh) as we know it today, let’s explore these challenges:

Resource Consumption

One primary challenge in sidecar mode is its resource consumption, a concern often raised by users exploring service meshes like Istio.

To grasp why this happens, it’s essential to dig into the origins of service meshes. Initially, platforms like Netflix introduced libraries (Hystrix, Zuul, Ribbon, etc.) to be integrated into applications during the build phase.

These libraries addressed critical cross-cutting concerns, such as authentication, authorization, security, routing, service discovery, and observability. In the early days, everything operated within a single application.

However, with the advent of cloud-native approaches and Kubernetes, this paradigm shifted toward the sidecar pattern. Instead of embedding these concerns within each app, they were added as sidecar proxies, adjacent to the application in the Kubernetes environment.

As a result, each pod now deploys its proxy, leading to a perceived increase in resource consumption.

Expected Vs. Actual Outcomes

One major challenge in Istio’s sidecar mode is culture. Historically, there’s been a persistent battle between developers (devs) and operations (ops). You often hear, “It’s not my code; it’s your machine.”

DevOps tried to bridge this gap by combining the two worlds, but it required a cultural shift that many enterprises resisted. Change is slower in large companies. So, even if they claimed to be “DevOps,” they often didn’t build fully end-to-end teams, a crucial DevOps requirement.

Now, with service mesh and sidecar architecture, this battle resurfaces. Generally, devs prefer coding over learning Kubernetes, which ops typically manages. Ops crafts simple YAML files for devs to deploy their apps in Kubernetes.

When deploying solely in Kubernetes, whatever’s in the YAML is deployed. If an app has one container, the dev sees only one.

But with sidecar mode, when a dev deploys an app, they see their app’s containers and two more: Istio and Istio-init. Silently, Istio injects its proxies and initiators.

A dev doesn’t need to grasp Istio’s intricacies. However, if their app fails, they might suspect this extra app somehow conflicts.

For instance, a user faced an issue when an app used port 15006. Without Istio, it worked fine in Kubernetes, but with Istio, it failed because that port is reserved for the proxy and can’t be changed.

Change in the Proxy Implies Redeploying Apps

The third intriguing challenge involves Day-2 operations. With sidecar mode, the architecture blends the development lifecycles of two components: the app and the proxy, as they share the same pod.

When a new app version is rolled out, the proxy in the same pod is also restarted. This isn’t a major issue because the change occurs in the app, not the proxy.

The challenge arises when the proxy needs an upgrade. In such cases, the operations team must ensure a graceful shutdown of the pod.

While we live in a cloud-native world, not all apps in Kubernetes follow the 12-factor compliance rules that define a cloud-native app.

Factor number 9 in these rules relates to startup and graceful shutdown for a 12-factor app:

IX. Disposability
Maximize robustness with fast startup and graceful shutdown
The twelve-factor app’s processes are disposable, meaning they can be started or stopped at a moment’s notice. This facilitates fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys.

For apps that aren’t 12-factor compliant, a proxy rollout can potentially disrupt the business.

Why Sidecarless Architecture

The primary concerns and challenges revolve around the sidecar pattern used in current service mesh implementations, which hindered the evolution and widespread adoption of service mesh.

To better understand the new architecture, let’s step back to the era of object-oriented programming and SOLID principles. Many architects were once developers, and they applied lessons learned from designing monolithic functionality to architecting microservices-based solutions. One key principle is the single responsibility principle, defined by Martin Fowler as “A class or module should have one, and only one, reason to be changed.”

Applying this principle to our microservices-based solution, the service mesh, raises questions. Why should we upgrade a Layer 4 proxy (responsible for encryption, security, zero trust, etc.) when we’re only modifying Layer 7 functionality like authentication?

The logical conclusion is to decouple the Layer 4 proxy from the Layer 7 proxy. The Layer 4 proxy serves as the essential foundation and follows a distinct development lifecycle compared to the Layer 7 proxy.

The Layer 7 proxy is created on demand and covers a group of workloads with the same security level, typically based on the service account. Instead of having one proxy per application replica, you have a proxy that scales differently, adjusting to the specific needs of the applications.

Solving the Challenges

Let’s now evaluate whether this architecture effectively addresses the previously mentioned challenges:

  1. Resource consumption: In the worst-case scenario, we observe a significantly reduced number of deployed proxies compared to sidecar mode. Furthermore, as we scale the applications, the proxy no longer scales with them.
  2. Expected outcomes: By having Istio components in a separate deployment, developers can work without confusion caused by injected Istio containers as part of their applications. This clarity in the YAML files fosters better understanding and collaboration, eliminating the dev vs. ops conflict.
  3. Proxy updates and application deployment: The separation of applications from Istio components allows for distinct development lifecycles. Consequently, changes in the proxy no longer necessitate redeploying the entire application stack.

Final Thoughts

Ambient or sidecarless mode addressed not only these three challenges, but also brought significant innovations.

By fully decoupling the two proxies, it led to a groundbreaking evolution in the Layer 4 proxy.

Now, it no longer relies on Envoy and its C coding language. Instead, the Layer 4 proxy is built using Rust, which delivers superior performance. Additionally, the libraries used in the proxy have been thoroughly tested in other products, demonstrating their reliability, making the service mesh added latency almost negligible.

In conclusion, Istio Ambient Mesh represents the most substantial improvement in service mesh in recent years and is the ultimate solution for discussions about the best service mesh.

Bigger challenges like Virtual Machines fully integrated into the mesh or IoT (Internet of Things) cannot be tackled without embracing Istio Ambient Mesh.