Compliance and zero trust with Istio ambient mesh

Security breaches of financial and customer data necessitate a deeper conversation about how we think about trust. Historically, a big part of how trust has been created is through the classic approach of applying security through an organizational boundary with the intention of keeping bad actors out. 

But while organizations are modernizing their technology and looking for ways to move quicker and experiment, they’re adopting new cloud services. They rely on cloud networks that they don’t own and can’t control, creating further security risks.

On top of that, add federal regulations and laws that organizations dealing with sensitive information need to abide by, as well as audits that ensure that these standards are upheld. Industry compliance means organizations must:

  • Maintain a secure network
  • Restrict sensitive data
  • Track vulnerabilities, patch/upgrade known vulnerabilities
  • Implement strong access control to sensitive data
  • Monitor, track, and dynamically alter policy 

This convergence of factors, from the evolution of applications to compliance, is why our industry is now rethinking what trust should look like. That’s where the idea of zero trust comes into the picture. 

I recently presented a webinar for The Linux Foundation on this topic. Watch the full webinar on “Solving for Compliance and Zero Trust with Istio Ambient Mesh.”

Solving for compliance and zero trust with Istio ambient mesh

Understanding zero trust

Zero trust restricts – down to the smallest possible scope – where trust is granted. Zero trust means:

  • Assume a hostile environment: There are malicious people inside and outside the environment.
  • Presume breach: Operate and defend resources with the assumption that an adversary has presence in your environment.
  • Never trust, always verify: Deny by default; every resource is explicitly authorized using least privilege multiple attributes and dynamic cybersecurity principles.
  • Scrutinize explicitly: Access to resources is conditional and access can dynamically change based on action and confidence levels resulting from those actions.
  • Apply unified analytics: For data, applications, assets, services to include behavioristics and log each transaction. 

We want traffic as it comes in to be authenticated (to know who it is). We want it to be authorized (to know what they’re allowed access to). We want to do that dynamically – and we want to do that in such a way that holds up.

What that boils down to is that:

  • All communication to resources is secured, regardless of location on the network
  • Access to resources is granted per session
  • Access to resources is determined dynamically 
  • All access is authenticated and authorized
  • Access is tracked, logged, audited, and can by dynamically revoked

policy decision point - zero trust

The way that zero trust networking is implemented is usually with coordination between a policy enforcement point that is handling traffic on behalf of a resource and some policy engine. This engine works together with an administrator driving the policy changes that might be needed. These policy enforcement points sit in line with the requests, and decisions are made in line and dynamically based on certain policies that have been set.

This is an area where service mesh – a technology that’s specifically built for these types of dynamic environments – can help.

Understanding service mesh

The idea behind service mesh is to solve some of the challenges that come up when applications want to communicate with each other. Security is a big part of that. When services communicate with each other, like APIs, networking challenges that we often see are service discovery, load balancing, and resilience (addressing when services are not available).

service mesh

We need to have a way to solve these problems consistently. With service mesh, what we see is that we implement those networking challenges like connectivity, security, and reliability, as an agent that lives with the application and the application instance. This agent gets deployed with the application instance, and handles these things on its behalf – and it can become an enforcement point of certain networking and security policies.

These agents run with the applications, they act as proxies and as policy enforcement points, and they are all remotely controlled and configured by a component we call the control plane. The control plane connects to these various agents and dynamically gives the agents  the policies it’s supposed to be enforcing. Those agents can reach back out to elements of the control plane to more dynamically determine what a policy should be. 

A policy, for example, is something like “should this request continue or should it not continue?” That determination is based on who’s calling the service, on behalf of what user, and other attributes like location, time, and claims that the user might have. 

Thus, the service mesh has a lot of the elements and architecture for implementing zero trust networking principles.

Improving on the service mesh model

There are many opportunities that service mesh presents. It has the right components and the right pieces in the right places to implement a zero trust networking architecture. However, we can still make improvements. This blog post outlines some of the challenges we were seeing 1.5 years ago. 

We announced last year that we’d be contributing our work to the Istio open source community. 

Today, our approach to running Istio is in a sidecarless mode that allows users to take advantage of the capabilities of the service mesh, including some of the properties around zero trust that we want, and do that in a way that simplifies operations like upgrades and onboarding applications into the mesh. Some of the side benefits include reducing cost and improving performance of the mesh. 

Istio

In the Istio ambient data plane, we take out the proxies that get co-located with the application instances – but keep them in the request path. We use lower level networking control to force traffic through the secure transport overlay layer. This layer is made up of agents that work closely with the CNI to implement the zero trust networking behaviors that we want from the service mesh. 

A service mesh needs to be able to do more than just establishing or not establishing connections, but also understand what’s in the request – like the tokens, headers, or claims – meaning having a layer seven understanding of what’s happening in the service mesh as well. We’ve separated that out. If you look at a request path, if you don’t need any of the layer seven introspection, then the traffic can stay in the secure overlay layer, which is quite a bit faster than having to use any layer seven proxies (the sidecar approach included), resulting in the properties of zero trust. 

The benefits of Istio ambient mesh

Overall, Istio ambient mesh brings a lot of optimization and benefits to the Istio project:

  • No more race conditions between workload containers and sidecar/init-container, etc. 
  • No need to inject Pods/alter deployment resources
  • Upgrades/patching are out of band/transparent from the application
  • Limited risk profile for opting into mesh features
  • Reduced blast radius of application vulnerabilities
  • Cost savings with reduced data plane components
  • Maintain isolated tenancy, customization, and configuration
  • Maintain the foundations of zero trust network security
  • Improved performance 

How Solo.io can help

Solo’s Gloo Platform is based on Istio, and provides all its capabilities for an initial and robust implementation of zero trust architecture:

  • Identity verification – Gloo Mesh uses SPIFFE for workload attestation and identity and OpenID Connect-based JSON web token verification for non-machine identity.
  • Psychological acceptability – A cyber security principle that refers to ensuring ease of use to avoid non-compliance among approved users. Gloo Platform, through its management plane, provides a well-defined and user-friendly mechanism to configure, deploy, and operate a service mesh.
  • Microsegmentation – Gloo Platform’s workspace feature provides advanced micro segmentation capability to configure granular access control for the segment. In a kubernetes setup, workspaces span across multiple clusters to provide a manageable access control across microsegments. You can read more about Gloo Mesh workspaces here.
  • Combined gateway and mesh – Gloo Platform’s management plane provides lifecycle management for both Gloo Gateway and Gloo Mesh. This enables use of both API gateway and service mesh for implementing a zero trust architecture, which can extend beyond Kubernetes clusters, into legacy virtual machine-based applications, as well serverless capabilities like lambda.
  • Centralized policy decision point – Gloo Management plane can manage multiple Istio control planes to create a multi-cluster mesh with identity and certificate distribution and rotation across the entire mesh. It can also be integrated with zero trust components like public key infrastructure, threat intelligence systems, etc. to provide a central policy engine with distributed policy enforcement.
  • All security concepts mentioned here are also applicable with ambient mesh, with an added advantage of a further reduced blast radius in ambient.

Download my latest white paper to learn how your company can use Istio ambient mesh to more easily and transparently enable zero trust.