Istio Ambient Mesh in Azure Kubernetes Service: A primer

Istio Ambient Mesh – a sidecar-less data plane for Istio – represents true innovation in the years-old service mesh industry as it addresses serious concerns about complexity, manageability, and day-two operations compared to sidecar-based deployments. Workload onboarding, data plane upgrades and CVE patches now become much easier. 

In addition, for large clusters with thousands of Pods, the resources requested by the sidecar containers are an expensive service mesh tax, as the memory usage of the Envoy sidecars grows linearly with the size of the service mesh. Istio Ambient Mesh alleviates these concerns as well.

We at Solo.io spearheaded its development, and recently this new sidecar-less pattern was integrated into the main branch of Istio. Introducing a new proxy architecture with a split for L4/L7 traffic, it’s quickly becoming interesting wherever a non-invasive approach to application networking is needed and where an incremental adoption of service mesh is the preferred path to production. 

We want to validate that the current state of Ambient Mesh can be deployed and used already in managed Kubernetes services, beyond the simple examples using local development clusters, but in setups that approximate more closely real-world scenarios and quasi-production deployments. We chose Azure Kubernetes Service first as we acknowledge its strength and popularity among enterprises and power users, and the availability of multiple options for network plugins in AKS.

Currently, there are 5 main options available for AKS networking:

  • The simpler, older kubenet based on non-CNI implementation based on NAT
  • The Azure CNI with IP assignment for pods from an existing VNet
  • Azure CNI Overlay with a Pod CIDR different from the VNet hosting the nodes
  • Azure CNI with Cilium and IP assignment from an overlay network
  • Bring-your-own CNI mode, where you can choose which CNI to deploy

 

AKS network plugin Istio w/sidecars Istio Ambient Mesh Ambient w/ eBPF
Kubenet YES No (1) No (1)
Azure CNI YES YES NO
Azure CNI Overlay YES YES NO
Azure CNI w/ Cilium YES NO  NO 
BYOCNI w/ Cilium YES NO (2) NO (2)

 

  1. Not being a CNI, AKS with kubenet plugin will not be able to run Ambient Mesh at all
  2. Cilium support is currently tracked in this Istio issue on GitHub

As you can see from the table, the only viable option at this moment is to use Azure CNI without Cilium. As Ambient Mesh matures and starts supporting Cilum and other eBPF-based CNIs we will update this blog with new information to deploy Ambient Mesh with eBPF-accelerated routing tables. Solo.io is committed to work with the Istio upstream community to continue to drive the evolution of Istio Ambient, including integration with Cilium and eBPF, and provide Azure Kubernetes Service users with the best possible service-mesh experience.

Ambient Mesh in Azure Kubernetes Service with Azure CNI

If you want to try Ambient Mesh in Azure Kubernetes Service, you’ll need:

  • An Azure account and the azure-cli command line tool (installation instructions here)
  • Access to GitHub and the istio/istio repository
  • Docker desktop to run the istioctl image.

First let’s create an AKS cluster with AzureCNI network plugin (at the time of writing, 1.25.5 is the latest supported version):

$> az group create --location eastus --name ambient


$> az aks create \

--location eastus \

--name ambient \

--resource-group ambient \

--network-plugin azure \

--kubernetes-version 1.25.6 \

--node-vm-size Standard_DS3_v2 \

--node-count 2


$> az aks get-credentials --resource-group ambient --name ambient

We suggest a minimum size of Standard_DS3_v2 to run Istio because the Kubernetes nodes should have at least 4 CPU cores.

We add the Gateway API CRDs to the AKS cluster, that Istio will use for the waypoint proxies. 

$> kubectl get crd gateways.gateway.networking.k8s.io &> /dev/null || \

  { kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v0.6.1" | kubectl apply -f -; }

Now that Ambient mode is included in the main branch, the nightly build containers published at gcr.io/istio-testing/ contain the ambient mode functionality. We can use the istioctl container to install Istio in our cluster (here shown the command to deploy Istio Ingress gateway, please refer to this guide if you wish to deploy the new Gateway API-based ingress):

$> docker run -ti --rm -v ~/.kube/config:/config \

 gcr.io/istio-testing/istioctl -c /config install \

--set profile=ambient --set meshConfig.accessLogFile=/dev/stdout  \

--set "components.ingressGateways[0].enabled=true" \

--set "components.ingressGateways[0].name=istio-ingressgateway" -y

Confirm that all pods in istio-system namespaces are up and running:

kubectl get pod -n istio-system

NAME                                   READY   STATUS    RESTARTS   AGE

istio-cni-node-67gsz                   1/1     Running   0          3h44m

istio-cni-node-sm7cn                   1/1     Running   0          3h44m

istio-ingressgateway-d9fb9779f-22rjm   1/1     Running   0          3h40m

istiod-c8fc4d865-mrncs                 1/1     Running   0          3h45m

ztunnel-7nxms                          1/1     Running   0          3h45m

ztunnel-kk264                          1/1     Running   0          3h45m

Note the istio-cni and ztunnel daemonsets: the first will take care of modifying the iptables rules on each node to redirect mesh traffic to the ztunnel and the latter is the L4 proxy that will tunnel connections to and from pods that are part of the mesh. 

Istio-CNI works in parallel with Azure CNI and they will not interfere with each other. The ztunnel source code is available outside of the istio repository and has its own issue tracker, this is important to know when looking for known issues about Istio Ambient.

The traffic interception in Ambient Mesh works by leveraging GENEVE tunnels and iptables interception, as explained in detail in this blog post by Peter Jausovec; it’s completely transparent to the user and only works on tagged traffic, allowing flexible interoperability of non-mesh and meshed applications.

Let’s deploy a sample application

Deploy the bookinfo demo app and tag the namespace to be part of the Ambient Mesh:

$> git clone https://github.com/istio/istio

$> cd istio

$> kubectl create namespace bookinfo

$> kubectl apply \

      -n bookinfo \

      -f samples/bookinfo/platform/kube/bookinfo.yaml

$> kubectl apply \

      -n bookinfo \

      -f samples/bookinfo/networking/bookinfo-gateway.yaml

$> kubectl label namespace bookinfo istio.io/dataplane-mode=ambient

To check if the traffic is encrypted we are going to:

  1. Find the IP address of the istio-ingressgateway that is exposed by an Azure Load Balancer, with a Kubernetes Service of type Load Balancer in the istio-system namespace.
  2. Use curl to generate some traffic.
  3. Use Stern to look at logs of the ztunnel pods.
$>  export INGRESSIP=$(kubectl get service -n istio-system istio-ingressgateway -o json | jq -r ".status.loadBalancer.ingress[].ip")

$> curl http://$INGRESSIP/productpage

$> stern -n istio-system ztunnel

Notice the logs line confirming the traffic flows thru the ztunnel and into the application pods:

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.183524Z  INFO ztunnel::proxy::inbound_passthrough: accepted connection source=10.224.0.38:49968 destination=10.224.0.15:9080 component="inbound plaintext"

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.190896Z  INFO outbound{id=bc720d1302fcdad7e72dbdce5bbbbd84}: ztunnel::proxy::outbound: proxy to 10.224.0.39:9080 using HBONE via 10.224.0.39:15008 type Direct

ztunnel-7nxms istio-proxy 2023-04-05T09:00:20.195660Z  INFO inbound{id=bc720d1302fcdad7e72dbdce5bbbbd84 peer_ip=10.224.0.15 peer_id=spiffe://cluster.local/ns/bookinfo/sa/bookinfo-productpage}: ztunnel::proxy::inbound: got CONNECT request to 10.224.0.39:9080

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.200360Z  INFO outbound{id=bc720d1302fcdad7e72dbdce5bbbbd84}: ztunnel::proxy::outbound: complete dur=9.598033ms

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.205214Z  INFO outbound{id=5633e52f6b933abc58a2ce10087e320c}: ztunnel::proxy::outbound: proxy to 10.224.0.54:9080 using HBONE via 10.224.0.54:15008 type Direct

ztunnel-7nxms istio-proxy 2023-04-05T09:00:20.210804Z  INFO inbound{id=5633e52f6b933abc58a2ce10087e320c peer_ip=10.224.0.15 peer_id=spiffe://cluster.local/ns/bookinfo/sa/bookinfo-productpage}: ztunnel::proxy::inbound: got CONNECT request to 10.224.0.54:9080

ztunnel-7nxms istio-proxy 2023-04-05T09:00:20.223560Z  INFO outbound{id=f00410da0cf1b399d1edb4c1397e8a70}: ztunnel::proxy::outbound: proxying to 10.224.0.56:9080 using node local fast path

ztunnel-7nxms istio-proxy 2023-04-05T09:00:20.225234Z  INFO outbound{id=f00410da0cf1b399d1edb4c1397e8a70}: ztunnel::proxy::outbound: complete dur=1.76561ms

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.234281Z  INFO ztunnel::proxy::inbound_passthrough: connection complete source=10.224.0.38:49968 destination=10.224.0.15:9080 component="inbound plaintext"

ztunnel-kk264 istio-proxy 2023-04-05T09:00:20.234306Z  INFO outbound{id=5633e52f6b933abc58a2ce10087e320c}: ztunnel::proxy::outbound: complete dur=29.182997ms

Note the traffic flowing in outbound from one ztunnel and into the second ztunnel and the correct use of SPIFFE identities (which will come in handy in the next section).

Add a L7 Gateway

We can add a waypoint proxy with the new “x waypoint apply” command of istioctl; this will create a waypoint proxy in the same namespace of the application, associated with the service account bookinfo-reviews:

$> docker run -ti --rm -v ~/.kube/config:/config \

  gcr.io/istio-testing/istioctl -c /config \

 -n bookinfo x waypoint apply --service-account bookinfo-productpage

A waypoint proxy will make sure that the L7 policies are applied to the connections to the pods using the service account, and that custom policies are enforced, such as request type limiting, network routing, etc.

This setup can be seen in the picture below.

Istio Ambient Mesh in Azure Kubernetes Service: A primer

When you execute the same request you can see the product page waypoint pod being used: 

$> stern -n bookinfo productpage-istio-waypoint

productpage-istio-waypoint-58f9ffd98-gwb9l istio-proxy [2023-04-05T13:12:29.454Z] "GET /productpage HTTP/1.1" 200 - via_upstream - "-" 0 4294 16 16 "10.224.0.33" "curl/7.87.0" "0e88e6b4-f4a0-484e-9130-fa51d6c673b1" "4.231.74.253" "envoy://connect_originate/" encap envoy://internal_client_address/ 

10.224.0.18:9080 10.224.0.33:0 - default

More examples of using L7 waypoint proxy are available in the preliminary Istio documentation.

Conclusion

We demonstrated how the latest version of Istio Ambient can be easily deployed in Azure Kubernetes Service, enabling its users to kick the tires of this new sidecarless model even in a managed Kubernetes service. This new operating model for service mesh allows for progressive adoption and incremental enablement of your workloads in the service mesh, avoiding big bang migrations and allowing for mesh applications to co-exist side-by-side with other applications in your cluster.

Learn more about Istio Ambient Mesh in this eBook.