Understanding Istio Ambient Ztunnel and Secure Overlay
There are a variety of tunnel mechanisms that can be used today to create connectivity between disconnected/remote networks. These have been traditionally used to overcome limitations in the WAN, or even to provide a deep layer of encryption.
- IPsec, a protocol used to provide encryption and authentication between two remote networks using the internet as a medium
- VXLAN/GENEVE, an overlay protocol used to carry MAC-in-UDP, and allow for extending Layer 2 networks across Layer 3 boundaries
- NV/GRE – An IP-in-IP encapsulation and tunneling protocol to extend IP networks over IP networks
This has paved the way for a service mesh future.
The way microservices have evolved have demonstrated a need for flexible connectivity options while maintaining zero-trust, and as such, Istio has evolved to provide a sidecar-less data plane called ambient mesh. You can read more about the announcement from the Istio community over here. Very special considerations went into place to ensure that the properties of zero trust, such as identity and encryption, were maintained in the development of ambient mesh and the sidecar-less approach.
Ambient mesh has taken the service-mesh functionality and broken it into two complementary layers: one that focuses on securing Layer 4 connectivity and one that implements Layer 7 policy and behaviors.
L4 → ztunnel | L7 → waypoint proxy |
|
|
With ambient mesh, the concept of ztunnel and waypoint proxy were introduced. The zero-trust tunnel, or ztunnel, is strictly responsible for L4 connectivity. Ztunnels will create overlays to each other as well as waypoint proxies, and the waypoint proxy is responsible for L7 Policy enforcement.
Welcome to ztunnel
Ztunnel or “zero-trust” tunnel is a secure overlay layer that provides capabilities to secure connections between services. The key features of ztunnel are:
- Security: Mutual TLS encrypted communication among your applications with cryptographic-based identity and L4 authorization policy
- Observability: TCP metrics and logs
- Connection multiplexing and balancing
Ztunnel originated from the idea of having a zero-trust tunnel where identity can be maintained on both ends using SPIFFE ID, which originates from the workloads. Ztunnel is a resource deployed as a daemonset and facilitates the creation of a secure overlay using the HTTP Based Overlay Network Encapsulation protocol, or HBONE for short.
HBONE runs on a dedicated port (15008) between the proxies in the data plane and uses mTLS and strong identity similar to how Istio currently works. However, HBONE is hidden to workloads–it’s not used by applications. The reasons to support a better transport mechanism include:
- Better support for protocols including server-send first protocols
- Better support for incrementally adopting Istio, especially when apps use their own TLS certificates
- Support for calling pod IPs directly and eliminate ways around Istio mTLS/encap that we see today
HBONE is made possible by the enhancements of Envoy’s HTTP Upgrade to support tunneling raw TCP over HTTP POST or HTTP CONNECT. The upgrade details of Envoy’s HTTP Connect can be reviewed here.
Figure 1 – The traffic stream using HBONE on TCP Port 15008 between ztunnels and HBONE packet composition.
ztunnel implementation
Today, a ztunnel is deployed as a pod for each node in the cluster, using a daemonset and happens to be based on Envoy Proxy.. Envoy was the most expedient option for the original release of ambient but, because of the need for a simpler and higher performance Layer 4 component, the underlying implementation will change soon; more on this later. This is important because as long as you have two endpoints that can be configured to initiate and terminate the overlay, mTLS can be established and requests can be forwarded on.
To better understand how this works in ambient mesh, let’s use the workshop right here and use the example from that workshop to look under the hood of ztunnel.
Because the output is too large to display here, you should run the following commands in that workshop suggested above:
- kubectl get pods -n istio-system
- kubectl -n istio-system describe pod ztunnel-wl482
The output shows all pods in the istio-system namespace. The output also shows the ztunnel pods, which if you pick any one and run the second command, it describes the container specifications with several key areas of interest:
- The image being used is the proxyv2 image, tuned for Ambient mesh, and is present both on the init container and the main running container.
- The security context which calls out the NET_ADMIN privileges which is used for the redirection of traffic.
- The specific X.509 certificate information for identity and encryption as you would normally see in the Istio sidecar configuration.
- The token used to be authorized to ask Kubernetes to run through the CSR process for a given workload.
If you run kubectl -n istio-system logs -l app=ztunnel | grep ‘^\[‘ while making requests from one pod to another pod in Ambient mesh, you will also see the output of the requests being made, flowing through the secure overlay. Below is an output of the request from a sleep pod over to the web-api pod. You can also see that the connection is successful through the 200 HTTP Success code. Also, the SPIFFE identity is present for both web-api and sleep pods at each end of the tunnel.
root@virtualmachine:~# kubectl -n istio-system logs -l app=ztunnel | grep '^\[' [2022-09-26T12:41:52.458Z] "CONNECT - HTTP/2" 200 - via_upstream - "-" 127 440 1 - "-" "-" "7271ea29-33df-46a0-b1a9-023a39efd3fb" "10.101.1.4:8080" "10.101.1.4:8080" virtual_inbound 10.101.2.3:42811 10.101.1.4:15008 10.101.2.3:41588 - - inbound hcm [2022-09-26T12:41:52.456Z] "- - -" 0 - - - "-" 125 825 5 - "-" "-" "-" "-" "10.101.2.5:15008" outbound_tunnel_clus_spiffe://cluster.local/ns/default/sa/web-api 10.101.1.2:45910 10.101.2.5:8080 envoy://internal_client_address/ - - outbound tunnel
Visualizing ztunnel
Where does identity originate from? ztunnel can assume the identity of the pods currently running on the same node. Similar to sidecar, each service account will have its own identity, and key/certificate pairs are signed for each service account via Certificate Signing Requests (CSR) requests from ztunnel to the Istio control plane. By using its own service account, ztunnel can have istiod sign the CSR and return an X.509 certificate for the workload it impersonates. The other end of the secure overlay will replicate that process for ztunnel and its co-located workload and this ensures end-to-end encryption. You can view the X.509 certificates managed by your ztunnel, using the istioctl pc secret command, then base 64 decode each of the certificates. Similar to the sidecar architecture, these X.509 certificates will be automatically rotated well before the expiration (every 12 hours by default in Istio) without you needing to do anything.
Let’s review a few architectures to better understand how the secure transport is initiated with ztunnel.
Normally, a service such as C1 would have a sidecar to establish a connection with S1, however, in this case, traffic will originate from C1 and be tunneled through the ztunnel at the local node, over to the ztunnel in the remote node, which will proceed to unencapsulate the traffic and forward on to the destination, S1. We see this depicted below.
Figure 2. Ztunnel implementation as daemonset, receiving configuration from Istiod
More specifically, below, the HBONE Overlay is established between ztunnels on each node. mTLS is still maintained to ensure encryption, and mTLS exists because each ztunnel has impersonated its co-locate workload.
Figure 3. Ztunnel pods established two-way tunnel with mTLS enabled
For any workloads with a waypoint proxy deployed, a secure tunnel is established between ztunnel and the waypoint proxy. We can see further below in figure 4, that the ztunnel and waypoint proxy establish a tunnel to each other and mTLS with identity is maintained end-to-end.
Figure 4. Ztunnel pods established two-way tunnels to waypoint proxy with mTLS enabled, Istiod is still sending configuration data to ztunnel and the waypoint proxy
How are we enforcing L4 authorization policies?
We can enforce service-to-service network policy with the L4/secure overlay layer including things like “deny-all” or very fine-grained service-to-service connectivity such as “Service A can talk to Service B but not Service C, etc”. We cannot do anything that requires parsing the connection like inspect HTTP headers or JWT tokens in the secure overlay layer.
Understanding ztunnel resiliency
Ztunnel is deployed as a daemonset in the form of a pod per node in the cluster, including Kubernetes control-plane nodes. Because of the daemonset K8s resource, if a ztunnel goes down, this implies that the pod has failed however, there are a few reasons why this could occur:
- Underlying node has failed in the cluster
- Node has run out of CPU, memory, and disk resources
- The node has lost physical network connectivity
- The node is in another network and a firewall is block traffic between this node and other nodes in the cluster
In the case of a ztunnel failure, such as the pod fails, the impact is limited to the workloads on the node where the ztunnel failed.
In any of these conditions, Kubernetes will reconcile ztunnels to the best of its ability provided upstream physical issues have been resolved. This is no different than any other application or critical component going down, and being reconciled by Kubernetes.
Optimizing ztunnel with eBPF and other approaches
Today, the Istio-CNI uses IPtables Rules to direct traffic into a tunnel, The IPTables rules for traffic redirection for ztunnels has a similar effect that a sidecar does in a pod.
eBPF can optimize this by controlling the packet processing of getting traffic into a tunnel, and removing the need for IPtables rules. Solo.io is in the process of open-sourcing this functionality to upstream Istio.
Also, mentioned previously, the ztunnel leverages Envoy today to establish the secure overlay between itself and other ztunnels, as well as waypoint proxies. The xDS configuration of Envoy relies on copying the entire config pipeline per workload and this is expected to be unscalable for larger K8s clusters, making updates to xDS more costly. Open source Istio is also exploring ways to optimize the ztunnel architecture using a proxy that could be developed in Rust. See here for more details: https://github.com/istio/istio/issues/40956
Conclusion
Istio Ambient Mesh is a take on sidecar-less service mesh that balances various concerns to achieve better operations, cost and performance. As we’ve seen in this blog, Istio ambient relies on layering and “opting-in” to various layers as needed. The secure overlay layer implemented with the ztunnel component provides the backbone for mTLS, zero-trust properties, and overall secure communications. The ztunnel component will continue to evolve and we encourage anyone interested in service-mesh and application networking to get involved. If you’d like to be an early adopter user with enterprise support (from the people who wrote the code!) please reach out to Solo.io
To learn more about Ambient Mesh, take a look at this blog post, and take this self-paced workshop!