Traffic in ambient mesh: Ztunnel, eBPF configuration, and waypoint proxies (Part 3)
In previous posts, I explained the role of Istio CNI and went into the details of how Istio CNI configures the iptables rules and interfaces on the cluster nodes.
In this post, I cover configuration on the ztunnels and explain the eBPF redirection mechanism and how to deploy waypoint proxies.
You can also watch the accompanying video here.
How is ztunnel configured?
In addition to configuring the nodes, the CNI plugin also configures the iptables for the ztunnel proxy. I already briefly mentioned and introduced the pistioin
and pistioout
interfaces. These interfaces on the ztunnel side are connected to the istioin
and istioout
interfaces on the node using the GENEVE tunnels.
On the ztunnel pod, anything received on the pistioin
interface gets forwarded to ports 15008
(HBONE) and 15006
(plain text). Similarly, packets received on the pistioout
interface end up on port 15001
.
Figure: Ports on ztunnel
The ztunnel also captures DNS requests on port 15053
to improve the performance and usability of the mesh. Note that rules configuring the routing to 15053
are only created if the ISTIO_META_DNS_CAPTURE
is set to true as specified in the DNS Proxying documentation.
The iptables and rules configuration in the ztunnel pod is configured in such a way that traffic received via the inbound and outbound tunnels (istioin
and istioout
) gets routed to the localhost
.
The iptables rules are using the TPROXY mark (0x400/0xfff
) to mark packets from the inbound or outbound tunnels and direct them to the corresponding ztunnel inbound and outbound ports.
TPROXY is a Linux kernel feature that allows transparent interception and redirection of network traffic at the transport layer. It’s commonly used for implementing transparent proxy servers or load balancers that intercept and redirect traffic without requiring changes to client-side applications or configurations. In contrast, using the REDIRECT target modifies the packets to change the destination address – remember all the hops the request needs to make from the pod, through the tunnels, and to the ztunnel proxy? Using the TPROXY, the original source IP and port of the traffic are preserved, allowing the destination server to see the original client’s IP address and port.
Putting it all together
Let’s recap everything I have learned so far and trace a sample request between the two pods in the ambient mesh.
I have a Kubernetes cluster and deployed Istio ambient mesh with two workloads – both part of the ambient mesh. I will be making a request from the sleep pod to the httpbin pod (e.g. curl httpbin
).
- The request from the sleep pod is captured by the rules and iptables configuration on the node.
- Because the pod is part of the ambient mesh, its IP address was added to the IP set on the node and the packets get marked with the
0x100
. - The rules on the node specify that any packets marked with
0x100
are to be directed to the destination192.168.127.2
through theistioout
interface. - The rules on the ztunnel proxy transparently proxy the packets from
pistioout
to the ztunnel outbound port15001
. - The ztunnel processes the packets and sends them to the destination – IP
10.244.1.5
(httpbin) which gets captured on the dedicated interface that was created for the httpbin IP on Node B. - The rules for the inbound traffic ensure the packets get routed to
istioin
interface. - A tunnel between
istioin
andpistioin
makes the packets land on the ztunnel pod. - The iptables configuration captures the packets from the
pistioin
and based on the marks it directs them to port15008
. - The proxy processes the packets and sends them to the destination pod.
Redirection mechanism using eBPF
The Linux kernel is not easy to modify however, with eBPF we can extend the kernel capabilities and run sandbox applications without modifying or recompiling. With eBPF we can programmatically control the network and create our own networking stack.
This means, instead of configuring iptables rules, route tables, and GENEVE tunnels, we could write an eBPF program that does exactly what we wanted to do and attach it to a kernel-defined trigger/event hook. There isn’t inherently anything wrong with iptables and GENEVE tunnels, however, it adds additional complexity and slowness due to the number of links created, more complex kernel forwarding rules, and so on. From that standpoint, eBPF is less complex, more performant, and easier to manage.
The traffic control (TC) subsystem gives us a way to attach eBPF programs to the ingress and egress points of a specific network interface. So instead of configuring kernel firewall rules that specify what happens to inbound and outbound packets, we can write eBPF programs to execute within the kernel at those points.
Figure: eBPF programs for redirecting traffic
The Istio CNI installs the eBPF program that hooks onto the ingress and egress points of the tc and routes the traffic accordingly.
Current eBPF redirection implementation includes 4 eBPF programs:
- App inbound
- App outbound
- Ztunnel host ingress
- Ztunnel ingress
The app inbound and outbound eBPF programs are attached to the ingress/egress TC of the workload pod on the host side. On the ztunnel side, the programs are attached on the ingress TC on the host and on the pod side.
What happens when using a waypoint proxy?
Since ztunnels only handles L4 concerns, we need to deploy a waypoint proxy to handle any L7 requests.
The waypoint proxies are deployed per service account and can live on any node, regardless of where the actual workload is running. When the waypoint is deployed, the ztunnel configuration is updated and any workloads whose traffic should be handled through the waypoint proxy (i.e. workloads using the same service account) have a waypoint address added to their configuration entry.
For example, here’s a snippet that shows the entry for the httpbin workload (IP 10.244.1.5
) when we deployed the waypoint proxy (10.244.2.6
):
"10.244.1.5": { "workloadIp": "10.244.1.5", "waypointAddresses": ["10.244.2.6"], "gatewayAddress": null, "protocol": "HBONE", "name": "httpbin-5984865fdc-qh894", "namespace": "default", "serviceAccount": "httpbin", "workloadName": "httpbin", "workloadType": "deployment", "canonicalName": "httpbin", "canonicalRevision": "v1", "node": "ambient-worker2", "nativeHbone": false, "authorizationPolicies": [], "status": "Healthy" },
The waypoint proxy is an instance of Envoy that’s configured to exactly match the IP and port of the destination workload, enforce any L7 authentication rules and forward the request to the destination.
An important thing to note is that the waypoint proxies are only part of the request flow on the server side – they are strictly reverse proxies for L7 traffic.
Figure: Waypoint proxies
For example, if we have two workloads, both with waypoint proxies deployed, and make a request from workload A to workload B, the request will skip the client-side (workload A) waypoint proxy, and end up on the server-side (waypoint proxy B) and then on the workload B.
Conclusion
Istio ambient mesh offers advantages over the traditional sidecar data plane deployment model. Its non-invasive and leaner design allows for a less risky and incremental adoption of the service mesh. As fewer components are involved – per-node ztunnel proxies and optional waypoint proxies – the adoption of the ambient mesh directly leads to reduced infrastructure costs and performance improvements.
In this blog post series, I went into a lot of detail and looked at how different components, such as the Istio CNI, work together with the platform and how the configuration is set up for transparently proxying the traffic within the ambient mesh. I explained the two redirection mechanisms – iptables, GENEVE tunnels, and eBPF programs. While both approaches achieve the same end goal, with the eBPF being more performant, less complex, and easier to maintain, albeit still in experimental mode.
Comparing the configuration burden between the sidecar and ambient mesh data plane deployment modes, it’s fair to say that the ambient mesh requires less complex configuration and is easier to understand. As the improvements on the eBPF side of the redirection mechanism continue, the risk and the complexity will only go down. At the same time, the performance will increase compared to the iptables and GENEVE tunnel approach.
As Solo.io is a co-founder of the Istio ambient sidecar-less architecture and leads the development upstream in the Istio community, we are uniquely positioned to help our customers adopt this architecture for production security and compliance requirements. Please reach out to us to talk with an expert.