A few days ago, I published the Exploring Cilium Layer 7 Capabilities Compared to Istio blog where I mentioned network cache-based identity may fail when a pod dies, a new pod is created and gets the IP of the old pod but has a different identity. Thank you everyone for sending me feedback about the blog! In this blog, I would like to demonstrate how identity could be mistaken for network cache-based identity where identities are generated from the network information such as Kubernetes Pod IPs and IPs/identities mappings are cached, thus it could be outdated and mistaken. You’ll see a client who should NOT be able to call a server successfully based on the network policy but is able to call the server because of the wrong identity assigned to the client hence the client is able to bypass the network policy enforcement.
In this experiment, you’ll set up a Kubernetes kind cluster, deploy v1 and v2 of the client applications (sleep) and v1 and v2 of the server applications (helloworld), along with the v1 network policy that allows ONLY the v1 client to call the v1 server, and the v2 network policy that allows ONLY the v2 client to call the v2 server. You’ll first observe the network policies enforced as expected. Then you would trigger an error scenario, along with scale up/down client pods and call the v1 server successfully from the v2 client as the v2 client has the v1 client’s identity. Let us get started!
Setting up the environment
To run the test, you can create a Kubernetes kind cluster with 3 workers, disabling the default CNI per Cilium’s documentation:
Download the Cilium image:
Install Cilium v1.12 with pod-to-pod encryption enabled with WireGuard:
Check the cilium pods to ensure they all reached the running status:
Deploy the applications and network policies
We have two applications, the sleep(client) and helloworld(server) applications. Both the sleep and helloworld have 2 versions, v1 and v2. We also have two simple L4 network policies, where they allow the sleep-v1 to call the helloworld-v1 and the sleep-v2 to call the helloworld-v2. All other calls to the helloworld-v1 or helloworld-v2 should be denied.
Clone the repo, then deploy the sleep-v1 and sleep-v2 deployments along with the helloworld-v1 and helloworld-v2 deployments. For sleep-v1 deployment, it has 15 replicas while the sleep-v2 deployment, helloworld-v1, and helloworld-v2 each have 1 replica.
Deploy the v1 Cilium L4 network policy. The v1 network policy configures ONLY the sleep-v1 is allowed to call the helloworld-v1. The sleep-v2 should NOT be able to call helloworld-v1.
Deploy the v2 Cilium L4 network policy. The v2 network policy configures ONLY the sleep-v2 is allowed to call the helloworld-v2. The sleep-v1 should NOT be able to call helloworld-v2.
Can sleep-v2 call helloworld-v1 successfully when it should not be allowed?
With the above applications and network policies deployed, in most cases, network policies will be effective so sleep-v2 will not be able to call helloworld-v1 successfully. Assume all of your sleep and helloworld pods are up running, you can call helloworld-v1 from the sleep-v1 pod and helloworld-v2 from the sleep-v2 pod:
You’ll get outputs as below where only sleep-v1 can call helloworld-v1 and only sleep-v2 can call helloworld-v2, and nothing else. When sleep-v2 calls helloworld-v1, the connection failed.
Use the command below to display the Cilium’s IP cache:
You’ll see the Cilium’s IP cache similar as below:
Display the sleep-v1’s Cilium identity:
Sample output that shows 13174 is sleep-v1’s Cilium identity:
Prepare the environment to simulate some failure:
In a perfect world when everything works, you’ll observe network policies enforced as above. However, In certain scenarios, sleep-v2 could call helloworld-v1 successfully due to various reasons below:
- Network outage
- Cilium pod was down, during an upgrade
- Resource constraints that cause the agent to be slow
- Resource constraints that cause API server to be slow in sending pod events to Cilium pod
- Programming bug that cause the agent to crash
- Etc…
While these events are not likely, they are not outside the realm of reality. It is possible for sleep-v2 to call helloworld-v1 successfully.
To demonstrate the eventual consistency issue that can cause wrong identity for a pod, let us simulate an “outage” by making the Cilium pod running on the node helloworld-v1 is on to not be able to reach the Kubernetes API server.
Review the test script
Let us review the run-test.sh script together before we run the test. First, we record all the IPs used by all of the sleep-v1 pods. Then scale the sleep-v1 deployment to 0 and scale the sleep-v2 deployment to 15 to simulate an environment where pods go up and down rapidly.
Then we keep rotating the sleep-v2 pods until one of the sleep-v2 pods gets assigned with the same IP as one of the IPs from the earlier sleep-v1 pods before scaling in.
Now that you have identified your sleep-v2 pod, call helloworld-v1 from the sleep-v2 pod that has the mistaken identity from a sleep-v1 pod earlier by reusing its IP address. This should fail per the v1 network policy, however, it would succeed in the test.
Check the Cilium IP Cache to obtain the sleep-v2 pod’s identity from its IP address.
Run the test!
Simply issue run-test.sh to run the test. You’ll get output like below, the sleep-v2 pod (IP address 10.244.2.65) calls the helloworld-v1 successfully after a few rotate of sleep-v2 pods:
From Cilium’s IP Cache map output below, IP address 10.244.2.65 is associated with identity 13174. If you recall, before you run the test, 13174 is sleep-v1’s Cilium identity.
Display the identity using the kubectl get ciliumidentity 13174 -o yaml command. Per output below, the sleep-v2 pod that has IP address 10.244.2.65 has sleep-v1’s identity 13174. This explains why you can curl helloworld-v1 from the sleep-v2 pod successfully earlier even when the v1 network policy ONLY allows sleep-v1 pod to call helloworld-v1.
Take a look at the short video to watch me run the above steps in my test environment:
Wrapping Up
As demonstrated above, network cache-based identity could be mistaken in certain scenarios even with WireGuard encryption enabled. Without a coherent cache (regardless of what has caused it) the identity could be mistaken thus bypassing the network policy. Honestly, this is just a limitation with network-based identity and I would not consider it a bug. While Cilium is used in our testing, it will be the same for any CNI that doesn’t use cryptographic primitives. To achieve defense in depth, you should consider security policies from a service mesh that provides cryptographic identity in addition to L3/L4 network policies.