Identity Federation for Multi-Cluster Kubernetes and Service Mesh
In this blog series, we will dig into specific challenge areas for multi-cluster Kubernetes and service mesh architecture, considerations and approaches in solving them.
The previous blog posts focused on aspects of Failover and Fallback routing from a service mesh perspective and in comparison (and combined with) multi-cluster API gateway instances.
In this blog post we start looking at federating identity across multiple clusters for authentication between services. This blog post and the previous Service Discovery post are complimentary to understand what services exists, where they exist and which ones should be communicating to each other.
To start, there are two different kinds of Authentication that are need for these environments:
- Service to Service authentication
- End user authentication
In this Blog post, we’ll focus on Service to Service authentication.
If you want to learn more about End user authentication, you can have a look at the Gloo documentation.
Service to Service Authentication
By default the TLS protocol only proves the identity of the server to the client using X.509 certificate and the authentication of the client to the server is left to the application layer.
Mutual TLS authentication refers to two parties authenticating each other at the same time.
In Istio, Mutual TLS work as follow:
- Istio re-routes the outbound traffic from a client to the client’s local sidecar Envoy.
- The client side Envoy starts a mutual TLS handshake with the server side Envoy. During the handshake, the client side Envoy also does a secure naming check to verify that the service account presented in the server certificate is authorized to run the target service.
- The client side Envoy and the server side Envoy establish a mutual TLS connection, and Istio forwards the traffic from the client side Envoy to the server side Envoy.
- After authorization, the server side Envoy forwards the traffic to the server service through local TCP connections.
SPIFFE, the Secure Production Identity Framework for Everyone, is a set of open-source standards for securely identifying software systems in dynamic and heterogeneous environments. Systems that adopt SPIFFE can easily and reliably mutually authenticate wherever they are running.
A SPIFFE ID is a string that uniquely and specifically identifies a workload. SPIFFE IDs are a Uniform Resource Identifier (URI) which takes the following format: spiffe://trust domain/workload identifier
In the case of Istio, the SPIFFE ID of a workload looks like spiffe://<trust domain>/ns/<namespace>/sa/<service account>
The default trust domain
is cluster.local
, so a the SPIFFE ID corresponding to a Pod started with the service account pod-sa
in the default
namespace would be spiffe://cluster.local/ns/default/sa/pod-sa
.
In a multi-cluster deployment, using the cluster.local
trust domain is a problem because there would be no way to differentiate a workload of a cluster from one of another cluster if they use the same service account and namespace names.
Istio allows you to use a different trust domain using the trustDomain
parameter of the MeshConfig
option.
Local Service to Service Authentication
Let’s start by a simple local example.
I’ve deployed Istio on a cluster using the kind2
trust domain.
When you deploy the bookinfo demo application on Istio, the productpage
micro service sends requests to the reviews
micro service.
If we modify the logging level of the Envoy sidecar proxy running in the reviews
Pod and load the web page, we’ll see the information below in the logs of the reviews
Pod:
fields { key: "source.namespace" value { string_value: "default" } } fields { key: "source.principal" value { string_value: "kind2/ns/default/sa/bookinfo-productpage" } }
As you can see, the Envoy sidecar proxy running in the reviews
Pod is able to determine that request is coming from a Pod running on the cluster deployed with the trust domain kind2
and using the Service Account bookinfo-productpage
of the default
namespace.
Multi-cluster Service to Service Authentication
First of all, I’ve deployed Istio on a second cluster using the kind3
trust domain.
The lab is composed of 3 kind clusters:
Service Mesh Hub can help unify the root identity between multiple service mesh installations so any intermediates are signed by the same Root CA and end-to-end mTLS between clusters and services can be established correctly.
Run this command to see how the communication between micro services occur currently:
kubectl --context kind-kind2 exec -t deploy/reviews-v1 -c istio-proxy \
-- openssl s_client -showcerts -connect ratings:9080
You should get something like that:
CONNECTED(00000005)
139706332271040:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:332:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 5 bytes and written 309 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
command terminated with exit code 1
It means that the traffic is currently not encrypted.
Enable TLS on both clusters:
kubectl --context kind-kind2 apply -f - <
Run the command again:
kubectl --context kind-kind2 exec -t deploy/reviews-v1 -c istio-proxy \
-- openssl s_client -showcerts -connect ratings:9080/pre>
Now, the output should be like that:
...
Certificate chain
0 s:
i:O = kind2
-----BEGIN CERTIFICATE-----
MIIDFzCCAf+gAwIBAgIRALsoWlroVcCc1n+VROhATrcwDQYJKoZIhvcNAQELBQAw
...
BPiAYRMH5j0gyBqiZZEwCfzfQe1e6aAgie9T
-----END CERTIFICATE-----
1 s:O = kind2
i:O = kind2
-----BEGIN CERTIFICATE-----
MIICzjCCAbagAwIBAgIRAKIx2hzMbAYzM74OC4Lj1FUwDQYJKoZIhvcNAQELBQAw
...
uMTPjt7p/sv74fsLgrx8WMI0pVQ7+2plpjaiIZ8KvEK9ye/0Mx8uyzTG7bpmVVWo
ugY=
-----END CERTIFICATE-----
...
As you can see, mTLS is now enabled.
Now, run the same command on the second cluster:
kubectl --context kind-kind3 exec -t deploy/reviews-v1 -c istio-proxy \
-- openssl s_client -showcerts -connect ratings:9080
The output should be like that:
...
Certificate chain
0 s:
i:O = kind3
-----BEGIN CERTIFICATE-----
MIIDFzCCAf+gAwIBAgIRALo1dmnbbP0hs1G82iBa2oAwDQYJKoZIhvcNAQELBQAw
...
YvDrZfKNOKwFWKMKKhCSi2rmCvLKuXXQJGhy
-----END CERTIFICATE-----
1 s:O = kind3
i:O = kind3
-----BEGIN CERTIFICATE-----
MIICzjCCAbagAwIBAgIRAIjegnzq/hN/NbMm3dmllnYwDQYJKoZIhvcNAQELBQAw
...
GZRM4zV9BopZg745Tdk2LVoHiBR536QxQv/0h1P0CdN9hNLklAhGN/Yf9SbDgLTw
6Sk=
-----END CERTIFICATE-----
...
The first certificate in the chain is the certificate of the workload and the second one is the Istio CA’s signing (CA) certificate.
As you can see, the Istio CA’s signing (CA) certificates are different in the 2 clusters, so one cluster can’t validate certificates issued by the other cluster.
Creating a Virtual Mesh will unify the root identity.
Run the following command to create the Virtual Mesh:
cat << EOF | kubectl --context kind-kind1 apply -f - apiVersion: networking.smh.solo.io/v1alpha2 kind: VirtualMesh metadata: name: virtual-mesh namespace: service-mesh-hub spec: mtlsConfig: autoRestartPods: true shared: rootCertificateAuthority: generated: null federation: {} meshes: - name: istiod-istio-system-kind2 namespace: service-mesh-hub - name: istiod-istio-system-kind3 namespace: service-mesh-hub EOF
When we create the VirtualMesh and set the trust model to shared, Service Mesh Hub will kick off the process to unify the identity to a shared root.
First, Service Mesh Hub will create the Root CA.
Then, Service Mesh Hub will use a Certificate Request (CR) agent on each of the clusters to create a new key/cert pair that will form an intermediate CA used by the mesh on that cluster. It will then create a Certificate Request.
Service Mesh Hub will sign the certificate with the Root CA. At that point, we want Istio to pick up the new intermediate CA and start using that for its workloads.
To do that Service Mesh Hub creates a Kubernetes secret called cacerts
in the istio-system
namespace.
You can have a look at the Istio documentation here if you want to get more information about this process.
Check that the new certificate has been created on the first cluster:
kubectl --context kind-kind2 get secret -n istio-system cacerts -o yaml
Here is the expected output:
apiVersion: v1
data:
ca-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFRENDQXZpZ0F3SUJBZ0lRUG5kRDkwejN4dytYeTBzYzNmcjRmekFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
jFWVlZtSWl3Si8va0NnNGVzWTkvZXdxSGlTMFByWDJmSDVDCmhrWnQ4dz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
ca-key.pem: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlKS0FJQkFBS0NBZ0VBczh6U0ZWcEFxeVNodXpMaHVXUlNFMEJJMXVwbnNBc3VnNjE2TzlKdzBlTmhhc3RtClUvZERZS...
DT2t1bzBhdTFhb1VsS1NucldpL3kyYUtKbz0KLS0tLS1FTkQgUlNBIFBSSVZBVEUgS0VZLS0tLS0K
cert-chain.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFRENDQXZpZ0F3SUJBZ0lRUG5kRDkwejN4dytYeTBzYzNmcjRmekFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
RBTHpzQUp2ZzFLRUR4T2QwT1JHZFhFbU9CZDBVUDk0KzJCN0tjM2tkNwpzNHYycEV2YVlnPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
key.pem: ""
root-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUU0ekNDQXN1Z0F3SUJBZ0lRT2lZbXFGdTF6Q3NzR0RFQ3JOdnBMakFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
UNBVEUtLS0tLQo=
kind: Secret
metadata:
labels:
agent.certificates.smh.solo.io: service-mesh-hub
cluster.multicluster.solo.io: ""
name: cacerts
namespace: istio-system
type: certificates.smh.solo.io/issued_certificate
Check that the new certificate has been created on the second cluster:
kubectl --context kind-kind3 get secret -n istio-system cacerts -o yaml
Here is the expected output:
apiVersion: v1
data:
ca-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFRENDQXZpZ0F3SUJBZ0lRWXE1V29iWFhGM1gwTjlNL3BYYkNKekFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
XpqQ1RtK2QwNm9YaDI2d1JPSjdQTlNJOTkrR29KUHEraXltCkZIekhVdz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
ca-key.pem: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlKS1FJQkFBS0NBZ0VBMGJPMTdSRklNTnh4K1lMUkEwcFJqRmRvbG1SdW9Oc3gxNUUvb3BMQ1l1RjFwUEptCndhR1U1V...
MNU9JWk5ObDA4dUE1aE1Ca2gxNCtPKy9HMkoKLS0tLS1FTkQgUlNBIFBSSVZBVEUgS0VZLS0tLS0K
cert-chain.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZFRENDQXZpZ0F3SUJBZ0lRWXE1V29iWFhGM1gwTjlNL3BYYkNKekFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
RBTHpzQUp2ZzFLRUR4T2QwT1JHZFhFbU9CZDBVUDk0KzJCN0tjM2tkNwpzNHYycEV2YVlnPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
key.pem: ""
root-cert.pem: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUU0ekNDQXN1Z0F3SUJBZ0lRT2lZbXFGdTF6Q3NzR0RFQ3JOdnBMakFOQmdrcWhraUc5dzBCQVFzRkFEQWIKTVJrd0Z3WURWU...
UNBVEUtLS0tLQo=
kind: Secret
metadata:
labels:
agent.certificates.smh.solo.io: service-mesh-hub
cluster.multicluster.solo.io: ""
name: cacerts
namespace: istio-system
type: certificates.smh.solo.io/issued_certificate
As you can see, the secrets contain the same Root CA (base64 encoded), but different intermediate certs.
Now, let’s check what certificates we get when we run the same commands we ran before we created the Virtual Mesh:
kubectl --context kind-kind2 exec -t deploy/reviews-v1 -c istio-proxy \
-- openssl s_client -showcerts -connect ratings:9080
The output should be like that:
...
Certificate chain
0 s:
i:
-----BEGIN CERTIFICATE-----
MIIEBzCCAe+gAwIBAgIRAK1yjsFkisSjNqm5tzmKQS8wDQYJKoZIhvcNAQELBQAw
...
T77lFKXx0eGtDNtWm/1IPiOutIMlFz/olVuN
-----END CERTIFICATE-----
1 s:
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIFEDCCAvigAwIBAgIQPndD90z3xw+Xy0sc3fr4fzANBgkqhkiG9w0BAQsFADAb
...
hkZt8w==
-----END CERTIFICATE-----
2 s:O = service-mesh-hub
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIE4zCCAsugAwIBAgIQOiYmqFu1zCssGDECrNvpLjANBgkqhkiG9w0BAQsFADAb
...
s4v2pEvaYg==
-----END CERTIFICATE-----
3 s:O = service-mesh-hub
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIE4zCCAsugAwIBAgIQOiYmqFu1zCssGDECrNvpLjANBgkqhkiG9w0BAQsFADAb
...
s4v2pEvaYg==
-----END CERTIFICATE-----
...
And let’s compare with what we get on the second cluster:
kubectl --context kind-kind3 exec -t deploy/reviews-v1 -c istio-proxy \
-- openssl s_client -showcerts -connect ratings:9080
The output should be like that:
...
Certificate chain
0 s:
i:
-----BEGIN CERTIFICATE-----
MIIEBjCCAe6gAwIBAgIQfSeujXiz3KsbG01+zEcXGjANBgkqhkiG9w0BAQsFADAA
...
EtTlhPLbyf2GwkUgzXhdcu2G8uf6o16b0qU=
-----END CERTIFICATE-----
1 s:
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIFEDCCAvigAwIBAgIQYq5WobXXF3X0N9M/pXbCJzANBgkqhkiG9w0BAQsFADAb
...
FHzHUw==
-----END CERTIFICATE-----
2 s:O = service-mesh-hub
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIE4zCCAsugAwIBAgIQOiYmqFu1zCssGDECrNvpLjANBgkqhkiG9w0BAQsFADAb
...
s4v2pEvaYg==
-----END CERTIFICATE-----
3 s:O = service-mesh-hub
i:O = service-mesh-hub
-----BEGIN CERTIFICATE-----
MIIE4zCCAsugAwIBAgIQOiYmqFu1zCssGDECrNvpLjANBgkqhkiG9w0BAQsFADAb
...
s4v2pEvaYg==
-----END CERTIFICATE-----
...
You can see that the last certificate in the chain is now identical on both clusters. It’s the new root certificate.
The first certificate is the certificate of the service. Let’s decrypt it.
Copy and Paste the content of the certificate (including the BEGIN and END CERTIFICATE lines) in a new file called /tmp/cert
and run the following command:
openssl x509 -in /tmp/cert -text
The output should be as follow:
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
7d:27:ae:8d:78:b3:dc:ab:1b:1b:4d:7e:cc:47:17:1a
Signature Algorithm: sha256WithRSAEncryption
Issuer:
Validity
Not Before: Sep 17 08:21:08 2020 GMT
Not After : Sep 18 08:21:08 2020 GMT
Subject:
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
Public-Key: (2048 bit)
Modulus:
...
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Subject Alternative Name: critical
URI:spiffe://kind3/ns/default/sa/bookinfo-ratings
Signature Algorithm: sha256WithRSAEncryption
...
-----BEGIN CERTIFICATE-----
MIIEBjCCAe6gAwIBAgIQfSeujXiz3KsbG01+zEcXGjANBgkqhkiG9w0BAQsFADAA
...
EtTlhPLbyf2GwkUgzXhdcu2G8uf6o16b0qU=
-----END CERTIFICATE-----
The Subject Alternative Name (SAN) is the most interesting part. It allows the sidecar proxy of the reviews
service to validate that it talks to the sidecar proxy of the rating
service.
References
- Mutual Authentication https://en.wikipedia.org/wiki/Mutual_authentication
- Istio Security Concepts https://istio.io/latest/docs/concepts/security/
- SPIFEE Overview https://spiffe.io/docs/latest/spiffe/overview/
- Istio Config Docs https://istio.io/latest/docs/reference/config/istio.mesh.v1alpha1/
Get started
Service Mesh Hub was updated and open sourced in May and has recently started community meetings to expand the conversation around service mesh. We invite you to check out the project and join the community. Solo.io also offers enterprise support for Istio service mesh for those looking to operationalize service mesh environments, request a meeting to learn more here.
- Learn more about Service Mesh Hub
- Read the docs and watch the demos
- Request a personalized demo
- Questions? Join the community