Getting Started with Multi-tenancy and Routing Delegation in Gloo Platform
An Istio user recently approached us with questions that we at Solo.io frequently hear from enterprises embarking on Day 2 of their service mesh journey. In this post, we’ll focus on this one:
“We’ve experienced challenges in scaling our organization to support multiple application groups using community Istio. How can Gloo Platform help us with multi-tenancy?”
The purpose of this post is to present a simple example illustrating Gloo Platform‘s support for multi-tenancy and how it can lead to better results when managing the configuration of multiple project teams, as compared with basic open-source Istio.
Prerequisites for Following Along
If you’d like to follow along with this example in your own environment, you’ll need a Kubernetes cluster and associated tools, plus an installation of Gloo Platform. We ran the tests in this blog on Gloo Platform v2.3.4 with Istio v1.17.2. We hosted all of this on a local instance of k3d v5.4.3. (We’ll provide a single-command installer that will allow you to set up and later tear down all the infrastructure on a local workstation.)
You’ll need a license key to install Gloo Platform if you don’t already have one. You can obtain a key by initiating a free trial.
For this exercise, we’ll also use some common CLI utilities like kubectl, curl, and git. Make sure these prerequisites are all available to you before jumping into the next section. I’m building this on MacOS but other platforms should be perfectly fine as well.
Clone GitHub Repo
The resources required for this exercise are available in the gloo-gateway-use-cases
repo on GitHub. Clone that to your workstation and switch to the gloo-gateway-use-cases
example directory. We’ll primarily be using the resources in the cloud-migration
example.
git clone https://github.com/solo-io/gloo-gateway-use-cases.git cd gloo-gateway-use-cases
Install Gloo Platform
As this is a getting-started example with Gloo Platform, you’ll only need a single k8s cluster active. However, if you already have multiple clusters in place, you can certainly use that configuration as well.
If you don’t have Gloo Platform installed, there is a simplified installation script available in the GitHub repo you cloned in the previous section. Before you walk through that script, you’ll need three pieces of information.
- Place a Gloo license key in the environment variable
GLOO_GATEWAY_LICENSE_KEY
. If you don’t already have one of these, you can obtain it from your Solo account executive. - Supply a reference to the repo where the hardened Solo images for Istio live. This value belongs in the environment variable
ISTIO_REPO
. You can obtain the proper value from this location once you’re a Gloo Mesh customer or have activated a free trial. - Supply a version string for Gloo Mesh Gateway in the environment variable
GLOO_MESH_VERSION
. For the tests we are running here, we usev2.3.4
.
If you’ve never installed any Gloo Platform technology before, you will need to import a Gloo Platform helm chart before the installation script below will work properly.
helm repo add gloo-platform https://storage.googleapis.com/gloo-platform/helm-charts helm repo update
Now from the gloo-gateway-use-cases
directory at the top level of the cloned repo, execute the setup script below. It will configure a local k3d cluster containing Gloo Platform and an underlying Istio deployment. The script will fail if any of the three environment variables above is not present.
./setup/setup.sh
The output from the setup script should resemble what you see below. If you require a more complex installation, a more complete Gloo Platform installation guide is available here.
INFO[0000] Using config file setup/k3d/gloo.yaml (k3d.io/v1alpha4#simple) INFO[0000] portmapping '8080:80' targets the loadbalancer: defaulting to [servers:*:proxy agents:*:proxy] INFO[0000] portmapping '8443:443' targets the loadbalancer: defaulting to [servers:*:proxy agents:*:proxy] INFO[0000] Prep: Network INFO[0000] Created network 'k3d-gloo' INFO[0000] Created image volume k3d-gloo-images INFO[0000] Starting new tools node... INFO[0000] Starting Node 'k3d-gloo-tools' INFO[0001] Creating node 'k3d-gloo-server-0' INFO[0001] Creating LoadBalancer 'k3d-gloo-serverlb' INFO[0001] Using the k3d-tools node to gather environment information INFO[0001] Starting new tools node... INFO[0001] Starting Node 'k3d-gloo-tools' INFO[0002] Starting cluster 'gloo' INFO[0002] Starting servers... INFO[0002] Starting Node 'k3d-gloo-server-0' INFO[0007] All agents already running. INFO[0007] Starting helpers... INFO[0008] Starting Node 'k3d-gloo-serverlb' INFO[0014] Injecting records for hostAliases (incl. host.k3d.internal) and for 3 network members into CoreDNS configmap... INFO[0016] Cluster 'gloo' created successfully! INFO[0016] You can now use it like this: kubectl config use-context k3d-gloo kubectl cluster-info ******************************************* Waiting to complete k3d cluster config... ******************************************* Context "k3d-gloo" renamed to "gloo". ******************************************* Installing Gloo Gateway... ******************************************* Attempting to download meshctl version v2.3.4 Downloading meshctl-darwin-amd64... Download complete!, validating checksum... Checksum valid. meshctl was successfully installed 🎉 Add the Gloo Mesh CLI to your path with: export PATH=$HOME/.gloo-mesh/bin:$PATH Now run: meshctl install # install Gloo Mesh management plane Please see visit the Gloo Mesh website for more info: https://www.solo.io/products/gloo-mesh/ INFO 💻 Installing Gloo Platform components in the management cluster SUCCESS Finished downloading chart. SUCCESS Finished installing chart 'gloo-platform-crds' as release gloo-mesh:gloo-platform-crds SUCCESS Finished downloading chart. SUCCESS Finished installing chart 'gloo-platform' as release gloo-mesh:gloo-platform workspace.admin.gloo.solo.io "gloo" deleted workspacesettings.admin.gloo.solo.io "default" deleted ******************************************* Waiting to complete Gloo Gateway config... ******************************************* 🟢 License status INFO gloo-mesh enterprise license expiration is 08 Mar 24 10:04 EST INFO Valid GraphQL license module found 🟢 CRD version check 🟢 Gloo Platform deployment status Namespace | Name | Ready | Status gloo-mesh | gloo-mesh-redis | 1/1 | Healthy gloo-mesh | gloo-mesh-mgmt-server | 1/1 | Healthy gloo-mesh | gloo-telemetry-gateway | 1/1 | Healthy gloo-mesh | gloo-mesh-agent | 1/1 | Healthy gloo-mesh | gloo-mesh-ui | 1/1 | Healthy gloo-mesh | prometheus-server | 1/1 | Healthy gloo-mesh-addons | rate-limiter | 1/1 | Healthy gloo-mesh-addons | redis | 1/1 | Healthy gloo-mesh-addons | ext-auth-service | 1/1 | Healthy gloo-mesh | gloo-telemetry-collector-agent | 1/1 | Healthy 🟢 Mgmt server connectivity to workload agents Cluster | Registered | Connected Pod gloo | true | gloo-mesh/gloo-mesh-mgmt-server-6c58598fcd-6c78n
Note that while we’ve set up Gloo Platform to support this entire exercise, Istio is installed as part of that. So for the initial example, we will be using community Istio only.
Istio Example
- Unreachable routes for some services;
- Non-deterministic behavior for certain scenarios; and
- Unexpected routing when virtual service changes occur.
- An operations team responsible for the platform itself (
ops-team
); and - Two application teams (
app-1
andapp-2
) responsible for their own sets of services.
Spin Up the Base Services
ops-team
to hold configuration owned by Operations. Then we will establish separate namespaces with a service instance for each of the application teams, app-1
and app-2
. The services for each team are based on the Fake Service to keep this example as simple as possible. Fake Service instances simply respond to requests with pre-configured messages.# Establish ops-team namespace kubectl apply -f ./multitenant/common/01-ns-ops.yaml # Deploy app-1 and app-2 services to separate namespaces kubectl apply -f ./multitenant/common/02-app-1.yaml kubectl apply -f ./multitenant/common/03-app-2.yaml
Establish an Istio Gateway
kubectl apply -f ./multitenant/istio/01-app-gw.yaml
Establish an Istio Virtual Service
VirtualService
s on our Gateway, one for each of our imaginary application teams. Each of these VSes will have two routes, one that matches on a specific URL prefix and another catch-all route for all other requests. Establishing default routes on VSes is considered an Istio best practice.apiVersion: networking.istio.io/v1beta1 kind: VirtualService metadata: name: app-vs-1 namespace: app-1 spec: hosts: - "api.example.com" gateways: - ops-team/app-gateway http: - name: "app-1-foo-route" match: - uri: prefix: "/foo" route: - destination: host: app-1.app-1.svc.cluster.local port: number: 8080 - name: "app-1-catchall-route" route: - destination: host: app-1.app-1.svc.cluster.local port: number: 8080
VirtualService
for just app-1
now:kubectl apply -f ./multitenant/istio/02-app1-vs.yaml
Test the App1 Service
curl -H "host: api.example.com" localhost:8080/foo -i
/foo
route goes to app-1
as expected:HTTP/1.1 200 OK vary: Origin date: Tue, 20 Jun 2023 18:45:09 GMT content-length: 260 content-type: text/plain; charset=utf-8 x-envoy-upstream-service-time: 6 server: istio-envoy { "name": "app-1", "uri": "/foo", "type": "HTTP", "ip_addresses": [ "10.42.0.35" ], "start_time": "2023-06-20T18:45:09.234491", "end_time": "2023-06-20T18:45:09.236515", "duration": "2.0242ms", "body": "Hello From App-1", "code": 200 }
Configure and Test a Second VirtualService
kubectl apply -f ./multitenant/istio/03-app2-vs.yaml
/bar
route behaves as expected and sends traffic to app-2
…curl -H "host: api.example.com" localhost:8080/bar
{ "name": "app-2", "uri": "/bar", "type": "HTTP", "ip_addresses": [ "10.42.0.36" ], "start_time": "2023-06-20T18:56:50.043017", "end_time": "2023-06-20T18:56:50.043289", "duration": "314.6µs", "body": "Hello From App-2", "code": 200 }
app-2
doesn’t exist. Now that we’re working in a “shared environment”, those requests that team2
expect to be routed to their app go to app-1
instead. And there’s no indication of an error in the Gateway
or VirtualService
resources.curl -H "host: api.example.com" localhost:8080/goto/app2
Team2
sees its expected requests route to app-1
:{ "name": "app-1-default", "uri": "/goto/app2", "type": "HTTP", "ip_addresses": [ "10.42.0.35" ], "start_time": "2023-06-20T19:14:49.185166", "end_time": "2023-06-20T19:14:49.185291", "duration": "125µs", "body": "Hello From App-1 Default", "code": 200 }
What’s REALLY Happening Here?
team2
? This is actually a documented Istio limitation. In cases like this where multiple tenants define similar routes across VirtualService
resources, Istio chooses its route based on which one has been in existence the longest. We can see this by deleting the app-1
VS and then re-applying it. Taking this step-by-step:- Delete
app-1
VS and note that the default route now sends traffic toapp-2
as expected.
kubectl delete -f ./multitenant/istio/02-app1-vs.yaml
curl -H "host: api.example.com" localhost:8080/goto/app2
{ "name": "app-2-default", "uri": "/goto/app2", ...snip...
- Now add back the
app-1
VS and note that the default route does not restart sending traffic toapp-1
; it insteads goes toapp-2
(because that is now the “older” of the two routes).
kubectl apply -f ./multitenant/istio/02-app1-vs.yaml
curl -H "host: api.example.com" localhost:8080/goto/app1
{ name": "app-2-default", "uri": "/goto/app1", ...snip...
- Routes can be “lost” when there are
VirtualService
conflicts between resources, even with as few as two tenants.
- Race conditions between
VirtualService
resources, say in parallel branches of CI/CD pipelines, can result in the same logical configurations exhibiting different routing behaviors, non-deterministically, and without any indication of a problem.
Manage Multiple Tenants with Gloo Platform
kubectl delete -f ./multitenant/istio
Second, we’ll use a Gloo Platform CRD called Workspace
to lay down Kubernetes boundaries for multiple teams within the organization. These Workspace
boundaries can span both Kubernetes clusters and namespaces. In our case we’ll define three Workspaces
, one for the ops-team
that owns the overall service mesh platform, and two for the application teams, app1-team
and app2-team
, to whom we want to delegate routing responsibilities.
Below is a sample Workspace
and its companion WorkspaceSettings
for the app1-team
. Note that it includes the app-1
Kubernetes namespace across all clusters. While there is only a single cluster present in this example, this Workspace
would dynamically expand to include that same namespace on other clusters added to our mesh in the future. Note also that via the WorkspaceSettings
, tenants can choose precisely what resources they are willing to export from their workspace and who is able to consume them. See the API reference for more details on workspace import and export.
apiVersion: admin.gloo.solo.io/v2 kind: Workspace metadata: name: app1-team namespace: gloo-mesh labels: team-category: app-team spec: workloadClusters: - name: '*' namespaces: - name: app-1 --- apiVersion: admin.gloo.solo.io/v2 kind: WorkspaceSettings metadata: name: app1-team namespace: app-1 spec: exportTo: - workspaces: - name: ops-team
kubectl apply -f ./multitenant/gloo/01-ws-opsteam.yaml kubectl apply -f ./multitenant/gloo/02-ws-appteams.yaml
Establish a Virtual Gateway and Routing Rules
The third step in our process of demonstrating multi-tenancy with Gloo Platform is to lay down a VirtualGateway
that selects the Istio Ingress Gateway on our cluster and delegates traffic to RouteTables
(another Gloo Platform abstraction) that are owned by the ops-team
.
kubectl apply -f ./multitenant/gloo/03-vg-httpbin.yaml
VirtualGateway
and RouteTable
resources manage traffic in Gloo Platform.RouteTables
. This is the heart of Gloo’s multi-tenant support. The first set of RTs are owned by the ops-team
select the gateway established in the previous step. These RTs intercept requests with a prefix designated for their respective teams, /team1
and /team2
, and then delegate to other RTs that are owned exclusively by those teams.kubectl apply -f ./multitenant/gloo/04-rt-ops-delegating.yaml
RouteTables
that are owned entirely by the application teams. They establish routes that are functionally identical to what we built in the Istio-only example, including with default routes for each team’s app. These led to the multi-tenancy issues we observed in the original example. But now, because they are deployed in delegated RTs, the default /
routes no longer introduce any ambiguity or risk of race conditions in determining which route is appropriate.kubectl apply -f ./multitenant/gloo/05-rt-team1.yaml,./multitenant/gloo/06-rt-team2.yaml
This is an example of one of the RouteTables
owned by an application team.
apiVersion: networking.gloo.solo.io/v2 kind: RouteTable metadata: name: app1-route namespace: app-1 spec: workloadSelectors: [] http: - name: app1-foo-route matchers: - uri: prefix: /foo forwardTo: destinations: - ref: name: app-1 namespace: app-1 port: number: 8080 - name: app1-default-route forwardTo: destinations: - ref: name: app-1-default namespace: app-1 port: number: 8080
By using an intermediate delegating RT, we have completely removed the risk of conflicting routes from multiple tenants leading to confusion in Istio’s routing choices.
Test the Gloo Services
team1
or team2
are routed exactly as expected. Let’s prove this out with a couple of curl commands. The first routes a request that would have previously triggered a conflict to the app-1-default
service as expected.curl -H "host: api.example.com" localhost:8080/team1/anything
{ "name": "app-1-default", "uri": "/team1/anything", "type": "HTTP", "ip_addresses": [ "10.42.0.43" ], "start_time": "2023-06-21T17:07:08.888625", "end_time": "2023-06-21T17:07:08.888746", "duration": "121.3µs", "body": "Hello From App-1 Default", "code": 200 }
The second test routes to the app-2-default
service, again just as expected.
curl -H "host: api.example.com" localhost:8080/team2/anything
{ "name": "app-2-default", "uri": "/team2/anything", "type": "HTTP", "ip_addresses": [ "10.42.0.46" ], "start_time": "2023-06-21T17:11:29.713765", "end_time": "2023-06-21T17:11:29.713990", "duration": "225.9µs", "body": "Hello From App-2 Default", "code": 200 }
Multi-Tenant Observability
meshctl dashboard
Graph
on the left navigation menu. Next to the Filter By:
label, be sure to select all Workspaces, all Clusters, and all Namespaces. After a few seconds to allow for telemetry collection and processing, you’ll see a graph like the one below. It shows you the traffic moving between the ingress gateway and the four services we established, across all three workspaces. (You may also want to fire off a few additional curl commands like the one above to the gateway endpoint in order to make the statistics slightly more interesting.)Exercise Cleanup
If you used the setup.sh
script described earlier to establish your Gloo Platform environment for this exercise, then there is an easy way to tear down the environment as well. Just run this command:
./setup/teardown.sh
Learn More
In this blog post, we took an introductory tour of service mesh multi-tenancy features using Gloo Platform. All resources used to build the example in this post are available on GitHub.
Do you want to explore further how Solo and Gloo Platform can help you migrate your workloads to the cloud with best practices for traffic management, zero trust networking, and observability?
- Find hands-on training resources for both Gloo Platform and its underlying open-source components at Solo Academy
- Explore the full Gloo Platform documentation here
- Reach out to Solo experts on the Solo.io Slack community and particularly the #gloo-mesh channel
- Request a live demo or trial
- See video content on the solo.io YouTube channel