Open Source Service Mesh Hub – Technical Overview

Solo.io Engineering | April 8, 2020

Today, we announced the open source release of Service Mesh Hub, representing a big step forward in our effort to simplify the experience of working with service meshes in complex enterprise environments. As Idit mentioned in her announcement, this project follows in the footsteps of our previous work on SuperGloo, as well as our prior releases of Service Mesh Hub and our broader collaboration with the service mesh community.  

 

A multi-cluster, multi-mesh management plane

Service Mesh Hub was designed to be able to scale your operations, from a single mesh on a single Kubernetes cluster, to managing multiple service meshes spanning many clusters. It consists of a set of components that run on a single cluster, often referred to as your management plane cluster

You can register a cluster with Service Mesh Hub, and it will handle the communications with other clusters – discovering what is running, pushing out configurations, scraping metrics, and more. You don’t need to switch Kubernetes contexts between clusters in order to update the configuration of your service mesh or application, which makes it much simpler to run and scale your operations. 

 

Discovery

When a cluster is registered, Service Mesh Hub starts discovery. The first task of discovery is to find any service meshes that are installed on the cluster. When it finds the control plane for a service mesh, discovery will write a Mesh resource to the management plane cluster, linked to the KubernetesCluster resource that was written during cluster registration. Currently, Service Mesh Hub discovers and manages Istio and Linkerd meshes, with plans to support more. 

Discovery then looks for workloads that are associated with the mesh, such as a deployment that has created a pod with a sidecar proxy for that mesh. It will write a MeshWorkload resource to the management plane cluster representing this workload. 

Finally, discovery also looks for services that are exposing the workloads of a mesh, and as before, writes a MeshService resource to the management plane cluster. 

At this point, the management plane has a complete view of the meshes, services, and workloads across your multi-cluster, multi-mesh environment. 

 

Virtual Meshes

In order to enable multi-cluster configuration, users will group multiple meshes together into an object called a VirtualMesh. The virtual mesh contains a few pieces of configuration that facilitate cross-cluster communications. 

In order for a virtual mesh to be considered valid, Service Mesh Hub will first try to establish trust based on the trust model defined by the user — is there complete shared trust and a common root and identity? Or is there limited trust between clusters and traffic is gated by egress and ingress gateways? Service Mesh Hub ships with an agent that helps facilitate cross-cluster certificate signing requests safely, to minimize the operational burden around managing certificates. 

Once trust has been established, Service Mesh Hub will start federating services so that they are accessible across clusters. Behind the scenes, Service Mesh Hub will handle the networking — possibly through egress and ingress gateways, and possibly affected by user-defined traffic and access policies — and ensure requests to the service will resolve and be routed to the right destination. Users can fine-tune which services are federated where by editing the virtual mesh. 

As of this release, Service Mesh Hub supports creating a virtual mesh with multiple Istio 1.5 control planes across multiple clusters. In the future, the team is planning to add support for more types of meshes, and enable virtual meshes that may include different types of meshes altogether (i.e. AWS AppMesh and Istio). 

 

Traffic and Access Policies

Service Mesh Hub enables users to write simple configuration objects to the management plane to enact traffic and access policies between services, across any cluster under management. It was designed to be translated into the underlying mesh config, while abstracting away the mesh-specific complexity from the user. 

A TrafficPolicy applies between a set of sources (mesh workloads) and destinations (mesh services), and is used to describe rules like “when A sends POST requests to B, add a header and set the timeout to 10 seconds”. Or “for every request to services on cluster C, increase the timeout and add retries”. As of this release, traffic policies support timeouts, retries, cors, traffic shifting, header manipulation, fault injection, subset routing, weighted destinations, and more. Note that some meshes don’t support all of these features; Service Mesh Hub will translate as best it can into the underlying mesh configuration, or report an error back to the user. 

An AccessPolicy also applies between sources (this time representing identities) and destinations, and are used to finely control which services are allowed to communicate. On the virtual mesh, a user can specify a global policy to restrict access, and require users to specify access policies in order to enable communication to services. 

With traffic and access policies, Service Mesh Hub gives users a powerful language to dictate how services should communicate, even within complex multi-cluster, multi-mesh applications. 

 

CLI Tooling

Service Mesh Hub is tackling really hard problems related to multi-cluster networking and configuration, so to speed up your learning curve it comes with a command line tool called meshctl. This tool provides interactive commands to make it easier to author your first virtual mesh, register a cluster, or create a traffic or access policy. Once you’ve authored config, it also has a “describe” command to help understand how your workloads and services are affected by your policies. 

Finally, we understand that playing around with tools that require setting up multiple clusters and managing multiple meshes is not easy, so we included a few commands to help get started immediately. You can run “meshctl demo init” to set up two kind clusters locally, and we’ll be expanding the set of demos we include to make it easy to dive right in. 

 

Learn More

To get started, watch this demo from Christian Posta and If you have any questions or feedback, we’d love to hear from you! 

Here are some resources to look at if you’d like to learn more:

Back to Blog