Hoot [Episode 4]: AWS App Mesh

March 6, 2020

Hoot is a live video series hosted roughly every two weeks where engineers unbox various cloud-native technologies, dig into features of new releases and answer your questions live. To kick off the launch of Hoot, we start with a series on Get to Know Service Mesh as service mesh is the latest buzzword in our ecosystem. The questions come up often include: What is it? Why do I need it? and Which one should I choose?

Get to Know Service Mesh

To help explain service mesh, this series will explore the different service meshes, explain the architectural approach, unique capabilities, contrast them and provide guidance on how to navigate the landscape in choosing a mesh(es) for your applications.

Covered in this series:

Episode 4: Get to Know AWS App Mesh

Speaker:  Eitan Yarmush, Software Engineer

Transcript

Welcome to the Hoot series Get to know Service Mesh. I’m Eitan Yarmush, I’m one of the developers here at solo.io, and I’m here to talk about AWS.

In previous episodes we discussed Istio with Christian Posta, Linkerd with Rick Ducott, and Consul with Yuval. And so what in that in such a series, I’m going to give a quick unboxing of AWS App Mesh and its features, as well as what makes it special and a good contender in the market. So with that in mind, let’s, let’s jump right in.

So, AWS app mesh is the really the main vendor mesh on the market right now, and in that way it’s able to integrate very closely with other AWS services, which can be very nice if you’re already running workloads on EC to ACS if you’re using other services such as cloud watch or our x ray for observe ability as well as integrating with AWS outposts for on-prem.[MAM1]  With that said, it’s actually the first built from the ground up to work in a non-Kubernetes environment. So as I mentioned earlier, with ETS and ec2 and that is a big pro and as well as AWS Fargate which is the newest offering from Amazon which is more of a serverless Kubernetes offering

And so if we just quickly look here just to skim the docs. As I said, you can use it with any other services. It’s built on top of Envoy and similar to Istio it uses a Sidecar model just like Istio and Linkerd and but it uses the Envoy proxy just like Istio. However the Envoy Proxy as well as the mesh itself is not open source as of this moment. So we don’t know exactly the proxy that they are running just that it is Envoy.

So if we look quickly at their architecture diagram, we see that the AWS App Mesh. What separates App Mesh from some of the other meshes is traditionally a Kubernetes native service mesh. There is the data plane, which consists of all of the containers and their and their side cars. But then there is the control plane, which in Istio’s case is the like Pilot and Citadel and other components in and Linkerd has the routing and the policy also. But in the case of App Mesh, the control plane actually does not live in Kubernetes, but rather lives in a centralized location. So to interact with AWS App Mesh, you actually need to make API calls to Amazon. So in that way, it does not actually live in your cluster, but as I said earlier, it allows you to add non-Kubernetes workloads to the mesh more easily.

As it says here. So, these little orange boxes representing the Sidecar proxies, the Envoy instances, and App Mesh running centrally. If we look at features really quick. Open Source, visibility, traffic control. So in terms of their traffic control features App Mesh allows you to do the canary style routing, I’m going to show that here in a minute. As well as retries and some other basic traffic management policies. It’s worth saying that it’s fully managed by AWS, so actually using their mesh features is completely free. If you’re already using AWS.

So with that, I think we’re going to jump right into a demo. And to do that, the demo that I’m following, so you can check it out at home is this one. There’s the link right there. I’ll leave that up for a few seconds while I just talked about it. So App mesh, as I said earlier, runs within Amazon. It doesn’t actually run in cluster. So, to configure it requires communicating with the AWS API. So originally if you wanted to configure App Mesh, you either had to do it through the API or through the web console, which I have up here. This will be populated later, we will get back to that. However, they recently created a controller to translate CRDs into their API objects which is very useful. As well as a Sidecar injector. So I said earlier that AWS is a proxy that operates using Envoy Sidecars and originally you had to inject them yourself, but they have created a Sidecar injector to do that for you. However, I think it is worth pointing out that they have a very good doc here once again in the documentation EKS getting started with AWS App Mesh and Kubernetes is they have a fairly extensive doc on how to manually add a sidecar to your pods, and just show exactly what’s happening and what all of the different environment variables do and to get more comfortable with the changes that the sidecar injector is actually making. As well as bootstrap, the App Mesh environment from the ground without the help of the controller and the Sidecar injector, which I wouldn’t recommend for normal workflows, but as a tutorial I think it is quite useful. 

The other thing to point out is that this tutorial assumes that you already have EKS running with App Mesh access. So to do that, I actually used eksctl.  You can download eksctl right here you can see, it’s a Weaveworks product, you can do it using brew, it’s super simple and it allows you to bootstrap EKS clusters very quickly. So as I said earlier, you’re really able to run a AWS App Mesh on any App Mesh workload. But, for this unboxing. I have chosen to do it on EKS because that’s the environment here at Solo.io, we work with the most and is definitely the most popular for the service mesh.

Like I said, I installed with eksctl for anyone who’s curious about what command I used this is the one. The important thing here is the app mesh access as well as the version. It has to be greater than 1.12, and the App Mesh access allows the workloads to actually access the App Mesh API from AWS. So, unfortunately this cluster creation takes about 20 minutes so I ran it beforehand. So with that, we’ll just jump right in.

So if I go ahead and get my pods running in all namespaces. I do not have much running just the standard AWS node core DNS and Kube proxy. Now that that’s all done. I’m going to go ahead and try out App Mesh with the controller. So, it looks like it needs to be 1.11 or later.  I have that running 1.13, I have JQ, and open SSL. The first thing it wants us to do is create and launch the controller that I talked about earlier and initialize the custom resource definitions. I’m going to go ahead and do that. So as we see it created our new custom resource definitions as well as the controller, the roles, and the binding that it needs to run said controller and the App Mesh system which it seems is where it put the controller.

So if we now do a Kubectl get pods on the Amazon system, we see that we have the controller running. Now if we also go ahead and get all of our CRDs. We see that we now have three CRDs: meshes, virtual nodes, and virtual services. Now, these are the three main resources that App Mesh uses to configure routing and other traffic management. We can confirm that this is correct. Let’s run this command. It successfully rolled out, good. We already got the CRDs.

And now, as I said earlier, we need to install the Sidecar injector. This is where the actual mesh comes into play. So that’s the console that I was showing earlier. So the first thing that we’re going to create is the mesh, so in the command line. I’m going to set my mesh name to “my mesh” just for the sake of it. So that’s my mesh name and the region for this tutorial is going to be “US-east2”. We’re going to export mesh region equals US East 2. And now we’re going to go ahead and install the sidecar injector.

That’s just waiting for the injector to finish deploying and there you go. So now, if we get pod again in the App Mesh system namespace, we should see that there are now two pods running: there’s the controller and the injector. So the injector, as I said earlier, is a mutating admission webhook that is going to capture all pod scheduling requests, and add the containers that App Mesh needs to intercept all the traffic. It adds the proxy, as well as running in a container — which changes some of the networking and allows the all of the network traffic to be captured by the Envoy proxy.

So that with that done, we can actually configure App Mesh. First step is going to be creating the mesh. So I’m just going to copy the mesh CRD here, and it does not tell me where to put it. So I’m just going to assume that the App Mesh system might be a global resource. So I’m going to. So we’re just going to copy and we’re going to create this guy. So we can just on Mac do a copy and paste and type that into kubectl apply-f.  And my mesh was created successfully. 

Now let’s see how we can verify that that was that that happened, typically with Kubernetes resources you can do a get on them, and there might be a status update or something like that. So let’s just go ahead and do that. The mesh is active, according to the status — that’s good news. So let’s go into the console refresh, there we go, we have my mesh. Okay, now as I said earlier, this mesh doesn’t have any resources on it yet. The three main resources are the virtual services, virtual routers, and virtual nodes and we have only created the mesh resource. 

The purpose of the controller is to sync the custom resources that you’re writing into Kubernetes with the AWS API. So let’s continue with the tutorial. It looks like it wants you to create a virtual service. So we don’t have any services yet, so I’m not 100% this will work. In the next part of the tutorial it actually has you create the services. So let’s go ahead and try this, if we create a namespace for it, let’s say, “kubectl create namespace appmesh-app” And we’re going to go ahead and actually this unfortunately will not save properly because the namespace is set incorrectly. So if we just open up Visual Studio make a new window here. 

Let’s just open up a new window, drag it out. We see where that goes. Can we drag this out? No, it doesn’t seem to want to do that. So let’s just open that up in here and so we’re going to paste that into here and then we’re going to go ahead and change this namespace to app mesh app. We don’t need to save that. Cool. And now we’re going to go ahead and apply that as well. 

Awesome. And so now if we go back into the console that should have been created for us the virtual services. No. Well, then there might be an error. Let’s go see. So, okay, kubectl get virtualservices.appmesh.k8s.aws -n appmesh-app -oyaml so we can take a look at that status and there is no status on it that makes sense that it wouldn’t have worked because there’s no router for it. It’s pointing to something that doesn’t exist. Interesting that it wants you to do it like that. So I think this is more of a “show you what it can do”. So, now what we’re going to do is go to the next part of the tutorial, which is deploying the actual service or app to get injected by

App Mesh so we can actually handle all of the routing. So let’s go ahead and do that. So we’re going to delete this virtual service by doing “kubectl delete namespace appmesh-app”. That should go ahead and delete the virtual services. So now we’re going to deploy the sample app and that will actually go ahead and create it.

We have done all of the prerequisites. So let’s go ahead and see everything that this is going to apply because it’s going to apply a mesh. So if we go we just copy and paste that it looks like it’s going to create a namespace for us, called “appmesh-demo”, which has the webhook injector enabled, so this label is going to tell the Sidecar injector to inject any pods in this namespace. We’re going to create a mesh, we’re going to create a virtual node, which is representative of the workload as a service,  so how a given workload is exposed via service discovery, as well as the back end that it is able to talk to — so, the servers that this workload is able to communicate with.

 We have one of those for our color gateway, our color teller, and our specific colors. So, we’re just going to go ahead and apply this and we can look more into it once it is running in our cluster. So we’re going to copy that, paste, and let it run.

So now, if we go ahead and get namespaces. You can see that we have an App Mesh demo namespace running now. So if we do “Kubectl get pods -n appmesh-demo”, we have six pods and if you notice here each, each one of these has “ready 2/2”. So the second container running in each of these pods is the Envoy proxy container which handles all of the inter mesh routing. It looks like the injection happened properly.

 We can see if it will show us everything, including the custom resources, and it did. We see that there are all those pods, as well as services for each of our pods, and more importantly, here is our App Mesh resources. So we have two things: the mesh “my mesh” is actually a leftover from the first part, but the color mesh is the one that was applied here as well as a virtual node for each workload and virtual services, which handles the routing rules.

If we go back into our console into meshes we see there’s now this newly created color mesh. If we go to virtual nodes, we have our six virtual nodes. So each one of these represents a unique mesh workflow. We can see that they have unique DNS names which correspond to their  Kubernetes service names. You can tell from the name “namespace.svc.cluster.local” paradigm, as well as here the gateway is able to communicate with two of the other nodes.

Then if we look here at virtual services, we can see that this is actually what handles the routing and sets up weights and so we will look further into that in a second. Let’s run this guide. So we need to be able to curl from within the cluster, and so this is what is going to allow us to do that, so we are going to run that. It looks like it’s going to create a pod, most likely one that is injected in the App Mesh demo namespace that has crawled running

So now what we’re going to ping the color gateway 9080/color endpoint. And if we go ahead and look at our App Mesh configuration, we have our virtual nodes and our virtual nodes as we see here have the one that we are actually going to hit is the app mesh demo.

When we reach out to the app mesh demo, it is then calling the slash color route which is going to route to our virtual service here: the color and which will return either blue or black. So let’s go ahead and run that command. 

Oh no, that’s not good. What happened here. Let’s go ahead and find out. Strange. Okay, let’s see, everything looks good. Up to this point. Unfortunately, it looks like we’re going to have to do a bit of debugging and which is always the fun part. Um, so I wonder, it seems like we could not connect to the gateway on port 90 I am guessing that our current cod is not injected for some reason or another, um, let’s see if we can go look at it hoops to get pods at mesh demo curler, as we can see, it is running in the namespace, but it is not in fact injected. And so it looks like this demo is not, it does not it unfortunately does not add that to the pod by default. So I’m wondering if I can, I cannot delete the pod as it was scheduled alone, there is no replica set for it. So I might have to quickly go ahead and create a deployment for it. If I make sure just to get the replica sets.

There is one. Oh, so let’s go ahead and do the deployment. It looks like we may have got lucky and be able to just delete it. We could just delete the pod and have it be rescheduled yes that looks like the case. So if we just go ahead and delete pod. We’re going to delete it, it’s going to curl and when that gets initialized it should have the Proxy running in it and it does not so I thought that the I was under the impression that the namespace being labeled would handle that for us.

But let’s do further. So let’s get the namespace demo. And it looks like we have the sidecar injector enabled there. So I’m wondering why this particular deployment does not want to be injected. So let’s see if we look at the deployment. It does not have any unique labels, aside from run. So let’s look at the other deployments and see if they have unique labels.

Color gateway does not in fact have any special labels. So this is very confusing to me. I let’s see if we can edit the deployment for the curler take away that run label and maybe it will work then color gateway. I apologize for this again. I am not too sure what is happening.  We’re gonna go ahead and get the pods in that mesh demo again. And when we edited the curler deployment, it should leave in the pod, but we’ll try again. I’m going to quickly create a new deployment. From this guy and see if we can get this going quickly. I don’t want to spend too much time on this.

Continues debugging system

I will continue to debug on my own. As I said earlier, I hope it’s helpful for everyone. Feel free to reach out to me on Twitter, my handle is my name @Eitan_Yarmush or soloio on Twitter, come check out our slack. We’re always having interesting conversations about all things mesh and otherwise. So thank you so much and have a wonderful day.