The Elephant (Payload) in the Room, Part 2: Handling Super-Sized Requests with Gloo API Gateway
Recently, a customer approached us with a problem. They use another vendor’s API gateway that checks the boxes for most of their requirements, with one notable exception: it fails on messages with elephantine payloads. They have requirements to issue requests that post up to gargantuan 100MB files. And adding another dimension to this porcine pickle, they would like to simultaneously have the gateway layer add arbitrary custom headers to be injected along with the upstream request. Can Gloo’s gateway technology help with such a problem?
In Part 1 of this post, we worked through this example using a dedicated Gloo Edge API gateway. Here in Part 2, we’ll work through the same example using Solo’s new Gloo API Gateway, built on top of Istio. In addition to offering equivalent gateway features, the Gloo API Gateway delivers the benefits of being integrated with the Istio control plane under Gloo management: multi-tenancy capabilities, superior cross-cluster operations, and more complete service observability.
But for this exercise, we’re strictly focused on the API Gateway as it can be adopted almost independently of any underlying service mesh. The larger benefits of Istio and Gloo Mesh are covered elsewhere.
So let’s get started. We invite you to follow along on your own Kubernetes cluster.
Prerequisites
To complete this guide, you’ll need a Kubernetes cluster and associated tools, plus an Istio deployment and an installation of Gloo Mesh Enterprise. We ran the tests in this blog on Gloo Mesh Enterprise v2.3.4 with Istio v1.17.2. We hosted all of this on a local instance of k3d v5.4.3.
You’ll need a license key to install Gloo Mesh Enterprise if you don’t already have one. You can obtain a key by initiating a free trial here.
For this exercise, we’ll also use some common CLI utilities like kubectl, curl, and git. Make sure these prerequisites are all available to you before jumping into the next section. I’m building this on MacOS but other platforms should be perfectly fine as well.
Clone Github Repo
The resources required for this exercise are available in the gloo-edge-use-cases
repo on Github. Clone that to your workstation and switch to the large-payload
example directory:
git clone https://github.com/solo-io/gloo-gateway-use-cases.git cd gloo-gateway-use-cases
Install Gloo API Gateway
Since we’re just evaluating the API Gateway component of Gloo, you’ll only need a single k8s cluster active. However, if you already have multiple clusters in place, you can certainly use that configuration as well.
If you don’t have Istio or Gloo installed, there is a simplified installation script available in the Github repo you cloned in the previous section. Before you walk through that script, you’ll need three pieces of information.
- Place a Gloo license key in the environment variable
GLOO_GATEWAY_LICENSE_KEY
. If you don’t already have one of these, you can obtain it from your Solo account executive. - Supply a reference to the repo where the hardened Solo images for Istio live. This value belongs in the environment variable
ISTIO_REPO
. You can obtain the proper value from this location once you’re a Gloo Edge customer or have activated a free trial. - Supply a version string for Gloo Mesh Gateway in the environment variable
GLOO_MESH_VERSION
. For the tests we are running here, we usev2.3.4
.
From the gloo-gateway-use-cases
directory at the top level of the cloned repo, execute the setup script below. It will configure a local k3d cluster containing Istio with the Gloo API Gateway component activated. The script will fail if any of the three environment variables above is not present.
./setup/setup.sh
The output from the setup script should resemble what you see below. If you require a more complex installation, the complete Gloo Mesh installation guide is available here.
INFO[0000] Using config file setup/k3d/gloo.yaml (k3d.io/v1alpha4#simple) INFO[0000] portmapping '8080:80' targets the loadbalancer: defaulting to [servers:*:proxy agents:*:proxy] INFO[0000] portmapping '8443:443' targets the loadbalancer: defaulting to [servers:*:proxy agents:*:proxy] INFO[0000] Prep: Network INFO[0000] Created network 'k3d-gloo' INFO[0000] Created image volume k3d-gloo-images INFO[0000] Starting new tools node... INFO[0000] Starting Node 'k3d-gloo-tools' INFO[0001] Creating node 'k3d-gloo-server-0' INFO[0001] Creating LoadBalancer 'k3d-gloo-serverlb' INFO[0001] Using the k3d-tools node to gather environment information INFO[0001] Starting new tools node... INFO[0001] Starting Node 'k3d-gloo-tools' INFO[0002] Starting cluster 'gloo' INFO[0002] Starting servers... INFO[0002] Starting Node 'k3d-gloo-server-0' INFO[0007] All agents already running. INFO[0007] Starting helpers... INFO[0007] Starting Node 'k3d-gloo-serverlb' INFO[0014] Injecting records for hostAliases (incl. host.k3d.internal) and for 3 network members into CoreDNS configmap... INFO[0016] Cluster 'gloo' created successfully! INFO[0016] You can now use it like this: kubectl config use-context k3d-gloo kubectl cluster-info ******************************************* Waiting to complete k3d cluster config... ******************************************* Context "k3d-gloo" renamed to "gloo". ******************************************* Installing Gloo Gateway... ******************************************* Attempting to download meshctl version v2.3.4 Downloading meshctl-darwin-amd64... Download complete!, validating checksum... Checksum valid. meshctl was successfully installed 🎉 Add the Gloo Mesh CLI to your path with: export PATH=$HOME/.gloo-mesh/bin:$PATH Now run: meshctl install # install Gloo Mesh management plane Please see visit the Gloo Mesh website for more info: https://www.solo.io/products/gloo-mesh/ INFO 💻 Installing Gloo Platform components in the management cluster SUCCESS Finished downloading chart. SUCCESS Finished installing chart 'gloo-platform-crds' as release gloo-mesh:gloo-platform-crds SUCCESS Finished downloading chart. SUCCESS Finished installing chart 'gloo-platform' as release gloo-mesh:gloo-platform workspace.admin.gloo.solo.io "gloo" deleted workspacesettings.admin.gloo.solo.io "default" deleted ******************************************* Waiting to complete Gloo Gateway config... ******************************************* 🟢 License status INFO gloo-mesh enterprise license expiration is 08 Mar 24 10:04 EST INFO Valid GraphQL license module found 🟢 CRD version check 🟢 Gloo Platform deployment status Namespace | Name | Ready | Status gloo-mesh | gloo-mesh-redis | 1/1 | Healthy gloo-mesh | gloo-mesh-mgmt-server | 1/1 | Healthy gloo-mesh | gloo-telemetry-gateway | 1/1 | Healthy gloo-mesh | gloo-mesh-agent | 1/1 | Healthy gloo-mesh | prometheus-server | 1/1 | Healthy gloo-mesh | gloo-mesh-ui | 1/1 | Healthy gloo-mesh-addons | redis | 1/1 | Healthy gloo-mesh-addons | rate-limiter | 1/1 | Healthy gloo-mesh-addons | ext-auth-service | 1/1 | Healthy gloo-mesh | gloo-telemetry-collector-agent | 1/1 | Healthy 🟢 Mgmt server connectivity to workload agents Cluster | Registered | Connected Pod gloo | true | gloo-mesh/gloo-mesh-mgmt-server-6c58598fcd-qkw65
Install htttpbin Application
HTTPBIN is a great little REST service that can be used to test a variety of http operations and echo the response elements back to the consumer. We’ll use it throughout this exercise. First, we’ll install the httpbin service on our k3d cluster. Run:
kubectl create namespace httpbin --context gloo kubectl --context gloo label namespace httpbin istio-injection=enabled kubectl apply -f large-payloads/httpbin.yaml -n httpbin --context gloo
You should see:
namespace/httpbin created namespace/httpbin labeled serviceaccount/httpbin created service/httpbin created deployment.apps/httpbin created
You can confirm that the httpbin pod is running by searching for pods with an app
label of httpbin
:
kubectl get pods -l app=httpbin -n httpbin
And you will see:
NAME READY STATUS RESTARTS AGE httpbin-66cdbdb6c5-2cnm7 1/1 Running 0 29s
Generate Payload Files
If you’d like to follow along with this exercise, we’ll test our service using some preposterously large payloads that we generate for ourselves. (You wouldn’t want us to flood your network with these behemoths when you cloned our Github repo, would you?)
- 1MB:
base64 /dev/urandom | head -c 10000000 > large-payloads/1m-payload.txt
- 10MB:
base64 /dev/urandom | head -c 100000000 > large-payloads/10m-payload.txt
- 100MB:
base64 /dev/urandom | head -c 1000000000 > large-payloads/100m-payload.txt
Create a Workspace
A Workspace
is a really important new feature in Gloo Mesh 2.0. By providing a team-oriented artifact “container”, they make it much easier to express policies that clearly delineate boundaries between resources that are owned by various teams within your organization. The Workspaces
you specify in turn generate Istio artifacts that enforce multi-tenant-aware policies. You can learn more about them here.
In our case, we’re focused strictly on gateway functionality and not so much on shared tenancy. So we’ll create a namespace and a single Workspace
to reflect the domain of our ops-team
that is maintaining our gateway capability.
apiVersion: v1 kind: Namespace metadata: name: ops-team --- apiVersion: admin.gloo.solo.io/v2 kind: Workspace metadata: name: ops-team namespace: gloo-mesh spec: workloadClusters: - name: '*' namespaces: - name: ops-team - name: gloo-mesh-gateways - name: gloo-mesh-addons - name: httpbin --- apiVersion: admin.gloo.solo.io/v2 kind: WorkspaceSettings metadata: name: ops-team namespace: ops-team spec: options: eastWestGateways: - selector: labels: istio: eastwestgateway
You can create the Workspace
above using this command:
kubectl apply -f large-payloads/workspace.yaml --context gloo
You should see results like this:
namespace/ops-team created workspace.admin.gloo.solo.io/ops-team created workspacesettings.admin.gloo.solo.io/ops-team unchanged
Establish a VirtualGateway
Let’s establish a Gloo Mesh VirtualGateway
that we’ll attach to the default istio-ingressgateway
that was configured when we installed our local Istio instance earlier. We’ll configure this gateway to handle our inbound, north-south traffic by selecting any RouteTables
that are specified in the ops-team
workspace. We’ll create such a RouteTable
momentarily. Here is the VirtualGateway
YAML:
apiVersion: networking.gloo.solo.io/v2 kind: VirtualGateway metadata: name: north-south-gw namespace: ops-team spec: workloads: - selector: labels: istio: ingressgateway cluster: gloo listeners: - http: {} port: name: http2 allowedRouteTables: - host: '*' selector: workspace: ops-team
Now we’ll apply this configuration to establish the north-south gateway:
kubectl apply -f large-payloads/virtual-gateway.yaml --context gloo
That should yield a result like this:
virtualgateway.networking.gloo.solo.io/north-south-gw created
Configure a RouteTable
RouteTables
are a key Gloo API Gateway abstraction that specify routing policies to apply to requests. You can learn more about them in the request routing documentation here. For this exercise, we require just a simple RouteTable
that attaches to our north-south-gw
and routes all inbound requests to our httpbin service.
apiVersion: networking.gloo.solo.io/v2 kind: RouteTable metadata: name: httpbin namespace: ops-team spec: hosts: - '*' virtualGateways: - name: north-south-gw namespace: ops-team cluster: gloo workloadSelectors: [] http: - name: httpbin labels: big-payload: "true" forwardTo: destinations: - ref: name: httpbin namespace: httpbin cluster: gloo port: number: 8000
Let’s apply this configuration:
kubectl apply -f large-payloads/route-table.yaml --context gloo
And observe that the RouteTable
was created as expected:
routetable.networking.gloo.solo.io/httpbin created
Add a TransformationPolicy
A Gloo API Gateway TransformationPolicy
provides an API for specifying a set of transformation rules to apply to an inbound request. These policies are quite expressive, and you can learn more about them here. In our case, we will simply inject a single custom header x-my-custom-header
with value my-custom-value
.
apiVersion: trafficcontrol.policy.gloo.solo.io/v2 kind: TransformationPolicy metadata: name: big-payloads namespace: ops-team spec: applyToRoutes: - route: labels: big-payload: "true" config: request: injaTemplate: headers: x-my-custom-header: text: 'my-custom-value'
Note that we’ve specified a label selector on this policy to apply it to any route that has a label big-payload
set to true
. Look back at the RouteTable
in the previous section to see that that label is specified there.
Now let’s apply the policy:
kubectl apply -f large-payloads/transformation-policy.yaml --context gloo
Here’s the expected result:
transformationpolicy.trafficcontrol.policy.gloo.solo.io/big-payloads created
Test, Test, Test
Managing with Marlin
Let’s not start with our full-grown, whale-sized payload. Instead, we’ll create a small clownfish-sized payload—we’ll call it Marlin—to get going. Note that Marlin swims upstream with its microscopic 100-byte payload with no problem. In addition, you can see the X-My-Custom-Header
with my-custom-value
that appears in the request headers that httpbin echoes back to the caller. So far, so good.
% curl -i -s -w "@large-payloads/curl-format.txt" -X POST -d "@large-payloads/100b-payload.txt" localhost:8080/post HTTP/1.1 200 OK server: istio-envoy date: Mon, 13 Jun 2022 21:49:37 GMT content-type: application/json content-length: 970 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 31 { "args": {}, "data": "", "files": {}, "form": { "1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890": "" }, "headers": { "Accept": "*/*", "Content-Length": "100", "Content-Type": "application/x-www-form-urlencoded", "Host": "localhost:8080", "User-Agent": "curl/7.79.1", "X-B3-Parentspanid": "8b4b17495cac4e95", "X-B3-Sampled": "0", "X-B3-Spanid": "2a556c09898f4ebf", "X-B3-Traceid": "bf810fb1d1979a228b4b17495cac4e95", "X-Envoy-Attempt-Count": "1", "X-Envoy-Internal": "true", "X-Forwarded-Client-Cert": "By=spiffe://gloo/ns/httpbin/sa/httpbin;Hash=b4de294baf5ff32636866c1c3cb971fc504f795c352d07ea92fd0b9c0640c978;Subject=\"\";URI=spiffe://gloo/ns/istio-gateways/sa/istio-ingressgateway-service-account", "X-My-Custom-Header": "my-custom-value" }, "json": null, "origin": "10.42.0.1", "url": "http://localhost:8080/post" } time_total: 0.059175s response_code: 200 payload_size: 100
Cruising with Crush?
Marlin was no problem, so let’s move up the food chain by trying a sea turtle-sized payload that we’ll call Crush. Crush carries a 1MB payload, so he may create some cacophony.
curl -i -s -w "@large-payloads/curl-format.txt" -X POST -d "@large-payloads/1m-payload.txt" localhost:8080/post
This is not the response we wanted to see from Crush:
HTTP/1.1 100 Continue HTTP/1.1 413 Payload Too Large content-length: 17 content-type: text/plain date: Mon, 13 Jun 2022 21:51:50 GMT server: istio-envoy connection: close payload too large
time_total: 0.058075s response_code: 413 payload_size: 2308844
An HTTP 413 response indicates that we have overflowed Envoy’s default buffer size for a given request. Learn more about Envoy buffering and flow control here and here. It is possible to increase the Envoy buffer size, but this must be considered very carefully since multiple large requests with excessive buffer sizes could result in memory consumption issues for the proxy.
The good news is that for this use case we don’t require buffering of the request payload at all, since we are not contemplating transformations on the payload, which is what we see most commonly with cases like this. Instead, we’re simply delivering a large file to a service endpoint. The only transformation we require of the Envoy proxy is to add X-My-Custom-Header
to the input, which we have carried along since the original example.
Re-calibrating for Crush
TransformationPolicy
that sets the optional passthrough flag. It is commonly used in use cases like this to instruct the proxy NOT to buffer the payload at all, but simply to pass it through unchanged to the upstream service.TransformationPolicy
to enable massive message payloads:config: request: injaTemplate: passthrough: {} # <<====== NOTE the addition of the passthrough directive headers: x-my-custom-header: text: 'my-custom-value'
Now apply the “passthrough” version of the TransformationPolicy
:
kubectl apply -f large-payloads/transformation-policy-with-passthrough.yaml --context gloo
Expect this response:
transformationpolicy.trafficcontrol.policy.gloo.solo.io/big-payloads created
Note that for this and all subsequent examples, we’ll suppress the native httpbin output because it wants to echo back the entire original request payload. And life is too short to watch all of that scroll by. Instead, we’ll rely on curl facilities to show just the response bits we care about: the total processing time, HTTP response code, and confirming the size of the request payload.
Now let’s retry Crush and watch him cruise all the way to Sydney with no constrictions:
% curl -i -s -w "@large-payloads/curl-format.txt" -X POST -d "@large-payloads/1m-payload.txt" localhost:8080/post -o /dev/null time_total: 0.445716s response_code: 200 payload_size: 10000000
Bashing with Bruce
Of course, the most fearsome payloads of all swim with Bruce, the great white shark. We’ll set our bulkiest payloads against the gateway with Bruce-sized proportions, 10MB first and then our ultimate goal of 100MB.
% curl -s -w "@large-payloads/curl-format.txt" -X POST -T "large-payloads/10m-payload.txt" localhost:8080/post -o /dev/null time_total: 4.401224s response_code: 200 payload_size: 100000000
Finally, we achieve our goal of handling a 100MB payload:
% curl -s -w "@large-payloads/curl-format.txt" -X POST -T "large-payloads/100m-payload.txt" localhost:8080/post -o /dev/null time_total: 35.882104s response_code: 200 payload_size: 1000000000
Bruce ran the gauntlet with no problems, thanks to our passthrough
directive causing the proxy to bypass buffering of the payload.
Cleanup
If you’d like to clean up the work you’ve done, and if you’ve followed along closely with this example, then you can simply delete the k3d cluster you created in the beginning using the teardown script in our example repo:
./setup/teardown.sh
Alternatively, if you brought your own cluster, then you can simply delete the Kubernetes namespaces we’ve created over the course of this exercise.
kubectl delete namespace httpbin kubectl delete namespace ops-team
You should see a response like this that confirms the resources have been deleted.
namespace "httpbin" deleted namespace "ops-team" deleted
Learn More about API Gateways
- Review Part 1 of this blog post, where we solved the same large-request problem using Gloo Edge
- Explore the documentation for Gloo API Gateway.
- Request a live demo or trial for Gloo API Gateway.
- See video content on the solo.io YouTube channel.
- Questions? Join the Solo.io Slack community.