The Elephant (Payload) in the Room: Handling Super-Sized Requests with the Gateway API and Envoy Proxy
A customer recently approached us with a problem. They use another vendor’s API gateway that satisfies most of their requirements with one notable exception: it fails on messages with elephantine payloads. They have requirements to issue requests that post up to gargantuan 100MB files. Could Gloo’s gateway technology help with such a problem? Another dimension to this porcine pickle is that they wanted to simultaneously have the gateway layer add some arbitrary custom headers to be injected along with the upstream request.
The purpose of this blog post is to try and wrap our arms around this oversized issue. We’ll work through an example to determine if Envoy proxy with a Gloo Gateway control plane can help us work through this problem. As a bonus, we’ll learn a bit more about how Gloo supports the new Kubernetes Gateway API standard, and even extends it in places where the core standard isn’t expressive enough to meet all our requirements. Feel free to follow along with this exercise in your own Kubernetes cluster.
Prerequisites
To complete this guide, you’ll need a Kubernetes cluster and associated tools, plus an instance of Gloo Gateway Enterprise. Note that there is a free and open source version of Gloo Edge, and it will work with this example as well. We ran the tests in this blog on Gloo Gateway Enterprise v1.17. Use this guide if you need to install Gloo Edge Enterprise. And if you don’t already have access to the Enterprise bits of Gloo Edge, you can request a free trial here.
We used GKE with Kubernetes v1.21.11 to test this guide, although any recent version with any Kubernetes provider should suffice.
For this exercise, we’ll also use some common CLI utilities like kubectl, curl, and git. Make sure these prerequisites are all available to you before jumping into the next section. I’m building this on MacOS but other platforms should be perfectly fine as well.
Clone Github Repo
The resources required for this exercise are available in the gloo-gateway-use-cases
repo on GitHub. Clone that to your workstation and switch to the large-payload-gateway-api
example directory:
git clone https://github.com/solo-io/gloo-gateway-use-cases.git cd gloo-gateway-use-cases/large-payload-gateway-api
Install htttpbin Application
HTTPBIN is a great little REST service that can be used to test a variety of http operations and echo the response elements back to the consumer. We’ll use it throughout this exercise. First, we’ll install the httpbin service on our kind cluster. Run:
kubectl apply -f 01-httpbin-svc.yaml
You should see:
namespace/httpbin created
serviceaccount/httpbin created
service/httpbin created
deployment.apps/httpbin created
You can confirm that the httpbin pod is running by searching for pods with an app
label of httpbin
:
kubectl get pods -l app=httpbin -n httpbin
And you will see something like this:
NAME READY STATUS RESTARTS AGE httpbin-66cdbdb6c5-2cnm7 1/1 Running 0 21m
Configure a Gateway Listener
A Gateway object represents a host:port listener that the proxy will expose to accept ingress traffic. We’ll establish a Gateway resource that sets up an HTTP listener to expose routes from all our namespaces. Gateway custom resources like this are a core part of the Gateway API standard.
kind: Gateway apiVersion: gateway.networking.k8s.io/v1 metadata: name: http spec: gatewayClassName: gloo-gateway listeners: - protocol: HTTP port: 8080 name: http allowedRoutes: namespaces: from: All
Now we’ll apply this to our kube cluster:
kubectl apply -f 02-gateway.yaml
Expect to see this response:
gateway.gateway.networking.k8s.io/http created
Now we can confirm that the Gateway has been activated:
kubectl get gateway http -n gloo-system
You’ll see this sort of response from a kind cluster:
NAME CLASS ADDRESS PROGRAMMED AGE http gloo-gateway True 42s
You can also confirm that Gloo Gateway has spun up an Envoy proxy instance in response to the creation of this Gateway
object by deploying gloo-proxy-http
:
kubectl get deployment gloo-proxy-http -n gloo-system
Expect a response like this:
NAME READY UP-TO-DATE AVAILABLE AGE gloo-proxy-http 1/1 1 1 4m12s
Generate Payload Files
If you’d like to follow along with this exercise, we’ll test our service using some preposterously large payloads that we generate for ourselves. (You wouldn’t want us to flood your network with these behemoths when you cloned our GitHub repo, would you?)
- 10MB:
echo {\"payload\": \" $(base64 -i /dev/urandom | head -c 10000000) \"\} > 10m-payload.txt
- 100MB:
echo {\"payload\": \" $(base64 -i /dev/urandom | head -c 100000000) \"\} > 100m-payload.txt
Install a Basic HTTPRoute
Let’s begin our routing configuration with the simplest possible route to expose httpbin
‘s portfolio of operations through our gateway proxy. You can sample the public version of this service here.
HTTPRoute
is one of the new Kubernetes CRDs introduced by the Gateway API, as documented here. We’ll start by introducing a simple HTTPRoute
for our service. This route manages routing and policy enforcement on behalf of an upstream service, like httpbin
in this case. We will begin with a simple configuration that forwards requests for any path on the api.example.com
virtual host to the httpbin
service.
apiVersion: gateway.networking.k8s.io/v1beta1 kind: HTTPRoute metadata: name: httpbin namespace: httpbin spec: parentRefs: - name: http namespace: gloo-system hostnames: - api.example.com rules: - matches: - path: type: PathPrefix value: / backendRefs: - name: httpbin port: 8000
Let’s apply this `HTTPRoute` now.
kubectl apply -f 03-httpbin-route.yaml
This is the expected response:
httproute.gateway.networking.k8s.io/httpbin created
Test the Simple Route with Curl
Now that the HTTPRoute
is in place and is attached to our Gateway
object, let’s use curl
to display the response with the -i
option to additionally show the HTTP response code and headers. Since we plan to test the response of the gateway and service with large payload submissions, we’ll use the httpbin /post
endpoint.
curl -is -X POST -d '{"payload: "my-small-payload"}' -H "Host: api.example.com" http://localhost:8080/post
Note that if you’re running on a cloud-provisioned cluster, you won’t access your service via port-forwarding to your localhost. Instead, you can obtain your proxy’s address using the glooctl
CLI like this: glooctl proxy url
. Then your curl
command would be expressed like this:
curl -is -X POST -d '{"payload: "my-small-payload"}' -H "Host: api.example.com" $(glooctl proxy url)/post
When you use the appropriate technique for your Kubernetes environment, then this command should complete successfully:
HTTP/1.1 200 OK server: envoy date: Thu, 13 Jun 2024 23:06:04 GMT content-type: application/json content-length: 437 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 37 { "args": {}, "data": "", "files": {}, "form": { "{\"payload: \"my-small-payload\"}": "" }, "headers": { "Accept": "*/*", "Content-Length": "30", "Content-Type": "application/x-www-form-urlencoded", "Host": "api.example.com", "User-Agent": "curl/8.6.0", "X-Envoy-Expected-Rq-Timeout-Ms": "15000" }, "json": null, "origin": "10.244.0.11", "url": "http://api.example.com/post" }
Inject a Custom Header using Gateway API Extensions
We’ll satisfy our customer’s custom header request by making one more change to our gateway configuration before we start ramping up to larger payloads.
We’ll use a transformation to modify our HTTP request to inject the custom header X-My-Custom-Header
with value my-custom-value
. This modified request will then be passed along to the backend httpbin
service. This type of requirement is common in scenarios where an integration is required, and you’d like to hide some of the required details from the consuming service.
Gloo Gateway and the Gateway API give us multiple avenues for satisfying this requirement. For this simple scenario, we could use a requestHeaderModifier
filter directly in the HTTPRoute
we built earlier. The standard doesn’t require this filter to be supported in the gateway, but it’s fairly common. Gloo Gateway fully supports it.
But in this case we’re going to use Gloo Gateway’s more fully featured transformation libraries. Why? Gloo technologies have a long history of providing sophisticated transformation policies with its gateway products, providing capabilities like in-line Inja templates that can dynamically compute values from multiple sources in request and response transformations.
The core Gateway API standard does not offer this level of sophistication in its transformations, but there is good news. The community has learned from its experience with earlier, similar APIs like the Kubernetes Ingress API. The Ingress API did not offer extension points, which locked users strictly into the set of features envisioned by the creators of the standard. This ensured limited adoption of that API. So while many cloud-native API gateway vendors like Solo support the Ingress API, its active development has largely stopped.
The good news is that the new Gateway API offers core functionality as described in this blog post. But just as importantly, it delivers extensibility by allowing vendors to specify their own Kubernetes CRDs to specify policy. In the case of transformations, Gloo Gateway users can now leverage Solo’s long history of innovation to add important capabilities to the gateway, while staying within the boundaries of the new standard. For example, Solo’s extensive transformation library is now available in Gloo Gateway via Gateway API extensions like RouteOption and VirtualHostOption.
We’ll add this to our gateway configuration by adding a RouteOption
describing the transformation, and by adding a reference to the new RouteOption
in our existing HTTPRoute
.
apiVersion: gateway.solo.io/v1 kind: RouteOption metadata: name: route-option-httpbin namespace: httpbin spec: options: stagedTransformations: early: requestTransforms: - matcher: prefix: / requestTransformation: transformationTemplate: headers: X-My-Custom-Header: text: 'my-custom-value'
Here is the extension filter we add in our HTTPRoute
to activate our transformation. While these ExtensionRef
filters are part of the Gateway API standard, the Solo RouteOption
extension it points to is not part of the standard.
filters: # Attach our transformation spec to this route - type: ExtensionRef extensionRef: group: gateway.solo.io kind: RouteOption name: route-option-httpbin
Now let’s apply these changes:
kubectl apply -f 04-route-option-header.yaml kubectl apply -f 05-httpbin-route-ext.yaml
Here are the expected results:
routeoption.gateway.solo.io/route-option-httpbin created httproute.gateway.networking.k8s.io/httpbin configured
Test, Test, Test
Managing with Marlin
Let’s not start with our full-grown, whale-sized payload. Instead, we’ll create a small clownfish-sized payload—we’ll call it Marlin—to get going. Note that Marlin swims upstream with its microscopic 100-byte payload with no problem. In addition, you can see the X-My-Custom-Header
with my-custom-value
that appears in the request headers that httpbin echoes back to the caller. So far, so good.
% curl -is -w "@curl-format.txt" -X POST -T "100b-payload.txt" -H "Host: api.example.com" http://localhost:8080/post HTTP/1.1 200 OK server: envoy date: Fri, 14 Jun 2024 00:04:48 GMT content-type: application/json content-length: 619 access-control-allow-origin: * access-control-allow-credentials: true x-envoy-upstream-service-time: 11 { "args": {}, "data": "{\"payload\": \"aSR8VIIW3aEZjEjYHg6EeDTayVHwLt9XV+QYoDe+IlE/+sWW07TZsek5KvCiPhm9NmCVRQh5l8oST1MDCfcx2eUP5V73u2a53CjE\"}\n", "files": {}, "form": {}, "headers": { "Accept": "*/*", "Content-Length": "116", "Host": "api.example.com", "User-Agent": "curl/8.6.0", "X-Envoy-Expected-Rq-Timeout-Ms": "15000", "X-My-Custom-Header": "my-custom-value" }, "json": { "payload": "aSR8VIIW3aEZjEjYHg6EeDTayVHwLt9XV+QYoDe+IlE/+sWW07TZsek5KvCiPhm9NmCVRQh5l8oST1MDCfcx2eUP5V73u2a53CjE" }, "origin": "10.244.0.11", "url": "http://api.example.com/post" } time_total: 0.025844s response_code: 200 payload_size: 116
Cruising with Crush?
Marlin was no problem, so let’s move up the food chain by trying a sea turtle-sized payload that we’ll call Crush. Crush carries a 10MB payload, so he may create some cacophony.
curl -is -X POST -T "10m-payload.txt" -H "Host: api.example.com" http://localhost:8080/post
This is not the response we wanted to see from Crush:
HTTP/1.1 100 Continue HTTP/1.1 413 Payload Too Large content-length: 17 content-type: text/plain date: Fri, 14 Jun 2024 00:08:37 GMT server: envoy connection: close payload too large time_total: 0.580871s response_code: 413 payload_size: 2624556
An HTTP 413 response indicates that we have overflowed Envoy’s default 1MB buffer size for a given request. Learn more about Envoy buffering and flow control here and here. It is possible to increase the Envoy buffer size, but this must be considered very carefully since multiple large requests with excessive buffer sizes could result in memory consumption issues for the proxy.
The good news is that for this use case we don’t require buffering of the request payload at all, since we are not contemplating transformations on the payload, which is what we see most commonly with cases like this. Instead, we’re simply delivering a large file to a service endpoint. The only transformation we require of the Envoy proxy is to add X-My-Custom-Header
to the input, which we have carried along since the original example.
Note that if you’d still prefer the approach of increasing Envoy’s buffer size to handle large payloads, there is an API in Gloo Edge for that, too. Check out the perConnectionBufferLimitBytes
setting in the ListenerOptions
API. This can be managed on a per-gateway level, as documented here. But generally speaking, eliminating buffering altogether offers superior performance and less risk.
Re-calibrating for Crush
RouteOption
that sets the optional Gloo Gateway passthrough flag. It is commonly used in use cases like this to instruct the proxy NOT to buffer the payload at all, but simply to pass it through unchanged to the upstream service.RouteOption
‘s transformation spec to enable massive message payloads:requestTransforms: - matcher: prefix: / requestTransformation: transformationTemplate: passthrough: {} # <<====== NOTE the addition of the passthrough directive headers: x-my-custom-header: text: 'my-custom-value'
Now apply the “passthrough” version of the RouteOption
:
kubectl apply -f 06-route-option-passthrough.yaml
Expect this response:
routeoption.gateway.solo.io/route-option-httpbin configured
Note that for this and all subsequent examples, we’ll suppress the native httpbin output because it wants to echo back the entire original request payload. And life is too short to watch all of that scroll by. Instead, we’ll rely on curl
facilities to show just the response bits we care about: the total processing time, HTTP response code, and confirming the size of the request payload.
Now let’s retry Crush and watch him cruise all the way to Sydney with no constrictions:
% curl -s -w "@curl-format.txt" -X POST -T "10m-payload.txt" -H "Host: api.example.com" http://localhost:8080/post -o /dev/null
time_total: 0.610423s
response_code: 200
payload_size: 10000018
Bashing with Bruce
Of course, the most fearsome payloads of all swim with Bruce, the great white shark. We’ll smash our bulkiest, Bruce-sized payloads against the proxy with our ultimate goal of 100MB.
% curl -is -w "@curl-format.txt" -X POST -T "100m-payload.txt" -H "Host: api.example.com" http://localhost:8080/post -o /dev/null time_total: 5.865839s response_code: 200 payload_size: 100000018
Even Bruce ran the gauntlet with no problems, thanks to our passthrough
directive causing the proxy to bypass buffering of the payload. Even when we brought Bruce to the party and increased the payload size by an order of magnitude, there were no issues.
Cleanup
If you’d like to clean up the work you’ve done, you can either delete the entire Kubernetes cluster you created earlier, or simply delete the Kubernetes resources we’ve created over the course of this exercise and uninstall Gloo Gateway.
kubectl delete -f 01-httpbin-svc.yaml,02-gateway.yaml,03-httpbin-route.yaml,06-route-option-passthrough.yaml
glooctl uninstall --all
You should see a response like this that confirms the resources have been deleted.
namespace "httpbin" deleted serviceaccount "httpbin" deleted service "httpbin" deleted deployment.apps "httpbin" deleted gateway.gateway.networking.k8s.io "http" deleted httproute.gateway.networking.k8s.io "httpbin" deleted routeoption.gateway.solo.io "route-option-httpbin" deleted Uninstalling Gloo Gateway... Removing Gloo system components from namespace gloo-system... Gloo was successfully uninstalled.
Learn More
- Explore the documentation for Gloo Gateway.
- Request a live demo or trial for Gloo Gateway Enterprise.
- See video content on the solo.io YouTube channel.