Fast and Furious: Gateway API at Scale with Envoy Proxy and Gloo Gateway
Gloo Gateway is Solo’s cloud-native API gateway based on the open source Envoy Proxy and the Kubernetes Gateway API. It provides authentication (using OAuth, JWT, API keys, HTTP basic auth, to name a few), authorization (with OPA or custom approaches), a web application firewall (WAF – based on ModSecurity), function discovery for OpenAPI and AWS Lambda, and advanced transformations.
“So, how many instances of Envoy proxy do I need?”
That’s one of the first questions asked by prospective users. Of course, each use case is different. But we generally answer that they need two instances to get high availability. However, it is rare for users deploy multiple instances for performance reasons. This blog post provides hard data in the form of benchmark results for common usage patterns that Solo sees within its customer base.
Test Strategy
We’ll start in this blog post by benchmarking Gloo Gateway without any filters, using only basic HTTP requests to delegate to a lightweight, internally deployed httpbin service.
Then we’ll show the impact of mixing in popular gateway features, like authentication and authorization using API keys, OAuth and WAF; transformations; and rate limiting.
We’ll combine all these operations into a single test run and use K6 load testing to orchestrate and produce Grafana dashboards that summarize our results.
Test Environment
Our test environment lived in a CPU-optimized, 32-CPU EKS cluster. We deployed an httpbin
workload as the lone application service in the cluster. We fronted this service with a single instance of Envoy proxy. The single Envoy instance was configured with 10 CPUs and 8GiB of memory allocated.
Gloo Gateway deployments are frequently deployed in production with multiple Envoy replicas, for both scalability and high availability. But one of the primary objectives of this test was to explore the limits of individual Envoy proxies under Gloo control.
We used a small suite of k6 test runners to drive the test traffic and to help with analyzing the results, using 2 CPUs with 8GiB of memory.
For more complex test scenarios, we activated an 8-replica suite of Gloo ExtAuth processors. Traffic is delegated from Envoy to the ExtAuth service only when a relevant authNZ policy is present (e.g., API keys, OAuth, Open Policy Agent).
A 4-replica suite of Gloo Rate Limiting processors was also available in the test environment. As with ExtAuth, traffic was only delegated to these services when a rate limiting policy was active. Note that neither ExtAuth nor Rate Limiting was active for the baseline scenario.
A 4-replica suite of Gloo Rate Limiting processors was also available in the test environment. As with ExtAuth, traffic was only delegated to these services when a rate limiting policy was active. Note that neither ExtAuth nor Rate Limiting was active for the baseline scenario.
Test Scenarios
We executed six separate test scenarios for just over an hour, 62 minutes to be precise. Here is a list of all the scenarios, with individual durations:
- Baseline Delegation: Delegate to
httpbin
with no authNZ or rate limiting constraints (12 minutes). - AuthNZ with API Keys: Add an API key-based authNZ policy to baseline (10 minutes).
- AuthNZ with OAuth: Switch authNZ policy to OAuth with Keycloak (10 minutes).
- AuthNZ with Transformations: Add a transformation to a JWT claim from Keycloak (10 minutes).
- AuthNZ with Transformations and Rate Limiting: Add rate limiting policy to prior test (10 minutes).
- AuthNZ with Web Application Firewall: Reset to baseline and add WAF processing (10 minutes).
Baseline Delegation
In the baseline test scenario, we configured the single Envoy proxy instance via Gloo Gateway using a simple HTTPRoute
that forwarded all traffic to the httpbin
service via an Upstream
resource. Upstreams are useful Gloo abstractions that provide a single point of reference to a potentially complex network of services that may be deployed either inside or outside of a Kube cluster.
apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: httpbin namespace: httpbin spec: rules: - matches: - path: type: PathPrefix value: / backendRefs: - name: httpbin1 group: gloo.solo.io kind: Upstream port: 8881
We ran this baseline test for 12 minutes, with top CPU consumption for the Gloo components and upstream services under 12%. The system reached a maximum rate of 25.5k requests per second by the end of that phase of the test, and with zero errors. That translates into a ceiling of over 2.2 billion requests per day, and with a single Envoy instance!
You can see that result in the first segment of the “K6 runners” panel labeled “Baseline” in the Grafana chart below.
Take special note of the efficiency of the Envoy proxy itself, at the heart of processing all requests across all test scenarios. It is listed as gloo-proxy-http
in the results chart above. Envoy consumes a maximum of 10% of the available CPU across all tests, and its memory requirements never exceed 1.1 GiB.
AuthNZ with API Keys
This scenario adds authNZ policies and a significant new element, the Gloo extauth
processor, to the data path for each request processed. We’ll declare a simple API key policy that requires all requests to contain an authorization header with a key taken from a particular category of Kubernetes secrets.
The center of this policy is an AuthConfig
object, a Gloo Gateway abstraction that we will plug into our route as a standard Gateway API extension. This policy looks for headers with the name api-key
and accepts only requests that contain Secrets
that carry the product-excellence
team label.
apiVersion: enterprise.gloo.solo.io/v1 kind: AuthConfig metadata: name: apikeys namespace: httpbin spec: configs: - apiKeyAuth: headerName: api-key labelSelector: team: product-excellence
That policy can be attached to an entire gateway, or scoped down to an individual route. We will create a RouteOption
object and then use that to attach the policy to our initial HTTPRoute
.
apiVersion: gateway.solo.io/v1 kind: RouteOption metadata: name: routeoption namespace: httpbin spec: options: extauth: configRef: name: apikeys namespace: httpbin --- apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: httpbin namespace: httpbin spec: rules: - matches: - path: type: PathPrefix value: / filters: # Attach the api-key authNZ behavior to our initial httpbin route - type: ExtensionRef extensionRef: group: gateway.solo.io kind: RouteOption name: routeoption backendRefs: - name: httpbin1 group: gloo.solo.io kind: Upstream port: 8881
How much does layering in this authNZ behavior impact our request throughput? Not as much as you might expect. Refer to the second segment of the “K6 runners” result chart that we introduced earlier. You see that the rate reaches a steady state of 21.0k requests per second by the end of that 10-minute test phase. That represents about an 18% throughput decline from the baseline: (25.5 – 21.0) / 25.5.
You can also see a modest uptick in CPU and memory consumption from the chart, due to the increased processing load on the gateway. Note that practically all of the extra load is attributable to the Gloo component responsible for extauth
processing, the green area on the CPU and memory panels.
In this case, we could still handle about 1.8 billion requests per day through our single Envoy instance, and all of this occurred with zero unexpected errors.
AuthNZ with OAuth and Keycloak
OAuth is a popular security framework that allows users to grant third-party applications access to their data without sharing their passwords. OAuth uses access tokens to give third-party services temporary access to a limited amount of a user’s personal information.
In this third phase of the test, we’ll use the OAuth support in open-source Keycloak to validate user requests.
The primary change in our routing is to swap out our API key AuthConfig
policy for a more complex OAuth configuration.
apiVersion: enterprise.gloo.solo.io/v1 kind: AuthConfig metadata: name: oauth namespace: httpbin spec: configs: - oauth2: oidcAuthorizationCode: appUrl: "https://httpbin.example.com" callbackPath: /callback clientId: ${KEYCLOAK_CLIENT} clientSecretRef: name: oauth namespace: gloo-system issuerUrl: "${KEYCLOAK_URL}/realms/workshop/" logoutPath: /logout afterLogoutUrl: "https://httpbin.example.com/get" session: failOnFetchFailure: true redis: cookieName: keycloak-session options: host: 127.0.0.1 scopes: - email headers: idTokenHeader: jwt identityToken: claimsToHeaders: - claim: email header: X-Email
If you’re familiar with OAuth security, then you’ll recognize the verbosity in the configuration is driven by the complexity of the interactions required by the OAuth standard. But this complexity offers value to us as well. Note for example that we can extract claims information from the OAuth identity tokens and use those to enrich the content of our request. In this case, for example, you see that we’re extracting an email
claim from the identity token and adding it to a header called X-Email
that this policy will pass along with the request. In future test phases, we’ll build on this with Gloo transformations to synthesize other request headers.
From a results standpoint, we expect our throughput to decrease a bit as we are adding another runtime component to the mix in Keycloak, plus a more complex set of security algorithms with OAuth.
So what’s the throughput impact of swapping in OAuth security? You can understand it better from the third segment of the “K6 runners” result chart. There is an incremental decrease to a steady state of 19.5k requests per second by the end of that 10-minute test phase. That represents a 5% throughput decline from the simpler API key processing, and 23.5% off the baseline case: (25.5 – 19.5) / 25.5.
You can also see a substantial increase in CPU consumption from the chart, and a more modest memory uptick. As was the case with API key processing, the extra load is attributable to the Gloo component responsible for extauth
processing, the green area on the CPU and memory panels.
Even with the additional OAuth processing, we could still handle over 1.6 billion requests per day through our single Envoy instance, and all of this occurred with zero unexpected errors.
AuthNZ with OAuth and Transformations
In the fourth phase of the test, we expanded our use of OAuth from the previous section, where we simply extracted an email
claim from the identity token and converted that into a header X-Email
. Now we’ll demonstrate the power of the Gloo Gateway transformation features. We’ll use those to extract the organization name from the email address, and we’ll synthesize that into a new header.
So for example, if the email
claim has a value jim@solo.io
, then our Gloo configuration will add a new X-Organization
header with value solo.io
. That header will then be available for subsequent Envoy filters in the chain to use, and it can also be passed to the upstream service.
The configuration change required to activate this transformation is to add a stagedTransformations
stanza to the RouteOption
that we used in the previous section. It originally delegated to our OAuth AuthConfig
. Now in addition to that, it transforms the email value as we described before.
apiVersion: gateway.solo.io/v1 kind: RouteOption metadata: name: routeoption namespace: httpbin spec: options: # Delegate to OAuth processing in Keycloak extauth: configRef: name: oauth namespace: httpbin # Extract email claim into an organization variable using a regex stagedTransformations: regular: requestTransforms: - requestTransformation: transformationTemplate: extractors: organization: header: 'X-Email' regex: '.*@(.*)$' subgroup: 1 # Use the extracted organization variable to create a new header headers: x-organization: text: "{{ organization }}"
From a results standpoint, adding this transformation has only a small impact on throughput, decreasing about 3% from the previous test stage, down to 19.0k requests per second. Changes in CPU and memory usage were negligible.
AuthNZ with Transformations and Rate Limiting
The fifth phase of the test explores the most complex scenario in this blog. We’re adding one more major component to our request processing, the Gloo Gateway rate limiter. We’ll add a new policy that limits users from any single organization to a given request rate. This adds two more components to the data path of our request: the Gloo rate-limit
processor and its Redis cache of rate limit counters. These are in addition to the extauth
processor that is responsible for enforcing our OAuth policy.
We’ll declare this policy using a RateLimitConfig
custom resource. This limits the number of requests from any organization to a maximum of 999,999 per second.
Note that this an impractically high limit on purpose. Our goal for this scalability testing is not to trigger the limit, but instead to force each request through the rate limiting path.
apiVersion: ratelimit.solo.io/v1alpha1 kind: RateLimitConfig metadata: name: limit-users namespace: httpbin spec: raw: # Use previously synthesized X-Organization header as basic for rate limiting setDescriptors: - simpleDescriptors: - key: organization value: solo.io rateLimit: requestsPerUnit: 999999 unit: SECOND rateLimits: - setActions: - requestHeaders: descriptorKey: organization headerName: X-Organization
We will also modify our httpbin RouteOption
to add a rateLimitConfigs
stanza that mixes this policy into our existing OAuth and transformation processing.
apiVersion: gateway.solo.io/v1 kind: RouteOption metadata: name: routeoption namespace: httpbin spec: options: extauth: configRef: name: oauth namespace: httpbin stagedTransformations: regular: requestTransforms: - requestTransformation: transformationTemplate: extractors: organization: header: 'X-Email' regex: '.*@(.*)$' subgroup: 1 headers: x-organization: text: "{{ organization }}" # Add per-organization RateLimitConfig to routing behavior rateLimitConfigs: refs: - name: limit-users namespace: httpbin
How much does layering in this rate limiting behavior impact our request throughput? Refer to the fifth segment of the “K6 runners” result chart. You see that the rate reaches a steady state of 14.5k requests per second by the end of that 10-minute test phase. That represents about a 24% from the previous stage before adding rate limiting. It represents a 43% throughput decline from the baseline.
You can also see a significant uptick in CPU and memory consumption from the chart, due to the increased processing load on the rate limiter.
This case extrapolates to about 1.25 billion requests per day through our single Envoy instance with both external authorization and rate limiting policies active. As in the other test scenarios, all of this occurred with zero unexpected errors.
AuthNZ with Web Application Firewall
The final phase of this test resets to the baseline to demonstrate a use case that enterprises might activate at the outer edge of its application network. This is based on the 2021 Log4Shell attack that devastated a variety of organizations.
One approach to thwarting this attack that we discussed at Solo is a declarative approach based on Web Application Firewall (WAF). This can be quickly deployed at the gateway level, completely independent of any application changes happening deeper in the network. Such responsiveness is critical when responding a zero-day exploit like Log4Shell. This sort of Tiered Gateway strategy can be quite effective when organization-wide policy requirements exist that are independent of individual applications. See this blog for more details.
In our case, we’ll add a waf
stanza that implements a single ModSecurity rule that uses regular expressions to scan multiple elements of the inbound request for indicators of a Log4Shell attack. Detecting any of these triggers leads to denial of the request.
apiVersion: gateway.solo.io/v1 kind: RouteOption metadata: name: routeoption namespace: httpbin spec: options: waf: customInterventionMessage: 'Log4Shell malicious payload' ruleSets: - ruleStr: | SecRuleEngine On SecRequestBodyAccess On SecRule REQUEST_LINE|ARGS|ARGS_NAMES|REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_BODY|REQUEST_HEADERS|XML:/*|XML://@* "@rx \\\${jndi:(?:ldaps?|iiop|dns|rmi)://" "id:1000,phase:2,deny,status:403,log,msg:'Potential Remote Command Execution: Log4j CVE-2021-44228'"
From a results standpoint, we see a return to the levels of the initial baseline phase of the test, since we have eliminated the most complex processing and replaced it with simpler regex scans. Note that the steady state throughput has rebounded to a tick above the baseline case: 26.2k requests per second, extrapolating to nearly 2.3 billion requests per day.
Summary and Further Reading
This table summarizes the results across all phases of this benchmark.
Test Phase | Throughput (per sec, day) |
TP Difference (vs. baseline) |
CPU | Memory (GiB) |
---|---|---|---|---|
Baseline Delegation | 25.5k, 2.2B | — | 11.8% | 2.9 |
AuthNZ: API keys | 21.0k, 1.8B | -18% | 16.2% | 3.3 |
AuthNZ: OAuth | 19.5k, 1.7B | -24% | 28.9% | 3.8 |
AuthNZ: OAuth + Transform | 19.0k, 1.6B | -25% | 28.9% | 3.5 |
AuthNZ: OAuth + Transform + Rate Limiting | 14.5k, 1.25B | -45% | 28.0% | 3.7 |
Web Application Firewall | 26.2k, 2.3B | +2.7% | 13.4% | 3.4 |
Gloo Gateway takes full advantage of Envoy to deliver amazing levels of performance. There was no special tuning of Gloo applied to achieve these results. a single parameter of Gloo Edge to get these numbers. Plus, there were zero errors encountered for any of the 132.92 million requests issued across this 62-minute test.
Check out this blog for a step-by-step tutorial through open-source Gloo Gateway featuring the Kubernetes Gateway API.
To experience the Gloo Gateway magic across multiple platforms for yourself, request a live demo here or a trial of the enterprise product here.
If you have questions or would like to stay closer to Solo’s streams of innovation, subscribe to our YouTube channel or join our Slack community.
Acknowledgements
Many thanks to Denis Jannot and Jesús Muñoz Rodríguez for building out the superb test framework that makes executing and extending these test scenarios surprisingly easy, and for their reviews of this blog.