Fast and Furious: Gateway API at Scale with Envoy Proxy and Gloo Gateway

September 10, 2024
Jim Barton

Gloo Gateway is Solo’s cloud-native API gateway based on the open source Envoy Proxy and the Kubernetes Gateway API. It provides authentication (using OAuth, JWT, API keys, HTTP basic auth, to name a few), authorization (with OPA or custom approaches), a web application firewall (WAF – based on ModSecurity), function discovery for OpenAPI and AWS Lambda, and advanced transformations.

“So, how many instances of Envoy proxy do I need?”

That’s one of the first questions asked by prospective users. Of course, each use case is different. But we generally answer that they need two instances to get high availability. However, it is rare for users deploy multiple instances for performance reasons. This blog post provides hard data in the form of benchmark results for common usage patterns that Solo sees within its customer base.

Test Strategy

We’ll start in this blog post by benchmarking Gloo Gateway without any filters, using only basic HTTP requests to delegate to a lightweight, internally deployed httpbin service.

Then we’ll show the impact of mixing in popular gateway features, like authentication and authorization using API keys, OAuth and WAF; transformations; and rate limiting.

We’ll combine all these operations into a single test run and use K6 load testing to orchestrate and produce Grafana dashboards that summarize our results.

Test Environment

Our test environment lived in a CPU-optimized, 32-CPU EKS cluster. We deployed an httpbin workload as the lone application service in the cluster. We fronted this service with a single instance of Envoy proxy. The single Envoy instance was configured with 10 CPUs and 8GiB of memory allocated.

Gloo Gateway deployments are frequently deployed in production with multiple Envoy replicas, for both scalability and high availability. But one of the primary objectives of this test was to explore the limits of individual Envoy proxies under Gloo control.

We used a small suite of k6 test runners to drive the test traffic and to help with analyzing the results, using 2 CPUs with 8GiB of memory.

For more complex test scenarios, we activated an 8-replica suite of Gloo ExtAuth processors. Traffic is delegated from Envoy to the ExtAuth service only when a relevant authNZ policy is present (e.g., API keys, OAuth, Open Policy Agent).

A 4-replica suite of Gloo Rate Limiting processors was also available in the test environment. As with ExtAuth, traffic was only delegated to these services when a rate limiting policy was active. Note that neither ExtAuth nor Rate Limiting was active for the baseline scenario.

A 4-replica suite of Gloo Rate Limiting processors was also available in the test environment. As with ExtAuth, traffic was only delegated to these services when a rate limiting policy was active. Note that neither ExtAuth nor Rate Limiting was active for the baseline scenario.

Test Scenarios

We executed six separate test scenarios for just over an hour, 62 minutes to be precise. Here is a list of all the scenarios, with individual durations:

  • Baseline Delegation: Delegate to httpbin with no authNZ or rate limiting constraints (12 minutes).
  • AuthNZ with API Keys: Add an API key-based authNZ policy to baseline (10 minutes).
  • AuthNZ with OAuth: Switch authNZ policy to OAuth with Keycloak (10 minutes).
  • AuthNZ with Transformations: Add a transformation to a JWT claim from Keycloak (10 minutes).
  • AuthNZ with Transformations and Rate Limiting: Add rate limiting policy to prior test (10 minutes).
  • AuthNZ with Web Application Firewall: Reset to baseline and add WAF processing (10 minutes).

Baseline Delegation

In the baseline test scenario, we configured the single Envoy proxy instance via Gloo Gateway using a simple HTTPRoute that forwarded all traffic to the httpbin service via an Upstream resource. Upstreams are useful Gloo abstractions that provide a single point of reference to a potentially complex network of services that may be deployed either inside or outside of a Kube cluster.

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: httpbin
  namespace: httpbin
spec:
  rules:
    - matches:
      - path:
          type: PathPrefix
          value: /
      backendRefs:
        - name: httpbin1
          group: gloo.solo.io
          kind: Upstream
          port: 8881

We ran this baseline test for 12 minutes, with top CPU consumption for the Gloo components and upstream services under 12%. The system reached a maximum rate of 25.5k requests per second by the end of that phase of the test, and with zero errors. That translates into a ceiling of over 2.2 billion requests per day, and with a single Envoy instance!

You can see that result in the first segment of the “K6 runners” panel labeled “Baseline” in the Grafana chart below.

Take special note of the efficiency of the Envoy proxy itself, at the heart of processing all requests across all test scenarios. It is listed as gloo-proxy-http in the results chart above. Envoy consumes a maximum of 10% of the available CPU across all tests, and its memory requirements never exceed 1.1 GiB.

AuthNZ with API Keys

This scenario adds authNZ policies and a significant new element, the Gloo extauth processor, to the data path for each request processed. We’ll declare a simple API key policy that requires all requests to contain an authorization header with a key taken from a particular category of Kubernetes secrets.

The center of this policy is an AuthConfig object, a Gloo Gateway abstraction that we will plug into our route as a standard Gateway API extension. This policy looks for headers with the name api-key and accepts only requests that contain Secrets that carry the product-excellence team label.

apiVersion: enterprise.gloo.solo.io/v1
kind: AuthConfig
metadata:
  name: apikeys
  namespace: httpbin
spec:
  configs:
  - apiKeyAuth:
      headerName: api-key
      labelSelector:
        team: product-excellence

That policy can be attached to an entire gateway, or scoped down to an individual route. We will create a RouteOption object and then use that to attach the policy to our initial HTTPRoute.

apiVersion: gateway.solo.io/v1
kind: RouteOption
metadata:
  name: routeoption
  namespace: httpbin
spec:
  options:
    extauth:
      configRef:
        name: apikeys
        namespace: httpbin
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: httpbin
  namespace: httpbin
spec:
  rules:
    - matches:
      - path:
          type: PathPrefix
          value: /
      filters:
        # Attach the api-key authNZ behavior to our initial httpbin route
        - type: ExtensionRef
          extensionRef:
            group: gateway.solo.io
            kind: RouteOption
            name: routeoption
      backendRefs:
        - name: httpbin1
          group: gloo.solo.io
          kind: Upstream
          port: 8881

How much does layering in this authNZ behavior impact our request throughput? Not as much as you might expect. Refer to the second segment of the “K6 runners” result chart that we introduced earlier. You see that the rate reaches a steady state of 21.0k requests per second by the end of that 10-minute test phase. That represents about an 18% throughput decline from the baseline: (25.5 – 21.0) / 25.5.

You can also see a modest uptick in CPU and memory consumption from the chart, due to the increased processing load on the gateway. Note that practically all of the extra load is attributable to the Gloo component responsible for extauth processing, the green area on the CPU and memory panels.

In this case, we could still handle about 1.8 billion requests per day through our single Envoy instance, and all of this occurred with zero unexpected errors.

AuthNZ with OAuth and Keycloak

OAuth is a popular security framework that allows users to grant third-party applications access to their data without sharing their passwords. OAuth uses access tokens to give third-party services temporary access to a limited amount of a user’s personal information.

In this third phase of the test, we’ll use the OAuth support in open-source Keycloak to validate user requests.

The primary change in our routing is to swap out our API key AuthConfig policy for a more complex OAuth configuration.

apiVersion: enterprise.gloo.solo.io/v1
kind: AuthConfig
metadata:
  name: oauth
  namespace: httpbin
spec:
  configs:
    - oauth2:
        oidcAuthorizationCode:
          appUrl: "https://httpbin.example.com"
          callbackPath: /callback
          clientId: ${KEYCLOAK_CLIENT}
          clientSecretRef:
            name: oauth
            namespace: gloo-system
          issuerUrl: "${KEYCLOAK_URL}/realms/workshop/"
          logoutPath: /logout
          afterLogoutUrl: "https://httpbin.example.com/get"
          session:
            failOnFetchFailure: true
            redis:
              cookieName: keycloak-session
              options:
                host: 127.0.0.1
          scopes:
          - email
          headers:
            idTokenHeader: jwt
          identityToken:
            claimsToHeaders:
              - claim: email
                header: X-Email

If you’re familiar with OAuth security, then you’ll recognize the verbosity in the configuration is driven by the complexity of the interactions required by the OAuth standard. But this complexity offers value to us as well. Note for example that we can extract claims information from the OAuth identity tokens and use those to enrich the content of our request. In this case, for example, you see that we’re extracting an email claim from the identity token and adding it to a header called X-Email that this policy will pass along with the request. In future test phases, we’ll build on this with Gloo transformations to synthesize other request headers.

From a results standpoint, we expect our throughput to decrease a bit as we are adding another runtime component to the mix in Keycloak, plus a more complex set of security algorithms with OAuth.

So what’s the throughput impact of swapping in OAuth security? You can understand it better from the third segment of the “K6 runners” result chart. There is an incremental decrease to a steady state of 19.5k requests per second by the end of that 10-minute test phase. That represents a 5% throughput decline from the simpler API key processing, and 23.5% off the baseline case: (25.5 – 19.5) / 25.5.

You can also see a substantial increase in CPU consumption from the chart, and a more modest memory uptick. As was the case with API key processing, the extra load is attributable to the Gloo component responsible for extauth processing, the green area on the CPU and memory panels.

Even with the additional OAuth processing, we could still handle over 1.6 billion requests per day through our single Envoy instance, and all of this occurred with zero unexpected errors.

AuthNZ with OAuth and Transformations

In the fourth phase of the test, we expanded our use of OAuth from the previous section, where we simply extracted an email claim from the identity token and converted that into a header X-Email. Now we’ll demonstrate the power of the Gloo Gateway transformation features. We’ll use those to extract the organization name from the email address, and we’ll synthesize that into a new header.

So for example, if the email claim has a value jim@solo.io, then our Gloo configuration will add a new X-Organization header with value solo.io. That header will then be available for subsequent Envoy filters in the chain to use, and it can also be passed to the upstream service.

The configuration change required to activate this transformation is to add a stagedTransformations stanza to the RouteOption that we used in the previous section. It originally delegated to our OAuth AuthConfig. Now in addition to that, it transforms the email value as we described before.

apiVersion: gateway.solo.io/v1
kind: RouteOption
metadata:
  name: routeoption
  namespace: httpbin
spec:
  options:
    # Delegate to OAuth processing in Keycloak
    extauth:
      configRef:
        name: oauth
        namespace: httpbin
    # Extract email claim into an organization variable using a regex
    stagedTransformations:
      regular:
        requestTransforms:
        - requestTransformation:
            transformationTemplate:
              extractors:
                organization:
                  header: 'X-Email'
                  regex: '.*@(.*)$'
                  subgroup: 1
              # Use the extracted organization variable to create a new header
              headers:
                x-organization:
                  text: "{{ organization }}"

From a results standpoint, adding this transformation has only a small impact on throughput, decreasing about 3% from the previous test stage, down to 19.0k requests per second. Changes in CPU and memory usage were negligible.

AuthNZ with Transformations and Rate Limiting

The fifth phase of the test explores the most complex scenario in this blog. We’re adding one more major component to our request processing, the Gloo Gateway rate limiter. We’ll add a new policy that limits users from any single organization to a given request rate. This adds two more components to the data path of our request: the Gloo rate-limit processor and its Redis cache of rate limit counters. These are in addition to the extauth processor that is responsible for enforcing our OAuth policy.

We’ll declare this policy using a RateLimitConfig custom resource. This limits the number of requests from any organization to a maximum of 999,999 per second.

Note that this an impractically high limit on purpose. Our goal for this scalability testing is not to trigger the limit, but instead to force each request through the rate limiting path.

apiVersion: ratelimit.solo.io/v1alpha1
kind: RateLimitConfig
metadata:
  name: limit-users
  namespace: httpbin
spec:
  raw:
    # Use previously synthesized X-Organization header as basic for rate limiting
    setDescriptors:
      - simpleDescriptors:
          - key: organization
            value: solo.io
        rateLimit:
          requestsPerUnit: 999999
          unit: SECOND
    rateLimits:
    - setActions:
      - requestHeaders:
          descriptorKey: organization
          headerName: X-Organization

We will also modify our httpbin RouteOption to add a rateLimitConfigs stanza that mixes this policy into our existing OAuth and transformation processing.

apiVersion: gateway.solo.io/v1
kind: RouteOption
metadata:
  name: routeoption
  namespace: httpbin
spec:
  options:
    extauth:
      configRef:
        name: oauth
        namespace: httpbin
    stagedTransformations:
      regular:
        requestTransforms:
        - requestTransformation:
            transformationTemplate:
              extractors:
                organization:
                  header: 'X-Email'
                  regex: '.*@(.*)$'
                  subgroup: 1
              headers:
                x-organization:
                  text: "{{ organization }}"
    # Add per-organization RateLimitConfig to routing behavior
    rateLimitConfigs:
      refs:
      - name: limit-users
        namespace: httpbin

How much does layering in this rate limiting behavior impact our request throughput? Refer to the fifth segment of the “K6 runners” result chart. You see that the rate reaches a steady state of 14.5k requests per second by the end of that 10-minute test phase. That represents about a 24% from the previous stage before adding rate limiting. It represents a 43% throughput decline from the baseline.

You can also see a significant uptick in CPU and memory consumption from the chart, due to the increased processing load on the rate limiter.

This case extrapolates to about 1.25 billion requests per day through our single Envoy instance with both external authorization and rate limiting policies active. As in the other test scenarios, all of this occurred with zero unexpected errors.

AuthNZ with Web Application Firewall

The final phase of this test resets to the baseline to demonstrate a use case that enterprises might activate at the outer edge of its application network. This is based on the 2021 Log4Shell attack that devastated a variety of organizations.

One approach to thwarting this attack that we discussed at Solo is a declarative approach based on Web Application Firewall (WAF). This can be quickly deployed at the gateway level, completely independent of any application changes happening deeper in the network. Such responsiveness is critical when responding a zero-day exploit like Log4Shell. This sort of Tiered Gateway strategy can be quite effective when organization-wide policy requirements exist that are independent of individual applications. See this blog for more details.

In our case, we’ll add a waf stanza that implements a single ModSecurity rule that uses regular expressions to scan multiple elements of the inbound request for indicators of a Log4Shell attack. Detecting any of these triggers leads to denial of the request.

apiVersion: gateway.solo.io/v1
kind: RouteOption
metadata:
  name: routeoption
  namespace: httpbin
spec:
  options:
    waf:
      customInterventionMessage: 'Log4Shell malicious payload'
      ruleSets:
      - ruleStr: |
          SecRuleEngine On
          SecRequestBodyAccess On
          SecRule REQUEST_LINE|ARGS|ARGS_NAMES|REQUEST_COOKIES|REQUEST_COOKIES_NAMES|REQUEST_BODY|REQUEST_HEADERS|XML:/*|XML://@*
            "@rx \\\${jndi:(?:ldaps?|iiop|dns|rmi)://"
            "id:1000,phase:2,deny,status:403,log,msg:'Potential Remote Command Execution: Log4j CVE-2021-44228'"

From a results standpoint, we see a return to the levels of the initial baseline phase of the test, since we have eliminated the most complex processing and replaced it with simpler regex scans. Note that the steady state throughput has rebounded to a tick above the baseline case: 26.2k requests per second, extrapolating to nearly 2.3 billion requests per day.

Summary and Further Reading

This table summarizes the results across all phases of this benchmark.

Test Phase
Throughput
(per sec, day)
TP Difference
(vs. baseline)
CPU
Memory
(GiB)
Baseline Delegation
25.5k, 2.2B
11.8%
2.9
AuthNZ: API keys
21.0k, 1.8B
-18%
16.2%
3.3
AuthNZ: OAuth
19.5k, 1.7B
-24%
28.9%
3.8
AuthNZ: OAuth + Transform
19.0k, 1.6B
-25%
28.9%
3.5
AuthNZ: OAuth + Transform + Rate Limiting
14.5k, 1.25B
-45%
28.0%
3.7
Web Application Firewall
26.2k, 2.3B
+2.7%
13.4%
3.4

Gloo Gateway takes full advantage of Envoy to deliver amazing levels of performance. There was no special tuning of Gloo applied to achieve these results. a single parameter of Gloo Edge to get these numbers. Plus, there were zero errors encountered for any of the 132.92 million requests issued across this 62-minute test.

Check out this blog for a step-by-step tutorial through open-source Gloo Gateway featuring the Kubernetes Gateway API.

To experience the Gloo Gateway magic across multiple platforms for yourself, request a live demo here or a trial of the enterprise product here.

If you have questions or would like to stay closer to Solo’s streams of innovation, subscribe to our YouTube channel or join our Slack community.

Acknowledgements

Many thanks to Denis Jannot and Jesús Muñoz Rodríguez for building out the superb test framework that makes executing and extending these test scenarios surprisingly easy, and for their reviews of this blog.

Cloud connectivity done right