Advanced Rate Limiting Use Cases with Envoy Proxy

Solo.io | June 15, 2020

As teams deploy applications into production and start to expose those applications to external clients and users, they need to consider configurations to secure and protect the applications to ensure a good user experience and meet their SLAs. Rate limiting is one of these methods.

What is Rate Limiting?

Rate limiting is a method of protecting backed applications by controlling the rate of traffic coming into or out of a network. The rate is specified by how many times a service can be called within a specific time interval (per second, minute, hour) and the ability to also set up a quota of total number of requests in a time period (X requests in a day, week, month). If the number of requests exceeds the defined limit, the incoming requests can overwhelm the capacity of the services resulting in poor performance, reduced functionality, and downtime. These can be the result of either intentional or unintentional events like DDoS attacks or client/system errors.

While setting these limits manually at the application level is possible, it greatly increases the complexity and administrative burden. API Gateways are an option to control the outside world’s access to the various applications through centrally defined configurations that are distributed across environments.

Rate Limiting with Envoy Proxy and Gloo

Envoy Proxy is the popular edge and service proxy and serves as the data plane for many cloud-native application networking technologies like modern API Gateways and Service Mesh. Envoy uses a chain of filters to shape and control the network traffic that flows through the proxy and rate limiting is one of those filters that can be configured and managed through a control plane like Gloo API gateway.

In a previous blog post, we provided an introductory overview into rate limiting and in this post we’ll cover additional rate limiting use cases that have been developed with our end users and customers, including:

Multiple Rate Limits per Client IP
Rate Limit Traffic Prioritization on HTTP
Integrating Rate Limits with JSON Web Tokens (JWT)

Gloo exposes Envoy’s rate-limit API, which allows users to provide their own implementation of an Envoy gRPC rate-limit service. Depending on if you are trying these use cases with enterprise or open source, a rate limiting server will either be included in the install or you’ll need to build your own, respectively. To fully leverage the Envoy Rate Limiter, you’ll need to complete these two additional Gloo configurations: configure rate limiting descriptors in the Gloo settings manifest and configure Envoy rate limiting actions at the Virtual Service level for each route.

Additionally, we are hosting a webinar on June 25th to dig into these use cases with live demos and Q&A – Register here to learn more.

Rate Limits by Client IP

This use case is a pretty common one, allowing a rate limit by the initiating client. If you have specific clients or end users that always need access to certain services, an option is to define the corresponding rate limit by the client IP address (aka downstream remote address). To do this, define a descriptor called remote_address in the settings along with the preferred requests per unit and definition of unit. Make sure to check the configuration to ensure the remote_address is a real client IP address and not an internal Kubernetes cluster or load balancer address. Admins are able to configure one or multiple rate limits for the same remote_address and once defined, Gloo will increment the appropriate number of counters based on the defined rate limits. Read more here.

Rate Limit Traffic Prioritization

The addition of different priorities or classes of traffic is a network management tactic for building resilient distributed systems and can be implemented in the form of rate limits. If you have multiple types of requests for a given service, you can decide to prioritize one type of request over another based on a business priority. This allows the system to drop the lower priority traffic to protect the higher priority one if they both come simultaneously.

As an example: An API supports both GET and POST functions for listing data and creating resources, respectively. The business finds the POST action more important, so a global rate limit can be applied to the POST function and a smaller rate limit applied to the GET function. Read more here.

Integrating Rate Limits with JWT

Headers values can be a convenient way to define rate limits but for those looking to add incremental security, an option is to encode the values as claims in a JSON Web Tokens (JWT) that is then passed on in the request. JSON Web Tokens, or JWT for short, are a standard way to carry verifiable identity information and can be used for authentication.

For the rate limiting use case, the headers are specified in the JWT configuration to be derived from those extracted claims after the JWT has been verified. Clients and end users are then provided a secure method of acquiring a JWT via authentication with a trusted identity provider. The virtual service will have an additional JWT configuration section to extract the x-type and x-header claims from the verified JWT and the request will continue on through the rate limit filter. In the case of an invalid JWT, the entire request will be considered invalid and the user presented with an error. Read more here.

Learn More

We hope you give these are other rate limiting scenarios a try in your application environment. Check out more articles on Gloo features and tutorials, here.

Download Gloo open source or request an enterprise trial
Learn more about Gloo and rate limiting
Register for an upcoming webinar
Questions? Join the community slack