Implementation of Throttling in Java

When implementing services, especially if the web, one of the problems that must be addressed at the architecture stage is the definition of the aggregate limit of requests that the back-end is able to process per unit of time. This is often measured with the parameter TPS (Transaction per Second). In some cases, the system may also have a physical limit of data that can be transferred in bytes, and then you speak of BPS (Bytes per Second).

When several clients concurrently send requests to the application, it is possible that the number of such requests exceed the physical limit imposed by the parameter of the TPS. Consequently, in the best of cases, the requests would time-out or not would not even have been taken on by the back-end of the application, but in the worst cases, may cause a fault of the system and its complete unavailability. For this reason it is often useful to implement a policy of throttlingwhich consists , substantially, in the limit the rate of requests served by the service, by terminating or by queuing up all the requests in excess.

Manage the throttling is, therefore, essential if you want to ensure that the application continues to ensure their availability and acceptable performance, even in the case of particularly high workloads.

In this article we will see how the throttling can be managed at the application layer, whereas a simple application with Spring Boot sample. This application exposes a single REST service /hello-worldmanaged by the following Controller, in which an object AtomicInteger is used to generate a id associated with the request:

Guava RateLimiter

The Project Guava contains many of the core libraries used by Google projects Java. These libraries cover many aspects: from the collection, the management of the cache, from concurrent programming, the processing of strings, and so on. In particular, in the context of the management of the competition Guava exposes a class RateLimiter that allows you to implement the throttling in a few simple steps.

First, it is necessary to include the dependency to the Guava, for example, with Maven means include the following <dependency>:

The implementation of the throttling first needs to instantiate an object of type RateLimiterby specifying the rate you want to support. This rate is defined in terms of  permits for second , which we can translate as  permissions to the second . The meaning associated to the permit depends on the use we make of them. In general N permit for second means that the service is able to manage N requests per second, but the interpretation of it may be either a transaction or a byte.

The class offers various factory methods create() for its instantiation, the simplest of which receives in input the only parameter permitsPerSecond:

In the moment in which the service is invoked, the first action that should be performed is to verify the availability of the resources (or permitrequired in order to be able to process the request. For this purpose, the class RateLimiter has a method acquire() where it is possible to specify the number of permit necessary. To apply these concepts to our controller example means to run the acquire between the first instruction of the method sayHello() and anyway, before the computationally more expensive method:

The effect of theacquire() is to verify that the rate of the invocations of the method is compatible with the one declared during instantiation of the class RateLimiter. Otherwise the execution is blocked until the request can be performed. The method returns the number of seconds of sleep that were necessary to ensure that the rate requested, or zero if there was no lock.

The class makes available also a method tryAcquire() (with different signature) that allows you to avoid the block or to get out of it after a fixed period of time. In this case, the method returns a boolean which indicates if the acquire was successful or if it is simply the timeout expired.

Interpretation of the Rate Limit

The correct operation of the throttling depends on how it is interpreted, the concept of permit offered by the class RateLimiter. For example, if we intend to support 100 transactions per second, and each invocation of the method counts as 1 transaction, then we can simply assign 100 permitsPerSecond during instantiation of the class, and consume 1 permits for each acquisition. But I can achieve the same result by 1000 permitsPerSecondboth 10 for every invocation.

This reasoning is very useful in those cases in which the choice of the two parameters is not so immediate. Suppose, for example, want to allow a maximum of 3 transactions per minute. This can be achieved either by considering a permitsPerSecond equal to 3/60 (3 transactions / 60 seconds in a minute) and acquiring 1 permits such invocation, or considering a permitsPerSecond equal to 1 and gaining 20 permits such invocation. Acquire 20 permits in fact has as effect that no other request can be served before the next 20 seconds.

In the case where the rate limit is defined in the byte are valid the same consideration. The only difference is that, at least in general, thepermitsPerSecond will be set to the maximum number of bytes manageable in the second, while the acquire will be run on a number of permits equal to the bytes consumed by the request. If we imagine, for example, the method receives an input file from processing, the size of such a file might just be the number of permits consumed:

Test Execution

We complete the service hello-world considering a rate limit of 3 transactions per minute and inserting some log messages that are necessary to understand the behavior of the method:

To run a test, we use the following shell command, which concatenates 4 instructions curlthe effect of which is to send 4 requests:

From the analysis of the log below, notice that the first request received, which in our case is the fourth, it is immediately served, while the other 3 are served respectively after 20, 40 and 60 seconds.

At completion, we consider, instead, use the method tryAcquire() with a timeout of 1 second. As we expect all the requests, except the first, are droppate.

Spring Boot Throttling

Limited to the way Spring Boot there is the open source library weddinithat allows you to manage the throttling declaratively through the annotation.

Unlike Guava, this library is more rigid in the management of the throttling, in fact, work, finishing, and then by rejecting all requests that come after the overcoming of the rate limit configured. In such cases, the exception ThrottlingException is thrown and returned to the applicant, and the method is not even executed.

The configuration of the policy of throttling is accomplished by using a simple annotation: @Throttling(), with which to annotate the method that implements the service. Through different parameters, it is then possible to define, not only the rate limit required, but also policies with which to apply the throttling.

Before going into the details of configuration, we apply the library to a simple hello word seen above. To do this we must first import the github repository and its dependency:

The controller for the management of the service will then be implemented in a way similar to what we have seen, with the only difference being that through the annotation @Throttling()the control of the rate limit is assigned completely to the library:

Note that in the absence of parameters the rate limit default is one request per second if so , then we invoke the service with the same 4 instructions curl we have seen above, only the first will be taken charge of, while the other will be droppate. The caller will then receive a exception ThrottlingException through the following json:

The configuration of the Throttling

For the configuration of the policy of throttling consider that the basic annotation allows the following three parameters:

  • The throttling type that can take on different values, as described below.
  • The rate limit which can take on integer values.
  • The timeUnit the unit of measure in which it is defined limit and that can assume the values defined in the class java.util.concurrent.TimeUnit that is: days, hours, minutes, seconds, even microseconds, milliseconds, and nanoseconds.

To these parameters if they added in function of the value assumed by the parameter typeas described in the following paragraphs.

Remote Address

The default implementation in which the rate limit is applied per IP address of the origin of the request, for instance the one obtained by HttpServletRequest#getRemoteAddr().

Spring Expression Language (SpEL)

Through an expression Download it is possible to take into account the parameters of the invoked method as a discriminant for the application of the throttling.

Http Cookie Value

The throttling can be applied in function of the value taken from a cookie present in the request and retrieved with the method HttpServletRequest#getCookies():

Http Header Value

Also the http header is retrieved from request using HttpServletRequest#getHeader() can be used as a discriminant for the throttling.

User Principal Name

Finally, the throttling can be discriminated in function of the user who is authenticated in the request, and obtained using HttpServletRequest#getUserPrincipal().getName().

Final Considerations

It is important to emphasize that the class RateLimiter offer by Guava is certainly more flexible because it does not use the concepts of remote address, cookies, etc, for limit requests, as done by the library weddini. This, however, implies that, in order to implement any policy of throttling more restrictive than is necessary to achieve the code ad-hoc. For example, the following method implements the throttling for remote address:

A second important difference is that Guava is able to queue requests that to reject them. The only drawback is that it is not possible to specify the number of requests maximum that it is possible to maintain in the queue. This, however, because a request in the queue takes system resources, and from the point of view of theavailability the application may jeopardize it.

Codie Source

The source code of the project is available here throttling.

Translated by Yandex.Translate and Global Translator

Leave a comment

Your email address will not be published.

*