Providing Rate-Limiting Via Bucket4j in Java

2 February, 2022

No Comments

Token Bucket is an algorithm used in packet-switched and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness . It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness

In a review by DZone, Maxim Bartkov, a Java Team Lead, RooX Solutions, gave a more thorough explanation of the “Token Bucket” Algorithm, where it is needed and what are the disadvantages of implementing it.

Every single user has, at least one in a lifetime, faced a problem with limiting their external API – a functionality where we should limit calling to our API for many reasons. Here is what the review explained:

Where Is rate-limiting Needed?

Fraud detection (Protection from Bots): For instance, if we have a website and we want to prevent spam from customers.
From the point of business logic, usually, it is used to realize the “API Business Model:” When we need to introduce tarification functionality for our external API, and we want to create a few tariffs. For each tariff, we set a call count limit per hour, such as:

START – up to 100 calls per hour
STANDARD – up to 10000 per hour
BUSINESS – up to 100000 per hour

These are the main reasons why to implement rate-limiting, however, there are many other reasons to use it. If we want to realize this function, we can use several algorithms. However, in this review we will only speak about “Token Bucket Algorithm”.

An Explanation

Let’s consider the algorithm in the following example.

Photo Credits: DZone

Bucket: It has a fixed volume of tokens (if you set 1000 tokens into our bucket, this is the maximum value of volume).
Refiller: Regularly fills missing tokens based on bandwidth management to Bucket (calling every time before Consume).
Consume: It takes away tokens from our Bucket (take away 1 token or many tokens (depends on the weight of calling consume method, it is a customizable and flexible variable, but in 99% of cases, we need to consume only one token).

Here is an example of a refiller working with Bandwidth management to refresh tokens every minute:

Photo Credits: DZone

Consume (as action) takes away Tokens from Bucket. The bucket is needed for storing a current count of Tokens, maximum possible count of tokens, and refresh time to generate a new token.

The Token Bucket algorithm has fixed memory for storing Bucket, and it consists of the following variables:

The volume of Bucket (maximum possible count of tokens) – 8 bytes
The current count of tokens in a bucket – 8 bytes
The count of nanoseconds for generating a new token – 8 bytes
The header of the object: 16 bytes

In total: 40 bytes.

For instance, if we have one gigabyte, we can store 25 million buckets.

The Disadvantage of the Algorithm

Despite all the positives, the algorithm is not perfect. The main problem of the Token Bucket algorithm is called “Burst.” The following example explains:

At some moments, our bucket contains 100 tokens.
At the same time, we consume 100 tokens.
After one second, the refiller again fills 100 tokens.
At the same time, we consume 100 tokens.

By ~1 second we consumed 200 tokens and accordingly, we exceeded the limit x2 times! However, this is not a problem at all.

Explanation: If we use our Bucket for only one second, we are over-consuming tokens x2 times (200 tokens), but if we use our bucket for 60 seconds, consumption of the bucket is equal to about 6100 seconds because the Burst problem happened only once. The more you use the bucket, the better it is for its accuracy.

The most important is consuming memory because we have a problem related to “Burst.” A bucket has a requirement about fixed memory size (40 bytes), and we face a problem of “Burst” because to create Bucket we need 2 variables: count of nanoseconds to generate a new token (refill) and volume of a bucket (capacity) – because of that, we can’t realize accuracy contract of Token Bucket.

To understand how to implement the “Token Bucket” algorithm, you can follow the Maxim Bartkov tutorial published on DZone.