Riya's Medley #14 - Understanding Rate Limiting for Seamless Performance
A rate limiter is like a bouncer for data or requests. It controls how fast or how often you can access data.
A rate limiter is like a bouncer for data or requests. It controls how fast or how often you can access data.
Why do you need rate limiter?
Prevent overloading of system - Bots or even a user attempting a DDoS attack or data scraping can sent excessive requests causing the system to overload and stop working.
Control credential stuffing - By restricting the number of login attempts from a single IP we can stop credential stuffing attacks. These attacks involve using automated tools to test username and password combinations until the right one is found.
Stopping Inventory Hoarding attack - In these attacks, individuals attempt to buy up all the available stock of a product with the intention of reselling it at a higher price.
Most programming languages and frameworks have rate-limiting libraries that can be used in the application.
Python →
flask-limiter, django-ratelimit
Node.js →
express-rate-limit, ratelimiter
Nginx has a built-in rate-limiting configuration feature
These libraries abstract the algorithmic implementation of rate limiter. But, it’s good to know some of the basic concept behind it.
Fixed Window
This is the most straightforward algorithm and can be implemented even without using any external library. In this algorithm, limited number of requests is allowed in a fix interval. For example, if the rate limit is 5 requests/minute then the client can make a maximum 5 requests in this 1 minute window interval.
The problem with this approach is the overflow of request at the edges of the window.
If we consider the time frame between 0.5 and 1.5, that is also 1 minute but the number of requests consumed in this interval is 6, which is above allowed limit. To overcome this issue we have the sliding window algorithm.
Sliding Window
Sliding window solves the limitation of fixed window by moving window forward when request is made.
Each timestamp is stored when a new request is made and older timestamps are removed. Request is only allowed if the number of timestamps do not exceed the allowed limit. This algorithm is efficient, the only downside is that we need extra space to store all these timestamps.
Cloudflare has an article on how they utilized by tweaking this algorithm to minimize L7 DDoS attacks.
Token Bucket
As the name suggests, the idea behind the algorithm is that each IP address or user is assigned a bucket of token, when the number of tokens get exhausted, requests are denied until bucket is refilled again.
So, the Token Bucket Algorithm helps control how often you can make requests. If you try to make too many requests too quickly, you'll run out of tokens and have to wait until more tokens appear. It's like having a system that says, "Hey, you can only do this every so often, and if you try to do it too much, you'll have to wait."
There are also other rate limiter algorithms like leaky bucket and sliding window log.
That’s it folks. Thanks for reading!
Beautifully written!