This algorithm divides time into fixed durations/windows. For example each
window is 10 seconds long. When a new request comes in, the current time is used
to determine the window and a counter is increased. If the counter is larger
than the set limit, the request is rejected.
Builds on top of fixed window but instead of a fixed window, we use a rolling
window. Take this example: We have a rate limit of 10 requests per 1 minute. We
divide time into 1 minute slices, just like in the fixed window algorithm.
Window 1 will be from 00:00:00 to 00:01:00 (HH:MM:SS). Let’s assume it is
currently 00:01:15 and we have received 4 requests in the first window and 5
requests so far in the current window. The approximation to determine if the
request should pass works like this:
Copy
Ask AI
limit = 10# 4 request from the old window, weighted + requests in current windowrate = 4 * ((60 - 15) / 60) + 5 = 8return rate < limit # True means we should allow the request
Consider a bucket filled with maximum number of tokens that refills constantly
at a rate per interval. Every request will remove one token from the bucket and
if there is no token to take, the request is rejected.