Coming back to your question regarding the differences between the token
bucket algorithm and our current quota mechanism. I did some tests and
they confirmed my first intuition that our current mechanism does not work
well with a bursty workload. Let me try to illustrate the difference with an
example. One important aspect to keep in mind is that we don't want to
reject requests when the quota is exhausted.
Let's say that we want to guarantee an average rate R=5 partitions/sec while
allowing a burst B=500 partitions.
With our current mechanism, this translates to following parameters:
- Quota = 5
- Samples = B / R + 1 = 101 (to allow the burst)
- Time Window = 1s (the default)
Now, let's say that a client wants to create 7 topics with 80 partitions
the time T. It brings the rate to 5.6 (7 * 80 / 100) which is above the
any new request is rejected until the quota gets back above R. In theory,
client must wait 12 secs ((5.6 - 5) / 5 * 100) to get it back to R. In
to the sparse samples (one sample worth 560), the rate won't decrease until
that sample is dropped and it will be after 101 secs. It gets worse if the
With the token bucket algorithm, this translate to the following parameters:
- Rate = 5
- Tokens = 500
The same request decreases the number of available tokens to -60 which is
below 0 so any new request is rejected until the number of available tokens
gets back above 0. This takes 12 secs ((60+1) / 5).
The token bucket algorithm is more suited for bursty workloads which is our
case here. I hope that this example helps to clarify the choice.
On Tue, May 12, 2020 at 3:19 PM Tom Bentley <[EMAIL PROTECTED]> wrote: