[Distributed current limit] Have you been pitted by the 12306 verification code?

[Distributed current limit] Have you been pitted by the 12306 verification code?

Stay Hungry, Stay Foolish-hungry for knowledge, hungry for knowledge

table of Contents

  • Preface
  • basic concepts
  • solution
    • Realize current limit based on guava
    • Current limit at the gateway level
    • Middleware to achieve current limit
  • Common current limiting algorithms
    • Token bucket algorithm
    • Leaky bucket algorithm
  • Actual combat
    • Current limit actual combat based on guava
    • Based on Nginx current limit actual combat
    • Current limiting component based on Redis+Lua (omitted)
  • Write at the end


I believe that many small and medium-sized companies or TO B companies have never been exposed to current limiting. For example, friends will find that the software current limit is all around. I believe that many friends have the experience of buying tickets at 12306 to go home. Everyone should be very familiar with the picture below.

That's right, this is the verification code for the dead, especially when grabbing votes (of course it is much easier to grab votes during the epidemic this year), no matter how you choose, you are always told that you chose the wrong one. Or just give you a bunch of pictures where ghosts can't see anything, sometimes it really makes people doubt life. In fact, this is a kind of current limiting measure. This deliberate means of making things difficult for users honestly restricts traffic access and greatly reduces the access pressure of the system.

basic concepts

Through the above example, the old cat feels that the friends at least have a certain number of current limiting in their hearts, so let's take a closer look at the dimensions of current limiting.

In fact, for general current limiting scenarios, there will be two dimensions of information:

  1. Time : The current limit is based on a certain time period or point in time, that is, the "time window", which limits every minute or even every second.
  2. Resources : Based on the limitations of existing available resources, such as the maximum number of accesses or the maximum number of available connections.

Based on the above two dimensions, we can basically give a simple definition of current limit. Current limit is the restriction of resource access in a certain time window. For example, there are up to 100 requests per second. But in a real scenario, we will not only set up one current limiting rule, but a combination of multiple rules.

QPS and connection number control

For the number of connections and access frequency (QPS) current limit in the above figure, we can set the current limit in the IP dimension. It can also be based on the current limit of a server. In actual combat, multiple dimensions of current limiting rules are usually set. For example, the frequency of access to the same IP per second is less than 10 and the number of connections is less than 5. When setting each machine's QPS to a maximum of 1000, the maximum number of connections is maintained at 200. Furthermore, we can treat the entire server group and computer room as a whole, and then set higher-level current limiting rules. Regarding this scene, Lao Mao will give a specific implementation demo code later in this article.

Transmission rate

As for the transfer rate, everyone should be familiar with it. For example, if a member does not give you a few KB to download, the member will give you a download rate of tens of M. This is based on the current limit logic of member users. In fact, we can limit the transmission rate in Nginx. For the demo, see the latter part of this article.

Black and white list

Black and white lists are a common flow restriction and release method for many companies, and black and white lists are often dynamic. For example, if an IP is accessed too frequently in a period of time, and the system does not recognize it as a robot or traffic attack, then the IP will be added to the blacklist, thereby restricting access to system resources. This is IP blocking, or Speaking of robbing train tickets, everyone will use third-party software. When swiping tickets, I don t know if everyone notices that sometimes they will be locked in a small black room and then released again decisively. In fact, this is based on the dynamic changes of the blacklist.

Then there is no need to explain the whitelist, it is equivalent to a pass, which can freely travel through various flow restriction rules.

Distributed current limit

Many systems nowadays are distributed systems, and Lao Mao shared the distributed lock mechanism with you before. So what is called distributed current limiting? In fact, it is very simple, that is, different from the single-machine current limiting scenario, all servers in the entire distributed environment are considered as a whole. For example, for IP current limiting, we limit an IP to a maximum of 100 accesses per second. No matter which machine it falls on, as long as it accesses a service node in the cluster, it will be subject to current limiting.

Therefore, there are generally two types of distributed current limiting schemes:

  • Gateway layer current limit: The flow rule is placed at the flow entrance.
  • Middleware flow limit: Using middleware, such as Redis cache, each component can obtain traffic statistics at the current moment from here to decide whether to let it go or reject it.


In this article, Lao Mao mainly introduces three solutions to everyone.

Solution 1: Realize current limit based on GUAVA

I believe that many people are familiar with guava, which is actually a toolkit produced by Google. We often use it to do some collection operations or do some memory caching operations. But in addition to these basic usages, Guava is actually involved in other fields, including reflection tools, functional programming, mathematical operations, and so on. Of course, guava also made contributions in the field of current limiting. The main thing is to provide several current limiting support classes led by RateLimiter in the multi-threaded module. Guava is a client component, which means that its scope is limited to the current server and cannot control the flow of other servers within the cluster. A simple example is shown below.

Solution 2: Realize current limit at the gateway level

Let's look directly at a picture, it should be a funnel model to be precise, as follows:

From the above figure, we can find that this is basically a normal request process for our daily requests:

  1. User traffic from the gateway layer to the background service
  2. The background service takes over the traffic and calls the cache to get the data
  3. If there is no data in the cache, query the database back to the source

So why do we call it a funnel model? In fact, it is very simple, because the traffic is decreasing from top to bottom, the most intensive user access requests are gathered at the gateway layer, followed by the background service. After service verification, some of the wrong requests are flushed, and the remaining requests fall into the cache. If there is no cache, it is the final database layer, so the frequency of database requests is the lowest. Later, Lao Mao will show you the current limiting of Nginx at the gateway layer.

Option 3: Middleware current limit

For developers, the current limit at the gateway layer needs to find the cooperation of the operation and maintenance team to achieve, but now young people have a strong desire to control, so most developers will decide to limit the current at the code level, then at this time , The middleware Redis has become the best choice. We can use the expiration time feature of Redis to request the time span for setting the current limit (for example, one request per second, or 10 requests in 10 seconds). At the same time, Redis also has a special skill called script programming. We can write a script of current limiting logic and implant it into Redis, so that the important task of current limiting is completely separated from the service layer. At the same time, Redis's powerful concurrency features and The highly available cluster architecture can also well support the current limited access of a huge cluster.

Common current limiting algorithms

Algorithm 1: Token bucket algorithm

The Token Bucket algorithm is currently the most widely used current limiting algorithm. As the name suggests, it consists of the following two roles:

  • Token-Only requests for tokens will be processed, and other requests will either be queued or discarded.
  • Bucket-a place used to hold tokens, all requests are to obtain related tokens from the bucket.

Simple token processing is shown in the figure below:

Token generation

This process involves a token generator and a token bucket. For the token generator, it adds tokens to the bucket at a predetermined rate, for example, issuing tokens at a rate of 100 requests per second. The distribution here is a uniform distribution. The token generator is analogous to a faucet, and the token bucket is analogous to a bucket. When the water is full, the next filling the bucket will overflow. The nature of the token issuance is the same. The capacity of the token bucket is limited. If it is full, the new token will be discarded at this time. (You can try to think about whether there are any holes in the rate of token issuance).

Token acquisition

After each request arrives, a token must be obtained to execute the following logic. If the number of tokens is small and there are many access requests, some of the requests will naturally not be able to obtain tokens. At this time, we can set up a buffer queue to store the excess tokens. Buffer queue is also an option, not all token bucket algorithm programs will implement queue. When the cache queue exists, those requests that have not obtained the token will be queued in this queue until a new token is generated, and then a request is taken from the head of the queue to match the token. When the queue is full, this part of the access request is discarded. In fact, in actual applications, you can also set relevant attributes in the queue, such as setting the survival time of requests in the queue, or sorting according to priority, and so on.

Algorithm 2: Leaky Bucket Algorithm (Leaky Bucket)

The schematic diagram is as follows:

The judgment logic of the leaky bucket algorithm is similar to that of the token bucket, but the operation object is different. The token bucket puts the token in the bucket, and the leaky bucket puts the request into the bucket. The same is that if the bucket is full, the subsequent data will be discarded. The leaky bucket algorithm will only flow packets out of the bucket at a constant speed.

The connection and difference between the two algorithms:

  • Common point: These two algorithms have a "constant" rate and an "indeterminate" rate. The token bucket creates tokens at a constant rate, but the rate at which the access request obtains tokens is indeterminate. As many tokens as there are, as many tokens are issued, and there is no more than waiting. The leaky bucket algorithm processes requests at a certain rate, but the speed of requests flowing into the bucket is variable
  • The difference: the leaky bucket naturally determines that it will not have burst traffic. Even if it has 1000 requests per second, the access rate of its subsequent service output is always constant. However, the token bucket is different, and its characteristic can "prestore" a certain amount of tokens, so all the tokens can be consumed in a short time when dealing with burst traffic. The efficiency of the burst flow rate will be higher than that of the leaky bucket, and of course the pressure on the back-end system will be greater.

Current Limiting Actual Combat

Current limit actual combat based on guava

pom dependency:


demo code:

    RateLimiter limiter = RateLimiter.create(2.0);
    public String tryAcquire(Integer count){
        if (limiter.tryAcquire(count)){
            log.info("success, rate is {}",limiter.getRate());
            return "success";
        }else {
            log.info("fail ,rate is {}",limiter.getRate());
            return "fail";
    public String tryAcquireWithTimeout(Integer count, Integer timeout){
        if (limiter.tryAcquire(count,timeout, TimeUnit.SECONDS)){
            log.info("success, rate is {}",limiter.getRate());
            return "success";
        }else {
            log.info("fail ,rate is {}",limiter.getRate());
            return "fail";
    public String acquire(Integer count) {
        log.info("success, rate is {}",limiter.getRate());
        return "success";

Regarding the demonstration of guava single-machine current limiting, Lao Mao simply wrote a few demos. The guava single-machine current limiting is mainly divided into blocking current limiting and non-blocking current limiting. After you start the project, you can adjust the relevant input parameters to observe the changes in the log.

Lao Mao, for example, analyzes the result of one of the requests localhost:10088/tryAcquire?count=2; the log output after the request is as follows:

2021-02-18 23:41:48.615  INFO 5004 --- [io-10088-exec-9] com.ktdaddy.KTController:success,rate is2.0
2021-02-18 23:41:49.164  INFO 5004 --- [io-10088-exec-2] com.ktdaddy.KTController:success, rate is2.0
2021-02-18 23:41:49.815  INFO 5004 --- [o-10088-exec-10] com.ktdaddy.KTController:success, rate is2.0
2021-02-18 23:41:50.205  INFO 5004 --- [io-10088-exec-1] com.ktdaddy.KTController:fail ,rate is 2.0
2021-02-18 23:41:50.769  INFO 5004 --- [io-10088-exec-3] com.ktdaddy.KTController:success,rate is 2.0
2021-02-18 23:41:51.470  INFO 5004 --- [io-10088-exec-4] com.ktdaddy.KTController:fail ,rate is 2.0

From the request log, we magically found that the interval between the first two requests was less than one second, but the tokens were consumed successfully. What is the reason for this? This sells a key point, and I will synchronize Guava's flow preheating model with you later.

Based on Nginx current limit actual combat

The current limiting configuration in nginx.conf is as follows:

#   IP 
#  1  $binary_remote_addr nginx 
#                binary_ remote_addr IP 
#  2  zone=iplimit:20m
#                 iplimit 20m   
#  3  rate=1r/s 
#                  100r/m
limit_req_zone $binary_remote_addr zone=iplimit:20m rate=10r/s;

limit_req_zone $server_name zone=serverlimit:10m rate=1r/s;

#  IP 
limit_conn_zone $binary_remote_addr zone=perip:20m;

limit_conn_zone $server_name zone=perserver:20m;

#  limit-rate.ktdaddy.com
#  host 
server {
        server_name  limit-rate.ktdaddy.com;
            #  IP 
            # 1   zone=iplimit =>  limit_req_zone zone 
            # 2   burst=2, 2 
            # 3   nodelay=>  503 
            limit_req zone=iplimit burst=2 nodelay;
            #  ,server IP 
            limit_req zone=serverlimit burst=1 nodelay;
            #  server 100 
            limit_conn perserver 100;
            #  Ip 1 
            limit_conn perip 1;
            #  504 503
            limit_req_status 504;
            limit_conn_status 504;
        #  100m 256k 
              limit_rate_after 100m;
              limit_rate 256k;

The above configuration actually combines four configuration methods of nginx current limiting, which are ip-based current limiting, current limiting based on each service, current limiting based on the number of IP connections, and current limiting based on the number of connections per service. Of course, I also mentioned the configuration of download speed limit. If you are interested, you can simulate the configuration and experience the current limit based on nginx.

Current limiting component based on Redis+Lua

This is still relatively important. Lao Mao decided to share it with you as a separate knowledge point, which is omitted here for now.

Write at the end

This article shares with you the related concepts of current limiting, our application in practice, and related algorithms and implementations. There is no sharing about the current limiting implementation of the redis+lua script. Personally, I think it is quite practical, so I will pull it out separately and share it with you, so stay tuned.