Rate-limit support in the aiohivebot python lib.

in HiveDevs7 months ago

Some context for aiohivebot

It has been a while since I pushed code to the aiohivebot project. The aiohivebot project is one of three interrelated projects that I'm working on:

  • coinzdense : post quantum signatures for Web-3
  • aiohivebot : An async python library for HIVE meant primarily for writing bots and middleware
  • hive-archeology-bot : A simple HIVE bot for personal usage that allows the owner to vote on content that is past its 7 day reward window.

Currently my focus is on aiohivebot, but the goal is to let these three projects grow towards eachother as a proof of concept stack with the hive-archeology-bot running on top of the aiohivebot library, that in turn uses coinzdense for signing with a post-quantum twist. I'll write a blog post on that last part when I'm further allong, but the idea is to demonstrate the viability of hash-based signatures for HIVE by redundantly signing most hive-archeology-bot operations with hash-based signatures in custom_json.

What makes aiohivebot different from beam and lighthive

The aiohivebot library has somewhat of a different setup from other Python HIVE libraries. It's main design is less suitable for commandline scripts than libraries such as beam and lighthive, but instead it is significantly better for use in long running middleware and bots. The core of aiohivebot is made up of a collection of (currently) 13 async tasks, one for each of the HIVE public API nodes. Each node is constantly probed for health, latency and new blocks, and a stream of blocks is fetched from whichever nodes have the block available the fastest. When during processing of the blocks other API calls are needed, the fastest of the healthy nodes will be used to get the results.

If for some reason the bot of middleware needs to be paused or powercycled, aiohivebot has hooks to keep track of the last block processed, so if that hook is persisted. after a reboot it can start at that block and not miss any blocks.
For bots and middleware this is a decent setup, but there is, or rather there was an issue when we want to use this last feature for streaming historic data from the blockchain into our bot.

The problem with streaming historic data

While the aiohiveboty python library was made primaraly for processing and responding to life on-chain events in an async and cross-connective way, the start-at-block option that was desinged for clean power cycling of middleware and bots also works when we, say, want to stream three months of blocks to gather some interesting stats. I had used the library for that purpose a couple of times, mainly with the hive-archeology-bot migration in mind, but untill a few months ago when I wanted to look into the subject of the HBD apr, everything worked just fine.

It seems that HIVE public-API nodes work just dandy when only streaming blocks at close to the max speed possible for the library, but when every block also gets queried for virtual operations, things get a bit more involved. I have no insights in the workings of HIVE public API nodes, but the fact that peakd stopped working from my home IP when I was running this script is probably a good indication that my script using aiohivebot ended up querying the public API nodes way to agressively.

It became clear that if I wanted to fully support the start-at-block feature for more than short catch-up bursts, then some kind of rate-limiting would be needed.

My reluctance to work on rate-limiting

Those of you who have been on the chain long enough to remember te STEEM flag-wars may remember that I used to run a little periodic script that made a graphical representation of recent events in the flag-wars in those days.
Those scripts worked with an ancient and much more basic predecesor of of aiohivebot called asyncsteem. I used to have quite some time invested in asyncsteem, an asynchonous Python 2.x library that was buils on top of the Twisted framework. Like, as it seems, the current API nodes, the main public API node at that time (run by SteemIt Inc) didn't have any kind of rate limiting headers or standard HTTP error codes, but one day Steemit Inc decided introduce a rather brutal and primitive rate-limiting implementation, and from that day asyncsteem, my flag-war reports, and some other projects I was working on died an instance death.

This experience at first made me very reluctant to start wading in the streaming of historic blocks again.

HTTP 429 response and the Retry-After header

The simplest form of rate limiting a server can implement leaves the client rather blind because no indication is sent to the client that there is any type of rate limiting in place until the server decides it has had enough.
This form can be OK if the window isn't too big. What happens is that the server will set some kind of per window quota for the client, either on IP or some other property that the server deems unique enough, and when the quota is reched, each consecutive web request from the client will result in a 429 Too Many Requests error. This response will usualy be accompanied by a Retry-After header.

Imagine that an API node has a 30 second window with a policy of 100 requests per 30 seconds. Now imagine our client sends out 10 requests per second. After 10 seconds our client will get a 429 error with a Retry-After header set to 20 seconds. While not ideal, this would rate-limit the client in a crude but effective way. But imagine the same scenario with a 1 hour window and proportionally 12,000 request per hour as quota. At 10 request per second the client will burn blindly through its 12,000 requests in 20 minutes, after what it will get hid with a Retry-After header telling the client to come back after 40 minutes. Eventhough the policy isn't unreasonable, the fact that the client has no way to know how hard it will hit makes a 1 hour policy with these headers pretty much a deadblow for a blocks streaming library like aiohivebot.

So in short, if a node owner implements oonly 429 responses for rate limiting, (s)he should take care not to choose too big of a window for the policies.

draft-ietf-httpapi-ratelimit-headers and the RateLimit header

There is a draft standard for HTTP headers for rate limiting that has been a while in the making. Although the latest draft has recently expired, and it's unclear if work continues, the headers proposed in the draft form a pretty decent addition to the 429 setup.

While the IETF draft proposes two headers, the most usable header is the RateLimit header, that looks something like this:

RateLimit: limit=100, remaining=50, reset=5

What this header basicly tells the client is that in the current undisclosed time window the client got a quota of 100 requests and there are still 50 left. It also tells the client that the current time window will end in 5 seconds.
This means that the client is free to make 50 requests in th next 5 seconds however it sees fit.

Next to the RateLimit header, the IETF draft also defines the RateLimit-Policy header that should look something like this:

RateLimit-Policy: 10;w=1, 50;w=60, 1000;w=3600, 5000;w=86400

The above example defines four rate limiting policies with four different time windows.

  • 10 requests max per 1 second window
  • 50 requests max per 1 minute window
  • 1,000 requests per 1 hour window
  • 5,000 requests per 1 day window

The aiohivebot code currently doesn't use or support this header. When HIVE public API nodes start sending these headers, I'll definetely start looking into properly making use of these headers in a way most suitable for aiohivebot use.

Client side rate-limiting if the server uses no such headers and codes

So currently the public API nodes don't return any of the abouve headers, but we still want to rate limit our client. So to accomodate this, I added a client side rate limiter to aiohivebot that allows for faking RateLimit headers based on a list of static client side policies like the ones specified in the RateLimit-Policy line above. Basicly aiohivebot comes with a json config file that allows a client side policy to be set on a per node basis, simulating that the server adds the Ratelimit header based on these policies in an expected way.

The new hive-nodes.json format

Untill the previous version of aiohivebot, the config file hive-nodes.json was just a list of node DNS names. In the latest version we changed the format to a dict. Most of the content of the dict is currently used for defining rate limiting policies like this:

{
  "api.hive.blog": {
    "policies": [ 
            {"w": 60, "v": 240},
            {"w": 600, "v": 1800},
            {"w": 3600, "v": 9000},
            {"w": 10800, "v": 21600}
    ]
  },
  "api.deathwing.me": {
    "policies": [ 
            {"w": 60, "v": 240},
            {"w": 600, "v": 1800},
            {"w": 3600, "v": 9000},
            {"w": 10800, "v": 21600}
    ]
  }
  ...
}

For now the policies are set like this for each of the nodes.

  • 240 requests per minute (4 req/sec)
  • 1,800 requesta per 10 minutes ( 3 req/sec)
  • 9,000 request per hour ( 2.5 req/sec)
  • 21,6000 requests/day ( 2 req/sec)

Let's demonstrate the concequences from these policies with a short script:

import asyncio
from aiohivebot import BaseBot

class MyBot(BaseBot):
    """Example of an aiohivebot python bot without real utility"""
    def __init__(self):
        jan2024=82316952
        super().__init__(jan2024, use_virtual=False, maintain_order=False)

    async def monitor_block_rate(self, rate):
        print("block rate =", rate,"blocks/second")

pncset = MyBot()
loop = asyncio.get_event_loop()
loop.run_until_complete(pncset.run())

This is basicaly a do nothing script that just prints the block processing rate once every aproximate 60 seconds.

block rate = 4.51 blocks/second
block rate = 4.66 blocks/second
block rate = 4.65 blocks/second
block rate = 4.56 blocks/second
block rate = 4.71 blocks/second
block rate = 4.56 blocks/second
block rate = 4.69 blocks/second
block rate = 3.41 blocks/second
block rate = 1.84 blocks/second
block rate = 2.31 blocks/second
block rate = 4.17 blocks/second
block rate = 4.78 blocks/second
block rate = 4.67 blocks/second
block rate = 4.61 blocks/second
block rate = 4.66 blocks/second
block rate = 4.66 blocks/second
block rate = 4.63 blocks/second
block rate = 3.31 blocks/second
block rate = 1.82 blocks/second
block rate = 2.38 blocks/second
block rate = 4.82 blocks/second
block rate = 4.7 blocks/second
block rate = 4.79 blocks/second
block rate = 4.77 blocks/second
block rate = 4.69 blocks/second
block rate = 4.74 blocks/second
block rate = 4.03 blocks/second
block rate = 3.15 blocks/second
block rate = 1.96 blocks/second
block rate = 2.78 blocks/second
block rate = 4.56 blocks/second
block rate = 4.65 blocks/second
block rate = 4.01 blocks/second
block rate = 4.49 blocks/second
block rate = 4.57 blocks/second
block rate = 4.52 blocks/second
block rate = 4.46 blocks/second
block rate = 3.01 blocks/second
block rate = 1.73 blocks/second
block rate = 2.67 blocks/second
block rate = 4.71 blocks/second
block rate = 4.69 blocks/second
block rate = 4.56 blocks/second
block rate = 4.47 blocks/second
block rate = 4.44 blocks/second
block rate = 4.69 blocks/second
block rate = 3.82 blocks/second
block rate = 3.44 blocks/second
block rate = 1.61 blocks/second
block rate = 1.57 blocks/second
block rate = 1.99 blocks/second
block rate = 1.92 blocks/second
block rate = 1.99 blocks/second
block rate = 1.98 blocks/second
block rate = 1.92 blocks/second
block rate = 1.91 blocks/second
block rate = 1.92 blocks/second
block rate = 1.88 blocks/second
block rate = 1.98 blocks/second
block rate = 3.17 blocks/second
block rate = 4.55 blocks/second
block rate = 4.63 blocks/second
block rate = 4.63 blocks/second
block rate = 4.7 blocks/second
block rate = 4.78 blocks/second
block rate = 4.85 blocks/second
block rate = 4.19 blocks/second
block rate = 2.94 blocks/second
block rate = 1.95 blocks/second
block rate = 3.4 blocks/second
block rate = 4.84 blocks/second
block rate = 4.71 blocks/second
block rate = 4.84 blocks/second
block rate = 4.69 blocks/second
block rate = 4.66 blocks/second
block rate = 4.66 blocks/second
block rate = 3.86 blocks/second
block rate = 2.92 blocks/second
block rate = 1.88 blocks/second
block rate = 3.37 blocks/second
block rate = 4.64 blocks/second
block rate = 4.71 blocks/second
block rate = 4.74 blocks/second
block rate = 4.7 blocks/second
block rate = 4.73 blocks/second
block rate = 4.76 blocks/second
block rate = 3.93 blocks/second
block rate = 2.81 blocks/second
block rate = 1.85 blocks/second
block rate = 3.54 blocks/second
block rate = 4.64 blocks/second
block rate = 4.79 blocks/second
block rate = 4.7 blocks/second
block rate = 4.66 blocks/second
block rate = 4.66 blocks/second
block rate = 4.5 blocks/second
block rate = 3.76 blocks/second
block rate = 2.39 blocks/second
block rate = 1.78 blocks/second
block rate = 3.56 blocks/second
block rate = 3.77 blocks/second
block rate = 4.64 blocks/second
block rate = 4.81 blocks/second
block rate = 4.73 blocks/second
block rate = 4.69 blocks/second
block rate = 4.61 blocks/second
block rate = 3.97 blocks/second
block rate = 3.47 blocks/second
block rate = 1.68 blocks/second
block rate = 1.79 blocks/second
block rate = 1.89 blocks/second
block rate = 1.89 blocks/second
block rate = 1.84 blocks/second
block rate = 1.91 blocks/second
block rate = 1.94 blocks/second
block rate = 1.9 blocks/second
block rate = 1.87 blocks/second

When you look quickly you see the block processing rate fluctuates quite a bit but not too much. If you look close at the pattern you should be able to see how the fluctuations have a pattern that matches the policies.

Input from node operators appreciated

If you are reading this and happen to operate a public API node, please let me know what you think about the current client side rate limiting policies for the nodes, and for your node in particular. Are the policies OK? Can I increase the rate limits? Should I decrease them? If you have the option to add the header fields that would be amazing, but if you don't, please let me know what client side policies you would like me to put in the config for your node.

Support and merchandise

If you appreciate my work on aiohivebot or on the coinzdense/aiohivebot/hive-archeology-bot in its completeness, I would very much like to work more on these projects, and I could really use your support in order for me to allow myself to allocate more time to them. I've got (normal and legal) bills to pay, so I need to often choose activities that make me money over activities I'm pasionate about, but with yous support the second can become the first again.

If you appreciate my work on these projects, please:

If you have your own projects and are in need of some Python and/or C++ expertise and have other skill sets to contribute to my projects, thats very interesting too. We can support eachothers projects with an hour swap.

While I prefer to work on these projects, if instead of supporting my open source projects you are in need of my skills, I'm available for short contracts with a discount if the project I'll be working on will be open source.

image.png

Sort:  

My node is set to limit per IP to 50 calls per second (barring any extra Cloudflare anti-bad actor measures), it should return a proper 429 response explaining the situation as well as a proper Retry-After header, you should be able to increase the limits there from what I can see.

Congratulations @pibara! You received a personal badge!

Happy Hive Birthday! You are on the Hive blockchain for 7 years!

You can view your badges on your board and compare yourself to others in the Ranking

Check out our last posts:

LEO Power Up Day - June 15, 2024