You are viewing a single comment's thread from:

RE: Hive Scaling: 30x More Capacity

in LeoFinance9 months ago

64kB * 32 = 2MB but that is just a block size limit. What was actually tested is 180 to 200 times more traffic than we normally have on mainnet. And whole stack at that. The python script prepares transactions, signs them with BeeKeeper, sends them to API node (which was also block producer), then that communicates with HAF node that consumes incoming blocks and fills up database - all on single computer. That's way more work than a server with a node would be doing in normal environment.

Before that, the silent bee guy that prepared the script (can you imagine working on Hive and not even having Hive account? most people on the team are like that 😡) accidentally tested old AH RocksDB, because he forgot that test-tools add it to configuration by default (it didn't make much difference in performance with or without it). Before first test on HAF was done, the expectation was that it will break on first big block due to size of SQL code produced by serializer to be processed in single SQL transaction. Turned out that not only serializer does not dish out as much data as assumed, PostgreSQL can handle orders of magnitude more (according to its documentation). That was very positive surprise.

Because in normal environment you'd not have all the work of whole network done on single computer, the test is not fully reflecting our limitations (we can do more). I also have some objections on the transaction mix the script produces. It dishes out 3-4k transactions per block and not very consistently due to all the extra work. That's why in the meantime I've made colony plugin that can skip most of the extra work and because of being run inside the node, can be smart about its production rate. We've run tests with it too, although not with HAF yet, since the plugin is not yet merged into develop, which makes it harder to include in HAF build stack. Because it skips some steps (f.e. it is not limited by API) and is easier to configure than script (and was actually configured to reflect transaction mix closer to that of mainnet), during test it consistently filled 2MB blocks with around 7250 transactions per block. This is where we run into some limitations. First, the p2p plugin has a one-minute deep limit per peer that does not allow for more than 1000 transactions per second. Of course we've continued testing with that limit increased hundredfold (and also separately with 4 concurrent nodes with colony enabled to see if it will work without touching the limit, if we just spread the load between multiple peers, which is what you'd expect in normal network setting). There is some other limit in p2p still that I didn't have time to identify, because we've also tested with 30k transactions per block (you can fit that many when those are minimal transactions "signed" with accounts that use open authority, that is, require no signature - only useful to test influence of number of transactions, won't ever happen in real life). It works as long as it is all on single node, but when colony is run on one node and the other node is a witness, the p2p communication (even after limit per peer was increased) is still choking somewhere.

At the moment I'm pretty confident that not only the 2MB blocks are no challenge, we can still increase performance (and then use that buffer to implement some aggressive memory optimizations, so in the end we will work at around the same pace but with less RAM consumed). The question is if we actually decide to do it, and if so, whether there will be enough human resources to complete the task. I've recently added a meta issue with my wish list for hived (@taskmaster4450 it contains bonds, although not quite like you've described them 😉).

Sort:  

Thanks for the write up. I only went based upon what Blocktrades said with the 30x, not presuming that it was a 1:1 increase in the throughput. Obviously, block size is only one factor and, as you detailed, there are many more being worked upon.

It is good that you have the 180x on the traffic during testing. That is very positive. Scaling is something that so many are focused upon, often after bottlenecks. It is good to see what is going on before Hive is full.

I will take a look at the wish list, especially the bonds. Perhaps another article will result.