You are viewing a single comment's thread from:

RE: Exploring Steem Scalability

in #steem7 years ago

One of the main concerns that the post was trying to address was that the requirements for witness and seed nodes had already reached 64 GB of RAM and were on their way to 128 and continuing to grow.

The belief that the state file should be stored in /dev/shm/, and that it was best to either rely on swapping or having enough RAM to hold the entire state file was a large part of this misconception.

You are correct that at some point in the future, if we continue to grow without making any changes, we would reach a point that 16 GB servers, and even eventually 32 GB would not be enough, but many of the changes we are working on (such as AppBase and RocksDB) are intended to address this long before we reach that point.

Sort:  
  1. The info (both in the post and here) about /dev/shm is completely wrong
  2. It is true that 64 GB is not currently required (and certainly not 128 GB which is actually overkill to the point of being useless) for a witness node. 32 GB works fine. 16 GB is pretty questionable and I'm doubtful that 8 GB is even usable (I'm doing some tests right now). 32 GB will be entering into that state once the file exceeds physical memory by a sufficient ratio, which from past experience at various sizes seems to be about 2x (I believe the file is around 35 GB currently). The memory mapped approach just does not do a very good job with delivering good performance when the data size is much larger than memory.
  3. Moving the data to a database should improve the scalability of data size with respect to physical memory dramatically, once development is finished, but may have other tradeoffs. We will have to see how that works out.

@andarchy

The belief that the state file should be stored in /dev/shm/, and that it was best to either rely on swapping or having enough RAM to hold the entire state file was a large part of this misconception.

So essentially the suggestion is, only during the initial indexing the files needs to be read quickly and from that point onwards only the last blocks will need to be in the RAM for faster I/O ?

In otherwords, once the reindexing is over, which is CPU driven and IO driven, the "tail end of the blockchain" is what that gets I/O and rest need not be in memory.

If this is the case, we need to carefully pageout the older parts of the blockchain from the memory and only the new (tail end) needs to be in memory.

Does this make sense ?